Assembly Identity — ReferenceIdentity and DefinitionIdentity, Comparison and Transformation


An Assembly Identity is simply an attributes bag. It consists of a set of attributes. Each attribute has a pre-defined range of acceptable value.


 


There are two kinds of Assembly Identity. One is called DefinitionIdentity. The other one is called ReferenceIdentity.


 


DefinitionIdentity is an integral part of an assembly. Each assembly has one and only one DefinitionIdentity.


 


For strongly named assembly, the reverse is supposed to be true, that a DefinitionIdentity will uniquely identify an assembly. In practice, not every people change their assembly version number often. That means multiple different assemblies will have the same DefinitionIdentity.


 


DefinitionIdentity is tied to a particular assembly. DefinitionIdentity does not exist without an assembly.


 


In .Net framework 1.0/1.1, a DefinitionIdentity has four build-in attributes: Name, Version, Culture, and PublicKey(Token). In .Net framework 2.0, we introduced a new attribute ProcessorArchitecture. It is likely that we will introduce new attributes in the future.


 


ReferenceIdentity is an identity that is used to refer to an assembly.


 


At runtime, fusion applies binding policies on a ReferenceIdentity to get a post-policy ReferenceIdentity (still a reference identity), the probe the world based on a defined rule to find an assembly. Once the assembly is found, fusion extracts the DefinitionIdentity of the assembly, and does a Ref-Def comparison. If the comparison shows the post-policy ReferenceIdentity are not binding equivalent of the Definition Identity, fusion will return FUSION_E_REF_DEF_MISMATCH, which in turn is translated to the famous FileLoadException The located assembly’s manifest definition with name xxx.dll does not match the assembly reference.


 


ReferenceIdentity is not tied to any assembly. It is possible that multiple ReferenceIdentities can be resolved to the same assembly at runtime.


 


Today ReferenceIdentity manifests itself in a few forms:


 



  1. In assembly metadata, compilers emit ReferenceIdentities for assemblies the assembly references.
  2. The assembly name in Assembly.Load is a ReferenceIdentity.
  3. When you serialize a type, the serialized data contains a ReferenceIdentity.
  4. The assembly names specified in various config file are ReferenceIdentities.

 


The difference between DefinitionIdentity and ReferenceIdentity are:


 



  1. DefinitionIdentity is tied to a particular assembly, while ReferenceIdentity is not. This is discussed above.
  2. ReferenceIdentity can be partial, while DefinitionIdentity can not. A ReferenceIdentity can contain only a subset of the build-in attributes, while a DefinitionIdentity will contain all the build-in attributes. For attributes not present when the assembly is compiled, their values are implicitly set to “neutral”.
  3. Binding policies cannot be applied to DefinitionIdentity. It is meaningless to apply policies on DefinitionIdentity.

 


Comparison


 


Since there are two types of assembly name, there are three types of comparison on assembly name. Ref-Ref, Def-Def, and Ref-Def comparison. There is no Def-Ref comparison.


 


Two ReferenceIdentities are compared as equal, if and only if they have the same set of attributes, and each attribute has the same value. For example, ReferenceIdentities “name” and “name, culture=neutral” are compared as not equal, since the first one only have one attribute, while the second one has two attributes.


 


Two DefinitionIdentities are considered as equal, if and only if all the attributes have the same value. By design two DefinitionIdentities will always have the same set of attributes. The comparison is purely on the value of each attribute.  For example, DefinitionIdentity “name” and “name, culture=neutral” are compared as equal, since unspecified attributes will carry a default value of “neutral”. While “name, culture=neutral” and “name, culture=en-us” are not equal.


 


ReferenceIdentity and DefinitionIdentity cannot be compared directly, since they have different semantics. Rather, we say if a given ReferenceIdentity matches a DefinitionIdentity. A ReferenceIdentity matches a DefinitionIdentity, if and only if the value of all the attributes specified in the ReferenceIdentity match the value of the corresponding attributes of the DefinitionIdentity. If an attribute is missing in the ReferenceIdentity, it matches any value for that attribute in DefinitionIdentity.  For example, Ref “name” matches Def “name, culture=neutral”, and Def “name, culture=en-us”. But Ref “name, culture=neutral” does not match Def “name, culture=en-us”.


 


In CLR we have another special comparison — binding comparison for Ref-Def matching. In the special binding comparison context, the version number is ignored when the ReferenceIdentity does not contain any public key (token).


 


Transformation


 


A ReferenceIdentity cannot be transformed to a DefinitionIdentity.


 


A DefinitionIdentity can be transformed to a ReferenceIdentity. For example, when you serialize a CLR type, the DefinitionIdentity is transformed to a ReferenceIdentity. Later when the type is deserialized, the ReferenceIdentity is used to locate the right assembly.


 


The transformation from DefinitionIdentity to ReferenceIdentity can (or should) be customized. You can (should) be able to only include a subset of attributes in the transformed ReferenceIdentity. You should also be able to include the full set of attributes in the transformed ReferenceIdentity.


 


 


Much of this information is *not* present in CLR today. Though you do see the concepts of references and definitions, there is only one AssemblyName class in CLR. Given an AssemblyName object, you cannot tell whether it is a ReferenceIdentity or DefinitionIdentity. You have to look at it in context to tell which one it is.


 


This omission does cause confusion on how to compare assembly identities, and how to transform a DefinitionIdentity to a ReferenceIdentity. Understanding the difference between ReferenceIdentity and DefinitionIdentity is big paradigm shift. Unfortunately it is not possible to make such a big move in .Net framework 2.0 at this point. We will look for making the change in the next version of .Net framework.

Comments (10)

  1. RonO says:

    > By design two DefinitionIdentities will always have the same set of attributes.

    Isn’t this true only if they are compiled to the same version of the Framework? What happens when comparing the DefinitionIdentity of a 1.1 assembly against the DefinitionIdentity of a 2.0 assembly? Or am I forgetting/missing something here?

  2. junfeng says:

    See this paragraph:

    ReferenceIdentity can be partial, while DefinitionIdentity can not. A ReferenceIdentity can contain only a subset of the build-in attributes, while a DefinitionIdentity will contain all the build-in attributes. For attributes not present when the assembly is compiled, their values are implicitly set to “neutral”.

    Maybe I should be more clear about this. "neutral" means different things for different attributes. For example, V1.x assemblies have ProcessorArchitecture of NONE. If you have v2.0 .Net framework March CTP, you will notice a ProcessorArchitecture enumeration, and NONE is one of the enum.

  3. David Levine says:

    Hi,

    re: "Maybe I should be more clear about this. "neutral" means different things for different attributes. For example, V1.x assemblies have ProcessorArchitecture of NONE. If you have v2.0 .Net framework March CTP, you will notice a ProcessorArchitecture enumeration, and NONE is one of the enum. "

    Will the fusion layer provide automatic mappings for this, and are the probing and binding rules worked out for this yet?

    For example, for an assembly that was built without this attibute, what happens if it is run on a system with two different assemblies, each with a different ProcessorArchitecture value, but with all other attributes the same (i.e. name, token, version, and culture). Which one, if any, will it bind to?

  4. junfeng says:

    David,

    The answer is in the post. You are looking for Ref-Def matching.

    "A ReferenceIdentity matches a DefinitionIdentity, if and only if the value of all the attributes specified in the ReferenceIdentity match the value of the corresponding attributes of the DefinitionIdentity. If an attribute is missing in the ReferenceIdentity, it matches any value for that attribute in DefinitionIdentity. "

    For your specific queston, it is answered here:

    GAC, Assembly ProcessorArchitecture, and Probing

    http://blogs.msdn.com/junfeng/archive/2004/09/12/228635.aspx

  5. Sean Nolan says:

    It seems to me that one fairly small change could make all this a lot easier.

    How about if the version number was not checked for strong-named assemblies that are not in the GAC?

    If the assembly is not in the GAC, but only in the local folder, surely the better assumption would be that we _do_ want to use it.

    Lets consider – if the wrong version is in the local folder there are two possibilities:

    1. It is compatible and everything works.

    2. It is not compatible and something breaks.

    But – by causing a load error if the version is different, you reduce that down to:

    1. Whether it is compatible or not, everything breaks.

    This could be controlled by a config option, so that you could turn it off if you don’t like it, but it seems it would cause the code to work most of the time with no down-side?

    As discussed at http://pluralsight.com/blogs/craig/archive/2005/03/11/6653.aspx#8449

  6. junfeng says:

    Sean,

    From a programming model’s point of view, different comparison algorithm for GAC and AppBase will simply confuse people more than help people.

    If you don’t want the version number to be checked, you can use simply named assemblies.

    Or you can use App.Config bindingRedirect.

    Or you can recompile your app.

  7. Sean Nolan says:

    It seems to me that 95% of .NET code is going to be deployed one of two ways – to the GAC, or to the local folder. The other 5% is may get confusing, but you can make the 95% simple.

    The rule would be:

    If it’s in the GAC versioning is checked, if it is not in the GAC versioning is not checked.

    When would that ever be confusing? I’d say telling me that my perfectly compatible code is being rejected because a revision number changed and that I should therefore edit an obsure element in an XML file sounds more like to confusion to me?

    I’d have to ask, if that simple rule was applied – when would it ever negatively affect the majority of .NET developers.

  8. junfeng says:

    Sean,

    How do you tell the developers? If the assembly is in GAC, you need the bindingRedirect, if it is not, you don’t need it? You don’t see it is confusing?

    As I replied in Craig’s blog, the current policy system does have its inefficiency. We will re-design it on the next version of .Net framework.

  9. Sean Nolan says:

    Honestly I don’t think that’s confusing at all.

    In fact – if I put an assembly in the local folder and it just works without a binding redirect then I don’t even have to think about it, so 50% of the time people don’t even have to know about redirects at all let alone be confused by them.

    I think it’s much more confusing when I get some software that needs version 1.3 of an assembly and I get version 1.3 of the assembly but it doesn’t work (because I have version 1.3.1851.0 instead of version 1.3.1792.0). So now people are not just confused, they have no idea why it isn’t working, so they have to go off and research binding redirects.

    The point is that versioning of strong-named assemblies is there to support sharing in the GAC – you do not need versioning when you are not sharing multiple versions, and there cannot be multiple versions in a local folder. Applying versioning to an assembly in the local folder is confusing. If the assembly is not in the GAC it makes perfect sense that GAC rules would not be applied to it, that’s not confusing.

  10. Given a path to an assembly manifest file, how do we get its full display name, including ProcessorArchitecture?…