Override CLR Assembly Probing Logic --- IHostAssemblyManager/IHostAssemblyStore

In .Net framework when resolve an assembly reference, CLR first checks GAC, then search the application directory in specific locations. (https://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpguide/html/cpconhowruntimelocatesassemblies.asp). If the assembly is not in one of those locations, CLR will fire AssemblyResolve event. You can subscribe to AssemblyResolve event and return the assembly using Assembly.LoadFrom/LoadFile/Load(byte[]).

You can change how CLR searches the assembly in application, but you can not opt out of the standard probing. AssemblyResolve event is the only place you can implement your own probing logic and it is fired after the CLR standard probing has done and failed.

In .Net framework 2.0, this has changed. You can override the probing logic with your own implementation, by hosting CLR, and implementing interface IHostAssemblyManager and IHostAssemblyStore.

IHostAssemblyManager

IHostAssemblyManager basically does two things:

1. return an instance of IHostAssemblyStore

2. return a list of assembly references that should *bypass* the host (called NonHostStoreAssemblies).

For assembly references in the list returned by IHostAssemblyManager, CLR will use the standard probing probing implementation, and will *not* consult the host assembly store. Otherwise, CLR will ask the host assembly store to provide the assembly.

There are a few scenarios around the list:

1. The host returns a list, and the list is not empty

In this case, when resolve an assembly reference, CLR looks to see if the reference is in the list supplied by the host or not.

If it is, CLR moves on with the standard probing.

If it is not, CLR will call into the IHostAssemblyStore interface provided by the host.

 

WARNING!!! In this case, CLR will *not* probe GAC for this reference, *nor* will it probe the application directory.

For this reason, if you decide to provide a list, make sure all the framework assemblies are on the list.

2. The host returns a list, and the list is empty

This is a special case of 1). Essentially every assembly reference goes through the host provided assembly store. It is very unlikely anyone will want to do so.

3. The host does not return a list

In this case, the host wants to supplement the CLR standard probing with its own logic. Since there is already a way for a host to provide the assembly after the standard probing, we decide to implement the probing logic in the following order:

 

1. GAC

2. host provided assembly store

3. application directory

Most people should stick with 3), that the host provides an assembly store, without a list of NonHostStoreAssemblies.

IHostAssemblyStore

IHostAssemblyStore does only one thing: given an assembly reference, return the assembly. For multi-module assemblies, it also has an API to return the modules.

When we design the interface, we want to super clear about this: The host does not implement binding policies. It only returns the assembly, after CLR evaluates the binding policies. For completeness when CLR calls the IHostAssemblyStore API, we give the host the following information, among other things:

 

1. The original assembly reference

2. The post binding policy assembly reference

3. The list of binding policies CLR applied

The host can decide to return the assembly or not based on the information. (For example, SQLServer 2005 does not want any publisher policy applied to assemblies in their store. If they see publisher policy applied, they will refuse to return the assembly.)

After the host returns the assembly, CLR does a ref-def matching check to make sure that the host does not cheat by returning an assembly with a different assembly identity.

A few caveats about how CLR calls into the host:

1. Assembly Identity

In .Net framework 2.0, the full identity of an assembly always includes processorArchitecture, but majority of the time, assembly references does not include processorArchitecture. To make sure there is no ambiguity, we always give the full identity of the assembly to the host, including processorArchitecture.

Recall that in the case that an assembly reference does not have processorArchitecture, when probing GAC CLR always searches the GAC in the following order: platform specific, MSIL, then without processorArchitecture.

Here we decide to do the same thing. This means, a host will see three calls into the IHostAssemblyStore API, first with platform specific processorArchitecture, then MSIL, last no processorArchitecture.

When you receive a request from CLR, make sure the assembly you returned has the same processorArchitecture as requested. Otherwise you will receive a FUSION_E_REF_DEF_MISMTACH error when we do the ref-def matching check after you return the assembly.

2. Simply named assemblies

Simply named assemblies are allowed in the host provided store. But CLR ignores the version number of simply named assemblies when doing ref-def matching. It is not expected that the host should implement the same logic. So we always pass version=0.0.0.0 for simply named assembly to the host.

WARNING!!! In CLR binding model simply named assemblies must always be qualified with an application and you cannot load a simply named assembly outside of the application directory. But in the hosting model, there is no way to associate a simply named assembly with any particular application. For reason, you should not allow simply named assemblies in host assembly store.

For example, if you have assembly Foo references a simply named assembly bar, when returning bar from the host assembly store, you have no idea whether the bar is the same one you referenced, or some random one other people have put into the store.

The only safe simply named assemblies in the host store, are the ones that do not reference any other simply named assembly. (a.k.a, only references strongly named assemblies).

We have to allow simply named assemblies in host assembly store, because our partner believes this is an important usability feature. But they do restrict them to what I described above.

3. Partial assembly references

Just like CLR does not probe GAC for partial assembly references, we don't call the host for partial assembly references.

Assembly.LoadFrom/LoadFile/Load(byte[])

In the case where the host returns a list of non host store assemblies (a.k.a the host wants unknown assembly references to go through the host store), it is not clear how Assembly.LoadFrom/LoadFile/Load(byte[]) should be implemented. We make the simple decision and disable them all.