Performance: One big assembly vs several small assemblies.

I frequently see people ask the same question in microsoft.public.dotnet.framework.clr and microsoft.public.dotnet.framework.performance. Which one has better performance, one big assembly or several small assemblies?

Strictly from a performance point of view, one assembly is always better than several assemblies. Each assembly loading has a fixed overhead. For multiple assemblies, you pay the overhead several times.

The overhead of assembly loading at minimum includes the following:
1. Finding the assembly.
2. Loader in memory data structure tracking this assembly.
3. Assembly initialization.

1. Overhead of finding the assembly.
For .Net assembly, the probing rule is documented here. For strongly named assembly, we will apply policy, then probe in GAC, then in app directory. Applying policy means finding config files and parse them (This is once per AppDomain so the overhead is not always as big as that. Nonetheless we will always look for publisher policy. This cost is always there. ). If your assembly is not in GAC, probing in GAC is a waste for you. All of these means a lot disk access. For simply named assembly, we don’t apply policy, and we don’t probe GAC so the overhead is smaller. But the overhead is still there.

2. Loader overhead.
Loader always has some overhead for each assembly. Like looking up the appdomain to see if the assembly is loaded or not, registering the assembly in the appdomain. All these has time and memory overhead. For .Net assembly, you pay the overhead three times for each assembly, one in fusion, one in CLR loader, one in OS loader.

3. Assembly initialization.
Every assembly has some initialization cost. If you look at .Net assembly’s import table, it has an entry pointing to mscoree!_CorDllMain. This method is executed every time an assembly is loaded. If your assembly is a Manager C++ assembly, it has its own DllMain to execute. It may also need runtime fixup. Also if you use C/C++ runtime library, it has its own initialization.

The cost of above add up when you have multiple assemblies.

There are other cost associated with multiple assemblies. Each assembly has its own metadata. This is extra disk size cost. And one assembly is likely to have better disk sequential distribution than several assemblies. This means the disk access time for several assemblies is going to be longer.

Of course there are many many good reasons why you want multiple assemblies. But from a strict performance point of view, one assembly wins, always.

Comments (9)

  1. Anonymous Coward says:

    These all seem like one time costs. Are there any runtime costs associated with multiple assemblies?

  2. Hans Jergen Ohff says:

    Give us back the linker so we can have assemblies build as .lib modules and we can static link.

  3. Ferris Beuller says:

    Some way t’ static link .net modules would be real damn supa’ fine. ‘S all


  4. Panos Theofanopoulos says:

    Does the AppDomain has a limit to the loaded assemblies it can handle (IIRC at BETA times was 99)?

  5. I am a loader person, so this one time startup cost means the whole world to me;)

    Joel Spolsky complained about the linker in And Jason Zander rebuted here

    Panos, I am not aware of any limit to the loaded assemblies in each appdomain. At least in fusion we do not post any limit.

  6. Brad Abrams says:
  7. Channel 9 says:

    While this posting doesn’t specifically discuss solution design, it does discuss the performance implications of single assembly vs. multiple assemblies.