Do not write in-process shell extensions in managed code


Jesse Kaplan, one of the CLR program managers, explains why you shouldn’t write in-process shell extensions in managed code. The short version is that doing so introduces a CLR version dependency which may conflict with the CLR version expected by the host process. Remember that shell extensions are injected into all processes that use the shell namespace, either explicitly by calling SHGetDesktopFolder or implicitly by calling a function like SHBrowseForFolder, ShellExecute, or even GetOpenFileName. Since only one version of the CLR can be loaded per process, it becomes a race to see who gets to load the CLR first and establish the version that the process runs, and everybody else who wanted some other version loses.

Update 2013: Now that version 4 of the .NET Framework supports in-process side-by-side runtimes, is it now okay to write shell extensions in managed code? The answer is still no.

Comments (42)
  1. c says:

    Wasn’t this one of the big reasons for the Reset?

  2. It would seem to me that desired CLR versions ought to be specified as a threshold, i.e. I need version GREATER THAN OR EQUAL TO N.N.N .

    [That assumes that there will never be any breaking changes to the CLR. Whether that’s a valid assumption or not I leave you to determine. -Raymond]
  3. Steve says:

    Is it just me or were some of those responses horrifying?  It seemed like people were thumbing their nose of two people who know a metric ton more about the inner workings of CLR and WindowsShell than anyone else on the planet, and yet they were trying to argue the point.  If it was all hypothetical, then that’s one thing…but I don’t believe it was.

    Raymond, is Rowland right?  Did MS release a RawImage thumnail extension written in .Net?

    Steve

    [I don’t know. (I’m going to invoke my right not to answer questions I don’t want to, so don’t expect “I don’t know” or “I don’t care” answers to every question.) -Raymond]
  4. J says:

    "[That assumes that there will never be any breaking changes to the CLR. Whether that’s a valid assumption or not I leave you to determine. -Raymond]"

    Yes it’s a valid assumption.  When .NET 2.0 just entered public beta, my company released a .NET 1.1 program.  It didn’t work on 2.0.  The CLR team has no problem breaking stuff between major releases.

  5. Gabe says:

    Any change can be considered a "breaking" change, because it is possible to make a program rely on a specific behavior no matter how incorrect or obscure.

    For example, let’s say you have a CLR 1.1 shell extension Foo that does FTP, and is has a class called FtpRequest. When you run it on CLR 1.2 which implements its own FtpRequest class, Foo stops working because it gets the CLR’s FtpRequest instead of your own.

    Of course you could just put the class in a namespace (like you should have done in the first place) and recompile, but your customers can’t recompile. They already have the broken version and upgrading their CLR will break it no matter what.

    You could mark Foo as requiring CLR 1.1, but if the next extension to load requires CLR 1.2 then it will fail.

    The correct behavior in this case seems to be to just allow multiple versions of the CLR in a process. Unfortunately this means every single app with so much as a File Open dialog will end up loading every version of the CLR because one extension will be marked with v1.1, another will be v1.0, another will be v2.0, etc.

    Since that would be a disaster, it’s best to stay away from managed shell extensions.

  6. andy says:

    Is this the reason why the Windows Sidebar has a .NET object model(http://msdn2.microsoft.com/en-us/library/aa965853.aspx) but gadgets must be developed using HTML + scripting using this object model. Or did I miss something about Gadget-development?

    Hopefully somebody at Microsoft is working to remedy this issue for the next release of Windows and/or .NET!

  7. andy says:

    "is Rowland right?  Did MS release a RawImage thumnail extension written in .Net?"

    Yes, (s)he is correct. See

    http://www.microsoft.com/downloads/details.aspx?familyid=D48E808E-B10D-4CE4-A141-5866FD4A3286&displaylang=en

    Maybe some intern developed it or somebody in their spare time. Anyways would be appropriate with some kind of warning at that download page so customers become aware that installing this extension might cause incompatibility issues.

  8. Wang-Lo says:

    It’s so comforting to see that the famously successful methods developed over the past couple of decades for managing DLL version dependencies have been adopted without modification for managing the analogous CLR problem.

    -Wang-Lo.

    [I’m not sure why you’re congratulating me on something I had nothing to do with. -Raymond]
  9. Gwyn says:

    Raymond, I’m pretty sure that Wang-Lo’s comment was just dripping with sarcasm.

    [As was my response. My point is that if you have an issue with the CLR, complaining to me won’t accomplish anything. -Raymond]
  10. Cooney says:

    Gabe:

    For example, let’s say you have a CLR 1.1 shell extension Foo that does FTP, and is has a class called FtpRequest. When you run it on CLR 1.2 which implements its own FtpRequest class, Foo stops working because it gets the CLR’s FtpRequest instead of your own.

    That’s your fault for mucking around in someone else’s package :)

  11. Anthony Wieser says:

    Since only one version of the CLR can be loaded per process, it becomes a race to see who gets to load the CLR first and establish the version that the process runs, and everybody else who wanted some other version loses.

    I was worried what this might imply, and found that on the referenced link at the top, you pointed out:

    The key point is that you have to avoid injecting the CLR into processes that aren’t expecting it. COM is just a conduit for the injection. Don’t focus on the COM part; focus on the injection part. If all the processes that use your COM object are expecting the CLR (e.g. because you wrote them) then there’s no problem since there is no injection.

    So, if I read you correctly, building generally consumable in-process COM objects with managed code is a potential no-no.  

    I’m not fully up to speed with .NET, but I believe there’s a COM wrapper you can use around a .NET interface.  Do these run out of process to circumvent the potential conflict?

  12. Dean Harding says:

    Do these run out of process to circumvent the potential conflict?

    No, it’s exactly this wrapper that causes the problems.

    I’ve been toying with writing an out-of-process shell extension plugin for a while (like with IIS & ASP.NET). The extension would just pass requests to a "worker" process to handle the actual work. You could then have one worker process per version of the CLR that you needed. It should work for most extensions (property pages being the most difficult one) but I’m not sure what the performance would be like…

  13. John Elliott says:

    This came up where I work, trying to write an MFC app that called a .NET component using the COM wrapper – and the component only worked on CLR 1.1, not 2.0. The eventual solution we reached was for the MFC program to load CLR 1.1 using CorBindToRuntime before instantiating the component.

  14. John Stewien says:

    I like Dean Harding’s idea. I’m thinking this is a problem because shell extensions are done through DLLs. The alternative though is that you have to run all your shell extensions as services which then communicate with the shell. I would think the Singularity team at Microsoft Research have been looking into this as the O/S they have developed doesn’t have support for DLLs.

  15. The versioning problem is unavoidable for in-process objects anyway, since a function can only have one definition (and two-level namespaces, ala OS X, are a cure worse than the disease). We run into the same problem in unixland with multiple incompatible libstdc++ versions in programs with plugins.

    The real WTF here is that shell extensions are loaded into every process that happens to use the shell namespace. Why wasn’t it designed to talk using IPC with some shell daemon from the get-go? It’s like the Unix NSS problem magnified a thousand-fold. If code injection is the answer, you’re probably asking the wrong question.

    [Remember, Windows 95 had to run in 4MB of memory. (Besides, imagine how many services would be running if every shell extension were a service. And imagine the new security vulnerabilities.) -Raymond]
  16. John Stewien:

    There are two traditional uses for shared library:

    A) Sharing code for efficiency’s sake (e.g., libc or msvcrt), and

    B) Loading potentially changing modules into a process (shell namespace extensions, unixland NSS, winamp plugins, etc.)

    While A, with proper versioning, is a huge win, I think that B often brings more trouble than it’s worth. Every modern OS uses that architecture at some level, though, but I don’t see why. Would using IPC really cost that much more? I can see a case for B in plugins that are tightly integrated into a host program’s GUI, but even that would be manageable with some kind of out-of-process COM trickery, I imagine.

  17. Dean Harding says:

    > every shell extension were a service

    That’s not how I would do it. My extension would essentially be a “framework” that developers would write their .NET extensions against. I’d only load one worker process per version of the CLR that you needed – each extension would simply reside in an AppDomain of that worker process.

    Obviously, it wouldn’t be a service, either. The worker process would just run under the same account as the launching explorer.exe.

    [Daniel appeared to be applying this principle globally, not just to managed extensions. -Raymond]
  18. Cooney says:

    Remember, Windows 95 had to run in 4MB of memory. (Besides, imagine how many services would be running if every shell extension were a service. And imagine the new security vulnerabilities.) -Raymond

    What security vulnerabilities? You’re running services on the box; since they expect to be explorer extensions, they’re only accessible from that box. The question is how to partition something like that when multiple users are on the box – you may end up with some sort of corba like beast.

    All I can see from this thread is that C# isn’t nearly as handy as straight C++. Wouldn’t it be nice if we had some lightweight messaging interface? You could write an extension as out of proc and have the CLR and proc issues handled automatically, while designing APIs that didn’t drag ass in large directories and other extreme cases. Of course, this is complicated, and wouldn’t fly too well when the guy writing the exception is at the level of a typical VB6 programmer.

  19. HS says:

    <blockquote>Yes it’s a valid assumption.  When .NET 2.0 just entered public beta, my company released a .NET 1.1 program.  It didn’t work on 2.0.  The CLR team has no problem breaking stuff between major releases. – J</blockquote>

    I think they have no qualms about maintaining compatibility because the CLR is versioned.

    You can’t run your 1.1 program on CLR 2.0, but on the other hand, you can install CLR 1.1 on the same machine that runs CLR 2.0 and both will co-exist and programs written for both versions can co-exist and run by correctly loading the version of the CLR it needs.

  20. Doug says:

    Originally, the CLR team specifically denied any claim that future versions of the CLR would be backwards compatible. Instead, IIRC, they claimed that all versions of the CLR would work side-by-side.

    Now, this solution isn’t perfect, and nobody wants 10 versions of the CLR installed, but it seems like it was probably the best solution for 1.0, 1.1, and 2.0, as .NET is changing very quickly and compatibility issues are certainly going to be a problem.

    Even so, strong backwards compatibility was an important goal for 1.1 and 2.0. Neither one has perfect back-compat, but they’re not terribly bad, either. Of course, it only takes one back-compat issue to make YOUR app (the most important app in the world) not work.

    For now, it isn’t terribly outrageous to have 1.0, 1.1, and 2.0 installed, allowing you to bypass most back-compat issues (assuming nobody does something silly like load two different .NET DLLs with different requirements into the same process).

    In the future, a whole new CLR for each revision becomes less maintainable. In addition, as more components want to share the process space, cross-CLR compatibility will become more important. And as the platform matures, this will become more possible.

    For example, notice that .NET 3.0 uses the same execution engineas .NET 2.0, so it can share the same process space. I’ve heard that other, less visible changes have also been made under the covers that will allow this to continue even with changes to the execution engine.

    In any case, while .NET hasn’t been perfect, it is not fair to say they didn’t learn anything. It is leagues ahead of plain-vanilla DLLs and COM in versioning strategy, and avoids so many versioning issues in both the framework and in end-user apps.

  21. Michiel says:

    The root question is: why is the CLR linked against the managed executable using what is in effect a process-wide namespace? Because if multiple CLRs could co-exist, an extension needing CLR 1.0 would link against that CLR, and another could link against CLR 2.0. That doesn’t sound hard: if an managed app wants symbols from CLR 1.0, in the compiler simply prefix every symbol with CLR10. If from 1.1, CLR11.

    Of course, you would still end up with a big Shell that way, but that is probably not too important. If it is important (1000+ customers or so), you’d be looking at stuff like LIBCTINY, not .NET.

  22. Stu says:

    “Remember, Windows 95 had to run in 4MB of memory. ”

    So? Vista requires at least 512MB. The .Net framework is not even
    available for Windows 95, so it is irrelevant to this discussion.

    The ‘proper’ solution to this problem would be something like Dean
    Harding is suggesting, only supplied as standard with, say .NET 3.2 and
    made the recommended way to create shell extensions, with a couple of
    templates included in the next iteration of Visual Studio.

    Of course, this solution would have violated Windows 95’s holy 4MB, but nobody cares about that anymore.

    And when will Microsoft seriously start to ‘dogfood’ .Net?

    So far the only ‘serious’ commercial applications written in .Net
    released are the hardly-heard-of-let-alone-used Microsoft Expression
    Suite.

    Come on! How about at least re-writing the Windows Accessories in C# or porting Visual Studio to .Net?

    Maybe then Microsoft will find more of these ‘limitations’ of .Net
    and have more incentive to fix them, rather than treating .Net
    developers as second-class-citizens (except when it comes to new UI
    stuff).

    [The original comment was that the shell extension
    mechanism should have been based on out-of-proc servers. That would not
    have been practical in 4MB. Therefore, the shell extension mechanism is
    in-proc. -Raymond
    ]
  23. c++ 4ever says:

    Load .net framework multiple times? Stupid architecture.

  24. BryanK says:

    Daniel Colascione:

    A) Sharing code for efficiency’s sake (e.g., libc or msvcrt), and

    B) Loading potentially changing modules into a process (shell namespace extensions, unixland NSS, winamp plugins, etc.)

    How about C) Being able to fix a security vulnerability in one library (e.g. zlib) and have it applied to every process that uses that library, across the whole system.  Maybe that’s not a "traditional" use of libraries, but it’s sure a good use.

  25. useless neway says:

    Remember, Windows 95 had to run in 4MB of memory.

    That’s off topic because .net fw doesn’t run in win95

    .net framework 1.1 cannot run Comapct framework 1.0 apps even if .netFW is supposed to be a superset of CF.

  26. jachymko says:

    Burak KALAYCI:

    yeah, but it didnt refuse to start on a 4MB box. if it had out-of-process shell extensions, you wouldn’t be able to start the shell at all.

    I pity you if you make all your decisions this fast. Managed code makes lot of tasks easier. It has some drawbacks you need to be aware of, just like everything else.

    Stu:

    [cite]So far the only ‘serious’ commercial applications written in .Net released are the hardly-heard-of-let-alone-used Microsoft Expression Suite.[/cite]

    Have you heard of SQL Server, Visual Studio Team System, Office Server, BizTalk Server… Microsoft has lot of quite serious .NET-based products. There is no point in rewriting legacy code just for the rewrite sake.

  27. > yeah, but it didnt refuse to start on a 4MB box.

    Yes it did start, I have to credit that.

    I remember something like the base mem for executables were at 4MB default and W95 had to fix all jumps before running them in a 4MB box and that caused extra slowness. I always had the impression that it wasn’t really designed for 4MB…

    > I pity you if you make all your decisions this fast.

    Fast? I made my decision around May 1990. It was between 80×86 assembly and gwbasic. I feel no remorse.

    Best regards,

    Burak

    [? Non sequitur. Physical address space is not linear address space. -Raymond]
  28. > Non sequitur. Physical address space is not linear address space.

    I may be totally wrong. What I was talking about is the image base set by linker being at 4MB by default causing extra relocation effort when loading on a system with just 4MB memory. I had read it somewhere, I’m not sure if it really makes sense (or if I remember it correctly).

    Best regards,

    Burak

    [The images are loaded at their preferred linear address of 4MB. No relocation is needed. I shouldn’t need to explain this. I assume my readers know this stuff already. If I had to keep explaining the basic stuff I’d never get around to discussing the advanced stuff. -Raymond]
  29. Guess I was talking about the following,

    http://msdn2.microsoft.com/en-us/library/ms809762.aspx

    ‘In executables produced for Windows NT, the default image base is 0x10000. For DLLs, the default is 0x400000. In Windows 95, the address 0x10000 can’t be used to load 32-bit EXEs because it lies within a linear address region shared by all processes. Because of this, Microsoft has changed the default base address for Win32 executables to 0x400000. Older programs that were linked assuming a base address of 0x10000 will take longer to load under Windows 95 because the loader needs to apply the base relocations.’

    Turns out my memory is not as good as I’d have liked.

    Best regards,

    Burak

  30. "Remember, Windows 95 had to run in 4MB of memory. "

    I remember that! I’d say it ‘crawled’ with 4MB. 8MB was the real minimum it managed to ‘run’ (Though I haven’t tested any memory size inbetween).

    My take: Don’t write *anything* in managed code.

    Best regards,

    Burak

  31. Dean Harding says:

    I made my decision around May 1990

    How did you choose not to write apps in .NET in 1990?

  32. Random Reader says:

    Gabe,

    > For example, let’s say you have a CLR 1.1 shell extension Foo that does FTP, and is has a class called FtpRequest. When you run it on CLR 1.2 which implements its own FtpRequest class, Foo stops working because it gets the CLR’s FtpRequest instead of your own.

    The class name resolution rules, at minimum, also include a reference the containing assembly.  The CLR knows FtpRequest in your assembly is not the same class as FtpRequest in any other assembly, so this situation cannot occur.

    Michiel,

    > The root question is: why is the CLR linked against the managed executable using what is in effect a process-wide namespace? Because if multiple CLRs could co-exist, an extension needing CLR 1.0 would link against that CLR, and another could link against CLR 2.0. That doesn’t sound hard: if an managed app wants symbols from CLR 1.0, in the compiler simply prefix every symbol with CLR10. If from 1.1, CLR11.

    The CLR is not a library that is linked to an executable; it is a runtime.  On newer versions of Windows, a flag in the managed .exe header instructs the OS loader to launch and initialize the CLR, which then loads and executes the managed code from the .exe.  (On older versions of Windows, a native-code shim in the .exe’s entry point launches the CLR instead.)

    Because it is a full runtime, which manages detailed characteristics of the low-level process environment, it is not possible for it to play well with any other code that wishes to do the same thing.  Consider the Garbage Collector, which needs to understand memory usage in the process, in order to correctly self-tune.  Two such GCs would end up fighting each other, much as two processes that try to efficiently use all available RAM on a server would behave.  (There’s an old blog post somewhere about, IIRC, IIS and SQL Server that describes a similar case.)

    Or for something more high-level, consider the WinForms part of the Framework.  Naturally it requires a message loop for a managed thread, and there can be only one main message loop (broadcasts, non-window messages, etc).  How are WinForms 1.x and 2.x supposed to both have a main message loop at the same time on the same thread?

    So it’s not just a simple symbol naming issue :)

  33. Gabe says:

    RandomReader, here’s what I said "Of course you could just put the class in a namespace (like you should have done in the first place) and recompile, but your customers can’t recompile."

    The CLR will only know that your main assembly is asking for a class called "FtpRequest". Since this putative app is not directly referencing Foo.FtpRequest, the CLR will look for FtpRequest and its own version first.

    Like I said, there are trivial steps to take ahead of time (strong names, namespaces, etc.). If you didn’t do this when compiling your apps the first time, though, you’re in trouble if your client only has the old version of the app and the new version of the CLR.

    In the case of a shell extension, you don’t get to choose which version of the CLR to load, the app or the first managed shell extension does. The only way to guarantee it working in this case is to send out a recompiled version using properly bound names.

  34. John Stewien says:

    Gabe,

    shouldn’t people be using signed assemblies on released products, and thus not have this problem?

    Also isn’t the take home message of all this "Don’t use managed code in shell extensions"? Managed code should just be used for fancy GUI stuff and web apps surely. Is C/C++ really that hard? It’s all I used up until 5 years ago.

  35. Random Reader says:

    Gabe,

    > The CLR will only know that your main assembly is asking for a class called "FtpRequest". Since this putative app is not directly referencing Foo.FtpRequest, the CLR will look for FtpRequest and its own version first.

    That’s what I’m saying is incorrect: if your app’s assembly name is Foo (normally filenamed Foo.exe as well, but that isn’t relevant), all references to FtpRequest will in fact be to FtpRequest in assembly Foo.

    Explicit namespaces are necessary for organizational reasons, but not for type resolution.

    This is the same concept as for native executables: the PE import table references a specific DLL, not just a symbol from anywhere at random.

  36. > How did you choose not to write apps in .NET in 1990?

    I will give a short answer as this is quite off topic.

    At the time, there was gwbasic (interpretted and managed) and Assembly option (with Debug for .COM files), both came free with DOS, on my 12MHz 286 with 1MB ram and 40MB harddisk.

    I simply realized that for my programming pleasure, *I* had to ‘manage’ all aspects of what I write as much as possible. It’s like manual transmisson vs. automatic transmission preference (in cars)…

    Best regards,

    Burak

  37. Eric Smith says:

    What about Office add-ins?

    I’m having trouble reconciling the discussion here with the apparent encouragement to write managed add-ins for Office represented by the existence of VSTO. Doesn’t the same problem exist there (although without the file dialog issue)? Is VSTO only practical in completely controlled environments where all add-ins are guaranteed to use the same CLR?

    Sorry, I know this is straying off-topic for Raymond’s expertise; you’ve got a knowledgable community here, though.

  38. Dean Harding says:

    Eric: Office add-ins are a little different. The problem with explorer extensions is that your extension gets injected into every other process that tries to open a file-open dialog (or other things). Office add-ins are only run in the Office executable (well, usually – I’m not sure what happens with OLE in this case).

    When you call a CCW (that is, a .NET object implementing a COM interface) from a native application, the infrastructure (by default) loads the most recent version of framework that’s installed. So Office will typically be running .NET 2.0. This is not a problem for most .NET 1.1 add-ins, which will mostly run quite happily under .NET 2.0.

    With explorer, it doesn’t always work like that. If you have a process that has ALREADY loaded the 1.1 framework, then it tries to open a file-open dialog and load your .NET 2.0 extension… bang!

  39. KJK::Hyperion says:

    Eric, Dean: Office components always run out-of-process when started through COM, so it’s not an issue in that case

    Daniel, Cooney: I’m always surprised at the lengths people will go to try and contradict Microsoft’s design choices, often apparently for the heck of it. The "try to fix this Windows bug" challenge was especially eye-opening

    To everyone else: don’t mix .NET and native code, it’s a terrible practice. Do it only if forced to do so, but don’t incorporate it in an architecture developed from scratch

    It’s inefficient and terribly embarassing to the runtime, which never has the slightest idea how to handle native calls – will it corrupt the managed heap? will it close one of my internal handles? will it violate security policies? etc. Internal framework functions are not P/Invoke, for a good reason – their C++ implementations are so filled with static code checking annotations (pssst, .NET sources are public) as to virtually offer the same guarantees as managed code

    Nevertheless, if you absolutely have to, make it so managed calls native and not the other way around. Native code rarely owns a whole process (notable exceptions are DBMSs, in fact .NET 2.0 was specifically tuned to be hosted by SQL Server) the way a runtime does, so despite the huge inefficiency it’s not quite as dramatic as pulling the runtime in a random process

  40. AndyB says:

    To everyone else: don’t mix .NET and native code, it’s a terrible practice

    I thought it just works.

    Native code rarely owns a whole process

    Really? I wonder what all these exe’s I’ve been writing for years are.

  41. Dean Harding says:

    Office components always run out-of-process when started through COM, so it’s not an issue in that case.

    Are you sure about that? I use both Lookout and Newgator in Outlook, both of which are .NET add-ins and both of which run in-process. Perhaps they don’t use VSTO, though…

  42. Viewed as a data flow component, a property handler has a single file stream input and outputs a one

Comments are closed.