Comments on “Please Sir May I Have a Linker?”


Several folks forwarded to me Joel Spolsky’s post Please Sir May I Have a Linker in which he outlines the issues with the .NET Framework redist distribution problems and calls for a simpler solution:  a linker that could take your managed application and produce one atomic exe combining only the frameworks and engine components you needed to run.  Now deployment is simpler and smaller, config issues are far less worrisome, and you don’t need to track tons of versions of the CLR coming out.  It’s an appealing idea. Such technologies have been around for 50 years now, and we’ve played with some of them in our own space for prototyping purposes.

But do you really want a linker?


As is usually the case with this kind of technology there are lots of pros and cons.  Joel’s post covers most of the pros so I won’t repeat them.  I admit I have been tempted to write such a tool myself.  I was the lead developer for the Metadata team in V1, wrote the file format spec (along with SusanRS), and helped design the CLR portion of the MC++ compiler (“IJW” — yes, I own my fair share of blame for that damn loader lock bug — Whibey is on its way).  Although I don’t code full time any more, I know I could create such a tool and it wouldn’t even be that hard.  I personally would use it to isolate our build tools from interim breaking changes because compiling yourself with an older version of yourself can get tricky.


But consider the down sides:


Intellectual Property – There will be people left, right, and center on this one.  I don’t want to provoke that debate with this post.  Suffice it to say it would have to be thought out.  Honestly it isn’t even the worst of my worries at least for the core engine — we already give away a lion’s share of the code through Rotor (SSCLI).  Let’s acknowledge it’s an issue and move on.


Working Set – Say this kind of tool was wildly successful, and the majority of applications out there using the CLR wound up deploying this way.  Each of those processes would wind up with their own copies of the code used to run themselves, using their own address ranges because it’s a linkers job to merge pieces into a new thing. There would be no sharing of pages whatsoever between processes.  This would drive up system wide working set, making all of us want to go to Fry’s for more memory cursing on the way there about what a pig Windows had become now that the CLR is so popular.  There are legitimate working set issues we are addressing now; this makes a tough job even harder. 


Servicing – I have a love/hate relationship with this one.  I hate the fact that there is no way for me to patch the code with new bug fixes that may be making the system more unstable than it should be.  But as an app writer I love the fact that fixes other people apply to my machine don’t muck up my perfectly working application.  You now understand why I am so conflicted about technologies like the GAC.  Call this one a wash.


Security – But here’s the real kicker: security.  Put aside stereotypes and flames you might be tempted to hurl for a minute and think rationally about the problem.  Say you used a tool like this and produced a little P2P file share utility ala iTunes sharing music between computers.  You have this program resident on the Start bar on all your machines, listening on a port for any friends to come along.  Now along comes a virus that Microsoft needs to patch in a hurry.  How do I do that?  You’ve statically linked the code with the defect I need to patch into your P2P app and who knows how many other such apps?  The vulnerable app might have simply been copied to the disk making it harder to find.  How do I go patch that thing and not leave your machine vulnerable?  Such issues already exist for template libraries where updates can only occur through recompilation of anyone who has ever written and deployed a program using them.


There are more potential cons but you get the idea.  We could have a good and spirited debate about mitigation strategies, how in a perfect world security bugs could never exist, and you could point out that Lawn Darts were only dangerous for those who didn’t know what they were doing.  There may even come a point where enough mitigation techniques can be brought to bare that I could be convinced it was ok to do this.  But I feel there is a lot of risk associated with such a tool, and even though it is appealing in a number of ways, it isn’t necessarily the right thing to do.

Comments (48)

  1. Louis Parks says:

    The difference between this post and Mr. Spolsky’s is that you respectfully approached the issue and he did not. Thank you for explaining the problem and separating out the emotion and politics associated with this issue. Posts like this help to explain why I read you and why I don’t read Mr. Spolsky.

  2. Paul Wilson says:

    Excellent explanation.

  3. It’s still an issue in statically linking C++ libraries, yet this is still common practice on many platforms. While I understand the security argument, it’s certainly not the trump card that you’ve played it as.

  4. Pavel Lebedinsky says:

    >It’s still an issue in statically linking C++ libraries, yet this is still common practice on many platforms.

    Let’s say a .NET app uses things from System, System.Windows.Forms, System.Xml etc, and statically links to all these assemblies. In the non-managed world that would be equivalent to statically linking to CRT, MFC, ole32, msxml etc. This is definitely not common practice (actually only CRT and MFC can be statically linked, and even that is strongly discouraged, for the reasons stated above).

    Linking in a few functions from a library might be OK but statically linking to entire runtime is a whole different thing.

  5. There was one other point that seemed fairly obvious to me.

    He suggested that his programs would be around 5/6 MB each. The .Net framework is around 22MB. That is 3-4 times the size of his exe. So if he released 3 versions (or even patches) for any one version of .Net then the downlaod of the .Net runtime becomes insignificant.

    Add to that the fact there will be MANY applications from other vendors all being 5/6 MB or even more and you’ll very quickly just over 22MB.

  6. Luc Cluitmans says:

    Funny, this is an issue I have been playing with the past few weeks. My problem is slightly different from what is portrayed here, but could be used using the same solution.

    I write lots of little tool programs (mostly commandline tools) for internal use in the research institute where I work. These tools tend to reuse some fragments of code I wrote before (e.g. commandline option handling). I really want to distribute these apps not via an installer, not via XCopy deployment, but via Copy deployment: just copy the one single executable file somewhere on the path and it works.

    The problem is that in the .NET framework, when you develop modularly, you also have to deploy modularly. What I miss, more than just a linker, is support for a .Net equivalent of ‘static libraries’, allowing modular development, but monolithic deployment.

    Unlike suggested in the original message, I have no need to link in any parts of the .Net framework; requiring the .Net framework to be installed on the client machine is perfectly acceptable. What I need is a way to fuse my own exe assembly with my own dll assemblies. And yes, for now it is perfectly ok if that just works with purely managed assemblies (excluding support for Managed C++).

    Note that the security argument in the original mail doesn’t really apply to this scenario anymore.

    Just for information, I have been playing with a few avenues to get a monolithic exe.

    – One way is to use the ‘source linking’ option in VS.Net: When ‘adding an existing item’ to a C# project, pay attention to the dropdown arrow of the OK button of the dialog box, and change it to ‘Link’. Just refer to the ‘library’ source files for each project you want to use the ‘library’ code. This is far from perfect, and may cause maintenance pains, but it works.

    – Another way is using Ildasm on all compiled assemblies, doing some voodoo to glue the .il files together, and next using ilasm to create a monolithic executable.

    – Yet other ways involve simulating a monolithic executable: package the library dlls into the executable as resources, or use your homebrew methods to append them to the executable (similar to self-extracting archives), and do some voodoo involving the AppDomain.Resolve event and one of the the Assembly.Load(byte[]) methods to load the dll from your ‘archive’ instead of from a file.

  7. I for one would rather see a system that could piecemeal install the framework and perhaps assemblies that exist in the gac, something we could link into our apps directly.

    Pseudocode(native?) like:

    if (!clrIsInstalled)

    DoClrInstall();

    if (!neededAssembliesInstalled)

    DoAssemblyInstall();

    or what have you.

    There is obviously some problems with this, the major one being it would likely be *ALOT* more work than linking assemblies together. It would also require some kind of server side support(a distribution system), etc, and would really be a fair sized installer system that can be patched into an executable. It’d be quite a bit better than 5 or 6 meg assemblies though, I would suspect.

    The upside is, of course, most of the issues with the linker are circumvented(IP maybe an issue still and it does open new security issues, what if the *installer* code has a exploit?).

  8. Jason Zander says:

    With respect to security being the "trump card", Pavel is right on here. You have to consider many factors when using this kind of technique, including (a) what scenarios will the code be exposed to and how much damage could it do in those scenarios, (b) what is the surface area of the code I will allow to manifest itself elsewhere, and (c) if I were exposed to a serious security issue, how would I react and protect my users? Since I own the CLR, my job is to try and make sure our code can’t enable any nightmare scenarios, hence my caution.

    Steven – Good point on the aggregated overhead to the network at large.

    Luc – I know precisely what you are referring to here. There are engineering advantages in going this route: (a) you can limit your /r’s while compiling your code to avoid picking up an even larger set of dependencies, (b) putting together smaller dll’s that are related into fewer larger dlls can eliminate extra OS loader overhead and system working set, and (c) the scenario you mention. If you squint at a managed dll it kinda looks like an .obj file doesn’t it? This is actually the technology I was referring to above that we had prototyped in the past. Take a look at this PDC deck {http://www.gotdotnet.com/team/pdc/4076/tls401.ppt&e=7421} (scroll to "MSIL Linking") and I think you’ll be very happy to hear this is on the way! This technology is very useful in this way once you’ve vetted it with the checklist I mentioned.

    Daniel – I’m curious how close the new ClickOnce technology in Whidbey comes to your scenario? It doesn’t really solve the .NET FX download (you’d have to add your own unmanaged shim). But it does strive to bring updates/missing files from a central server to your local machine for execution. {http://longhorn.msdn.microsoft.com/lhsdk/ndp/cpovrclickoncedeploymentoverview.aspx}

  9. Julian Gall says:

    The issue here will go away in two or three years when most users have the framework installed. However, at present one of the main decisions facing a software developer is: "Can I afford to develop in a .NET language when this means I will lose a certain number of customers who are not able or willing to download 21MB just to try something out?".

    I am faced with this choice. I am developing an app. that will be launched from a single button on the IE toolbar. It has one dialog box and performs a very simple function. I want to use C#, which I know, but I may have to go back to C++ (which I don’t like much or know as well) so that the download size doesn’t put potential customers off.

    Your explanations for a linker being a bad idea are very clear and make a good case. Is there another solution then? Could MS divide the framework into 2MB chunks and download a bit with each of the next security patches? Could it be "download on demand" for the parts of the framework that an app uses? I know those are daft ideas but is there no choice but to wait until everyone has the framework?

    Joel’s secondary point is that we have already had two versions of the framework. V1.1 wasn’t a minor patch to V1.0, it was a complete new 21MB. How often is this going to happen?

  10. Jason: I am curious exactly how far click once goes, I’ve yet to have had the time to play with it. My inital understanding was that ClickOnce was more oriented towards web deliever, I’ll have to read up on it a bit more.

    Although, while it apparently does provide nessecery assemblies, patches, whatver, as you noted it doesn’t help with the framework or runtime installation. An interim solution for installing the runtime itself piece by piece would be of value, just well beyond my skillset to achieve. Similarly a MS hosted distribution point for the framework would be important.

    In regard to Julians point, a download on demand system is much of what I was talking about. You’d need enough native code to check for the JIT, GC, and other core services to get the code up and running. We even get the luck of not having to consider portability(a .NET exe isn’t going to run on anything but windows *without* a CLR being installed). However I don’t even begin to understand what it would take to dynamically check for and then install the framework core components(try to bind to mscoree.dll, if that fails initate an install?). Although this is an area that the ClickOnce should have(and probably did) consider. To really achieve this Microsoft would have to create a much more miniature packaging system for the framework, something that includes only the core engine, not the libraries or asp.net or compilers, etc.

  11. Jason Zander says:

    Good questions. Right now of course we have v1.0 and v1.1. We are working on the Whidbey release, which you saw at the PDC — besides a new Visual Studio it is the power behind Yukon’s SQL/CLR integration. And then finally we will have a version of the CLR that runs Longhorn (also seen at the PDC). Right now those last two are built from the same tree/source. Each of these new builds is (or will be) it’s own thing with a separate redist. We actually started supporting parts of some XP SKU’s and Windows 2003 with v1.1, so it comes with the OS in those cases. You can expect us to keep going that route.

    We’re starting to veer into deployment and app compat which is worthy of some extra details in and of itself. Let me write something up a little more thorough that this edit box will allow and post that (stay tuned). One bottom line parting thought until then: we want people to write managed code on today’s runtime, and we will do everything in our power to make that investment easier to deploy over time and work on the newer versions as they come out.

  12. MartinJ says:

    You know, I’m not so much worried about a linker for the actual runtime. I’d like a linker for my own code. I don’t want to have to include the source code files in multiple projects. Yet, I don’t want to include references to many assemblies. I’d like the happy medium where my utilities can live somewhere in limbo and get brought into my assembly at the IL level.

    That way, I don’t have to remember which file has the one method/class that I want to include in my project. You know, the one that does that thing like back in the day. What’s it called? Oh yeah. Nope. Let’s look over there. Nah…

  13. Jason Zander says:

    Carlos – You should separate the architecture from current implementation. There is nothing inherit about MSIL binaries that disallow the kinds of features you are talking about. If you were willing to forgo richness like cross-assembly inlining of methods, reflection, and other data described technology that require the IL and/or metadata, then you could strip all of the above. Check out that MC++ PDC deck I included a link to above for more details about the Whidbey product, which in fact has many new features in this direction.

    Martin – I believe you are also describing the MSIL linker in whidbey I mentioned above. You compile up your utility code into a netmodule (eg: a dll), and then link many of them together into a deployable unit.

  14. Mike Dimmick says:

    Martin, Luc: have you looked at ILMerge (at http://research.microsoft.com/~mbarnett/ilmerge.aspx)?

    Obviously this is a research rather than a production tool, so you should probably treat it with a little caution.

  15. rs says:

    For those of you who are curious, there is a linked sample (the scribble.exe). It links all framework libraries except for mscorlib.dll.

    Click Samples on the left panel.

  16. Jason Zander’s got an excellent thread going regarding Joel On Software’s "please sire may I have a linker?" Recommended reading….

  17. Luc Cluitmans says:

    Mike: thanks for the reference to ILMerge, it is exactly what I needed! As I need it mostly for internal projects in our research institute, it doesn’t matter much that it is not a ‘production quality’ tool.

    I noticed that the output it produces was a lot smaller than I expected. On closer inspection, that is because all the huge wads of 0x00 bytes in the PE files (to fill segments to a multiple of 4 kbyte) are no longer needed. So ILMerge saves some space too: in my test I merged one exe with fourteen dlls, 624 kbytes worth of raw material, while the merged result was an exe of only 416 kbytes.

  18. SBC says:

    I was wondering if there are any performance gains (or losses) if pruning tools like ilmerge and salamander (remotesoft) are used in app development. Any measurement studies?

  19. Ian says:

    The main part of the problem is that we are in a long period of transition, moving from Win32 API to .NET framework API. I would like to transfer all my development to .NET (c# is very nice productive language.) but this forces me to restrict my users to people who have .NET or people who are willing to install it. This places a technical decision on potential users, which in general, they don’t want to take. Ease and simplicity of install is vital in getting "joe public" to try my software. Just saying that people need to install the framework (which has a perceived risk of failure) or wait till 2006 (?) for longhorn (a full OS replacement) is missing the point. Having a linker to provide standalone .NET based appliactions able to run on any previous win32 API version of windows would be ideal. I think MS need to consider how to manage the transition both in terms of the development community and end-users. A Linker provides flexibility during the move from the old to new.

  20. Joe says:

    Just a technical aside on this page. When viewed with Opera in ‘identify as Opera’ mode this page is all screwed, when viewed in ‘identify as IE’ mode it works fine, why is that?

  21. Sam J. says:

    I think the main problem with not having a linker is taking away this decision from everyone else, so not giving anyone a chance in deciding this for himself.

    It’s like saying ‘alkohol is bad for you so we disallow it’.

    The only thing this will achieve is the customers (like Joel) who need this will use some other tool, probably even more insecure.

    So what does MS gain by not providing a linker? They lose programmers.

    Taking away the choice from the one who should be able to make it is always a bad move.

    Sam

  22. The Drunken says:

    MS webservers have a history of sending buggy HTML to browsers identifying themselves as Opera.

    See here: http://my.opera.com/community/dev/discussion/openweb/20030206/

  23. Dimitri. says:

    >> But do you really want a linker?

    Let me surprise you here. Yes mister, I really want a linker. And I don’t need you telling me that no, I don’t want one. You do list some advantages of the CLR, and Joel lists some of its disadvantages. I want to take that decision myself. I am a software developer, and I need the tools to make the product I want, not some preaching about this way that is the one way.

    I find this absurd and stupid. Yes, I need a linker, and if you want to tell me that in some situations it may be better without, that’s fine. Give me a linker and let me decide.

  24. My question, which has been posed by others in different ways, is why don’t you provide the tool and leave it up to the programmer make the choice as to the appropriate technology?

    It is clear from reading the comments below that there is a demand for this and there are situations where using it is app. Why not add it as another tool in the toolbox for programmers to use?

    There is a danger that it will be misapplied. I agree that is a danger. I believe that people have an almost limitless ability to make mistakes (myself included), hence the tool *will* be misused. On the other hand, the people who did this *would do the wrong thing anyway*, they would just find other ways to make mistakes.

  25. Sualeh Fatehi says:

    Take a look at this, and see if it fits the bill:

    http://research.microsoft.com/%7Embarnett/ilmerge.aspx

    ILMerge is a utility that can be used to merge multiple .NET assemblies into a single assembly. It is freely available for use from the MSR Downloads site. If you have any problems using it, please get in touch. (mbarnett _at_ microsoft _dot_ com)

    ILMerge is packaged as a console application. But all of its functionality is also available programmatically. While Visual Studio does not allow one to add an executable as a reference, the C# compiler does, so you can write a C# client that uses ILMerge as a library.

    ILMerge takes a set of input assemblies and merges them into one target assembly. The first assembly in the list of input assemblies is the primary assembly. When the primary assembly is an executable, then the target assembly is created as an executable with the same entry point as the primary assembly. Also, if the primary assembly has a strong name, and a .snk file is provided, then the target assembly is re-signed with the specified key so that it also has a strong name.

    There are several options that control the behavior of ILMerge. See the documentation that comes with the tool for details.

  26. Dan Golick says:

    If we cannot have a linker how about incremental installs of the framework. Only download/install the referenced assemblies.

    If I could install the small subset of the assemblies that I am using it would not be too burdensome for users who do not have .net installed.

  27. Fred says:

    Re: the bad layout in Opera – I believe it’s a CSS issue, not a "bad HTML to Opera" issue. If you switch to user CSS instead of author CSS, you don’t end up with the whole article split by words. You do miss out on the author’s "design", but that was questionable in the first place.

  28. Not my real name says:

    These areguments are nothing new. Even if there is no linker, you can still have the same problems.

    – Intelluctual Property:

    There must be something I’m missing but Microsoft allows the redistrubution of the .Net run time. So what would be different if only libraries were left out? As for third party libraries, I’d still have to get their permission to include them in my program.

    I can see a problem with third party libraries that microsoft includes in the runtime. But it’s had them in the past. That might take some leg work, but nothing new.

    – Working Set:

    Yes, there wouldn’t be a shared base for all the linked applications, but so what it’s no different now. The only problem would be if every application were linked with the entire .Net runtime.

    – Servicing

    True, if it ain’t broke don’t fix it.

    – Security

    The P2P might be a bad example. You’re assuming that the problem is in the .Net runtime. What if it’s in the application. What can microsoft do? Even if it’s in the runtime, fixing it would require that people keep their systems up to date. We all know how well that goes.

    Overall I’d be ok with no linker, IF the runtime was shipped with windows and the vast majority of people had the latest copy.

    But they don’t and I bet it will be along time before eveyone moves to Whibey.

    Until then, Please Sir May I Have a Linker?

  29. Also not my real name says:

    Fred, if you’d read the issue, you’d have noticed that it’s a "bad CSS to Opera", in fact. Read the article – that’s the problem exactly.

  30. Will Gayther says:

    There is, of course, the other side of the coin. With linking, you can ensure that your app doesn’t have any accidental security holes in it because someone is running your app on an old version of the runtime. Your average user isn’t exactly enthusiastic about downloading updates.

  31. Jamie says:

    Isn’t this one ? (It may have been pointed out but I’m not going to read EVERY reply !)

  32. Security Consutlant says:

    "Now along comes a virus that Microsoft needs to patch in a hurry."

    Huh? Since when do you patch a virus? Maybe you meant to say:

    "Now along comes a vulnerability in Microsoft’s code. Well, really now, this vulnerability had been there for quite a while, but err um we kind of kept it quiet except that now that we have this err um virus we need to tell people about it… err well we can at least blame it on the virus and make it sound like the virus has some magical power over your PC, not due to any fault in MS’s code, and along comes MS to save the day with a new patch."

    It’s this attitude, of blaming everyone else for the vulnerabilities in MS’s code, that has got MS the reputation that it has.

  33. Jason Zander says:

    I take your comment to say that it is the responsibility of the software industry (and especially Microsoft) to secure our code and ensure that your computer and mine are protected, no excuses. I couldn’t agree more. It’s why I have a desire to make sure that any features we add to the product are done in a safe and responsible way.

  34. Neil says:

    Why don’t you make the CLR a critical update? Then force anyone installing MS Software to do a windows update first as part of the install process. This would not solve all your problems but would help.

  35. Jason Zander says:

    Neil – If an exploit were found in the CLR we would definately release a critical update, so no problems there. But it won’t fix the security problems I’m describing since one of the advantages of a linker is that your intent is to statically pull all of the code into one big monothilic executable. Once you do this, there is no way for me to identify the code that needs to get patched.

  36. Mahavir Jain says:

    Nice Article , explains lots and clears many doubts !

  37. peterhcy says:

    I think i see what you are saying!