More on Jar Hell


Regarding my prior post on Jar Hell, a colleague asked me to elaborate on the problems I experienced and why .NET would not have these problems.

.NET basics

First thing: .NET has a version-aware classloader and strongly-named assemblies. This means when I build a .NET app against an assembly, the app runs against that assembly, and that assembly only, unless I explicitly override that.

Also, .NET adds the concept of a global assembly cache, the GAC, into which fundamental, re-usable libraries can be placed. (The GAC gets over-used in my opinion. Too many people think their libraries are fundamental and machine wide, when they are clearly not. In any case, the GAC mechanism has its uses.)

Java classloading

I am not a Java classloader expert, so take all of what follows with a grain of salt. In practice, the way it works in Java is the Java classloader loads classes from the classpath. The Java application environment also has well-known directories from which libraries may be loaded. The lib and ext directories in the JDK/JRE, for example.

The differences

Beyond the basics, when you have an “app container” – say JBoss,Tomcat, or WebSphere – they add their own classloading rules, and special directories from which jars will be loaded.

This is one fundamental difference between Java and .NET. In .NET there is one way to find and load assemblies. It’s done by Fusion. Essentially there are two places to look for assemblies: in the app’s home directory (or a “special” bin directory for ASP.NET apps), or in the GAC. In Java, there are a myriad of ways to find jars, and a variety of ways to load them. Special directories abound.

The setting for JBoss+Tomcat – where I can tell it to use either the JBoss classloader behavior or the Tomcat classloader behavior (go here and scroll down half the page) – the very existence of this switch illustrates the problem. There is no such switch in .NET.

Beyond that though, is the proliferation of special directories. library directories for the JVM, for the web application container, for the uber application container. extension directories, and on and on. In .NET, there are two places to look. Not seventy two.

Next up: component libraries. On the Java side, some of the ones I used were: servlet.jar, jaxrpc.jar, log4j, xerces (an XML library). Each of these is independently versioned. But, the Java classloader doesn’t care, doesn’t respect versions. The classloader just scans the classpath and the special directories and loads a class with a given fully-qualified name.

If I build and test against Xerces 1.4.4, and then drop by app into a container that has a different version of Xerces, the Java classloader won’t complain. Won’t raise an eyebrow.

A perfect example of this is the SAX problem I had. The BEA jrockit JVM ships a version of SAX, in one of its special directories. But so also (apparently) does AXIS, or JBoss, or something else in the chain. The error I saw was the result of the “wrong” SAX library being loaded when JBoss/Tomcat/AXIS was run under BEA’s JVM.

Another example is the NoSuchMethodError I encountered while running two distinct versions of AXIS within Tomcat (within JBoss). At runtime, the classloader picked up the wrong version of the library – presumably the 1.1 version and not the 1.2 version my app was built against.

A final example, although not nearly as serious, is the ImageIO thing and the support for BMP. My WordML app was depending on Java 1.5 ImageIO function, but it ran against Java 1.4 ImageIO libraries. It was up to me as a developer to either handle the non-performance, or check whether the Java engine was 1.5 or below. This is merely a hassle, because I could handle it in my own code. The other examples were not even my code.

In .NET this entire class of errors doesn’t exist. You have to have the exact right assembly – the same one you built against, or your app won’t run. This means, no accidental or inadvertent class swapping. You can make this even more bulletproof with cryptographically strong names for assemblies. This means, no malicious class-swapping. (Like, stubbing out a license enforcement class by playing classpath games. I’ve proven this works with major app servers).

Is he done yet?

Yeah, almost. I’m wrapping up here.

What this all means is .NET developers can basically ignore classloading and versioning issues. With .NET, essentially, you copy in the app’s files (including library files), and the app is installed. With Java, you copy in the JARs, and then test, and test, and test. And you’re never really sure it’s going to work, because buried deep inside some routine, somewhere, maybe not even in your code, is a call out to log4j or xerces or some other library that is out-of-version.

A bunch of people have responded to my previous post and said, “I feel your pain”. Another set of people responded and said “so what else is new? it’s been like this for years.” That is very surprising to me. I am not a Java developer by profession. It’s more of a hobby ;). So I can’t believe people really deal with this every day. It’s so… 1997. It’s like COM and “DLL Hell” all over again.

I am not able to assure you that you’ll never have versioning issues with .NET. Just that I think .NET’s approach is so much more … usable.

Update 611pm US Eastern: removed incorrect reference

Comments (14)

  1. Walt says:

    You said: "This means, no malicious class-swapping. (Like, stubbing out a license enforcement class by playing classpath games. I’ve proven this works with major app servers)."

    I never even thought of that! That is a major, major issue. Does anyone out there know of a fix for this? What good does licensing your software if someone can just bypass it with a bogus JAR file?

  2. Dino says:

    this jar hell thing is really old news:

    http://www.sauria.com/blog/2003/01/16

    it’s just that I’ve always been late for everything.

  3. Dino says:

    Walt, there’s a workaround, yes. Sort of.

    The workaround is for the application to insist n a particular version of the jar or package being loaded.

    See

    http://java.sun.com/j2se/1.3/docs/guide/versioning/spec/VersioningSpecification.html

    In Java, the publisher of a JAR can specify a version number for the JAR. The problem is, (a) the classloader completely ignores this information. and (b) it’s easy to spoof anyway. Even if the classloader respected the versions, you could just fake it.

    So it is left for the app to check the package version, and … i don’t know… failfast? if he gets the wrong one.

    People have also worked on other version and dependency checkers and verifiers for Java.

    Hey, wait! Maybe that’s the reason for all of this. To encourage a new market in version dependency checker tools !! Pure genius !

  4. It’s worth pointing out that there are significant downsides of the .NET model too, and that an ideal world would involve finding a balance between the two.

    First and biggest advantage Java has is in unloading of classes. In Java, classes get unloaded when their classloader gets GC’d. Period. Classloaders are, comparatively, lightweight objects and it’s perfectly feasible to have dozens or hundreds within an application. Furthermore there is no (ZERO) penalty on calls between objects loaded by different classloaders. In .NET (even in Whidbey, LCG notwithstanding) the only way to unload anything is to load it within an AppDomain, which is an insanely heavyweight object and imposes a several-order-of-magnitude penalty on all calls between objects in different AppDomains, as well as changing the *semantics* of such calls (they typically become pass-by-value instead of pass-by-reference, but if the objects involved aren’t serializable, they don’t work at all).

    Another advantage on the Java side is that the Java classloader model allows loading classes from streams or byte arrays constructed in memory – loading classes directly from URLs, for example, without having to save to disk. If it’s true what you’re saying about .NET only loading from directories on disk, then you need to download the assemblies first – and then somehow fake out the CAS system to persuade it that despite being on the local machine, these should not run with local machine privileges. Reflection.Emit tangentially partially addresses this issue, I guess. Hmm… how does Assembly.LoadFrom() fit with your notion that there are never special directories?

    If .NET could get a model for code unloading and dynamically loaded code that’s as clean and elegant as Java’s, while also keeping the versioning advantages that it has today, then it would be a clear winner. Today it’s a toss-up, IMHO. You’ve traded the lack of version problems for a lack of *power*, and as a programmer I’m not sure I like that tradeoff.

  5. Walt,

    The correct way to solve the problem is to sign and seal the jars. Then nobody can tamper with them or play tricks using the classpath.

  6. signing and sealing jars says:

    Yes, signing and sealing jars prevents the classpath tricks and replacing classes opportunistically. I don’t know why deliverers of jars don’t do this more often.

    BUT, this doesn’t address the dependency verification problem. All a signed and sealed jar says is "all classes herein are one unit, and you cannot add (or replace) classes in the same package)." there’s still nothing in the classloader that allows an app to insist that he must use THAT particular version of a jar.

  7. Dino says:

    Stuart, you’re right, there is the possibility to Assembly.LoadFrom(). This enables apps to custom-load assemblies. I completely left this out because I think it is mostly interesting to container developers — like the poeple who developed ASP.NET, or the people who developed the CLR hosting in SQL server or DB2, but it’s not interesting to apps developers.

    There is also a downloaded assembly cache, which is distinct from the GAC and bin. It supports the loading and running of assemblies from a URL, like an applet would do. So within IE, you can run a .NET assembly, and can grant it security (based on the assembly evidence) and so on.

    And finally, you can specify private probing paths for assembly resolutoin. Per appdomain I believe. I think this is again interesting for infrastructure development.

    As for .NET’s AppDomain versus Java’s ClassLoader – you make good points. It’s easier to unload in Java. It’s easier to make calls across the boundary (in Java there’s not really a boundary). But how mainstream a scenario is this? – unloading and reloading classes, and then having high volume communication across them.

    I can see unloading and reloading, for example on an auto-updating app. But it doesn’t require high-volume comms across the boundary. I can see high volume calls across dynamically loaded classes, but I don’t see the need to unload at high volume. Bottom line, I think the balance drawn by the Java Classloader architecture is the wrong one. Too loosey goosey. How much of a productivity tax has the classloader/classpath/versioning thing been for the zillions of Java devs out there, over the past 10 years? All that trouble is worth the power of being able to unload with a lightweight mechanism?

  8. You’re right that the problems caused by the Java classloader model probably outweigh the benefits of easy unloading. But you miss my main point, which is: why, oh why, do these have to be mutually exclusive?

    Why did we have to wait until the second major release to get LCG? Why do we have to wait until the *next* major version *after* Whidbey to have the trivial extension of LCG to being able to generate entire classes? And why will we probably have to wait until the version after *that* before "normally" loaded classes get the unloadability benefits that are being given to LCG code?

    The .NET designers seem to have looked at Java’s classloading model during their architecture design process, decided it sucked, and completely written it off. In most other areas, the framework did an excellent job of taking what was good of Java and improving the stuff that wasn’t. But in the area of unloading code and introducing security boundaries around dynamically-loaded code (which is interesting for anyone trying to make a plugin-based architecture, BTW, which is a very broad area of development) .NET just ignored all the good work that Java did, and it’s taking literally years to catch up with where Java was at 1.1! That’s what I’m disappointed about.

    No, I wouldn’t trade the .NET model for the Java one. But I don’t think it’s unreasonable to lament the fact that I can’t get the best of both…

  9. Oh, and FWIW, my own desire for easy sandboxing and code unloading was driven by NRobot, http://home.gna.org/nrobot. I have solved the biggest problems in my working copy at this point thanks to some excellent blog entries, but I had to do a pretty massive refactoring to get the right stuff passed across the appdomain boundary. It’s not so much the speed that’s a factor as the changing semantics of the calls, requiring AppDomain support to be architected in from the beginning…

  10. One thing to keep in mind – and I’m definitely not trying to support Java here – is that the .NET Framework will redirect – via a registry key that the .NET Framework installer lays down – requests for a specific version of the CLR which will, in effect, redirect the BCL assembly versions. For example, if requesting v1.0.3705 and it’s not installed, then the CLR will load v1.1.4322 which causes the 1.0.5000.0 assemblies to be loaded (if you have Whidbey beta installed, the 2.0 assemblies are actually loaded since .NET 2.0 accepts redirects for all versions since 1.0.0.0).

  11. Wayne Citrin says:

    Walt said: "I never even thought of that! That is a major, major issue. Does anyone out there know of a fix for this? What good does licensing your software if someone can just bypass it with a bogus JAR file?"

    We’ve looked and looked, and haven’t been able to find a decent licensing mechanism for Java. We have a hybrid product with components written in both Java and .NET. For exactly the reason mentioned above, our licensing mechanism is solely on the .NET side.

  12. First, pardon the interruption in your normally scheduled program. This isn’t a post on interop, it is

  13. Tanveer Badar says:

    I would like to add that class unloading in java is not related to strong naming/code signing/whatever in .net. The point of this article, at least to me, was how to make sure you don’t run into versioning hell and make sure you get what you ask for, which most have seemed to side track and digress. However, all points raised were worth discussing in any case.

    Addressing the class (un)loading feature, it is one of the highest voted requests on connect, one can expect some day it will be possible.

  14. waleed says:

    the .net has the GAC feature which is shared location to store shared assemblies so they one assebly can be shared and accessed by many application without requiring the application to have single copy of it does the java has similar feature ?