Why Java VMs do not have a Pre-JIT feature?

Both Java and .NET have several things in common – their runtimes are both able to execute code written in a machine-independent “assembly language”. As we know, this code is represented in binary format: bytecode in the Java world, and IL in .NET. The generic idea of a “bytecode” is pretty old, in fact UCSD Pascal had a similar concept called P-Code back in 1970s, and then later Smalltalk built on the same idea.

Usually, these generic instructions are not executed directly, but rather they are translated into machine code on the fly, in a process called JIT. The acronym stands for Just-In-Time, which is similar with the back-end phase of a C/C++ compiler. This is also not a new idea, and if I remember correctly, Smalltalk-80 implemented it for the first time in a succesful manner.

As a side note, in the case of .NET, these instructions were specifically designed to enable a simpler language but also a potentially faster JIT process. For example, in IL you have a single, virtual “add” instruction which adds whatever two numeric operands are present on the stack, irrespective of their types. The runtime will perform the right optimization since the argument types can be deduced anyway from the metadata information. This contrasts somewhat with the Java approach where you have numerous flavors of “add”, one for each possible pair of integer types.

On the other side, the fact that you intensively need the metadata during the JIT also implies that it is pretty hard to design a “.NET/IL processor” since you need to understand both the .NET metadata and IL at the same time. OK, maybe it is not impossible but certainly hard. The Java bytecode, on the other side, was initially designed to be run on a processor, not in a JIT environment.

But with JIT we have now another challenge. First, as soon as you stop the execution of a process, you lose all the optimization information gathered in the previous run. And you have to JIT again and again, each time the process starts. Second, when your process starts, you lose some time compiling the IL/bytecode.

A natural idea is to cache the compiled images on disk, so at the next start you will just load them and start from there. Even more than that – there is an optimization called Pre-JIT that allows CLR (starting with 1.0) to pre-compile a .NET assembly ahead of time, and persist the generated in machine-dependent executable. Pre-JIT helps for example to get better load times for GUI-style apps.

I am wondering why no Java virtual machines do not have something similar these days?

[update: fixing the link about UCSD Pascal]

Comments (10)

  1. Anonymous says:

    The UCSD Pascal link doesn’t work – 404.

  2. Anonymous says:

    "I am wondering why no Java virtual machines do not have something similar these days? "

    GCJ (the GNU Java compiler) can compile .class files into executable objects which can then be linked into a .exe file, or even C++ files. Of course, the final exe / library will end up with GC & Java runtime (statically / dynamically linked .LIB) dependencies.

    Of course, GCJ executables run slower (but start faster) than executables run under the JIT. In particular, GCJ uses the Boehm conservative GC, which limits its optimization strategies (for one thing, it can’t compact since it’s conservative).

    Also, inferring the correct machine operator to use from the datatypes that are statically analysed and discovered on the stack at that point in the assembly (a fundamental limitation of both JVM and MSIL being that the stack must be the same at every instruction, no matter what path got to that instruction) is trivial, not difficult (especially since the stack type data inferral needs to be done for verification). It’s easier than function overload resolution, which is something compiler writers do as a matter of course when implementing a language with that feature.

    And yes, I write compilers.

  3. Anonymous says:

    By pre-JITting you are probably talking about ngen.exe right?

    ngen.exe can be very bad (premature optimization) since it will create machine code on the machine it’s running, for instance at build time, not on the end user machine.

    Since ngen is not provided by the .NET 2.0 runtime (neither with previous runtime versions by the way), there is no way you are going to ngen your app as a post-install step.

    I must have missed something really obvious…

  4. Anonymous says:

    Oops, sorry. ngen.exe is with the runtime dist.

  5. Anonymous says:

    Back in the heady days of Java as silver bullet/panacea, one of the pillars of why Java was the end all be all was Write Hurriedly OnceRun Everywhere, and part of selling that was that JIT was really fast, and that they could (theoretically, in the next release) use runtime information to make the JITted code faster.

    Part of the culture of Java is still the devotion to machine independence.

    There are probably some technical reasons that I do not understand as well, but I think it would be hard to dismiss the white hot anti-Microsoft fires in which Java was forged.

  6. Anonymous says:

    >> GCJ (the GNU Java compiler) can compile .class files into executable objects which can then be linked into a .exe file, or even C++ files.

    Yes, I knew about GCJ (I should have mention it). But I don’t consider it a runtime. It’s just a compiler. So you lose all these advantages coming from a runtime environment (you cannot re-JIT if you detect a certain usage pattern over time, etc).

  7. Anonymous says:

    The UCSD link is working now.

  8. Anonymous says:

    ngen.exe is in .Net framework redist. Check your framework directory.


    Java5.0 has introduced Java Class Data Sharing concept. Not the same concept, but similar result. It is very limited though.

    Java was never popular for client side applications. That is probably why Sun does not do ngen.exe.

    But all are just speculations.

  9. Anonymous says:

    Pre-JIT can help with load time, but the better performance optimizations typically happen at runtime. The Java camp seem to be more interested in actual runtime performance, not loading performance.

    The usual example that I’ve seen floated about is the performance of the JVM’s virtual method calls vs the CLR’s. The JVM can inline some virtual method calls at runtime. If it realizes the optimization would not benefit the program any longer, it backs it out. Whereas the CLR currently does not bother with even attempting that optimization. Which, considering there are lots of virtual method calls, is unfortunate.

    The general idea is that a good JIT can determine more about the execution of your program than a pre-compiler can.

  10. Anonymous says:

    I’ll go out on a limb here and guess that for their particular circumstances they decided that adding support for pre-JITing of user code was more expensive relative to the customer benefit that it would provide than other features under consideration for this release cycle.