Why does the CLR report a NullReferenceException even if the referenced access is not exactly the null pointer?


We saw some time ago that before invoking a method on an object, the CLR will generate a cmp [ecx], ecx instruction to force a null reference exception to be raised if you are trying to invoke a method on a null reference.

But why does the CLR raise a Null­Reference­Exception if the faulting address is almost but not quite zero?

class Program {
 public static unsafe void Main() {
  byte *addr = (byte*)0x42;
  byte val = *addr;
 }
}

When run, this program raises a Null­Reference­Exception rather than an Access­Violation­Exception. On the other hand, if you change the address to 0x80000000, then you get the expected Access­Violation­Exception.

With a little bit of preparation, the CLR optimizes out null pointer checks if it knows that it's going to access the object anyway. For example, if you write

class Something {
 int a, b, c;
 static int Test(Something s) { return s.c; }
}

then the CLR doesn't need to perform a null pointer test against s before trying to read c, because the act of reading c will raise an exception if s is a null reference.

On the other hand, the offset of c within s is probably not going to be zero, so when the exception is raised by the CPU, the faulting address is not going to be exactly zero but rather some small number.

The CLR therefore assumes that all exceptions at addresses close to the null pointer were the result of trying to access a field relative to a null reference. Once you also ensure that the first 64KB of memory is always invalid, this assumption allows the null pointer check optimization.

Of course, if you start messing with unmanaged code or unsafe code, then you can trigger access violations near the null pointer that are not the result of null references. That's what happens when you operate outside the rules of the managed memory environment.

Mind you, version 1 of the .NET Framework didn't even have an Access­Violation­Exception. In purely managed code, all references are either valid or null, so version 1 of the .NET Framework assumed that any access violation was the result of a null reference. There's even a configuration option you can set to force newer versions of the .NET Framework to treat all access violations as null reference exceptions.

Exercise: Respond to the following statement: "Consider a really large class (more than 64KB), and I access a field near the end of the class. In that case, the null pointer optimization won't work because the access will be outside the 64KB range. Aha, I have found a flaw in your design!"

Comments (28)
  1. Henke37 says:

    Answer: The runtime knows when the optimization is safe. It won't do it if it isn't safe. Remember that it can still check the pointer before indexing.

  2. Adam Rosenfield says:

    If you have a class larger than 64 KB, you very likely have a serious design problem.  You could very easily blow your stack with a single one of these (though granted I think that's only possible in C# with value types like structs, not reference types, but correct me if I'm wrong).  Some of the systems I code for use a default stack size of 128 KB for non-main threads, and those systems are only about 7 years old.

  3. Dan Bugglin says:

    Answer: Treating ALMOST nulls as nullreferenceexceptions is simply a convenience to allow the developer to recognize that their problem is likely caused by a null dereference.  The error will still properly occur as an access violation exception, it will just be up to the programmer to recognize it's actually caused by a null dereference.

  4. Dan Bugglin says:

    Answer: Treating ALMOST nulls as nullreferenceexceptions is simply a convenience to allow the developer to recognize that their problem is likely caused by a null dereference.  The error will still properly occur as an access violation exception, it will just be up to the programmer to recognize it's actually caused by a null dereference.

  5. Dan Bugglin says:

    Sorry for the double post, I was getting an error page when posting (not a null reference exception, disappointingly) so I assumed my comment did not go through.

  6. RangerFish says:

    @The MAZZTer Was it an access violation?

  7. ashx says:

    >> If you have a class larger than 64 KB, you very likely have a serious design problem.

    From the lens of the JIT compiler, who cares ? It has to work whether it's good design or not.

    But of course the JIT compiler knows the size of the bare object, so it would simply skip the optimization in those cases.

  8. John Doe says:

    I guess it was too much trouble to make a common ancestor for NullReferenceException and AccessViolationException.

  9. j b says:

    Adam Rosenfield> If you have a class larger than 64 KB, you very likely have a serious design problem.  

    You wouldn't believe some of the constructions in some software. In the software for the System 12 ISDN telephone switches, the largest struct declaration took more than 8300 lines. Printed 72 lines to the page, that would be a book of well above a hundred pages! (I came across this when System 12 was in its earlier stages – it probably grew far beyond that before the software was retired.)

    Another figure from the same software: One of the linkers used to develop the software crashed because it used a signed 16-bit number to index the table of symbols exported from a module. There was a module that exported more than 32767 symbols. The table structure for handling imported symbols from an arbitrary number of other modules used at 32 bit index, but this was a single module that alone defined >32k global symbols.

    In MY eyes, both of these cases indicate serious design problems, but the telecommunications people don't agree with me. Appearently, this is standard fare in their part of the world.

  10. James Risto says:

    Perhaps it puts a guard page after the structure that has no read nor write access so traps?

  11. Myria says:

    Considering that you can't have arrays in a class in the sense that matters here, it'd take a seriously messed up class to get to 64 KB.

    In the example of using the unsafe pointer, the Runtime in theory could keep track of all the machine-code instructions for which a null pointer exception is possible, and note the difference.  I suppose that it is not worth the trouble to do this.

    I'm curious – does the Runtime use an unhandled exception filter or a vectored exception handler to convert the NT exception into a CLR exception?

  12. Mordachai says:

    Myria> I'm curious – does the Runtime use an unhandled exception filter or a vectored exception handler to convert the NT exception into a CLR exception?

    I'd be shocked if it doesn't! ;)

  13. Gabe says:

    Myria: The runtime absolutely allows arrays within types, although C# only allows them in unsafe contexts (so you rarely see them). I've used them for unmanaged interop before. That said, it's still unlikely to see a 64k class without trying really hard — although maybe there are likely situations where codegen could do it.

  14. Joshua says:

    @Myria, Steve Wolf: I always thought it was handled at catch time when it needs to convert the SEH exception to a .NET type.

  15. voo says:

    Well as always, why not try the whole thing? So I used the following python script:

    import math

    with open("test.txt", "w") as file:

       for var in range(int(math.ceil(70 * 1024 / 8))):

           file.write("public long foo{0};n".format(var))

    which just generates a really long list of longs that totals 70kb of class data.

    Putting that into a class called Foo and calling it like this (to actually get the JIT to run but make sure it doesn't optimize the whole calling out):

    static void Test() {

    Random rand = new Random();

    for (int i = 0; i < 10000; i++) {

    try {

    Foo foo = rand.Next() == 0 ? new Foo() : null;

    Console.WriteLine(foo.foo8959);

    } catch (NullReferenceException x) {

    }

    }

    }

    static void Main(string[] args) {

    for (int i = 0; i < 1000; i++) {

    Test();

    }

    }

    shows that at least with VS2012 in release mode we always get a NullReferenceException and nothing else – so the JIT is clever enough to avoid the problem.

  16. saveddijon says:

    If 64KB isn't enough then you can always reserve more. Most Unix systems detect "null" pointers by not mapping memory at address zero, going for some offset. I think 64MB is used in some versions of BSD.

    Any system with ASLR will also likely detect references well into the address space.

    But if that isn't good enough, implement it in the CPU. x86 doesn't do this, but I think ARM might: if all of your memory referencing instructions are loads and stores of the form:

    ld [Rbase + offset],Rdest

    st Rsrc, [Rbase + offset]

    where offset may be a constant or a register, then just have a mode where a special exception is taken if Rbase is zero, regardless of what the offset may be. Works for any size class/array/whatever, and doesn't require an MMU.

  17. Joshua S. says:

    If I make this check:

    if (s != null)

    s.c();

    …then does the JITter optimize out the intrinsic "cmp" null check?

  18. Matt says:

    If you have a big class and reference a field at the end of it, then there's a risk of memory corruption if the field is written to. Consider the following:

    [StructLayout(Size=64*1024*1024)]

    struct Big {}

    struct ArbitraryWrite { public Big b; public uint f; }

    class Program()

    {

     static void Main()

     {

       ArbitraryWrite aw = null;

       aw.f = 0x11223344;

     }

    }

    which would compile to:

    xor eax, eax ; // aw = null;

    mov [eax + 64*1024*1024], 0x11223344 ; // Boom.

    This would be equivalent to *(DWORD*)(64+1024+1024) = 0x11223344, which is really bad, and effectively allows someone to jump out of the .NET sandbox (this would be an 0-day in Silverlight, for example).

    To combat this situation, the runtime says that if you're accessing a field more than 64KB into a field, it'll explicitly check the value first:

    xor eax, eax; // aw = null

    cmp eax, [eax]; // this will AV at zero, giving us the a null-reference exception that we wanted

    mov [eax + 64*1024*1024], 0x11223344 ; // this is now only reachable if we're not at null, i.e. no 0-days here. Huzzah!

  19. Danny says:

    Huh? 64 KB class is a bad design? Get serious, all my classes I use on daily basis are bigger then 64 KB. Why? Because clients like shiny objects, that's why. Do any of you here live in real life? You got no other choice but to buy a 3rd party framework that's build on top of another 3rd party framework which is inherited from another 3rd party framework (and the chain can go on like in Scheherazade's 1001 nights of names "bin…bin…bin"). Clients want reports so I need a report framework, I don't have time to implement one. So I buy one that can print the undiscovered underground Moon tunnels. Clients want big data, so I need a framework for that too (even if the DB is the totally free and fast PostgreSQL I don't want to waste time creating Users, Events, Passwords etc etc tables so I buy a framework who does that for me). Clients want cloud, want this, want that. Do I implement them all? Hell noooo (otherwise I would still be at implementing Microsoft Windows step). They already exists so I buy then use for my bid, hence the client is satisfied with a lower price and faster delivery. And that means they are all over the 64 KB on daily basis. Over time you get fast using those frameworks, you upgrade from version to version (yes, spend money to gain money) and the overall idea is that you spend your time solving clients problems.

  20. Csaboka says:

    People weren't talking about classes whose code is bigger than 64K – they were talking about classes whose instance data is bigger than 64K. In other words, if you add together the sizes of all non-static fields (remembering that references only take the size of a pointer, not the size of the referenced object), and you get a number bigger than 64K, you are definitely doing something wrong.

  21. voo says:

    @Danny: I think you're confusing instance size with library size or something. Just because you use dozens of 3rd party frameworks doesn't mean you get large classes. For once most frameworks use classes, which means that a single instance of a gigantic framework still only costs you 4/8 byte.

    To get a class with more than 64kb you have to have about 8192 longs declared (and no standard arrays won't work as long foo[10000] is only a reference to the data so is again only 4/8 byte long). So no this is extremely rare (unsafe code has shortcuts for this kind of thing, but unsafe code doesn't have to do implicit null pointer checks so the whole point is moot).

  22. 12BitSlab says:

    A class with 64k of data is not too big.  I'll povide an example.  I have to update a table based on another table.  I could read the first table one record at a time and update the destination table.  However, that beats the living stuff out of the database.  Alternatively, I can read the entire source table into a class, generate the SQL needed to update the destination table, then execute the SQL statements.  I do this today in a job that runs each day @ 0400 hours.  The source table has around 720k records.  My class uses a boatload of space, but it runs FAR faster than handling things one record at a time.  On low grade server hardware, it runs around 28 seconds of wall time (not CPU seconds).

    There are limits to this.  I worked on a system 20 years ago that processed 30mm transactions per day.  Because of regulatory requirements, we had to keep 7 years of data in the database.  The technique of reading an entire table into memory won't work since it would be very difficult to buy that much memory and the system also has to serve the needs of 10,000 interactive users as well as anywhere from 100-1000 concurrent batch jobs.

  23. Jon says:

    @12BitSlab

    Unless you're doing something horribly wrong all that data is being pointed to by references and is not part of the actual class in memory. When you create a new instance of the class, it shouldn't be taking up 64K of stack, even if it is using gigs of data on the heap.

  24. Mike Dimmick says:

    @Myria: For x86, the CLR pushes a new SEH frame onto the stack (and links it up through fs:[0]) for every managed-unmanaged transition. At least, that was the case in v1.x, I don't know if it's changed in later versions. Source: blogs.msdn.com/…/51524.aspx (scroll down to 'A Single Managed Handler')

    Other processor architectures, including x64, use table-based exception handling, so RtlInstallFunctionTableCallback is used to tell Windows about the JITted functions. Source: blogs.msdn.com/…/x64-manual-stack-reconstruction-and-stack-walking.aspx

  25. LongCat says:

    ITT: "Programmers" having difficulty understanding the subtlety of 64KB of class data.

  26. Danny says:

    @voo / @12BitSlab – I am not confusing anything. Example of one shiny object I was talking about – devExpress components to show my clients shiny grids, shiny edits, shiny anything GUI has to offer. Do it yourself they are free trial, go get them and install them on your favorite VS version (mine is 2010) then launch just a small C# test with only one cxGrid and that's it. And then comeback to me and tell me the size of the instantiated class. I bet it's more then 64KB.

  27. Timothy Fries says:

    @Danny: You certainly are confusing the issue at hand.  The topic isn't whether an objects generally hold references to more than 64KB worth of data total, which is certainly common as your cxGrid demo proves; the topic is an object holding 64KB worth of data *within itself*; which is the only case where it'd be dereferencing further than 64KB from a single pointer offset.

    About the only natural objects in .NET that will get that large would be an array; or if you have some brain-damaged code  generator tool that spits out a class with tens of thousands of value-typed member fields.

  28. Y says:

    The CLR can be loaded dynamically into an existing process. Does it verify that the first 64k are really unmapped? Or is this guaranteed by the OS?

Comments are closed.