Discardability in drivers has nothing to do with discardability in user-mode (which has nothing to do with discardability, really)

Some time ago we discussed what the DISCARDABLE keyword means. In summary: It has no effect in user mode, and in kernel mode, it means that the memory should be thrown away after initialization is complete.

The two uses for the DISCARDABLE flag aren't really all that related to each other. They are just two components (the 16-bit Windows memory manager and the 32-bit Windows device driver loader) that evolved independently, both of which saw a bit and said, "Hey, I can use this for something!" but they each used it for something different.

Once upon a time, back in the days when two megabytes was a decent amount of memory, someone came up with an optimization: Since driver initialization code and driver initialization data are both thrown away at the end of initialization, we can merge them into the same page and save 4KB of memory. Multiply this by the number of drivers in the system, and that's a lot of memory being saved at system boot, which in turn means that we can boot in less memory. This is important when you are trying to minimize your system requirements.

The 32-bit device driver folks needed a bit in the segment attributes to say "This memory should be thrown away once initialization is complete." They saw a bit lying around with the sticker DISCARDABLE written on it and said, "Discardable, yeah, that perfectly describes what we want. Thanks for reserving that bit for us!"

Okay, so that explains the DISCARDABLE bit, and the important-at-the-time 4KB memory savings explains why driver initialization code and data are merged into a single page. But this results in a dreaded W|X page, which negates any benefit of DEP! Why are drivers still using this optimization that isn't that useful any more?

Inertia, probably.

You had a driver originally written in the days when 4KB was a lot of memory, so it used this one weird trick to save 4KB of memory. The driver then evolves over time, but the merging of driver initialization code and data hangs around because things stay the same until something makes them change.

Who knows, maybe there will someday be evolutionary pressure to get all the old drivers to change their section attributes. (I suspect pressure is low because the W|X page is not in memory for very long, so the attack window is hard to hit reliably.)

Comments (39)
  1. HK says:

    What is a W|X page? Tried to google it to no avail.

    1. Ray Koopa says:

      Write or executable memory page, I guess

    2. Arezz says:

      W|X = Write Or Execute (logical or), while you usually want W^X (Write Xor Execute), so that an executable page is not writable (and as such is e.g. protected against buffer overflows).

      1. Joshua says:

        But I like writing

        int func(int arg)
        int innerfunc(char *a, char *b)
        return arg ? strcmp(a, b) : stricmp(a, b);

        qsort(…, …, innerfunc);


        For the uninitiated, this puts a trampoline on the stack.

        1. Tristan Miller says:

          Any compiler I’ve seen doesn’t put innerfunc’s machine code on the stack; it’s ultimately just a scoping thing.

          1. VinDuv says:

            innerfunc’s code isn’t on the stack, but an executable trampoline is. innerfunc needs to capture the “arg” variable from the outer function, so it’s compiled as a function taking three arguments (arg, a, and b). A pointer to innerfunc is really a pointer to a trampoline, put on the stack, which adds arg to the argument list and jumps to innerfunc. (It needs to be the stack since the value of arg will change from call to call).

            If you build code with an inner function on Linux with gcc, you’ll notice that taking the pointer to an inner function returns a pointer on the stack, and that the stack is rwx instead of rw- when the program is running.

        2. voo says:

          It’s always hard to recognize sarcasm on the internet, so I hope you’re not serious.

          If you are: Using non standard extensions that open gigantic security holes is a horrible, horrible idea.

          1. Joshua says:

            Executable stack isn’t a security hole. Non-executable stack was defeated for stack smashing years and years ago.

            I was in fact in possession of the knowledge of how to do it in 2001, and did not recognize the malicious use possible of such a thing.

          2. voo says:

            Yes there are return to libc attacks but things like ASLR, flow guard (has that one been broken yet? Haven’t heard anything) and similar protections make this a great deal harder. On the other hand an executeable stack is basically childs play to exploit.

            And return to libc has been around for waaay longer than 2001, so wouldn’t have been particularly note worthy back then indeed :-)

          3. Patrick Star says:

            This is dealt with by special casing in the kernel, not by making the stack executable (at least not in proper DEP/NX implementations).
            When there’s a fault caused by attempting to execute code on the stack, the instructions are checked to see if they match a generated trampoline. In that case the trampoline behavior is emulated instead of killing the process.

          4. @Patrick Star: No, GCC does it by making the stack executable. It uses an assembler directive like

            .section .note.GNU-stack,”x”,@progbits

            to mark the object file containing the function as requiring an executable stack, which the linker will dutifully follow by creating an ELF with an executable stack when linking in that object file. Check the output of “execstack -q” on such an executable.

    3. exchange development blog team says:

      If you thought googling W|X pages was hard, just be glad we don’t have /^(?=.*\d)(?=.*[a-z])(?=.*[A-Z])(?=.[\W]).{8,}$/ pages.

  2. ranta says:

    “Driver Compatibility with Device Guard” says it makes the INIT section not writable. That closes the W|X attack window entirely.

  3. Ben Voigt (Visual Studio and Development Technologies MVP with C++ focus) says:

    The solution to the W|X issue is to have separate pages for initialization code and initialization data. Now you’re using twice as much memory (assuming that both actually fit into a single page. but bloat has been affecting drivers too), but not only is 8K not problematic, **it doesn’t matter anyway because it is going to be discarded**.

    The effective benefit is the releasing of the memory, not the merging of two sections that have very limited lifetime anyway.

    1. Dave Bacher says:

      The optimization may predate Windows 3.1.

      On Windows 1 through Windows 3.0, you had real mode. I believe Windows 1 ran in 256k, and know it ran in 384k. 8k versus 4k, times even the much lower number of drivers loaded in that era, was really significant. Even if you’re discarding it. You had maybe a 5mb, 10mb hard drive — and you’re trying to fit everything you need in there. And it was slow and noisy.

      1. alegr1 says:

        Windows in the standard mode (80286) didn’t care about pages

  4. kantos says:

    In all honesty I wouldn’t be surprised if the Kernel folks just load the page twice now. The likelihood of self modifying code is near zero these days as the performance penalty would be atrocious. As such it should be safe to load the page twice, once marked write, the other execute. Heck, if driver inits are not run at DISPATCH IRL then they could just mark it as executable and handle the access violation to do a copy on write if necessary.

    1. ranta says:

      Sure you could load the page twice to physical memory, but where would you put the copy in virtual memory?
      Code in other pages may call functions defined in the INIT page, and vice versa. Those calls are PC-relative and don’t have any relocation, so you cannot patch them if you move the executable version of the INIT page to a relative virtual address other than what the driver expects.
      Other pages may also contain pointers to objects or functions defined in the INIT page. Those have relocations but you don’t know whether they are intended to point to code or to data.
      You could try to put both versions of the page at the same virtual address, and activate the correct one during each page fault. But IIRC, x86 and amd64 page tables do not support unreadable executable pages, so if code in the INIT page reads a variable from the INIT page, it would get the original value from the executable version and not the current value from the writable version.

      1. IInspectable says:

        “Other pages may also contain pointers to objects or functions defined in the INIT page.”
        After reading today’s blog entry you should be able to explain, why this won’t happen.

        1. ranta says:

          It would be OK to have a pointer from another page to the INIT page if the driver ensures that the pointer isn’t dereferenced after the INIT page has been discarded, e.g. by changing the pointer to NULL at the end of DriverEntry. But I now checked the Windows-driver-samples source tree and you seem to be right that this doesn’t happen in practice.

          However, relocatable pointers from the INIT page to the INIT page certainly do happen; see filesys\miniFilter\minispy\filter\RegistrationData.c. There you again have the problem how to check whether the pointer needs to point to code or to data, if you want to move the writable version of the INIT page to a different virtual address.

          1. kantos says:

            Not really? This is a loader issue. Since drivers are fundamentally DLLs; and DLLs do this quite literally all the time… it’s really not an issue to do exactly that. Recall that the loader has to do fixups ANYWAY just to load the page into memory since drivers have to be HE-ALSR compatible already. The fact that we’re in kernel mode is just an artifact. The real killer would be IRQ level. But DriverEntry runs at PASSIVE_LEVEL so there is no reason that the loader couldn’t do any of this.

      2. kantos says:

        No but it supports Execute pages that aren’t writable, any attempt to write would trigger a GP fault IIRC.

    2. Myria says:

      Tell that to graphics driver writers. Although in this case it was user mode, I once dealt with a user-mode graphics driver helper that generated code at runtime inside my process, but didn’t then call RtlAddFunctionTable. The lovely undebuggable crashes when the driver had a bug were just awesome…

  5. alegr1 says:

    You need to tell your comrades at MS to stop sweating about saving non-paged and paged code space in kernel, and pay more attentions to PNP paged tag leaks and MS antivirus (i guess) allocating unhealthy amounts of paged pool. They’re being penny-smart but pound-foolish.

    1. You seem to be under the impression that the people who maintain the driver loader are also the people who write PnP drivers and anti-virus software.

      1. alegr1 says:

        No, but people who set “performance” targets for these things (“if you shrink total non-pageable code by 1 megabyte, and pagefile usage by 10 meg, you get a bonus!”) don’t have a clue about the big picture.

        1. Not sure what your point is. Are you saying that the people who set performance targets should reassign people from the loader team to the PNP team or the anti-malware team? (And accept the fact that those people will basically be the “new guy on the team” who has to learn how the system works.)

          1. alegr1 says:

            No, I believethat saving 5 MB of nonpaged pool or 20 MB of pagefile commit (at a cost of having bugs hard to disgnose) should not be a performance target at all.

          2. Klimax says:

            Not every device with Windows has loads of RAM… (and more VMs per physical machine is always desired and competitive advantage)

          3. alegr1 says:

            I can understand wanting to be able to run more VMs with less memory pressure, but this is where memory de-duplication would help A LOT.

          4. Klimax says:

            Deduplication can get you only so far. Also it (and memory compression) won’t help you much with cheap devices like 200USD tablets…

    2. The_Assimilator says:

      Which part of “Once upon a time, back in the days when two megabytes was a decent amount of memory” did you miss?

      1. alegr1 says:

        You won’t believe how much some kernel people of MS still believe in memory pressure savings of paged code. “It lets us run 1001 VM rather than 1000”. Never mind that I personally encountered at least two regressions (bugchecks) in MS code caused by that. How many are still there who knows.

  6. DWalker says:

    There are two hard things in computer programming. Naming things is one, and I forget the other… :-)

    1. Darran Rowe says:

      There are two hard things in computer science, naming things, cache invalidation and off by one errors.

      1. DWalker says:

        Oh yes, that’s it exactly!

  7. AlexShalimov says:

    Ah, the good old days days when every byte was valuable. I’m glad they are gone.

    “this one weird trick” — finally a perfectly justified usage of this phrase.

    1. Brian_EE says:

      Come over the embedded microcontroller world. Here bytes still matter. It’s surprising what you can do with 4K FLASH and 512 bytes RAM if you’re careful.

  8. Billy O'Neal says:

    BinScope will fail any driver that still does this. We in security land are fighting the inertia :)

Comments are closed.

Skip to main content