Enabling ASLR for memory savings?

There's not a lot of easy-to-come-by-knowledge on how ASLR works or what it does, so to save people some effort, I'll place some links here:

https://www.nynaeve.net/?p=100

https://www.blackhat.com/presentations/bh-dc-07/Whitehouse/Paper/bh-dc-07-Whitehouse-WP.pdf

The first link talks about how to turn on ASLR for your dlls.  The second is an in depth discussion of the Vista RTM ASLR implementation and some deficiencies from a security perspective.

I'm actually interested in ASLR for an entirely different reason -- performance.

An important scenario for Axapta is hosting multiple instance of the client on a Terminal Server.  Obviously the less memory each instance of Ax32.exe uses, the more clients we'll be able to successfully host on a single machine.  This means customers save money on hardware and licensing.

Most modern operating systems look for ways to eliminate redundancy w.r.t. memory consumption.  One mechanism for doing this is attempting to load a DLL into the same spot in the virtual address space of each process on the system.  The short version goes something like this: notepad.exe and explorer.exe both need "shell32.dll" loaded into their own "private" address space. 

Windows doesn't use fully relocatable code like the linux ELF format is (that is, in a linux module, virtual addresses in machine instructions reference the program counter or some other well-known base address (like the GOT), so you can load that segment of code essentially anywhere in the flat 32bit address space).

Linux: https://en.wikipedia.org/wiki/Position-independent_code

Windows: https://en.wikipedia.org/wiki/Portable_Executable

Windows DLLs specify a "preferred address" in their PE image, telling the loader where, ideally, that code will be loaded.  This is a bit weird - at link time, you specify in the DLL where you'd ideally load your self into a processes address space.

In the event that notepad.exe and explorer.exe can both load shell32.dll into shell32's desired address space, windows can be pretty clever about the backing memory for shell32.  Even though each process gets a distinct and seperate logical virtual address space, the pages backing shell32 in each of them can be backed by the same physical memory.  Essentially, the VM system is only charged "once" for shell32. 

In the event that a DLL cannot load into its preferred spot in memory, the loader must "rebase" the DLL to a new address that is available in the target process.  This rebasing disqualifies the VM pages from being re-used across multiple processes.

Obviously in the case of Ax32.exe on terminal servers; we want to avoid rebasing as much as possible, as each instance of the client with a rebased module will consume more memory than is strictly needed, thus limiting capacity.

This is where ASLR comes in.  ASLR stands for "Address Space Layout Randomization", and its a security feature in Vista.  Read the earlier links on how and why ASLR is used from a security perspective.

The practical upshot of ASLR for us is that the rebasing mechanism has now been moved in-kernel, and is system wide per DLL instead of process-wide.  This means that if a rebase does occur in Ax32, it should be shared across all subsequent copies of AX32, leading to less private memory consumption per instance of the client.

Additionally, because the kernel rebaser doesn't leave gaps between loaded modules, there is less memory  page fragmentation and fewer pages are used in general. 

We're still experimenting with the per-client memory footprint advantages of ASLR, but hopefully they'll be there.  We're hoping to see some per-client memory savings due to ASLR on ASLR enabled operating systems like Windows Server 2008.