When using Enhanced Write Filter (EWF) in RAM or RAMREG mode several customers might assume the EWF overlay is limited solely by the availability of physical memory. Consequently, many assume they will be able to achieve an overlay twice as big on a system with 2 GB RAM than on a system with 1 GB RAM. This is not true by any means. This article explains the factors that limit the overlay size and the significant improvements seen on Windows Embedded Standard 2011.
If you are interested more in the conclusion than the internals, skip the following sections and jump directly to the “Results and Conclusion” section.
EWF memory allocation internals
EWF maintains a list of memory descriptor lists (MDLs) that describe the entire overlay. Using MDLs is the preferred memory allocation technique for kernel mode drivers that have huge memory needs that exceed the Paged Pool or Non Paged Pool limits. The overlay size is increased on a "need to" basis and not pre-allocated in DriverEntry. Whenever a new write arrives and it cannot be accommodated in the overlay, we allocate a new MDL that describes 64 KB, increasing the overlay size by 64 KB. No more allocations are needed until we run out of this 64 KB chunk. This continues until the system cannot allocate any more MDLs. Next, I will explain why this happens much before we actually run out of physical memory on the system.
Each MDL describes a set of pages – in the case of EWF, we use MDLs that describe 16 physical pages (64 KB per MDL / 4 KB per page = 16 pages per MDL). To map these 16 noncontiguous physical pages into 16 contiguous virtual pages, the operating system needs 16 Page Table Entries (PTEs). Specifically, it needs 16 system PTEs because MDLs don't use regular PTEs. So each time the overlay grows by 64 KB we have 16 fewer system PTEs.
System PTEs are not an unlimited resource and their total number does NOT increase linearly with physical memory. So the EWF overlay size will be limited by the availability of system PTEs and NOT by the availability of physical memory. Also, note that system PTEs are also used by the operating system and other kernel mode drivers (a big chunk by the video drivers)
The next question is – How many system PTEs does Windows provide? It varies and the primary factors are the version of Windows (Windows XP vs. Windows 7) and the processor architecture (x86 vs. x64). The exact numbers are not documented on MSDN, but you can easily find out for yourself using Windbg.
Using Windbg to determine the system PTE information
- Enable kernel debugging on the target system and restart the machine. You can either debug remotely using a serial / 1394 cable OR use local kernel debugging.
- Launch Windbg and set the symbol path to the Microsoft public symbols server
- “.sympath SRV*C:\PublicSymbols*http://msdl.microsoft.com/download/symbols”
Results and Conclusion
Windows Embedded Standard 2009 (32 bit edition) vs. Windows Embedded Standard 2011 (32 bit edition)
Note: On Windows XP, there is a memory management setting that can help increase the system PTEs (compared to default behavior on Windows XP). To use this, set the following REG_DWORD to 0xffffffff. “HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management! SystemPages”. This setting is not used by the Memory Manager in Windows 7.
Here’s how you should interpret the results - On Windows Embedded Standard 2009 with 2GB of RAM, system PTEs can only describe up to 1007484 KB (~ 984MB). The EWF overlay limit will be lesser than this owing to system PTE usage by other drivers and the operating system. In my experiments, I observed this usage to be around 300MB, thus limiting the EWF overlay to about 684 MB. The improvement in Windows Embedded Standard 2011 is impressive and hard to miss. On Windows Embedded Standard 2011 with the same configuration, system PTEs can describe up to 1709952 KB (1670 MB). Assuming similar system PTE usage by other components, EWF overlay can be as large as 1370 MB. The overlay limit has roughly doubled from Windows Embedded Standard 2009 to Windows Embedded Standard 2011.
You will also notice that limits on 2GB system are smaller than that on a 1GB system. To reiterate my earlier point, system PTEs don’t increase linearly with RAM. With more physical memory, the kernel needs to maintain larger book keeping structures (such as the PFN database), hence the increasing physical memory from 1 GB to 2 GB is not beneficial if you are looking to improve overlay limit. It is however beneficial for RAM hungry user mode processes.
Windows Embedded Standard 2011 (64 bit edition)
On 64 bit edition of Windows, the kernel address space is much larger (both theoretical and supported) and system PTEs can describe up to 128 GB of RAM. However, EWF on Windows Embedded Standard 2011 has only been tested on systems up to 4 GB of RAM. If you have embedded scenarios that use more RAM, please provide feedback either on the forums or the connect website. Refer to the product documentation that will accompany Windows Embedded Standard 2011 RTM for the final word on this.
Conclusions and call for action
- Estimate the overlay needs for your embedded scenario, compare it with the chart above and decide the correct platform (Windows Embedded Standard 2009 or Windows Embedded Standard 2011).
- Consider using 64 bit edition of Windows Embedded Standard 2011 if you need more overlay space than what is offered by the 32 bit edition.
- As always, provide feedback!
- Mark Russinovich’s blog article - “Pushing the Limits of Windows – Virtual Memory”.
- Memory Management advances in Windows Vista and Windows Server 2008.
- For general reading on OS Internals - Windows® Internals: Including Windows Server 2008 and Windows Vista, Fifth Edition.