Installment Two: The Windows Image File format
To understand Windows imaging in Longhorn and Ximage you need to first understand a little about the Windows Image File (WIM) format. WIM files have five sections. The first three aren’t all that interesting to us, they’re the header, footer and metadata; these are all things that describe the file itself.
The image metadata section, however, is very interesting. Within each image file we can store multiple images. When you get your Longhorn CD, instead of having hundred of individual files, as you’re used to seeing on Windows CDs, it will have one large WIM file. Currently it is called install.wim. Inside this one WIM file you might find multiple SKUs of Windows. For example, one WIM might contain Longhorn Pro, Server and Advanced Server, or whatever we end up using for product names. The image metadata section of the WIM file describes each of the images that are in the WIM file, each of the files which makes up each image, (recall from installment one that Windows images are file based instead of sector based) and exactly how each file gets laid down on the disk when the file is restored.
The final section is the file data. This is the section which contains the files which make up the images. There are a couple of things that are interesting about the file data. First, the data is compressed by default. Second, the files are always stored using Single Instance Storage (SIS). I first heard about SIS from the Exchange team. Obviously it is common for a single email to reside in the inbox of hundreds of users. SIS enables the Exchange server to keep only one copy of that email. In the example above, a Longhorn CD with Pro, Server and Advanced Server all in the same file, there will be only one copy of ntdll.dll and only one copy of ntoskrnl.exe. It is a very efficient file storage format.
In my next installment I’ll talk about the public interfaces to the WIM file format.