Extract Files from Patches

From the mailbag, someone asked how to extract files from a patch. Now
presumably one would want to extract the files as they apply to a product if the
patch were installed but I will cover both ways because one can lead to the
other. If you’re looking for the simplest and quickest way to extract files from
a patch skip toward the end; otherwise, if you’re interested in the structure of
a .msp file and how to extract all files regardless of a particular
product the patch targets read on.

Recall from
What’s in
a Patch
that a .msp file contains sub-storages for pairs of
transforms that transform the patch target package to the patch upgrade package, and
possibly one or more sub-streams for the cabinet files that contain the files to
be patched. Because the internal structure of a .msp file uses OLE structured
storage you can extract the transforms and cabinet files out; however, to allow
for 72 characters instead of 36 characters as
limited by OLE, Windows Installer
compresses stream names except for the

summary information stream
, named 05SummaryInformation. You’ll also find
more streams than perhaps expected for use by Windows Installer. That doesn’t prevent
you from at least extracting the cabinet files from .msp files.

To enumerate and thereby extract all sub-storages and streams use the OLE
structured storage APIs like the
StgOpenStreamEx function to get a pointer to
the IStorage interface. Call the
IStorage::EnumElements function on the interface to get
the IEnumSTATSTG interface pointer. As typical with IEnumXXXX interface
implementations, call the
Next function. In this case you get an
STATSTG structure. If the
STATSTG.type field is STGTY_STORAGE (1) you’ve found a transform and the
STATSTG.pwszName is the name of the transform. If the STATSTG.type field is
STGTY_STREAM (2) you’ve found a stream. To determine if the stream is a cabinet
you can check the first 4 bytes of the stream for “MSCF”.

Patches produced with
PatchWiz.dll
from the Windows Installer SDK will
contain one cabinet with all files for all transforms in the patch. The files in
the cabinet all use the value of the File column of the
File table so with a
quick lookup you can get whatever files you want. This allows you get all of the
files for a patch regardless of what product .msi packages the patch
targets. Obviously there’s quite a bit of work here.

A similar approach is to open the .msp file using the
MsiOpenDatabase function, passing
MSIDBOPEN_PATCHFILE for the second parameter. Note that this cannot be done in a
custom action because the second parameter will marshal as a string so any value
besides MSIDBOPEN_READONLY (0) won’t marshal correctly.

You can then use the view APIs like the
MsiDatabaseOpenView
,

MsiViewExecute
, and
MsiViewFetch
functions to query the
_Storages table to get
the transforms and the
_Streams table to get the cabinet file in a patch.
Querying the _Streams table in a .msi file or a .msm file may also return other
streams like binaries in the
Binary table or icons in the
Icon table. While you
can read data directly from the Data column of the _Streams table using the

MsiRecordReadStream function
you cannot read from the Data column of the _Storages
table. You can use the names and the OLE structured storage APIs as described
above to get the exact name of the sub-storage to extract using the

IStorage::OpenStorage function
.

There is a much simpler way to accomplish all of this but you’ll only extract
files from a patch that apply to a specific product since the first pair of
transforms to apply to a product from a patch are used. If your patch only
targets a single product then you have no worries. You first perform an
administrative installation of the target product
.msi package, which
runs only basic actions like

InstallFiles
in the

AdminExecuteSequence table
. Passing the command to
start /wait

will block
until msiexec.exe completes and returns.

start /wait msiexec /a product.msi TARGETDIR=”%TMP%Product” /qn

Next you apply the patch that contains the files you want to extract. This is
the same method you would use to apply any

minor upgrades
that a patch might target. Patches will typically transform
the AdminExecuteSequence table to add the

PatchFiles
action.

start /wait msiexec /p patch.msp /a “%TMP%Productproduct.msi” /qn

Now the files that were patched in the product will exist in the directory
structure and you can fish them out as necessary. If your patch targets multiple products you’ll need to repeat this
for each product, which is why in such cases the more complicated method of file
extraction described above is beneficial. Note also that any directories that
depend upon 64-bit redirection but whose source directories structures are the
same will overwrite files because such redirection is not performed for
administrative installations. This
happened
with an early pre-release of the .NET Framework 2.0.