Binary Patching tools (mspatcha, mspatchc)

Microsoft has some great binary patching tools. In my simple tests they're over twice as size efficient as zip files, and in some cases 10x more efficient.

It took me about 2 hours of searching around to find them, so I wanted to pass it on. I started off with diff/patch, but they didn't handle binary files well. I searched around and found various commercial tools. Eventually I stumbled across "Binary Delta Compression" and found these guys.

Background:
It's called the Delta Compression API, and it specifically uses knowledge of PE files to improve the compression techniques. It's part of the windows installer technologies.

Windows has 2 key dlls here:

  • mspatchc.dll - creates a patch (delta) file. This dll is in the Platform SDK .
  • mspatcha.dll - applies a patch (delta) file. This is in the system32 directory.

These dlls both expose APIs which are declared in patchapi.h and documented in the Delta Compression API on msdn.

The Platform SDK  includes very convenient command line tools (apatch.exe, mpatch.exe) to use the APIs.

Patching has some advantages:
- smaller than releasing the new dll.
- the patch is only useful if the audience has the original dll. So you can freely release the patch without worry about piracy.

Summary:
1. Download the Platform SDK. That will pull down mspatchc.dll, and also has some great sample tools in " C:\Program Files\Microsoft Platform SDK\Samples\SysMgmt\Msi\Patching"
2. The diff tool is mpatch.exe ("make patch", the thing that makes the patch file). The apply tool is apatch.exe ("apply patch").

Synatx for mpatch is "old_file new_file patch_file".
Syntax for apatch is "patchfile old_file new_file"

Demo 1: Major upgrade:

Let's say I want to make a patch to upgrade mscordbi.dll from V1.1 to V2.0. V1.1 mscordbi.dll was 233,472 bytes, and V2.0 was 288,768 bytes.  So that 55k of new code in V2 (we added about 50 new interface methods, including Edit and Continue).

I run:

    mpatch c:\WINDOWS\Microsoft.NET\Framework\v1.1.4322\mscordbi.dll c:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\mscordbi.dll dbi_patch

That produces a file, dbi_patch, which is 119,468 bytes. That's only 40% of the V2 file size. The zipped size of the V2 mscordbi.dll is 152k, so the patch is significantly smaller. And that has to deal with have 50k of new code (40% of the patch file size). Note that the patch file has very high entropy and so it won't compress well at all.

Now let's reaply the patch. We'll build a dummy dll, d2.dll, and then compare that to the V2 mscordbi.dll to ensure they're exactly the same.

 C:\temp\diff>apatch dbi_patch c:\WINDOWS\Microsoft.NET\Framework\v1.1.4322\mscordbi.dll d2.dll

APATCH 5.1.2600.0 Patch Application Utility
Copyright (c) Microsoft Corporation. All rights reserved.

100.0% complete

OK
 C:\temp\diff>fc /b d2.dll c:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\mscordbi.dll
Comparing files d2.dll and C:\WINDOWS\MICROSOFT.NET\FRAMEWORK\V2.0.50727\MSCORDBI.DLL
FC: no differences encountered

Demo 2: Minor upgrade

Patching is ideal when there are just minor updates, such as for a service pack.  When I make the patch between V1.0  mscordbi (221,184 bytes) and V1.1 mscordbi (233,472 bytes), the patch size is 43,166 bytes, which is 18% of the the new file size. I've played around with some other cases where the compression is ~10%, so your scenarios may hit a jackpot.

V1.1 mscordbi zips to 110k, so the patch is still < 40% of the compressed new file size.