Thoughts on setup repackaging and reverse engineering

While I was on the Visual Studio and .NET Framework setup team, a colleague and I took a trip to visit 2 universities - MIT and USC - to validate the plans and designs we were working on to add support to our setup UI to generate transforms for Active Directory deployment for the Whidbey version. During this trip, I was struck by the level of MSI knowledge that the system administrators at both of these schools possess, and also by how willing they were to reverse engineer setups to do any/all of the following:

  • Convert a non-MSI setup to an MSI so that it can be deployed via Active Directory, using tools such as FileMon and RegMon
  • Remove specific custom actions, files, registry keys, etc from an MSI
  • Use transforms to make changes to the installation behavior of an MSI

The universities that we visited and many others that I've talked to who are in charge of software deployment simply need to roll out standard OS images with specific applications, and too few app developers understand the implications of their setup development decisions on things like admin deployment.

Then, when I joined the Windows Embedded team, I found a common theme among embedded customers - they need to install drivers and applications onto their embedded images, but they do not ship components that can be used by Target Designer and the embedded tools. This leaves the following options

  • Using the setup package to install on an embedded image after deploying the OS (which has drawbacks like increased footprint and additional troubleshooting required for identifying missing dependencies needed by the setup package, etc)
  • Reverse engineering the setup package and creating a component that can be imported into the embedded database using the same set of tools as the university admins above

It really struck me to think about the similarities in the problem spaces between the 2 teams that I have worked on at Microsoft. It also got me thinking because I have been working with and studying setup technologies for the 5 years I've been working at Microsoft, and over that time I've learned to think the way a setup developer thinks so that I can take apart a setup and figure out what it does. However, despite all of this experience, I still haven't seen any 2 setups that behave in exactly the same way.

I have come up with some tricks that can be useful to take apart a setup and figure out what it does, debug issues, and reassemble it in new ways. I'll list out some of the MSI-specific tips and tricks I've seen and used, and then some more general ideas:

Reverse Engineering MSI-based Setups

  • The biggest key to working with MSIs is to understand how Windows Installer works - what a component is, how Windows Installer does reference counting, what the data in key tables in the MSI are used for
  • Simple MSI setups are relatively easy to reverse engineer, assuming they only install files listed in the File table and write registry keys/values listed in the Registry table
  • As soon as you start introducing custom actions, type libraries, self registration, etc then the job becomes more difficult. If it wasn't for these types of “black box“ actions, our team could produce an msi2sld tool similar to the inf2sld tool that ships with the Windows Embedded toos. This tool could still be useful as a starting point for creating embedded components - I'll have to look into this some more
  • You can create an administrative share by running msiexec /a <msi name> to create a file layout that includes all of the files that the MSI will install - this can be useful when the files to be installed are consumed into the MSI via CAB/MSM files, and also when creating an embedded component where you need to create a file repository
  • Orca is an invaluable tool - you can change/remove launch conditions, custom actions, registry values, component conditions, and on and on. I generally only use Orca to look at the contents of an MSI though, I would recommend not changing a shipping setup - you never know what unintended side effects you may cause, and the worst possible thing would be if you make a change that ends up preventing future MSP's from applying to your machine.
  • Rather than using Orca to change existing setups, I suggest creating transforms. All you have to do is make a copy of the MSI you want to modify, keep one in the original form, modify the other with Orca, then use a transform creation tool to analyze the differences between the 2 MSIs and create an MST

Random Tricks for Packaging and Reverse Engineering

  • Running a setup package with the /? switch (or -? or -h or -help) will usually show a set of command line parameters supported by the setup. This is often useful to unpack a self-extracting EXE package to get to the files that will be installed or to get to the INF file for a driver to use inf2sld
  • Look for an INI file or DAT file in a self-extracting setup package or sitting next to a setup.exe - these often contain settings that are used during setup. Sometimes they will contain data or variables that are self-explanatory or contain comments (versions of Office are especially good at this), and these values can often be changed to update the behavior of a setup
  • Get to know other setup technologies - Wise and InstallShield both make MSI package creation tools, and have specific UI, packaging strategies and behaviors that are slightly different and unique; OCM and INF-based setups are used for installing drivers and some other components that ship as part of the OS on some platforms; IExpress is used for creating self-extracting EXE packages; SFXCab is a next-generation update to IExpress that is used for newer hotfixes created for Windows that are shipped on Windows Update

It seems like I have more tricks that I've used but I can't seem to remember them all right now. I'm getting tired tonight, but I think tomorrow I'll take Visual Studio and .NET Framework setup packages and walk through some of the inner workings and show how I use some of the above tricks.

I have no idea if anyone but me is really interested in the theories and technologies behind setup creation, but if you made it this far and have comments/suggestions/questions/etc please send me an email or post a comment. Thanks!