Deciphering the MSI Directory table, part 2.

Last time I left off with a quick story about how I first got started with the Directory table as an intern on the Windows Installer. I suppose that blog entry was really more of a teaser than a real story and it was certainly light on technical details. So, let's pick up with some of the data that I scrounged together before sitting down in Ben's office.

Here's an example of the simplest (non-empty) Directory.idt file that I could find as an intern:

 
Directory       Directory_Parent      DefaultDir
s72             S72                   l255
Directory       Directory
TARGETDIR                             SourceDir

First, let's decipher all of the data provided so far. This is an IDT file (a text file form of an MSI table, intended for use as an archival mechanism). IDT files have a very particular format. On the first line, we have the names of each of the columns separated by tabs. Thus this table has three columns called Directory, Directory_Parent, and DefaultDir. Easy enough but there is a lot of contextual data stored in those names that I'll revisit later.

The second line contains the data definitions for each of the columns. The letters specify that data type; "s" is string, "l" is localizable string, "i" is numeric and "v" is data stream. The numbers specify the size of the column which hopefully is self-explanatory, except that 0 means "unbounded". Finally, the case of the letter specifies whether the column is nullable or not; lowercase letters mean not nullable and uppercase letters mean nullable. In this case, only the Directory_Parent column is nullable.

The third line first specifies the name of the table (finally, we know this is the "Directory" table). The rest of the strings specify the primary keys of the table. Thus our Directory table has only one primary key column which happens to be the Directory column. If you are database illiterate like I was as an intern then know that the primary key columns in a database serve two roles. First, the data in the primary key columns of every row of a table must be unique. Second, primary key columns can be referenced from other columns (they are called foreign keys) in different tables or even the same table. If you stop and think about it the fact that primary keys are unique makes them perfect identities to be referenced by other columns. I explicitly remember the night that connection connected in my head.

I thought I was so smart. I obviously had so much to learn.

[to be continued]