Sections, Symbols and References in the Windows Installer XML (WiX) toolset.

Thus far, it seems everyone has been creating one single .wxs source file for their entire MSI or MSM file.  This is understandable, since the "Getting Started" topic in the WiX.chm only shows one .wxs file per MSI and MSM.  And if you started learning WiX by trying to decompile an existing MSI or MSM, dark will only generate a single .wxs source file for your MSI or MSM file.  But the real power of the WiX toolset only becomes apparent when you break up your setup into different sections then let the symbols and references tie your source files back into a cohesive package.

I'll start by showing you the WiX source code then I'll try to explain what it does.  Let's assume we have a file called "product.wxs" that looks like this:

 <?xml version='1.0'?> 
<Wix xmlns='https://schemas.microsoft.com/wix/2003/01/wi'> 
   <Product Id='00000000-0000-0000-0000-000000000000' Name='MyProduct' Language='1033'
            Version='0.0.0.0' Manufacturer='My Corporation'> 
      <Package Description='My Product' Comments='My Product That Is Just An Example'
               InstallerVersion='200' Compressed='yes' /> 

      <Media Id='1' Cabinet='product.cab' EmbedCab='yes' /> 

      <Directory Id='TARGETDIR' Name='SourceDir'> 
         <Directory Id='ProgramFilesFolder' Name='PFiles'> 
            <Directory Id='MyDirectory' Name='MyDir' LongName='My Directory' /> 
         </Directory> 
      </Directory> 

      <Feature Id='MyFeature' Title='My Product Feature' Level='1'> 
         <ComponentRef Id='MyComponent' /> 
      </Feature> 
   </Product> 
</Wix> 

What I've defined above is the skeleton of a MSI product.  At the top is the required <Product/> and <Package/> elements that provide the identification information for this package to the Windows Installer.  Then I provide the <Media/> element that defines how any file Resources that are a part of this package should be laid out.  In this case, I want all the files compressed into a single cabinet and that cabinet stored as a stream inside the MSI file.  Next, I provide my bare bones Directory tree.  Finally, this package is finished off with a very simple Feature tree with one Feature containing one Component.

"Hey, wait!  Where's the Component definition for 'MyComponent'?" you might ask.  Before I can answer that very important question I need to add a couple more examples files.  First, let's add another WiX source file called "fragment.wxs" that looks like this:

 <?xml version="1.0"?> 
<Wix xmlns='https://schemas.microsoft.com/wix/2003/01/wi'> 
   <Fragment Id='MyFragment'> 
      <DirectoryRef Id='MyDirectory'> 
         <Component Id='MyComponent' Guid='00000000-0000-0000-0000-00000000000' DiskId='1'> 
            <File Id='MyFile' Name='myfile.txt' LongName='My File.txt' src='present.txt'/> 
         </Component> 
      </DirectoryRef> 
   </Fragment> 
</Wix> 

If we skip the <Fragment/> element the rest of the WiX code should look pretty familiar.  I've defined a Component named "MyComponent" (with a bogus GUID) in the "MyDirectory" Directory and noted that any files contained by this Component will be a part of the Media with Disk Id labeled 1.  Then I declare that the Component contains a single text file.  For good measure, let's say that there is a file called "present.txt" that looks a lot like this:

 Each day is a gift.  That's why we call it the present. 

Before (finally) explaining in detail how this all works, let's first prove that it works.  Here is the output from my compilation and linking.

 C:\example>candle.exe product.wxs fragment.wxs 
Microsoft (R) Windows Installer Xml Compiler version 2.0.1621.0 
Copyright (C) Microsoft Corporation 2003. All rights reserved. 
product.wxs 
fragment.wxs 

C:\example>light.exe product.wixobj fragment.wixobj -o product.msi 
Microsoft (R) Windows Installer Xml Linker version 2.0.1621.0 
Copyright (C) Microsoft Corporation 2003. All rights reserved. 


C:\example> 

No output from the light means there were no errors so you should now have a "product.msi" file sitting in the same directory with all your other files.  You can install that MSI and see it show up in your Add/Remove Programs if you like, but trust me this all works.

"But how did it work?"

Well, when candle compiles your source code it creates an object file (.wixobj) that has zero or more sections in it.  The elements that are children of the <Wix/> element (namely: <Product/>, <Module/>, and <Fragment/>) define a new section.  So in the example above, product.wxs defines one section and fragment.wxs defines another.

Sections contain data and references.  Most of the data in the section is information that will end up in the final package (MSI file).  Some of the data is just information needed by the linker or binder to build the package.  For example, the <File/> element shown above contains the necessary information to define a file Resource in the package as well as the "src" attribute that tells the binder where to find the physical file on disk so that the file can be put into a cabinet and inserted into the package.  Finally, the data in the section is used to define all of the symbols for the section.

A symbol is the unique identifier for a WiX element in your .wxs source file.  In general, the symbol for an element maps to the primary key columns of the MSI table the WiX element represents.  For example, the <File/> element's "Id" attribute in WiX maps to the MSI File table's File column which is the primary key column.  It is pretty safe to assume that all "Id" attributes in the WiX schema represent symbols.  If I was to take a stab at the symbols defined in the example source files above, I think this would be the list:

 product.wxs
Product:00000000-0000-0000-0000-000000000000 
Media:1 
Directory:TARGETDIR 
Directory:ProgramFilesFolder 
Directory:MyDirectory 
Feature:MyFeature 

fragment.wxs
Fragment:MyFragment 
Componet:MyComponent 
File:MyFile 

Of course, I might be missing one or two, but hopefully you get the idea of what the compiler thinks is a symbol.  If you really want to know for sure, take a look at the tables.xml file for the columns marked "symbol='yes'".

Symbols exist to be referenced.  References, the only thing other than data in a section, point at symbols in the current section or other sections.  The compiler creates references to symbols when necessary and stores the references at the top of the section in the object file.  Obviously elements like <ComponentRef/> or <DirectoryRef/> create references to Components and Directories respectively, but the compiler will create references in other cases as well.  For example, the <Component/> element's "DiskId" attribute creates reference to a <Media/> element's "Id" attribute.  Since, the .wixobj file contains the references I can easily list them here for you:

 product.wixobj
<reference table="Component" symbol="MyComponent" />

fragment.wixobj
<reference table="Directory" symbol="MyDirectory" /> 
<reference table="Media" symbol="1" /> 

Note: I have purposely skipped over the complex reference discussion here, but I'll come back to that in some future blog entry.

Thus far, I've only talked about the compiler.  Now that we know the basics behind sections, symbols and references we can talk about the details of the linker.  This is where the real power of the WiX toolset kicks in.  I also believe the linker differentiates the WiX toolset from the other tools I have seen and/or heard of that can build MSI files today.

The linker starts by processing all of the sections in the provided object files looking for an entry section.  Today there are two types of entry sections: products and modules.  As you would expect, when the linker encounters a product entry section it knows it is generating a MSI.  If the linker encounters a module entry section the linker knows it is creating a MSM file.  If the linker comes across two entry sections in the object files, it gives up with an error since the linker cannot generate two outputs at the same time.  Consider the entry section to be like the "main()" function in a C or C++ program.  That's where the linker starts the programs execution.

While the entry section is being located, the linker is also building up the table of symbols from every section from the provided object files.  If any symbols are found to be duplicated, the linker will give up with an error.  In the C/C++ linker, this error condition is very similar to the case where you define the same variable in the same scope.  Once all of the sections have been processed and a single entry section is found, the linker starts resolving references starting at the entry section.

When the current section has a reference that resolves to a symbol in another section the other new section's references are added to the list to be resolved.  The process continues until all references are resolved.  If a reference cannot be resolved it causs the linker to bail with an error.  This error case is similar to the C/C++ linker cannot find a matching function implementation for one of your calls.  Also, any sections that are not referenced are ignored.

It is important to note that sections are the atomic unit of linking.  In other words, either all of the information in a section is included in your final output or none of it is included.  This fact is important to keep in mind when splitting your source code into Fragments.  You only need one symbol in a Fragment to be referenced and the entire contents of the Fragment will be a part of your final output.

Before wrapping up this blog entry, let's step through the example we've used so far.  Remember, up above, we provided light the fragment.wixobj and the product.wixobj object files to link.  The linker would load all of the symbols in those two object files (getting a list much like I described above) and figure out that the section created by the <Product/> element is our entry section.

The linker would then take the only reference in that section (as shown above) and start looking for the symbol "MyComponent" in the Component table.  Of course, that reference resolves into our fragment.wixobj.  Then the two references from the fragment.wixobj would be resolved.  Remember, references from each section must be resolved.  In this case, the "MyDirectory" in the Directory table and "1" in the Media table references are resolved by symbols from the entry section.  The linker now happily goes along its merry way finishing the linking process using those two sections to build the final MSI file.

Hopefully this blog entry helps explain some of the inner workings of the WiX toolset so that you can take better advantage of the tools.  This write up (or something like it) will be making its way into the WiX documentation so I would appreciate any feedback that makes sections, symbols, and references in the Windows Installer XML toolset make sense.