This Blog will use the sample Heap-on-Node (HN) from section 3.8 of MS-PST and walk through the process of how to read a property from it. The current version of the MS-PST open specification document can be found here: http://msdn.microsoft.com/en-us/library/ff385210(office.12).aspx
First, it's important to understand that there are several layers and structures involved here. A basic understanding of the NDB and LTP layers is extremely helpful, but not required.
The construction of the Heap-on-Node can be summarized as follows:
- The Block is a Heap-on-Node.
- The BTree-on-Heap is built on top of the Heap-on-Node.
- The Property Context is built on top of the BTree-on-Heap.
First, we will take a look at the HNHDR which is the first 12 bytes of the block.
Figure 1: The HNHDR.
For now, we really only care about 2 of the properties here. The first 2 bytes which is the ibHnpm property with a value of 0x00EC. This is the offset from the beginning of the block to the beginning of the HNPAGEMAP. The 4th byte is the bClientSig with a value of 0xBC. This tells us that this block contains a Property Context which is built on top of a BTree-on-Heap. This will be important later.
Looking at the offset in the Block specified by the ibHnpm property, it's not possible to know the length of the HNPAGEMAP until you read the first property, cAlloc, which in this case is 0x0008. This tells us that the rgibAlloc table will contain 9 (8 + 1) entries. Each entry is a 2 byte WORD value. So, the length of the rgibAlloc table is 18 (9 * 2) bytes in length. Add the 2 bytes each for cAlloc and cFree and you get the total length of the HNPAGEMAP which is 22 bytes. Reading all 22 bytes we have the following.
Figure 2: The HNPAGEMAP.
Starting at the 5th byte, read the next 18 bytes as a series of 2-byte values to get a list of the offsets for the starting location of each allocation in the block. The 9th offset is a place holder that tells an application where the next allocation should start, it is not an allocation that contains data.
The first allocation starting at offset 0x000C will be the BTHHEADER which is 8 bytes in length.
Figure 3: The BTHHEADER.
We care about 3 values from the BTHHEADER; cbKey, cbEnt, and hidRoot.
It's VERY important to understand that the hidRoot is a HID structure that should be read as 2 separate 2-byte values and NOT as a single 4-byte value.
The combination of the cbKey and cbEnt tells us how long each PC BTH Record will be. Remember back when we looked at the HNHDR and the bClientSig value was 0xBC (bTypePC)? That's how we know that the records will be PC BTH Records. In this case, each record will be 8 bytes in length.
Locating The hidRoot
The hidRoot is worth spending a little bit more time talking about. The current version of the MS-PST document (v2.1, 2/10/2014) is missing some vital information for working with HID structures. The current version of the documentation was valid for older versions of Outlook, however the current version of Outlook handles this structure differently. The exact version of Outlook that first handled it this way is not known.
Looking at the hidRoot we know that the hidBlockIndex is 0x0000 which means that the heap item is contained in the same block we are currently looking at. The hidIndex has a value of 0x0040 which seems strange because it's supposed to be the 1-based index to the heap item where we can look to start reading some PC BTH Records. Since this BTree-on-Heap node only contains 8 allocations, it couldn’t possibly be the 64th allocation.
In order to get the correct allocation index we need to bit shift the hidIndex 5 places to the right. This will result in a value of 0x0002. Therefore, the PC BTH Records begin at the second allocation. This information should be included in a future release of the documentation and is vital in obtaining the correct hidIndex value.
Reading The PC BTH Records
We now have enough information to read the collection of PC BTH Records from the second allocation. First, in order to determine the length of the 2nd allocation we subtract the offset of the 3rd allocation from the offset of the 2nd. 0x6C - 0x14 = 0x58 (88 bytes).
Figure 4: The PC BTH Records.
We also know that each PC BTH Record is 8 bytes. Dividing 88 bytes by 8 bytes each gives us 11 records.
- 34 0E 02 01 A0 00 00 00
- 38 0E 03 00 00 00 00 00
- F9 0F 02 01 60 00 00 00
- 01 30 1F 00 80 00 00 00
- DF 35 03 00 89 00 00 00
- E0 35 02 01 C0 00 00 00
- E3 35 02 01 00 01 00 00
- E7 35 02 01 E0 00 00 00
- 33 66 0B 00 01 00 00 00
- FA 66 03 00 0D 00 0E 00
- FF 67 03 00 00 00 00 00
Each PC BTH Record starts with a 2-byte wPropId and a 2-byte wPropType. The list of possible wPropType values can be found in MS-OXCDATA section 2.11.1 and MS-OXPROPS contains the Master Property List which is probably the best place to look for Property ID values. However, you will also find Property ID values scattered throughout other documents where those properties are used.
Getting The Property
We will use the 4th PC BTH Record for demonstration purposes because we can easily look up what the Property ID and type is. The wPropId is 0x3001 and the wPropType is 0x001F. Searching in MS-OXPROPS we find that it belongs to the PidTagDisplayName property and that the type is PtypString. According to MS-OXCDATA section 2.11.1 the PtypString is "Variable size; a string of Unicode characters in UTF-16LE format encoding with terminating null character (0x0000)."
Now that we know what the property and the type is we can go retrieve it. The 3rd property of the PC BTH Record is the dwValueHnid which can represent a few different things. Refer to MS-PST section 220.127.116.11 on how to determine what the type of data stored in it. In this case, it's a HID structure. Reading the hidIndex value of 0x0080 and bitshifting it 5 places to the right (like we did for the hidRoot earlier) we get an index value of 0x0004. That means that the string value that belongs to the PidTagDisplayName is stored in the 4th allocation of this block. Looking back at the list of offsets from the HNPAGEMAP we can see that the 4th allocation starts at offset 0x007C and is 0x10 (16 bytes) in length.
Figure 5: pidTagDisplayName.
This represents the string "UNICODE1". Unfortunately, this block does not contain any other meaningful properties that we can look up in MS-OXPROPS, but if there were you would follow the same process.
Figure 6: Complete Heap-On-Node Property Context Block.