Parsing the POP3 UIDL History

In my previous post, we discussed how to locate Outlook’s POP3 UIDL history. Now that we have the blob, let’s look at how to parse it:

POPBlob Structure:

  • Version (2 bytes): Must be PBLOB_VERSION_NUM (3)
  • Count (2 bytes): Count of resource tags
  • Resource tags (variable): 0 or more null terminated UTF-8 strings encoding the resource tags. The number of null terminated strings must match Count.

Resource Tags:

A resource tag encodes a UID with some metadata. The format of a resource tag string is represented as follows:

Mcyyyymmddhhmmssuuuuuuuuuuuuuuuuuu...

where

  • M (1 char): '+', '-', or '&', indicating a successful get, delete, or get-and-delete
  • c (1 char): ' ', 'h', or 'b', indicating content of none, header, body
  • yyyy (4 chars: Four digit year of download
  • mm (2 chars): Two digit month of download
  • dd (2 chars): Two digit day of download
  • hh (2 chars): Two digit hour of download
  • mm (2 chars): Two digit minute of download
  • ss (2 chars): Two digit second of download
  • uuuu... (variable): Encoded uid (Unique Identifier) of a message

The encoded uid of the message has been escaped so only alphanumeric characters and the character '$' are present. Non-alphanumeric characters in the original UID are represented as '$'+ 2 digit hex encoding. For instance, the character '-' is encoded in the UID as $2d

For example, this blob:

 030017002B623230313230393036313331313338304243353335444224326445413633243264313145312432644137354324326430303231354144374242373400 
2B623230313230393036313331313337313433444242434224326445413636243264313145312432644135463224326430303232363443313534424100 

2B623230313230393036313331313339323445383833333324326445413334243264313145312432644133414324326430303231354144374631353600 

2B623230313230393036313331313336333446324533383124326445423236243264313145312432644244353924326430303231354144383043324300 

2B623230313230393036313331313339333733443545363924326445413236243264313145312432644231363024326430303231354144393946303000 

... 

Can be interpreted as follows:

 0300 Version: PBLOB_VERSION_NUM 
1700 Count: Count of restags (0x17 = 23) 

2B623230313230393036313331313338304243353335444224326445413633243264313145312432644137354324326430303231354144374242373400 Tag 1 

2B623230313230393036313331313337313433444242434224326445413636243264313145312432644135463224326430303232363443313534424100 Tag 2 

2B623230313230393036313331313339323445383833333324326445413334243264313145312432644133414324326430303231354144374631353600 Tag 3 

2B623230313230393036313331313336333446324533383124326445423236243264313145312432644244353924326430303231354144383043324300 Tag 4 

2B623230313230393036313331313339333733443545363924326445413236243264313145312432644231363024326430303231354144393946303000 Tag 5 

...

And we can parse one of the resource tags as follows:

 2B623230313230393036313331313338304243353335444224326445413633243264313145312432644137354324326430303231354144374242373400 = 
"+b201209061311380BC535DB$2dEA63$2d11E1$2dA75C$2d00215AD7BB74" 

+ = Successful get 

b = Content is body 

2012 = Year 

09 = Month 

06 = Day 

13 = Hour 

11 = Minute 

38 = Second 

0BC535DB$2dEA63$2d11E1$2dA75C$2d00215AD7BB74 = UID

This UID can then be interpreted as:

 0BC535DB + $2d + EA63 + $2d + 11E1 + $2d + A75C + $2d + 00215AD7BB74 = 
0BC535DB + '-' + EA63 + '-' + 11E1 + '-' + A75C + '-' + 00215AD7BB74 = 

0BC535DB-EA63-11E1-A75C-00215AD7BB74

So the first UID encoded in this blob is "0BC535DB-EA63-11E1-A75C-00215AD7BB74", and the body was successfully retrieved on 9/6/2012 at 13:11:38.