PST Files In Multiple Profiles

A scenario recently came up that a couple of customers have hit with Outlook’s version of MAPI. These customers are in the business of processing PST files found on a user’s machine. They may be loading them up to scan the messages for viruses, or to ensure they’re backed up. Since opening a PST requires MAPI, they would create a MAPI profile to do their work, add the PST to the profile, read the data they needed to read, then logoff the profile and delete it. These scans would happen throughout the day without user interaction. The interesting scenario occurs when the user happened to be running Outlook, and they had the PST already loaded in Outlook’s profile. In Outlook 2003, we didn’t get any reports of a problem with this. But with Outlook 2007, both of these customers started seeing failures to open these PST files. Usually they wouldn’t have a problem, but every once in a while, they couldn’t open the file.

Investigation

Let’s look at what happens when you open a PST file using MAPI. First – MAPI itself doesn’t deal with the PST. It’s the provider, mspst32.dll that opens the PST file. The first thing the PST provider tries to do is open the file with exclusive write access, allowing others to read but not write. If this fails, then it assumes another instance of the PST provider has already opened the file, so it requests read access. If both processes accessing the PST are running as the same user, in the same session, the provider is able to coordinate access to the file. This is how it worked in Outlook 2003 and usually there weren’t any problems sharing access, even if the different MAPI sessions were using different profiles.

In Outlook 2007, as part of some optimizations around how we read and write to the PST, we implemented a cache. Access to this cache is controlled by a number of shared objects (a memory mapped file, some events, etc.). These shared objects derived their name from the path of the PST file. And here is where the problem came in.

Investigation using Process Explorer showed that when the problem happened, Outlook’s MAPI session and the customer’s MAPI session were referencing the same file, using the same path, but the case was different. For instance, in one process, the file handle might point to:

C:TestFilesMyPST.pst

where in the other process, the file handle points to:

C:testfilesmypst.pst

Note that some characters are uppercase in one path, and lowercase in the other. For access to the file, this difference in case doesn’t matter, so everything worked in Outlook 2003. And in Outlook 2007, again, for access to the PST file itself, the difference in case wasn’t a problem. However, when we build the names for the shared memory objects, case does matter. For instance, memory mapped files are created using the function CreateFileMapping. Although it’s not specifically documented as such, this function is case sensitive. The names

C__TestFiles_MyPST_pst_WCINFO

and

C__testfiles_mypst_pst_WCINFO

when used in the lpName parameter of CreateFileMapping will point to two different objects. So our mechanism for synchronizing access to the cache fails, and the second process to try to access the PST ends up returning an error, usually MAPI_E_FAILONEPROVIDER.

I raised this as a bug with development, with the suggested fix that we just lowercase (or uppercase) the paths before building the shared object names. However, in the course of trying to fix it, we realized the problem is actually much bigger than the case of the path. For instance, if one profile uses the path:

C:TestFilesMyPST.pst

and the other uses

C:testfi~1mypst.pst

both are still accessing the same file. However, this scenario wouldn’t be fixed by simply lowercasing the file name. Also problematic would be sym links, drive mappings, etc. The real fix is to not depend on the path name as part of the synchronization, and instead use some internal characteristic of the PST file itself. This fix is in the works, however, it’s too big to get into a hotfix. The next version of Outlook should handle all of these scenarios much better.

Workaround

As it turns out, the workaround for this is fairly straightforward. It’s based on this fact: As long as all the profiles accessing the PST use the same path, with the same case, then the problem can’t happen. All the shared memory objects will use the same names and there won’t be any problems with synchronization. So, if you have an application that routinely adds PSTs to your own profile, and wish to avoid this problem, all you have to do is check if there are any other profiles, and if they’re using the same PST. If they are, use the same path they’re using, and then you can’t conflict with them! In practice, scanning the profiles looks like this:

  1. Use MAPIAdminProfiles to get an IProfAdmin object and IProfAdmin::GetProfileTable to get a table of profiles.
  2. For each profile in the list, use IProfAdmin::AdminServices to get an IMsgServiceAdmin object and then IMsgServiceAdmin::GetMsgServiceTable to get a list of services in the profile.
  3. For each service where PR_SERVICE_NAME is “MSPST MS” or “MSUPST MS”, use IMsgServiceAdmin::AdminProviders with the value from PR_SERVICE_UID to open an IProviderAdmin object, then use IProviderAdmin::GetProviderTable to get a list of providers.
  4. From the provider in the table, use IProviderAdmin::OpenProfileSection to open the profile section, from which you can read PR_PST_PATH.
  5. Add this path to your list of “known PSTs”.

This process can be repeated each time you have a new PST to manipulate, or just periodically. Once you’ve got your list of known PSTs, you just need to compare your PST to the ones in the list. As long as you ensure the path you use is the same as the path in the other profiles, you eliminate the possibility of this problem happening.

This comparison can be simple or complex, depending on what scenarios you want to cover. Given that the most common scenario you’re going to hit here is the paths are the same except for casing, you could just do a case insensitive compare and cover most of cases. If you want to cover more cases, you could use GetFullPathName on the paths before you do your comparison. And if you want to cover even more cases, you can use GetFileInformationByHandle to get the volume number and index for each path and compare those.