Very little attention is often applied to PDB files, but if you do ship them to customers, then you will want to ensure they don't contain more information than is necessary. The debugging information stored here can be used to reverse engineer or attack your code, so it is worthwhile to take a precautionary step and verify (i.e. test) that there is no private data being exposed via your public symbol files.
You can use the Debug Interface functions to crack open and query your .pdb files to see exactly what information is being provided. Then it is just a matter of creating a simple tool which can be inserted into your release process to validate the files. As a straightforward sample, we will show how to use the DIA APIs to query for keywords which could be used by an attacker.
1 void KeywordList(IDiaSymbol *globalSym, LPOLESTR search)
3 IDiaEnumSymbols *searchEnum = NULL;
4 globalSym->findChildren(SymTagNull, search, nsCaseInRegularExpression, &searchEnum);
7 LONG numItems = 0;
10 for(LONG i=0; i<numItems; i++)
12 IDiaSymbol *sym = NULL;
13 searchEnum->Item(i, &sym);
17 BSTR symName = NULL;
20 printf(" %ws\n", symName);
24 sym = NULL;
29 searchEnum = NULL;
The "globalSym" pointer is the PDB files' main collection of symbols; to get this pointer, the typical call flow looks like:
· Create an IDiaDataSource object
· Call IDiaDataSource::loadDataFromPdb to open a specific PDB file
· Call IDiaDataSource::openSession
· Call IDiaSession::get_globalScope to get the global symbol container
On line 4 we do a case-insensitive wild-card search through the global scope (* and ? are the only supported wild-cards). If any matches are found, the result will be a collection stored in (and accessed by) the IDiaEnumSymbol interface.
The "Count" property (line 8) will tell us how many matches were found. From there we pull out each symbol using the "Item" method (line 13) and then query the symbol's name (line 18). Even though the documentation for get_name doesn’t mention anything about freeing up the returned BSTR, you will want to clean up the memory (line 22) using SysFreeString, otherwise you will end up with a memory leak.
We can then call this helper function, for example, to dump out all instances of the word "password", which is probably not something we would want to leave around for others to gather information about.