I’m just finishing up my Masters in Computer Science, and was surprised when I recently got an assignment that involved parsing PDB files. Over the years I’ve been involved in many discussions about why the Microsoft PDB format isn’t public. John Robbins recently went as far as to say “The actual file format of a PDB file is a closely guarded secret“, but that’s probably and overstatement given that the CCI code which contains a C# managed-PDB reader has now been released on CodePlex. So needless to say, I didn’t expect to be having to parse PDB files for school … especially in a computational biology class.
Ok, so if you do a web-search now for “PDB file format“, the first hit you get isn’t for Microsoft’s Program Database file format, but for the Protein Data Bank file format. Who ever came up with the idea that file types should be uniquely determined by a 3-letter extension that isn’t coordinated by any central registry anyway? Now I’ve got to decide which way I want windows to treat PDB files (and both have been pretty important to me lately):