Here is the library I found libpff
: libpff - Library and tools to access the Personal Folder File (PFF) and the Offline Folder File (OFF) format. - Google Project Hosting
It contains a shared library and tools to analyze Microsoft Outlook Personal Folder Files (PFF), including PAB, PST and OST file types. Embedded, this package has a pffexport library to export items in the archived filestructure, pffinfo to provide basic information about the files and pffexport-m recover to recover and export PFF items.
Another resource I've found included libraries called libpst
: libpst Utilities - Version 0.6.59
And on that same site, I found a nice detailed review of the header data and file structure of a PST: outlook.pst
The structure of these files is formatted as a B-tree, and thus very similar to a mini filesystem. I figured out that some of the internal data could be uncompressed with the zLib
library. And I also found a few documents written by Joachim Metz
on the structure of the PFF format. What was important for me in these documents was the endianness characteristics information being little-endian, along with the note that there are 2 types of PFF's, 32 bit and 64 bit, which tell us whether it's in ANSI or Unicode format, among a few other details. Luckily I didn't have to go as far as Outlook 2013, or I would have had to figure out which one it was out of 3 formats, because 2013 introduces a another 64 bit Unicode format with 4096 pages.
Why it didn't bother me to be looking at the PST format is because I figured that the files were closely related, and all pretty much documented in the PFF specification. Instead of 0x53 0x4D in the file header I was dealing with 0x53 0x4F, and a few very minor changes otherwise. And seeing whether it was an x64 or x86 format was all shown in the fileheader; this was a 64 bit format in my case.
All these documents are downloadable from that Google code page. I had one more that I found on MSDN but I can't remember which one it was. They provide lots of exchange server protocol papers including other office file formats documentation... I found lots of stuff, but I only have the downloadable content left over. I still have my search history, but I can't find out what page some of the information was on (I was on a ton of websites).
Anyways, I'd even figured out how to extract individual .msg files from the storage objects within the PST once I had it converted (what the customer requested). So I have all of this archived and ready to go.