AceInfinity
Emeritus, Contributor
I was with a customer, and fixing some PC issues, along with some networking stuff. They had PST backups, and were using them from external harddrives (which is a problem, but I won't go into details here,) although the question came up for being able to see what was in the PST file, without having to view it through Outlook.
I did some research...
The PST format uses two cipher algorithms (both direct without a key) to encode the data blocks. Because they don't utilize a key, these are definitely just for data obfuscation so that somebody couldn't just open the file in a hex view and see all of your messages that way. Only the end user data blocks however are encoded in the PST file, meaning all other information (header, allocation metadata pages, and BTree) are stored without any kind of encoding, the data itself is all obfuscated though.
The important thing to note is that the encoding will always be consistently the same for this reason. Even if the PST file is passworded, you really don't control the level of security for the data which is obfuscated within the file. (Perhaps this is not a good thing?) Once, you understand the algorithm's used in the encoding process for data obfuscation, it really doesn't matter at that point what the password is, because the instructions are already mapped out for you to decode the data.
As for the password, it is stored as a property in the message store as a CRC-32 hash of the original password in plaintext. So when this password is validated, the plaintext given by the user attempting to decrypt the file to decode the data and retrieve it's contents, gets packaged up in a CRC-32 hash and compared with the hashed value in the store for equality, and the file with all it's decoded contents is unraveled for this person.
The part about this, is that CRC-32, just like MD5, provides an opportunity for hash collisions, which makes it a weakspot for bruteforcing.
Microsoft knows about this though. I just thought that I would share this information, if any of you are relying on passworded PST files being secure. The main point I am trying to make here though is that the decryption of the obfuscated data doesn't rely on a key at all (based on the password given), so I could compile a program, write the decryption process and use that over and over again to view the contents of the file, without knowing what the user's password for the file is, and it would be the exact same process from start to finish, regardless of what password was used to originally encrypt the file. Thus the password seems kind of pointless, I could make it "a" or "sEc09X4@!r60", and it wouldn't matter. Perhaps the latter would be better if someone decided to try and bruteforce the password though. Don't rely on the specification of only the PST itself though, if you have secure data in your emails locked away that you don't want being seen, and if you think others have access to this PST file.
That password is basically just a way of saying, for instance to Outlook if you want to view the files: "Okay, I don't know how the encryption algorithm works in order to decrypt this file for myself, so I'll give you the password, and you can do it for me." In essence, it's just a "bribe" really, if you think about it that way. Like giving a piece of candy to someone because you don't understand the decryption algorithm in order to do it on your own.
~Ace
I did some research...
The PST format uses two cipher algorithms (both direct without a key) to encode the data blocks. Because they don't utilize a key, these are definitely just for data obfuscation so that somebody couldn't just open the file in a hex view and see all of your messages that way. Only the end user data blocks however are encoded in the PST file, meaning all other information (header, allocation metadata pages, and BTree) are stored without any kind of encoding, the data itself is all obfuscated though.
The important thing to note is that the encoding will always be consistently the same for this reason. Even if the PST file is passworded, you really don't control the level of security for the data which is obfuscated within the file. (Perhaps this is not a good thing?) Once, you understand the algorithm's used in the encoding process for data obfuscation, it really doesn't matter at that point what the password is, because the instructions are already mapped out for you to decode the data.
As for the password, it is stored as a property in the message store as a CRC-32 hash of the original password in plaintext. So when this password is validated, the plaintext given by the user attempting to decrypt the file to decode the data and retrieve it's contents, gets packaged up in a CRC-32 hash and compared with the hashed value in the store for equality, and the file with all it's decoded contents is unraveled for this person.
The part about this, is that CRC-32, just like MD5, provides an opportunity for hash collisions, which makes it a weakspot for bruteforcing.
Microsoft knows about this though. I just thought that I would share this information, if any of you are relying on passworded PST files being secure. The main point I am trying to make here though is that the decryption of the obfuscated data doesn't rely on a key at all (based on the password given), so I could compile a program, write the decryption process and use that over and over again to view the contents of the file, without knowing what the user's password for the file is, and it would be the exact same process from start to finish, regardless of what password was used to originally encrypt the file. Thus the password seems kind of pointless, I could make it "a" or "sEc09X4@!r60", and it wouldn't matter. Perhaps the latter would be better if someone decided to try and bruteforce the password though. Don't rely on the specification of only the PST itself though, if you have secure data in your emails locked away that you don't want being seen, and if you think others have access to this PST file.
That password is basically just a way of saying, for instance to Outlook if you want to view the files: "Okay, I don't know how the encryption algorithm works in order to decrypt this file for myself, so I'll give you the password, and you can do it for me." In essence, it's just a "bribe" really, if you think about it that way. Like giving a piece of candy to someone because you don't understand the decryption algorithm in order to do it on your own.
~Ace
Last edited: