Generating a Key from a Password


If you’re trying to encrypt data using a password, how do you convert the password into a key for symmetric encryption? The easiest way might be to simply convert the password to a byte array, and use this array as your key. However, this is a very bad idea and will lead to an easily cracked system. First of all, for a 256 bit encryption algorithm your passwords would all have to be exactly 32 bytes, or you would end up with not enough bits for the key; or worse, too many bits for the key, meaning that every password that starts with the same eight characters will work to decrypt the data.


In the English language, passwords will probably only contain the characters a-z, A-Z, 0-9, and perhaps some symbols. This means that only 70-75 of the possible 256 bit combinations for each byte will be used. In addition, not every symbol will be used with even frequency. Various estimates for the entropy in a single byte of an English language password vary from 1.3 to 4 bits. This means that in order to generate a good 256 bit key, you’ll need a password that’s anywhere from 64 to 197 bytes long!


Fortunately the .Net Framework has provided several ways for you to convert passwords to keys, all of which are much better alternatives than the simple method mentioned above. All of the methods I mention below will always generate the same key given the same set of inputs, so they can be use to effectively create password-based encryption in your code.


CryptDeriveKey


The PasswordDeriveBytes class is available in current releases of the framework, and contains two methods for you to generate keys.


One way of using PasswordDeriveBytes is as a simple wrapper around CAPI’s CryptDeriveKey function. This is done by calling the appropriately named CryptDeriveKey method of PasswordDeriveBytes. When calling CryptDeriveKey, you’ll be using the password found in the class constructor, but you’ll need to pass in the cryptographic algorithm that you’re generating a key for, the size of the key that you’d like to use, and the name of the hash algorithm you’d like to use to generate the key.  When calling CryptDeriveKey, the salt and iteration count that are set on the PasswordDeriveBytes object are not used, so even having different salts and iteration counts will produce the same key given that the rest of the inputs are also the same. (I’ll discuss the use of the salt and iteration count later.)


Some sample code might clear up the use of CryptDeriveKey.


PasswordDeriveBytes cdk = new PasswordDeriveBytes(“P@$$w0rd”null);

// generate an RC2 key
byte[] iv = new byte[] { 0, 0, 0, 0, 0, 0, 0, 0 };
byte[] key = cdk.CryptDeriveKey(“RC2”“SHA1”, 128, iv);
Console.WriteLine(key.Length * 8);
        
// setup an RC2 object to encrypt with the derived key
RC2CryptoServiceProvider rc2 = new RC2CryptoServiceProvider();
rc2.Key = key;
rc2.IV = new byte[] { 21, 22, 23, 24, 25, 26, 27, 28};
        
// now encrypt with it
byte[] plaintext = Encoding.UTF8.GetBytes(“Message”);
MemoryStream ms = new MemoryStream();
CryptoStream cs = new CryptoStream(ms, rc2.CreateEncryptor(), CryptoStreamMode.Write)

cs.Write(plaintext, 0, plaintext.Length);
cs.Close();
byte[] encrypted = ms.ToArray()

 


The sample generates a 128 bit key to use with the RC2 algorithm. This key is generated by using the SHA1 hash algorithm. The IV parameter to CryptDeriveKey is actually an out parameter — it’s supposed to represent the IV that you can use with your algorithm. However, current implementations just set this to an array of zeros, so you’ll need an IV anyway. (Obviously my IV and password aren’t the strongest in the world.)


I occasionally get asked what algorithms you can pass to CryptDeriveKey. Since CryptDeriveKey is a wrapper around CAPI, you need to pass a parameter that CAPI can understand. This means, it must be a string that CryptoConfig will map to a *CryptoServiceProvider class. (Actually, this is simplified a bit …. any algorithm that has the same OID as a CSP class will work, but for all intents and purposes, stick with the CryptoServiceProvider classes). In v1.1 of the framework, these classes are:




Hash Algorithms























String Implementation
http://www.w3.org/2000/09/xmldsig#sha1 System.Security.Cryptography.SHA1CryptoServiceProvider
MD5 System.Security.Cryptography.MD5CryptoServiceProvider
SHA System.Security.Cryptography.SHA1CryptoServiceProvider
SHA1 System.Security.Cryptography.SHA1CryptoServiceProvider
System.Security.Cryptography.HashAlgorithm System.Security.Cryptography.SHA1CryptoServiceProvider
System.Security.Cryptography.MD5 System.Security.Cryptography.MD5CryptoServiceProvider
System.Security.Cryptography.SHA1 System.Security.Cryptography.SHA1CryptoServiceProvider




Symmetric Algorithms


























String Implementation
3DES System.Security.Cryptography.TripleDESCryptoServiceProvider
DES System.Security.Cryptography.DESCryptoServiceProvider
RC2 System.Security.Cryptography.RC2CryptoServiceProvider
System.Security.Cryptography.DES System.Security.Cryptography.DESCryptoServiceProvider
System.Security.Cryptography.RC2 System.Security.Cryptography.RC2CryptoServiceProvider
System.Security.Cryptography.TripleDES System.Security.Cryptography.TripleDESCryptoServiceProvider
Triple DES System.Security.Cryptography.TripleDESCryptoServiceProvider
TripleDES System.Security.Cryptography.TripleDESCryptoServiceProvider


PBKDF1


The other use of PasswordDeriveBytes is as an implementation of the PBKDF1 algorithm, specified in RFC 2898, section 5.1. (PBKDF stands for Password Based Key Derivation Function). PBKDF1 is a pretty simple algorithm:


  1. Concatenate the Password and Salt: R0 = Pwd + Salt
  2. Hash the result Iteration Count times: Rn = Hash(Rn – 1)
  3. The result is the Rn where n = Iteration Count

As you can see, PBKDF1 uses a salt to reduce the risk of a dictionary attack.  Having a large salt will reduce the risk that an attacker can create a list of the output keys for a set of given passwords. Instead of just having to calculate one key, the attacker would have to calculate one key for each salt.  RSA, who developed the algorithm as a part of PKCS #5, recommends a salt of at least 64 bits.  A 64 bit salt would mean the attacker would need to generate 2^64 keys for each password in order to use a dictionary attack.


In addition, an iteration count is required. In general, the larger the iteration count, the stronger the resulting key. Here, RSA recommends a value of at least 1000. PBKDF1 uses either MD4, MD5, or SHA1 as its underlying hash algorithm, although PasswordDeriveBytes will not care if you use another hash algorithm.


In order to generate your key, you call GetBytes passing in the number of bytes you need for a key. One neat thing about the output of GetBytes is that getting two smaller byte arrays is the same as getting one larger one. For instance, two arrays of 20 bytes put together will be the same as a call to GetBytes(40). You can call GetBytes as many times as you need. PBKDF1 will only produce the number of bytes that the hash algorithm generates, but GetBytes will extend the result further, allowing you to get a large number of bytes out of your password.


The following sample is a rewrite of the CryptDeriveKey sample, using PBKDF1. In this case, I’m using a simple salt, an iteration count of 1000, and the SHA1 hash algorithm. I also use PasswordDeriveBytes to generate an IV for me.


// setup the password generator
byte[] salt = new byte[] { 0, 1, 2, 3, 4, 5, 6, 7 };
PasswordDeriveBytes pwdGen = new PasswordDeriveBytes(“P@$$w0rd”, salt);
pwdGen.IterationCount = 1000;
pwdGen.HashName = “SHA1”;
        
// generate an RC2 key
byte[] key = pwdGen.GetBytes(16);
byte[] iv = pwdGen.GetBytes(8);
        
// setup an RC2 object to encrypt with the derived key
RC2CryptoServiceProvider rc2 = new RC2CryptoServiceProvider();
rc2.Key = key;
rc2.IV = iv;
        
// now encrypt with it
byte[] plaintext = Encoding.UTF8.GetBytes(“Message”);
MemoryStream ms = new MemoryStream();
CryptoStream cs = new CryptoStream(ms, rc2.CreateEncryptor(), CryptoStreamMode.Write);
        
cs.Write(plaintext, 0, plaintext.Length);
cs.Close();
byte[] encrypted = ms.ToArray();

 


PBKDF2


With the release of Whidbey, the frameworks will have an implementation of PBKDF2, in the class Rfc2898DeriveBytes. PBKDF2 is also defined in RFC 2898, in section 5.2. The big advantage of PBKDF2 over PBKDF1 is that the output from PBKDF2 is not bounded to the size of a hash algorithm’s output. Although the .Net implementation of PBKDF1 does not impose this limitation on you, in order to be more secure I’d recommended that you move to PBKDF2 when you make the move to Whidbey. As you’ll see, moving from PasswordDeriveBytes to Rfc2898DeriveBytes is trivial since Rfc2898DeriveBytes works in much the same way as PasswordDeriveBytes. You start by setting up a password, salt, and iteration count (PBKDF2 uses HMACSHA1 as an underlying pseudo-random generator). Then you call GetBytes repeatedly until you have enough data to encrypt with. The following code is a rewrite of the PBKDF1 code to use PBKDF2:


// Setup the password generator
byte[] salt = new byte[] { 0, 1, 2, 3, 4, 5, 6, 7 };
Rfc2898DeriveBytes pwdGen = new Rfc2898DeriveBytes(“P@$$w0rd”, salt, 1000);
        
// generate an RC2 key
byte[] key = pwdGen.GetBytes(16);
byte[] iv = pwdGen.GetBytes(8);
        
// setup an RC2 object to encrypt with the derived key
RC2CryptoServiceProvider rc2 = new RC2CryptoServiceProvider();
rc2.Key = key;
rc2.IV = iv;
        
// now encrypt with it
byte[] plaintext = Encoding.UTF8.GetBytes(“Message”);
MemoryStream ms = new MemoryStream();
CryptoStream cs = new CryptoStream(ms, rc2.CreateEncryptor(), CryptoStreamMode.Write);
        
cs.Write(plaintext, 0, plaintext.Length);
cs.Close();
byte[] encrypted = ms.ToArray();

 


As you can see, the only change to the code is replacing the PasswordDeriveBytes class with an Rfc2898DeriveBytes class.


Other changes for Whidbey include the ability to set a byte array password (on both PasswordDeriveBytes and Rfc2898DeriveBytes), which increases security by allowing you more control over the password object. (More on that in a future post.)


Hopefully this has provided some useful examples of how to generate a good cryptographic key using a password.

Comments (31)

  1. Just a few comments:

    Isn’t the strength of GetBytes is related to the hash algorithm (nothing magic behind the scenes)? If so, no one should ever get more bytes out than the hash algorithm generates, since they aren’t as strong after the hash algorithm "runs out". This is why the PBKDF1 specs say that if the requested length is greater than the hash size, fail.

    An IV should not be derived from the password. Passwords are likely to be reused (and thus the resulting key and IV would be the same). The same key / IV combo should never be reused. The IV is not sensitive, so generating a random one (or incrementing a sequential one) is fine, and the IV can be send along with the message (as a prefix).

    Salt most definitely does NOT increase password entropy. Salt, like the IV, is not secured. The only entropy comes from the password itself. Salt serves to stop dictionary-like attacks, or attacks against multiple hashes (salt makes a given plaintext always compute to different hash (since the salt modifies the plaintext and it’s no longer the same plaintext)). Regardless of salt, the password entropy is related only to the password itself. Even adding 1024-bits of random salt would do nothing more (except increase processing time a tad).

    To increase password strength, the iterations come into play. Iterations can add n-bits (where n = log2(iterations)) of work required to brute force the password, up to normal constraints (i.e., not going to get past 64-bits on a 128-bit hash). The password entropy still remains the same.

    Also, this is a good way to store password hashes in general, not just for keys. I wish more apps would take this approach (hey, say, like Windows NT password hashes :)).

  2. Shawn says:

    >Isn’t the strength of GetBytes is related to the hash algorithm (nothing magic behind the scenes)? If so, no one should ever get more bytes out than the hash algorithm generates, since they aren’t as strong after the hash algorithm "runs out".

    PasswordDeriveBytes extends PBKDF1 in the case that you start requesting more bytes than the hash algorithm can provide. However, I do recommend using PBKDF2 if you are moving to Whidbey.

    > An IV should not be derived from the password. Passwords are likely to be reused (and thus the resulting key and IV would be the same). The same key / IV combo should never be reused. The IV is not sensitive, so generating a random one (or incrementing a sequential one) is fine, and the IV can be send along with the message (as a prefix).

    I agree that you should never use the same key / IV combo, and your solution to this problem works. Perhaps I should have been more clear in my posting; I was thinking that the salt would change every time, which would lead to different key / IV combos. This way the salt and the IV don’t both have to be sent along with the encrypted text, only the salt would have to go.

    > Salt most definitely does NOT increase password entropy. Salt, like the IV, is not secured. The only entropy comes from the password itself. Salt serves to stop dictionary-like attacks, or attacks against multiple hashes

    True enough…..updated the post to reflect that.

    Thanks for your comments!

    -Shawn

  3. Well, transmitting a salt or an IV in plaintext comes out to the same thing. A 64-bit salt or a 64-bit IV — user’s choice :). Either way, you’ve got some initialization (be it for the cipherstream or the hashing) being sent.

  4. Shawn says:

    Right, but if you choose to send the IV than you need to also send the salt because there’s no way to calculate that (unless you have it hardcoded in your application, or have some other pre-determined way to create a common salt). This way, I only have to send the salt, and the IV can be calculated from the salt + password.

  5. Anonymous Coward says:

    As noted in Section 5.2 and Appendix B.1.1 of RFC 2898, the entropy (search space) of a PBKDF2/HMAC-SHA1-mapped passphrase is limited to 160 bits. Generally speaking, that’s big enough, as the RFC itself notes. However, I’m wondering whether the entropy might potentially be somewhat less than that?

    What concerns me here is the iterative part of the PBKDF2 algorithm: U_n+1 := PRF(P, U_n), where PRF = HMAC-SHA1. There’s no guarantee I’m aware of that HMAC_SHA1 maps 160-bit intropy to 160-bit entropy. It certainly can’t increase the entropy, and hance almost certainly decreases it (assuming HMAC-SHA1 isn’t bijective/invertible on 160-bite blocks, which is isn’t designed to be, since it’s a hash function not an encryption/decryption function).

    Thus, if HMAC-SHA1 decreases entropy by even a fraction of a bit per iteration, a large iteration count is going to decrease the overall entropy unacceptably. In the limit, it’s not out of the question that HMAC-SHA1 has a small fixed set (i.e., a subset of 160-bit space such that HMAC-SHA1 always maps that subset into itself), and that fixed set is reached after some not-too-unreasonable number of iterations (say 1,000,000). If that fixed set is small enough. and the attacker knows the IC is > 1,000,000 (which must be assumed, given that the IC must be assumed to be public knowledge), then the attacker "wins".

    Granted, the workfactor involved in computing a high IC does still make the attacker’s life harder, but that’s no the point here. The point is simply one of entropy.

    So, my question is: What’s the state of understanding about this iterated-HMAC-SHA1 qustion? (This question is independend of the compression function that goes into the construction of SHA1 itself, of course.)

  6. Anonymous Coward says:

    Pardon my typos in the preceding post (sunny day, on my deck, couldn’t see screen), but I think the ideas are correct …

  7. Shawn says:

    Sitting on the deck on a sunny day …… you’re making me jealous :-)

    I haven’t seen a formal proof anywhere that iterations do not decrease the effectiveness of PBKDF2, in fact, it’s pretty generally accepted that iteration count increases the effectiveness.

    Unfortunately I couldn’t find any resources to point you at. Since the RSA people are the ones who came up with PBKDF2, I’d recommend starting a search over on their site to see if you can find anything. If you do find some information, please post it back here …. it’d be interesting to read.

  8. Anonymous Coward says:

    Thanks for the response. I tend to agree with your comment that "IC increases the *effectiveness*" (which I phrased as *workfactor* in my original post). As I acknowledged, the question was a knowingly "academic" one about pure-entropy, as opposed to overall practicality of the design. But that’s fair game in the crypto business: an "provably academically secure" technique [admittedly within the bounds of cryptographic/probabilistic surety] is always more reassuring than a "merely practically hard" technique.

    I have indeed done a web search or two on this topic, but haven’t found diddly. I was sort-of hoping you’d been more lucky. RSA is more likely to talk to Microsoft than to an "anonymous coward" like me … :-)

  9. noulouk says:

    Hello, I’m very interested in Whidbey security and how to upgrade my application (.NET 1.1)when ASP .NET 2.0 will be released.

    If you have time, could you help me and follow the link :

    http://www.asp.net/Forums/ShowPost.aspx?tabindex=1&PostID=657449

    Thanks in advance for any help.

  10. I am doing some research on how I can efficiently encrypt data .NET 1.1 and decrypt with VC++ 6.0, I found the following links very useful: the old and the new? WINCRYPT.H Generating a Key from a Password CryptDeriveKey…

  11. There’s a ton of new and enhanced security features coming with the v2.0 release of the CLR.  However,…

  12. There’s a ton of new and enhanced security features coming with the v2.0 release of the CLR.  However,…

  13. Хранить пароли в открытом виде — это клиническая форма легкомыслия. Идеальное ре

  14. Ngaga, Gisse says:

    Dear Sir,

    I have the following problem:

    I propose to use the hash function output as an input to the AES algorithm.

    The output of the AES key is used to encrypt message M.

    Question:

    1) I want to use a hash output (e.g. 160 bits stream) as an input to the the AES algorithm  1) I want to use a hash output. What techniques should be used to match the hash ouput and the AES input ?,  Or what actually should be done?

    2) Question1, but Hash(160 bits) ouput  less than the AES (192 bits) input.

    3) Are there any known functions that can solve the problem?

    Kind regards,

    Ngaga, Gisse

  15. shawnfa says:

    You should use Rfc2898DeriveBytes class mentioned in this post, rather than just the hash value, since as you note you’re limited in the number of bits you can create.

  16. Itamar says:

    I have the following problem:

    A c++ application that get encrypted message from a NET application. The C++ is using Crypto API and the NET should be using what is availabel in the NET. Each App can Ecrypt/Decrypt its own but I can’t get the C++ to decript NET encrypted Message. Using RC2,

    (RC4 is not valailabel on NET and AES not available on Windows 2000 without the NET frame work), the crypto API is using hashed password (MD5) and the NET is using : Password,Salt and IV. How the Salt and IV on the NET should be handled at the C++ end ?  

  17. shawnfa says:

    You need to use the same algorithm to derive your key on both ends.  Both CAPI and .NET Support CryptDeriveKey which you should probably be using.

    -Shawn

  18. harafeh says:

    I have the same problem as Itamar above. We have old VB 6.0 code that uses the MS base cryptographic service written using calls to the advapi32.dll (things like CryptEncrypt, CryptCreateHash and so on).

    I ported the code to C# (still using CAPI) and it works fine in that I can encode a string in C# and decode it in the old code and vice versa.

    The problem is when I try to use the .Net native calls (SymmetricAlgorithm RC2 and other .Net classes). I can encrypt and decrypt in .Net, but the strings are not interchangeable between the two versions.

    As far as I can tell, I’m using the exact same parameters, at least the ones I can get to. For example, .Net won’t let me set the provider name so it uses the enhanced MS provider while the old code uses the basic one. But even if I change the old code to use the enhanced provider, I cannot interchange strings between the two. This must mean that I’m missing some parameter that is set by default in either version.

    I am using CryptDeriveKey("RC2", "MD5", 40, iv) in C# and I’m positive these are the same parameters in CAPI (with iv being all zeros).

    Any ideas?

  19. shawnfa says:

    Hi Harafeh,

    The CryptDeriveKey API does the following:

    * Hash the incoming password using the specified hash algorithm

    * Call the CAPI CryptDeriveKey API, using that hash object

    * Return the resulting key

    -Shawn

  20. yo mama says:

    in what world does this make sense?

    "First of all, for a 256 bit encryption algorithm your passwords would all have to be exactly 8 bytes"

    8 bytes * 8 bits/byte = 64 bits.

    32 bytes * 8 bits/byte = 256 bits.

  21. shawnfa says:

    Yep, you’re right.  I’ve updated the post.

    -Shawn

  22. Even though this was written 4 years ago, this post is still beautiful and useful.

    I translated the russian comment above ‘Хранить пароли в открытом виде — это клиническая форма легкомыслия. Идеальное ре ‘ (google translate provided: Store passwords in clear text – is the clinical form of levity. Perfect D)

    So to follow up with our Russian friend, how and where is a safe place to store the password?

  23. Steve Fox says:

    I’ve got a situation where I need to use Wincrypt and .Net to implement an AES encryption/decryption.  Basically I’ve got most of the application using .Net for AES, but I’ve got one piece that has to use non .Net and therefore I want to use Wincrypt.  However, I’m having issues generating the same key and as an end result the same encryption.

    The main issue I’ve figured is that Wincrypt has a very strict method to generate keys/perform the encryption. HOwever, the .Net side is the issue.  I can’t seem to figure out how to duplicate the Wincrypt process in .Net.  Any ideas help would be greatly appreciated.

  24. Stuart says:

    A poster incorrectly pointed out that PBKDF2/HMAC-SHA1 might possibly reduce to a small set after a significant number of iterations. A closer examination of the algorithm shows that all the intermediate Hash values are in fact XORed together to prevent this condition. In effect the first Hash output acts as an OTP (one time pad) for those that follow, and so on. At worst, the effect of a small set of high iteration hash values is to reduce the benefit of increasing the iteration count further, it cannot conceivably reduce security.

    Also, Appendix B.1.1 of RFC 2898 only applies to the key since SHA1 only propagates 20bytes (160bits) of internal state from one 64byte block to the next. But, if the "salt" is not known to our attacker (since we are dealing with a general KDF we could be using it in areas other than the discussed PW, Salt case) we could extend the search space further (up to 40bytes). Even if we consider the first 64byte-block to only pass on 20bytes of internal state then the next 64byte fed to SHA1 block should be able to output any 20byte output state. So as long as the second block is different for each additional 20bytes of PBKDF2 output and unknown to our attacker the search space for all our output is indeed higher than the 160bit space for just one block. To achieve this the salt must be less than 60bytes so the 4byte PBKDF block counter can be appended to it in the same 64byte internal hash block. However, this is clearly a special case with ill-defined security bounds and in general cannot be assumed to apply.