How much strength does PKCS #5 password-based key derivation (PBKDF2) add to a key?

Last week I posted my tests with Markov chains as memorable passphrase generators. This weekend I’m exploring some ideas for introducing more entropy into the resultant chains so they can be shorter and hopefully more memorable. In the midst of that I remembered something from my previous life doing crypto stuff: brute-force attacks on passwords can be made more difficult by using a key derivation function that performs some computationally expensive operation on the password to produce a cryptographic key. If this is done right, it means the attacker must perform this expensive function for each password he wants to try, which can make an otherwise easily cracked password better and an already strong password stronger still.

The standard algorithm for this is specified in PKCS #5 v2, and is called PBKDF2 (Password-Based Key Derivation Function, version 2). You can see the spec for details, but it basically consists of a number of hash iterations which transform a password and salt value into binary key material. Depending on how many iterations you specify, PBKDF2 can be as fast or as slow as you need.

The beauty of this solution is that it makes brute-forcing keys orders of magnitude harder, without imposing a significant burden on legitimate users. For example, if the iteration count is so high that the PBKDF2 function takes one second to run (which is a ridiculously large number of iterations), it takes your computer an additional one second of computation to authenticate you based on your password. But an attacker trying to guess your password from a dictionary has to spend that additional second of computation once for each password in his list, which could easily be tens of millions of words long.

PBKDF2 is used in a number of real-world systems, most notably the WPA/PSK security specification in the 802.11(g) wireless networking (WiFi) standard. As a result, the best known attack on WPA/PSK-secured networks is to brute-force the key, which is a helluva lot stronger than WEP which can be cracked in a few seconds.

That’s all fine, but when I generate my Markov text I know how much entropy is represented in the string, and in my experience it’s not much (16 to 80 bits is typical for my test data). If I’m using this text to derive a 128-bit AES key, that weakens the resulting key considerably. But what if I could use a PBKDF to make it harder for attackers to brute-force the Markov text, making the cracking effort equivalent to brute-forcing a 128-bit AES key? That would be helpful as it would allow users to use shorter and less sophisticated passphrases without losing security. However, how many bits of security does the PBKDF add?

That was the question I sought to answer. The only real guidance I can find is in the RFC for AES in Kerberos, in which the author benchmarks PBKDF2 iterations on the his machine to determine how many can be done per second, then estimates how long it would take an attacker with the same machine to brute-force a list of 2^32 passwords (that’s over 4 billion for those of you on the decimal system). The number he came up with isn’t important, because it wasn’t particularly helpful. An attacker will have more than a single Pentium 4, and I don’t need an estimated time to brute-force the derived key, I need a number relative to the time to brute-force without the derived key so I can determine how strong the input key needs to be to satisfy my security goals.

Well, I couldn’t find anyone else asking that question, so I wrote a little benchmark tool to help me. It uses the superb LibTomCrypt cryptography library, and is in my SVN repository here. It figures out roughly how many AES ECB decrypt operations can be performed in one second, then how many PBKDF2 key derivation/AES ECB decrypt operations in one second as well. It compares these two numbers and translates the difference into the number of bits of additional AES key length the time difference is equivalent to.

If that all sounds a little speculative, that’s because it is. It assumes the ratio between an AES decrypt operation and the PBKDF2 function is the same on my machine as it would be for an attacker, but in reality a determined attacker would probably use an array of dedicated FPGA hardware crackers or some other implementation specifically tuned for brute-forcing which might have different AES/PBKDF2 performance characteristics. However, I think it provides a back-of-the-envelop estimate of the additional effort PBKDF2 requires of an attacker, and roughly how much more entropy in the input key would require the same amount of effort to crack.

Enough of the theory (such as it is); what were the results? Well:

AES Key Size PBKDF2 iteration count PBKDF2 hash function AES ECB decrypts/second PBKDF2/AES ECB decrypts/second Equivalent additional key bits
128 bits 2000 iterations SHA256 1,139,930.82 72.62 13.94 bits
128 bits 4000 iterations SHA256 1,144,402.52 36.05 14.94 bits
256 bits 2000 iterations SHA512 838,315.36 35.81 14.51 bits
256 bits 2000 iterations SHA256 833,425.22 34.96 14.54 bits
256 bits 4000 iterations SHA256 845,963.13 17.60 15.55 bits
256 bits 4000 iterations SHA512 831,825.83 17.49 15.54 bits

So if my methods are to be believed, using a 2000 to 4000 iteration PBKDF2 to derive a key from user input makes the attacker do extra work equivalent to an additional 13 to 16 bits of entropy in the input key. That’s a huge increase in the attacker’s computational workload, and seems well worth the additional effort.

Tags: , , , , ,

Leave a Reply