Passwords have been around for millenia. The oldest use was to prove to sentries whether you were friend or foe. (I use “password” in this article to include a wide variety of schemes, including passphrases and numerical PINs).
Choosing a proper password has always required considering the attacker’s capabilities and limitations. There are several factors at play:
- Entropy — How many possibilities does the attacker need to consider? It may be lower than you’d think since it only depends on how little an attacker knows about you. If she has retrieved paper from your shredder, there may only be a few possibilities to try, even though the password itself is complex and impossible for someone else to guess.
- Guess rate — How quickly can the attacker try guesses? This is often determined by the attacker’s vantage point. If they have hashes from your server, an offline attack is many times faster than an online attack. It can also be limited by the attacker’s own hardware and by the hashing algorithm.
- Responses — What can you do about guessing? Can you disable a user’s account if there are too many bad attempts? Require secondary authentication? Can you shut down the entire system?
Unix systems in the 1980′s and 90′s limited passwords to 8 characters because of the the underlying DES cipher key size. Because this is too short to use multiple words, most recommendations from that time tried to maximize the entropy of every character while remaining memorable. A common suggestion was to take the first letter of each word in a phrase and mix in some punctuation and numerals. This kind of scheme persists to this day, with many websites enforcing a minimum (and sometimes maximum) length and the use of an uppercase letter or numeral.
Password cracking programs such as Crack (Unix) and Cracker Jack (DOS) targeted this scheme. To mirror user behavior, they would take a dictionary (wordlist) and append numerals or change case. A useful strategy would be to start with a common wordlist and add in local terms such as sports teams or city names. After a few passwords were cracked, you could identify patterns (such as user nationality or college major) and add similar terms to your set. But as long as the user didn’t use too short of a password or an actual word or close variant as the base string, they would usually be secure against Crack.
With the advent of the FreeBSD-MD5 scheme in the early 90′s, passwords could now be arbitrarily long. This brought login systems in line with PGP, which had supported long passwords for a while. The recommended scheme then changed to “use a difficult-to-guess passphrase.” However, not many concrete recommendations were made for what makes a passphrase difficult enough.
Many users thought that just having any passphrase was difficult enough. Who could guess all the letters and spaces among multiple words? While this might have been true if attackers stuck to Cracker Jack, it ignores the fact that attackers can change strategies. Each word can be treated like a single character as before. As long as the words were in a dictionary, multi-word passphrases might have less entropy than a password constructed the old way. Newer tools like John the Ripper help target passphrases.
In choosing a password, consider the entropy for multiple attacker vantage points. How much advantage would a co-worker have over a random stranger? Do they realize you like good Scotch and might use those names in your passphrase? Know you like Will Farrell movies and might use a quote from one? A good passphrase is one where even your spouse would not have an advantage over a stranger.
Additional entropy can be gained by varying it. Misspell or make up words, Dr. Seuss style (but don’t use words from his books!) Ever heard of a “omliyeti”? Me neither, but it might be memorable. Don’t capitalize the first word or put the punctuation (if any) at the end. Put spaces in the middle of words but run the beginning/ends together.
Admins can suggest schemes to help users pick good passwords, and they can attempt to crack their choices to establish password strength. But a user might still pick a low-entropy password that happens to pass this check. Fortunately, the second two factors above (guess rate and responses) are independent of entropy yet still have a big impact on actual password security.
The bcrypt and scrypt password hashing algorithms have greatly slowed the attacker’s guess rate. They use hash functions that are intentionally slow (and in the case of scrypt, memory intensive). More importantly, they have a tunable difficulty parameter that allows the admin to keep pace with Moore’s Law.
Responses can be very important as well. PINs can be numeric and short because access is usually limited to online guessing with lockout after a few tries. One approach I’ve used before is to seed the password file with fake accounts that have easier passwords than the rest (but still hard enough to prevent online guessing). If anyone logs in to them, we know the password file has been retrieved and someone is cracking it.
Another response would be to require secondary authentication. Google does this with their text message authentication. Duo Security provides a phone app. This can be required all the time or activated when the user logs in from a new IP address or doesn’t have the prerequisite cookie.
Password security is a difficult problem, especially with a varied user base. However, most admins focus too much on increasing entropy of user choices and not enough on decreasing the attacker’s guess rate and implementing responses to limit their access when they do get a hit.