Passwords have been around for millenia. The oldest use was to prove to sentries whether you were friend or foe. (I use “password” in this article to include a wide variety of schemes, including passphrases and numerical PINs).
Choosing a proper password has always required considering the attacker’s capabilities and limitations. There are several factors at play:
- Entropy — How many possibilities does the attacker need to consider? It may be lower than you’d think since it only depends on how little an attacker knows about you. If she has retrieved paper from your shredder, there may only be a few possibilities to try, even though the password itself is complex and impossible for someone else to guess.
- Guess rate — How quickly can the attacker try guesses? This is often determined by the attacker’s vantage point. If they have hashes from your server, an offline attack is many times faster than an online attack. It can also be limited by the attacker’s own hardware and by the hashing algorithm.
- Responses — What can you do about guessing? Can you disable a user’s account if there are too many bad attempts? Require secondary authentication? Can you shut down the entire system?
Unix systems in the 1980’s and 90’s limited passwords to 8 characters because of the the underlying DES cipher key size. Because this is too short to use multiple words, most recommendations from that time tried to maximize the entropy of every character while remaining memorable. A common suggestion was to take the first letter of each word in a phrase and mix in some punctuation and numerals. This kind of scheme persists to this day, with many websites enforcing a minimum (and sometimes maximum) length and the use of an uppercase letter or numeral.
Password cracking programs such as Crack (Unix) and Cracker Jack (DOS) targeted this scheme. To mirror user behavior, they would take a dictionary (wordlist) and append numerals or change case. A useful strategy would be to start with a common wordlist and add in local terms such as sports teams or city names. After a few passwords were cracked, you could identify patterns (such as user nationality or college major) and add similar terms to your set. But as long as the user didn’t use too short of a password or an actual word or close variant as the base string, they would usually be secure against Crack.
With the advent of the FreeBSD-MD5 scheme in the early 90’s, passwords could now be arbitrarily long. This brought login systems in line with PGP, which had supported long passwords for a while. The recommended scheme then changed to “use a difficult-to-guess passphrase.” However, not many concrete recommendations were made for what makes a passphrase difficult enough.
Many users thought that just having any passphrase was difficult enough. Who could guess all the letters and spaces among multiple words? While this might have been true if attackers stuck to Cracker Jack, it ignores the fact that attackers can change strategies. Each word can be treated like a single character as before. As long as the words were in a dictionary, multi-word passphrases might have less entropy than a password constructed the old way. Newer tools like John the Ripper help target passphrases.
In choosing a password, consider the entropy for multiple attacker vantage points. How much advantage would a co-worker have over a random stranger? Do they realize you like good Scotch and might use those names in your passphrase? Know you like Will Farrell movies and might use a quote from one? A good passphrase is one where even your spouse would not have an advantage over a stranger.
Additional entropy can be gained by varying it. Misspell or make up words, Dr. Seuss style (but don’t use words from his books!) Ever heard of a “omliyeti”? Me neither, but it might be memorable. Don’t capitalize the first word or put the punctuation (if any) at the end. Put spaces in the middle of words but run the beginning/ends together.
Admins can suggest schemes to help users pick good passwords, and they can attempt to crack their choices to establish password strength. But a user might still pick a low-entropy password that happens to pass this check. Fortunately, the second two factors above (guess rate and responses) are independent of entropy yet still have a big impact on actual password security.
The bcrypt and scrypt password hashing algorithms have greatly slowed the attacker’s guess rate. They use hash functions that are intentionally slow (and in the case of scrypt, memory intensive). More importantly, they have a tunable difficulty parameter that allows the admin to keep pace with Moore’s Law.
Responses can be very important as well. PINs can be numeric and short because access is usually limited to online guessing with lockout after a few tries. One approach I’ve used before is to seed the password file with fake accounts that have easier passwords than the rest (but still hard enough to prevent online guessing). If anyone logs in to them, we know the password file has been retrieved and someone is cracking it.
Another response would be to require secondary authentication. Google does this with their text message authentication. Duo Security provides a phone app. This can be required all the time or activated when the user logs in from a new IP address or doesn’t have the prerequisite cookie.
Password security is a difficult problem, especially with a varied user base. However, most admins focus too much on increasing entropy of user choices and not enough on decreasing the attacker’s guess rate and implementing responses to limit their access when they do get a hit.
A great comment on password strength:
http://xkcd.com/936/
There’s a lot wrong with that comic. The problem is that people take it as serious advice instead of humor.
The main issue is that he strictly rules out a local dictionary attack. However, as we’ve seen in so many site compromises, the hashed passwords are readily available. So 44 bits of security is not nearly enough. Assuming 11 bits per word, you need a minimum of 6 words, not 4.
Also, it is not at all clear in the comic that the words have to be truly randomly chosen (i.e., not by a human). Tell a human to think of 6 “random” words and you will get something with much less than 11 bits of entropy each. Note how he doesn’t have any 3-letter words or actually obscure words.
I took 2048 words from the dictionary and grabbed 6 pseudorandom ones from that set:
wrecker challenge inappreciatively landship bedabble unarrayed
Sure, you can memorize that but it doesn’t quite fit the comic’s smug tone of “see how simple this can be?”
Heh, well the main thing I took from it was that a fairly good looking password “Tr0ub4dor&d” has dramatically less entropy than 4 easy to remember words. Despite being 65,536 times harder to guess.
Do you really thing it’s a common attack to search 2^44 for a password? I suspect 2^44 is radically better than the average and would improve security quite a bit if it was common. Sure someone with a botnet or a couple GPUs and some patience could decrypt it.
“The main issue is that he strictly rules out a local dictionary attack.” How so?
Search 2^44 is far from trivial. He’s assuming 1000 guesses a second, sounds like he’s expecting a local dictionary attack. How many sha512 checksummed guesses a random hacker can achieve per second?
Care to attack a password I set using 4 randomly chosen words from /usr/dicts/words of length 5 to length 7 (xkcd used one 5, one 6, and two 7) using whatever password hash is default on linux desktops these days?
I agree that your six words are more secure than his 4. But disagree that you can’t have humans pick them. As long as your coworker can’t socially engineer a word list by picking favorite countries, hobbies, wife/kids names, tv shows, sports stars and the like it’s going to be a computer guessing.
So as long as someone can avoid picking 4 words from one of their favorite sports/tv/friends/family the fact that they are not random, but follow a theme doesn’t make them significantly easier for a hacker (or computer) to guess them.
So my main argument is showing a user that comic is likely hugely improve the entropy (and thus the difficulty of guessing) of the resulting password.
Maybe add the warnings “Don’t pick 4 words from anything you love/hate.” “Pick them so they are easy to remember, but not personal to you”.
Like say correct horse battery staple ;-).
Seems like 2^44 salted sha512 checksummed passwords is plenty to hide in. Likely more than enough so that 99.9% of the threats are not going to be from a brute force attack of your checksums.
A local dictionary attack on SHA-512 would be much faster than 1000 passwords/second. More like some millions of passwords. http://www.cryptopp.com/benchmarks.html cites 99 MiB/s for SHA-512 … it is not really clear how this mass-hashing rate translates to password hashing, but we can guess that at least a million of passwords (2^20) in a second should be reasonable … without any parallelization.
This means that we have 2^24 seconds (this is 194 days) on one processor … and a lot faster if we consider GPU password cracking, where lots of processors can work concurrently on the same hash.
Hi,
Am new to the world of security and as I was reading through the article about varying entropy i.e. “Don’t capitalize the first word or put the punctuation (if any) at the end. Put spaces in the middle of words but run the beginning/ends together.”, I didn’t understand why you suggested not to capitalize or use punctution. Secondly what did you mean by “Put spaces in the middle of words but run the beginning/ends together”. Can you give examples?
The general idea is to make your password hard to guess. So from easy to guess to hard (in order):
* monday
* Monday (adds capitol where expected)
* mondaY (adds capitol where not expected
* m0ndaY (substitute “o” for zero)
* m0n daY (add a space where not expected)
* m0n daY! (add punctuation where expect)
* m0n! day (add punctuation where not expected)
So capitalization, punctuation, or anything really doesn’t help much if it’s easy to guess. So MondayTuesday is much easier to guess than
m0n! dayzeB&ra (added zebra and punctuation). Careful though, as people pick easy transformation rules password crackers add them.
It varies case by case but I suggest regular password changing ISN’T WORTH IT unless it’s very easy. It interferes with your ability to remember them.
* Fred Cohen revisits his 1997 writing http://all.net/Analyst/2011-04.pdf
* RJA – http://www.cl.cam.ac.uk/~rja14/Papers/SEv2-c02.pdf
* Spaf – http://www.cerias.purdue.edu/site/blog/post/password-change-myths/
* Schneier – http://www.schneier.com/blog/archives/2010/11/changing_passwo.html
Anybody arguing that something tiresome is a “best practice” should be asked to prove it.
I agree, but I don’t think I said in this post to change passwords regularly.
Creating a password that is both memorable and strong can be difficult, but is crucial for online security. And along with strong passwords, I’d like to suggest users also consider what other forms of security are available when interacting online, specifically two-factor authentication. While it isn’t available everywhere, many sites, businesses and financial institutions are making two-factor authentication available to users as either a requirement or an additional security option. As a Symantec employee, I’ve seen the evolution of two-factor authentication, especially how users can now use their mobile device as an authentication token. Unlike traditional two-factor authentication token solutions, approaches that enable re-use of existing mobile devices are faster and easier to deploy, and more cost-effective to maintain. And, unlike traditional hardware tokens, users are far less likely to forget their mobile device at home.
To protect against online password guessing it is possible to limit the total number of guesses that an attacker can make against a password, rather than the guess rate. That can be done by using two counters: a counter of consecutive bad guesses, which is reset after a good guess; and a cumulative counter of bad guesses, which is NOT reset after a good guess. After the first counter reaches, say, 5 guesses, the password is disabled and must be reset. After the second counter reaches, say, 30 guesses, the user is asked to change the password. The username associated with the password can be changed to counter a denial-of-service attack by submitting bad guesses. For more details, see http://pomcor.com/whitepapers/protecting_against_password_guessing_attacks.pdf.
Yes, that falls under option 3 in the above list (responses).
However, your particular strategy requires users to keep changing their username if your site encounters a persistent DoS attack. Since usernames should not have to be secret, it seems poor.
In an operating system such as Unix usernames are not secret because they are used for purposes other than login. For example, a user’s username is also the name of the user’s home directory. But in a Web site or Web application a username can be very well be secret; and a secret username is a good way of augmenting the password’s entropy.
In non-social sites or applications where users don’t interact with, of know of, each other, there is no reason to publish the username, so it should be considered secret. This includes high-security applications such as Web banking.
Also, there are important cases where it makes sense to change a username in order to thwart a denial-of-service attack even though the username is not secret.
For example, social sites often ask for an email address as a login username. The email address is not a secret, but for privacy reasons it is not shown to other social site users; so it can be changed without disrupting the user’s connections and activities at the site. Granted, some users may not have multiple email addresses and may not be able to create email address aliases. But it would be easy to let the user change the login email address to a username that is not an email address, if needed or desired.
For a second example, consider the case of a user accessing an intranet through a VPN. The user may log in using a username that may not be a secret because it is known to insiders; but changing it is an effective countermeasure against a DOS attack by an outsider who has been able to guess it.
I’d place a large bet on username choice having measurably less entropy than password choices, especially across all accounts on your server. If I can lock out 10+% of your web service, it doesn’t matter that one user picked ADFAJOIOWER as their username.
I’m not assuming that users will choose high entropy usernames. I’m saying that in the unusual case where a particular user gets repeatedly locked out because an attacker knows the username and is trying to guess the password the user can change the username to stop the attack. In the even more unusual case where the attacker is randomly trying to lock out users, those 10+% who do get locked out can change their usernames.