Learn from Faculty

Steven Myers, Associate Professor of Computer Science and Informatics at Indiana University Bloomington, explains the nuances and intricacies of password/passphrase authentication. He thoroughly explains HOW and WHY passphrases are vastly superior to old, short passwords:


Passphrases in brief

What is a passphrase?

A passphrase is simply a different way of thinking about a much longer password. Dictionary words and names are no longer restricted. In fact, one of the very few restrictions is the length - 15 characters. Your passphrase can be a favorite song lyric, quote from a book, magazine, or movie, or something your kids said last week. It's really that easy.

Better security

IU account passwords protect all sorts of valuable data — personal finances, university finances, HR data, student data, etc. Would you want the people with access to all your information using old, insecure passwords? Better security is provided by increasing the length of a password to a minimum of fifteen characters. ALL IU users should consider the risks and switch today.

Almost anything goes

The restrictions of numbers and/or symbols in certain places in your password are gone. Passphrases can be simple short sentences of five or six words with spaces, using natural language. Since you type emails and such every day, typing in natural language shouldn't be anything new.

A happy medium

Passphrases bring into balance the trade-off between hard to remember but much more secure passwords, and easy to remember but much less secure passwords. By extending the length, IU is able to reduce the complexity requirement, and offer passwords that can allow virtually any character, word, or symbol.


More information

One of the weakest links in the security process is the use of passwords. However, people use passwords every day and rely on the security of those passwords to protect information and data that is extremely valuable, be it their financial portfolio, confidential health information, private emails, business correspondence or online banking data. The security of a user's data is only as strong as the weakest link, and more often than not, that weak link is the poor choice users make when choosing a password.

One option would be to assign users their passwords, rather than allowing them to pick their own. In practice this usually isn't feasible because users often have a hard time remembering their assigned password. This can lead them to write it down and store it insecurely, or it can lead to an increased support burden of having to perform multiple password resets. So ultimately, system administrators usually allow users to pick their own passwords.

However, allowing users to pick their own passwords comes with another set of problems. Humans are notoriously bad when it comes to picking things at random. In a truly random password, the probability should be equal of any given character appearing at any position in the password. When we look at typical passwords that users pick we do not find that same probability to be true. Some of the reasons for this are the very same reasons why users have a hard time remembering truly random passwords in the first place

Passwords are hard to remember.

We know from memory studies that most people have trouble remembering more than seven or so unrelated items. So what do people do when they need to remember multiple items? They relate them in some way. In other words, they impose some kind of order or apply various rules that will help them remember. Users may or may not even be conscious of the fact that they're constraining their passwords in this way. At some level, it's just human nature to do so.

Remembering the password in your brain is not the only memory that comes into play. "Muscle memory" becomes a factor as well. In fact, users may only need to remember their passwords long enough until their fingers start to remember how to type them. I'm sure we've all had the experience where we couldn't "remember" a phone number and were unable to recite it to someone else. However, if we picked up the phone and started to dial it, we'd quickly be able to tell the other person the digits of the phone number.

Since people need to type their passwords they prefer to pick characters that they can easily type on the keyboard. Using common keystrokes that are related to each other makes it that much easier to commit the password to muscle memory. So by definition, the common keystrokes people often type are the same common keystrokes they use to type everyday words. When we start to examine the passwords that users pick we find that people tend to prefer certain character combinations, as well as certain types of characters, over others.

People make for a poor /dev/random

In general, people show a preference to alpha characters over numeric characters and numeric characters over symbol characters. Among the alpha characters, people prefer lowercase over uppercase and among the symbols characters, people prefer the "upper row" (above the keyboard numerals) over the other symbol characters on the keyboard. Furthermore, we're more likely to find certain characters in predictable places in the password. We find that numbers and symbols are placed at the very end or the very beginning of the password more often than other positions in the password.

All of these tendencies and inherent rules reduce the randomness, or entropy of the passwords we pick. The easier it is to predict aspects of the password, the easier it is for an attacker to guess the password. Common password cracking programs take advantage of all these statistical tendencies; rather than having to perform an exhaustive search of all the possible password combinations, the search can be limited to a smaller set of possible passwords with great success.

Unfortunately all of the things we do to help us remember our passwords are the very things that make them predictable and easier to crack. Even common tricks, like simple substitution of the number zero for the letter O or appending or prefixing a number or symbol to a common word, don't buy us much because they are the very same tricks the attackers use.

Can we make people pick stronger passwords?

We can encourage stronger passwords by banning common words or requiring some mixture of uppercase, lowercase, numbers, and symbols. Strictly speaking, these password constraints also reduce the true entropy of the password, but in practice they often produce more "random" passwords than the passwords that would be picked in the absence of those constraints. However, even with these constraints in place, users still manage to pick weak passwords (e.g., P@s$w0rd1). We could deploy more complex constraints to try to get users to pack more randomness into an eight or nine character password, but we quickly approach the point where the password becomes harder to remember.

Everything should be made as simple as possible, but no simpler.

So what do we do? One option would be to stop fighting human nature and allow users to pick more natural passwords that are easier for them to remember, yet still afford an adequate measure of strength (i.e., still be hard to crack). To do that requires us to start thinking about passwords in a different way. In fact, we need to stop thinking about them as passwords altogether and start to think about them as passphrases instead.

A passphrase uses multiple natural words or phrases to construct the "password." Because the characters that make up a word naturally have a relation to each other they can be thought of as atomic units. When you think of the word "chair" you don't think of the letters c-h-a-i-r, you think of the single item you know to be a chair. So instead of thinking about a password as being made up of eight or nine characters, start to think of a passphrase as being made up of five or six words. Consider the following nonsense passphrase:

apples flute steve four at chair

First, please do not use the above example as your actual choice of a passphrase. Now, what can we say about this passphrase compared to our traditional way of looking at passwords? You might be inclined to say it's weak because it's constructed of English dictionary words all in lowercase with no symbols or numbers. However, one thing stands out that is very different from a traditional password. It's long! In fact, it's a 32 character password in our old way of thinking; surely long enough to meet even the strictest of minimum length requirements that might be imposed. When we adopt a passphrase strategy to passwords we're basically trading length for character complexity.

If the system requires users to use complex characters, then the passphase could be modified to comply.

apples flute Steve four @ chair

What do we do if the bad guys find out?

Even if attackers had a priori knowledge that a user's password was constructed entirely of lowercase alpha characters (and the space key), they would still need to try on the average of 3.18 X 10^45 guesses to brute force a 32-character passphrase. Assuming our attacker gained access to the hash of the passphrase, it would take a 3GHz Pentium XP machine generating 5,000,000 guesses per second over 2.0 x 10^8 millennia to crack the passphrase. Considering our sun will burn out in about 5.0 x 10^6 millennia, I don't think we have anything to worry about.

But wait; let's assume our adversaries learn our users are using six-word passphrases and let's further assume that they know the users are only picking words from a 5000 word vocabulary (the approximate active vocabulary of an average English speaking five-year-old). With that knowledge they could construct a more sophisticated passphrase cracker rather than the traditional character-by-character password crackers of today. Armed with such a passphrase cracker, they would still need to try 7.8 x 10^21 guesses on the average to brute force the passphrase. Using our trusty XP system, that would only take 5.0 x 10^4 millennia; still plenty of time to develop a rich, deep tan before the sun burns out.

Even though a much reduced character set is being used, the added strength comes from the length of the passphrase. In fact, the 6-word passphrase in the example is roughly equivalent to an 11-character traditional password. The passphrase is superior however, because we already know that people aren't likely to remember an 11-character password. So in effect, we're getting the "strength" of a complex short password in a form that is easier for users to remember and easier for them to type.

Passphrases can make sense too!

Just as a traditional password is stronger by being constructed of truly random characters, a passphrase is going to be stronger if it is made up of truly random words, that have no relation to each other, and are nonsensical when taken as a true phrase. However, if we're willing to trade some randomness, we could encourage users to make their passphrases into meaningful sentences or phrases. In fact, by using true sentences, users are more likely to add additional entropy by using punctuation and capitalization naturally. For example, consider the following passphrase:

Steve found four apples and a flute at his chair.

It's important to note that the randomness drops significantly when natural English is used as a user's passphrase. Even though the above passphrase consists of 10 words, it's roughly only equivalent to a 9-character password.

Don't be trite!

Just as you wouldn't want a single common word to be used for a tradit