Sunday 15 March 2009

Passwords

The issue of 'having a good password' is something that comes up in conversation from time to time although I generally suggest that, even if you think you know you what that means, you should do some research (because things change), I realise that most of my friends won't bother and that it might help of few people, those that can be bothered to read it, if I were to do a write up of my current understanding of what is a good password.

The reason I say 'current understanding', and suggest that things change, is because that is my experience; and I don't just mean my own understanding, but that of the 'industry' as a whole. While there are things that have always been regarded as insecure, there are also lots of things that were regarded as secure but are now regarded as a risk. This is largely because, as the speed of computers has increased, the crackers (people who try to break into systems) have more powerful tools at their disposal. For example I have an old programing book containing code for a 'password generator' that works be randomly choosing a couple of three or four letter words and sticking them together to create a password. These days the crackers use programs that attempt to get in by trying common dictionary words, both forwards, backwards, and stuck together in combinations. These days a password that uses actual dictionary words is regarded as relatively insecure.

A Perfect Password



A perfect password would be a completely random collection of letters, numbers, and punctuation marks. Why? Because if it is truly random then the only way for somebody to crack it is to try every possible combination. This does not mean that such a password is uncrackable; but it means that of all the passwords that we could use, this is the hardest to crack.

So, if we know what a perfect password is, why don't we all use them? And the answer is that we are human and we find them difficult to remember. There's not much point having a great password for your desktop computer if it's written on a post-it note stuck to the monitor; partly because it's could be a security risk and partly because if you lose the bit of paper you are locked out.

The security risk aspect of having a post-it note on your screen depends of course on what the password is for. For example, if I had the root password for my server (at a server farm) written on the side of my monitor at home then the only person who sees it besides me is my wife and I'm pretty confident that she's not interested in hacking my server. The post-it note is inaccessible to a cracker unless they come around to my house and although I am sure there are a whole bunch of folks on the Internet who might like to gain access to my server, I doubt very much that they'll go that far.

Doing The Math



Okay, so I've said that a random collection of characters is best and while that might seem obvious it's worth taking a few moments to consider exactly why that is the case:

If I ask you to pick a letter of the alphabet you have 26 choices. If I ask you to differentiate between upper and lower case that doubles to 52. If I say to include digits then you gain another 10 possibilities giving 62 options. If we add in punctuation marks then we increase it further but I am going to leave them out (I'll explain why later).

So you have 62 options for a single character. Now I ask you for a second character you have 62 choices for that one too. That's 62 options for the first one and 62 options for the second. Thus for our two character 'password' there are 62x62 options (which is 3844 possibilities). Each additional character multiplies the number of possibilities by 62 so by the time we have a 6 character password we have 62x62x62x62x62x62 which is 56,800,235,584 possibilities. Now unless our password happens to be the very last password that a cracker tries, they aren't going to have to try all of them, but with that many possibilities to go at the cracker is going to have to try a hell of a lot, or are they?

Bear in mind that our 56.8 billion possibilities include all possible 6 letter dictionary words, peoples names, and dates of birth. However these will be a relatively small percentage of the total number of options. Just for the fun of it, let's look at dates and let's say that I'm trying to guess your date of birth, wedding anniversary, child's birthday or other 'secret' date that you may have used as a password:

There are 12 months in a year, each of which has a maximum of 31 days. Chances are that your date is within the last 50 years so for a DD-MM-YY date (6 digits) I have 31x12x50 options which is 18600 possibilities. Now that's a lot for a human but a password cracking program could work through them in a matter of minutes.

Add every name and dictionary word, forwards and backwards, into the equation and our cracker program still only has to deal with a tiny fraction of the number of possibilities that it would have to deal with if we use a truly random password. Of course a cracker could write a program to try totally random passwords but they don't need to because many, many people out there will use a name, date, dictionary word or something else and a program that tries these first will give them access to a lot of systems.

How To Remember It



Okay, so now that I've (hopefully) sold you the idea that a password should be random, how do we go about remembering such a password?

Well one option that we already mentioned is to write it down and while there are a lot of cases where this would be a security risk, there are a lot where it is not a problem. I don't have my server password written on a post-it note but I do have a document on my computer that lists a whole bunch of passwords and PIN numbers that I didn't have any choice about and that I don't trust myself to always remember. The file is backed up so there's no chance of me losing it, and it's password protected (with a good password of my own devising) so to get in there a cracker would need to access my computer and crack that password. In other words, it is HIGHLY unlikely, so although the passwords are 'written' down, they are safe. Of course this still leaves me with the problem of how to remember the password that protects the file, and this is where my technique for remembering a random password comes into play, however I now have a confession to make: it isn't random!!!!!

What? Not random? Well, no, but almost. Let me explain:

My wife's name, let's call her Anne, would make a really poor password. First of all because it's only 4 letters and secondly because it's a name. However let's solve the length problem by adding letters from her surname until we have 6. Hell, let's be different and create a 7 letter password. Assuming that our surname is Jones we now have 'AnneJon'.

Now that's still pants as a password and of course if she'd been called Deborah then we wouldn't have needed the extra letters from the surname and it would be even worse. So, to improve things, let's change some of the letters to numbers that look a bit like them such that we get '4nn3J0n'. Now let's reverse it: 'n0J3nn4'.

Immediately it is looking a lot more random however that number swap is a common technique, and so is reversing a word. Plus, a name is a bad starting point in the first place. Clearly this gives us two areas that we can improve on: our seed word and our encryption' technique.

Now we still want to be able to remember this thing so rather than using one seed word, let's use three. Let's say I want a password for my account at Amazon, that my wife's name is Anne Jones and that I was born on the 04/03/64. By converting the site name to upper case and taking a character from each seed in turn we get: AA0Mn4An0Ze3OS6Nm4ith. However, I only want 7 characters so let's take the first 7 (because I can do that in my head without having to write anything down) and I get: AA0Mn4A.

Now that isn't truly random in the sense that it's been generated using two pieces of personal information and a modified (converted to upper case) site name but given the huge number of pieces of information that I could have used as 'seeds' and the huge number of ways in which I could manipulate them, it will be close to impossible for somebody else to guess at. In the example I used the site name, my wife's name, and my date of birth, in that order, and converted the site name to upper case. I could have used my fathers middle name, the registration number of the first car I ever bought, my sister's birthday, the last letter of each word in the first line of the song that was playing when I met my wife, the last 6 digits of my phone number or any number of other pieces of information as seeds. Furthermore I could have 'encrypted' them by reversing them, taking only the first and last letters of the names, using just the odd number letters, omiting vowels, or dozens of other techniques.

The important thing is that rather than using a memorable password I'm using memorable techniques to generate what is to all intents and purposes a random password. If I decide upon a set of seeds and a set of techniques and always use the same ones then I will always know how to generate my passwords. Note that by using the site name (or system name or something that I strongly associate with it) as one of the seeds, I can use the same seeds and my encryption techniques for the others and have different passwords for different sites/systems. Using the scheme above as an example, my password for eBay would be 'EA0Bn4A' and my password from GoogleMail would be 'GA0On4O'. What's more it means that if I register on any forums or shop checkout systems that store member/customer passwords in an unencrypted form, then a malicious person at the company can only see my password for their system; a password that gives them no clue about what I might be using on other systems.

Before we finish, I would just like to step back to my earlier decision to exclude punctuation characters when we were calculating how many possibilities there are for a six character password. Now that I have explained my technique for generating passwords it should be obvious why I did this: because the kind of seeds I suggest don't include punctuation symbols. You could however build in punctuation symbols at the encryption stage. Just in case you are thinking that 56.8 that we calculated before aren't enough.

2 comments:

3ls said...

I can see that you have put a lot of thought in to making a secure password, with an easy way to try and remember it. I understand that this blog was published in March 2009 and lots of things have changed since then. The fact that with a 6 digit password, there being 56 billion possibilities makes the password sound very safe. If you take a basic website such as: http://howsecureismypassword.net/ and type a completely random 6 digit password consisting of lower and upper case letters as well as numbers, the website claims it could be hacked in 3 minutes. It is incredibly small chance that the password they try last is going to be yours, a probability of 1 in 56 billion in fact; therefore this time is actually a lot less than 3 seconds. I understand that is website might not be very accurate, but for it to come up with a time period so low is quite worrying.
The main thing that I believe keeps a password secure today is length, the longer the password the more possibilities so the longer it will take the computer to crack. You gave a quite strong view that using dictionary words is unsafe, I don’t think this is the case as long as you use the right words. Ideally using words that aren’t in the dictionary would make a lot stronger password but these are a lot harder to remember, and not many people have gone to the effort of creating new words. There are over 171 thousand words in the dictionary; this is a very small amount in comparison to your 56 billion combinations. Ideally using your idea of random letters and numbers for a 18 digit password would be extremely secure, but remembering this is very difficult and writing it down decreases the safety of the password. If I created a password of completely irrelevant words to each other and place it on the same website, such as: hopscotch0tea0bush an 18 digit password, the website says that it will take 1 trillion years. A very secure password, I would recommend a length of over 16 letters and try to split up your words with numbers or punctuation.
If you are to compare this with 18 long random letters and digits and punctuation such as: %kT8.GeYp26#Wyd0J} it comes up with 60 Quintillion years. An incredibly strong password, but very difficult to remember so in practical from the average person. This shows that a random password is by far the securest but remembering it is very difficult, it is a lot better to have a set of words split by numbers or punctuation as it is a lot easier to remember than a small password consisting of random letters. If you can go to the effort of remembering something so random, it is worthwhile as you can create a password which is very difficult to crack.
If you wanted to create yourself a password consisting of words from the dictionary that is easy to remember, make sure that none of the words are associated to you or each other, for added security, miss spell them or miss letters, in essence creating your own new word. All I do is think of three subjects that come to you head such as shape, furniture and food, then I could come up with a password involving these, and to remember it all I need to think of is shape furniture and food. By using this idea I could create a password such as:
Oblong1chair2beetroot and this would take 5 sextillion years. And now by remembering the shape, furniture and food, the password comes in to my head, especially after using it a couple of times. As you can see I started my password off with a capital for added security. And just put two numbers to split up my words. The words I have chosen as obscure and have no association with me. What do you think?

LuAn said...

Your proposed method for generating passwords sounds reasonable to me however it's worth noting that you don't always have the option to use a long password. Passwords have to be stored (usually in an encrypted form) within the protected system, and if the system programer has set it to store passwords of up to say 6 characters, then that's all you can use. In that instance you can only use the "Oblong" part of your "Oblong1chair2beetroot" password, which is pretty weak. If instead you use the letters from each of your "seed words" i.e. the "O" from "Oblong" followed by the "c" from "chair", the "b" from beetroot, then the "b" from "Oblong", and so on, then you get "Ocbbhe"... which is much better.

Something else that's well worth considering, although it's in the hands of system programers rather than user (programers take note), is to employ a lockout mechanism after a set number of attempts (like ATM machines do with PINs). As you point out, it would be unlikely that the correct password would be the last one that a cracker tries. In fact according to the laws of probability they would on average need only try half of them (50% of the time it would be less than half, and 50% it would be more). Now if the system were to lock them out after say 3 attempts, and not let them try again for say half an hour, this would GREATLY increase the time it would take to crack a system. Furthermore, if triggering the lockout were to send a notification to a system administrator who made it their business to ascertain whether this was a legitimate user having a bad day or something more sinister...

Something else I'd like to mention here, given that the original post is a few years old now, is another method that I currently use to generate passwords, using songs and poems. It's pretty simple, easy to remember, and generates "random" strings of characters.

In a nutshell, all you do, is to pick a song or poem that you know well. Then take the first letter from each word. So, for example, the poem:

Mary had a little lamb
It's fleece was white as snow

Generates the password:

MhallIfwwas

You can of course truncate it to a fewer number of characters if that's all you're allowed, or expand it if you want and area allowed more.