In previous posts relating to my LAMP server (see tags) I've described how I use PHP to write iptables rules so I can keep various ports closed when I'm not actually using them. Something else I use it for is to block IP addresses from which bad bots are operating. This technique is discussed and documented on various other sites so I'm not going to give details full details here. However, in a nutshell my scheme involves:
1. Using robots.txt to designate a directory into which bots should not go.
2. Inserting hidden links on various documents that link to a file in the above directory. Humans will not see these links (because they are invisible) and well behaved robots (that pay attention to robots.txt) will not follow them either. Thus the only things that will follow those links are bad bots.
3. My hidden links point to a PHP script which write a new iptables rule, thereby blocking the bot. An entry is also made in a database recording the time and request that blocked the bot.
4. Bots will often operate using addresses from pools that are used legitimately at other times, so it's important to release the blocked addresses after a suitable period. I have a cron job to run a script that checks the database and releases anything that has been blocked for longer than a specified period.
This has been working well for some time however more recently I've been seeing something else in my server reports that I wanted to do something about; attempts to access urls such as these:
//Admin//scripts/setup.php
//MyAdmin//scripts/setup.php
//admin//scripts/setup.php
//phpMyAdmin//scripts/setup.php
There's usually a great long list of them trying lots of variations. Something else I've seen quite a bit of is this kind of thing:
/index.php?gzip=0&file=/etc/passwd
Again, the usually attempt the same, or similar things with any .php they can find.
Now provided that all of your other security is in place then attempts such as these shouldn't be a problem however, they are clearly attempts to break into the server. As such whatever is generating them is making a nuisance of itself and should, in my humble opinion, be told to **** off at the first available opportunity. So I've added a few more lines to httpd.conf
RewriteRule scripts/setup.php /var/www/cgi-bin/ip_blocker.php
RewriteCond %{QUERY_STRING} /etc/passwd
RewriteRule ^(.+) /var/www/cgi-bin/ip_blocker.php
The first line looks for urls that contain "scripts/setup.php" and redirects them to my blocking script. Obviously if you use anything on you server where that would be part of a legitimate request you need to modify that, but I don't, so I can use it. The next two lines do a similar redirect on any request where the text "/etc/passwd" appears in the query string.
Note that because I wanted these rules to apply to all of the virtual sites on my server, I've put these rules such that they apply to the main server and told all of the virtual sites to inherit them using:
RewriteOptions inherit
Note however that despite them appearing before the virtual server directives in the httpd.conf file, the virtual server directives are processed first. Thus it is important that none of the virtual server rules end with [F] as this would result in a match there halting the processing of rewrite rules before these are run.
Enjoy, unless you're a bot. ;-)
Sunday, 21 June 2009
Sunday, 15 March 2009
Passwords
The issue of 'having a good password' is something that comes up in conversation from time to time although I generally suggest that, even if you think you know you what that means, you should do some research (because things change), I realise that most of my friends won't bother and that it might help of few people, those that can be bothered to read it, if I were to do a write up of my current understanding of what is a good password.
The reason I say 'current understanding', and suggest that things change, is because that is my experience; and I don't just mean my own understanding, but that of the 'industry' as a whole. While there are things that have always been regarded as insecure, there are also lots of things that were regarded as secure but are now regarded as a risk. This is largely because, as the speed of computers has increased, the crackers (people who try to break into systems) have more powerful tools at their disposal. For example I have an old programing book containing code for a 'password generator' that works be randomly choosing a couple of three or four letter words and sticking them together to create a password. These days the crackers use programs that attempt to get in by trying common dictionary words, both forwards, backwards, and stuck together in combinations. These days a password that uses actual dictionary words is regarded as relatively insecure.
A perfect password would be a completely random collection of letters, numbers, and punctuation marks. Why? Because if it is truly random then the only way for somebody to crack it is to try every possible combination. This does not mean that such a password is uncrackable; but it means that of all the passwords that we could use, this is the hardest to crack.
So, if we know what a perfect password is, why don't we all use them? And the answer is that we are human and we find them difficult to remember. There's not much point having a great password for your desktop computer if it's written on a post-it note stuck to the monitor; partly because it's could be a security risk and partly because if you lose the bit of paper you are locked out.
The security risk aspect of having a post-it note on your screen depends of course on what the password is for. For example, if I had the root password for my server (at a server farm) written on the side of my monitor at home then the only person who sees it besides me is my wife and I'm pretty confident that she's not interested in hacking my server. The post-it note is inaccessible to a cracker unless they come around to my house and although I am sure there are a whole bunch of folks on the Internet who might like to gain access to my server, I doubt very much that they'll go that far.
Okay, so I've said that a random collection of characters is best and while that might seem obvious it's worth taking a few moments to consider exactly why that is the case:
If I ask you to pick a letter of the alphabet you have 26 choices. If I ask you to differentiate between upper and lower case that doubles to 52. If I say to include digits then you gain another 10 possibilities giving 62 options. If we add in punctuation marks then we increase it further but I am going to leave them out (I'll explain why later).
So you have 62 options for a single character. Now I ask you for a second character you have 62 choices for that one too. That's 62 options for the first one and 62 options for the second. Thus for our two character 'password' there are 62x62 options (which is 3844 possibilities). Each additional character multiplies the number of possibilities by 62 so by the time we have a 6 character password we have 62x62x62x62x62x62 which is 56,800,235,584 possibilities. Now unless our password happens to be the very last password that a cracker tries, they aren't going to have to try all of them, but with that many possibilities to go at the cracker is going to have to try a hell of a lot, or are they?
Bear in mind that our 56.8 billion possibilities include all possible 6 letter dictionary words, peoples names, and dates of birth. However these will be a relatively small percentage of the total number of options. Just for the fun of it, let's look at dates and let's say that I'm trying to guess your date of birth, wedding anniversary, child's birthday or other 'secret' date that you may have used as a password:
There are 12 months in a year, each of which has a maximum of 31 days. Chances are that your date is within the last 50 years so for a DD-MM-YY date (6 digits) I have 31x12x50 options which is 18600 possibilities. Now that's a lot for a human but a password cracking program could work through them in a matter of minutes.
Add every name and dictionary word, forwards and backwards, into the equation and our cracker program still only has to deal with a tiny fraction of the number of possibilities that it would have to deal with if we use a truly random password. Of course a cracker could write a program to try totally random passwords but they don't need to because many, many people out there will use a name, date, dictionary word or something else and a program that tries these first will give them access to a lot of systems.
Okay, so now that I've (hopefully) sold you the idea that a password should be random, how do we go about remembering such a password?
Well one option that we already mentioned is to write it down and while there are a lot of cases where this would be a security risk, there are a lot where it is not a problem. I don't have my server password written on a post-it note but I do have a document on my computer that lists a whole bunch of passwords and PIN numbers that I didn't have any choice about and that I don't trust myself to always remember. The file is backed up so there's no chance of me losing it, and it's password protected (with a good password of my own devising) so to get in there a cracker would need to access my computer and crack that password. In other words, it is HIGHLY unlikely, so although the passwords are 'written' down, they are safe. Of course this still leaves me with the problem of how to remember the password that protects the file, and this is where my technique for remembering a random password comes into play, however I now have a confession to make: it isn't random!!!!!
What? Not random? Well, no, but almost. Let me explain:
My wife's name, let's call her Anne, would make a really poor password. First of all because it's only 4 letters and secondly because it's a name. However let's solve the length problem by adding letters from her surname until we have 6. Hell, let's be different and create a 7 letter password. Assuming that our surname is Jones we now have 'AnneJon'.
Now that's still pants as a password and of course if she'd been called Deborah then we wouldn't have needed the extra letters from the surname and it would be even worse. So, to improve things, let's change some of the letters to numbers that look a bit like them such that we get '4nn3J0n'. Now let's reverse it: 'n0J3nn4'.
Immediately it is looking a lot more random however that number swap is a common technique, and so is reversing a word. Plus, a name is a bad starting point in the first place. Clearly this gives us two areas that we can improve on: our seed word and our encryption' technique.
Now we still want to be able to remember this thing so rather than using one seed word, let's use three. Let's say I want a password for my account at Amazon, that my wife's name is Anne Jones and that I was born on the 04/03/64. By converting the site name to upper case and taking a character from each seed in turn we get: AA0Mn4An0Ze3OS6Nm4ith. However, I only want 7 characters so let's take the first 7 (because I can do that in my head without having to write anything down) and I get: AA0Mn4A.
Now that isn't truly random in the sense that it's been generated using two pieces of personal information and a modified (converted to upper case) site name but given the huge number of pieces of information that I could have used as 'seeds' and the huge number of ways in which I could manipulate them, it will be close to impossible for somebody else to guess at. In the example I used the site name, my wife's name, and my date of birth, in that order, and converted the site name to upper case. I could have used my fathers middle name, the registration number of the first car I ever bought, my sister's birthday, the last letter of each word in the first line of the song that was playing when I met my wife, the last 6 digits of my phone number or any number of other pieces of information as seeds. Furthermore I could have 'encrypted' them by reversing them, taking only the first and last letters of the names, using just the odd number letters, omiting vowels, or dozens of other techniques.
The important thing is that rather than using a memorable password I'm using memorable techniques to generate what is to all intents and purposes a random password. If I decide upon a set of seeds and a set of techniques and always use the same ones then I will always know how to generate my passwords. Note that by using the site name (or system name or something that I strongly associate with it) as one of the seeds, I can use the same seeds and my encryption techniques for the others and have different passwords for different sites/systems. Using the scheme above as an example, my password for eBay would be 'EA0Bn4A' and my password from GoogleMail would be 'GA0On4O'. What's more it means that if I register on any forums or shop checkout systems that store member/customer passwords in an unencrypted form, then a malicious person at the company can only see my password for their system; a password that gives them no clue about what I might be using on other systems.
Before we finish, I would just like to step back to my earlier decision to exclude punctuation characters when we were calculating how many possibilities there are for a six character password. Now that I have explained my technique for generating passwords it should be obvious why I did this: because the kind of seeds I suggest don't include punctuation symbols. You could however build in punctuation symbols at the encryption stage. Just in case you are thinking that 56.8 that we calculated before aren't enough.
The reason I say 'current understanding', and suggest that things change, is because that is my experience; and I don't just mean my own understanding, but that of the 'industry' as a whole. While there are things that have always been regarded as insecure, there are also lots of things that were regarded as secure but are now regarded as a risk. This is largely because, as the speed of computers has increased, the crackers (people who try to break into systems) have more powerful tools at their disposal. For example I have an old programing book containing code for a 'password generator' that works be randomly choosing a couple of three or four letter words and sticking them together to create a password. These days the crackers use programs that attempt to get in by trying common dictionary words, both forwards, backwards, and stuck together in combinations. These days a password that uses actual dictionary words is regarded as relatively insecure.
A Perfect Password
A perfect password would be a completely random collection of letters, numbers, and punctuation marks. Why? Because if it is truly random then the only way for somebody to crack it is to try every possible combination. This does not mean that such a password is uncrackable; but it means that of all the passwords that we could use, this is the hardest to crack.
So, if we know what a perfect password is, why don't we all use them? And the answer is that we are human and we find them difficult to remember. There's not much point having a great password for your desktop computer if it's written on a post-it note stuck to the monitor; partly because it's could be a security risk and partly because if you lose the bit of paper you are locked out.
The security risk aspect of having a post-it note on your screen depends of course on what the password is for. For example, if I had the root password for my server (at a server farm) written on the side of my monitor at home then the only person who sees it besides me is my wife and I'm pretty confident that she's not interested in hacking my server. The post-it note is inaccessible to a cracker unless they come around to my house and although I am sure there are a whole bunch of folks on the Internet who might like to gain access to my server, I doubt very much that they'll go that far.
Doing The Math
Okay, so I've said that a random collection of characters is best and while that might seem obvious it's worth taking a few moments to consider exactly why that is the case:
If I ask you to pick a letter of the alphabet you have 26 choices. If I ask you to differentiate between upper and lower case that doubles to 52. If I say to include digits then you gain another 10 possibilities giving 62 options. If we add in punctuation marks then we increase it further but I am going to leave them out (I'll explain why later).
So you have 62 options for a single character. Now I ask you for a second character you have 62 choices for that one too. That's 62 options for the first one and 62 options for the second. Thus for our two character 'password' there are 62x62 options (which is 3844 possibilities). Each additional character multiplies the number of possibilities by 62 so by the time we have a 6 character password we have 62x62x62x62x62x62 which is 56,800,235,584 possibilities. Now unless our password happens to be the very last password that a cracker tries, they aren't going to have to try all of them, but with that many possibilities to go at the cracker is going to have to try a hell of a lot, or are they?
Bear in mind that our 56.8 billion possibilities include all possible 6 letter dictionary words, peoples names, and dates of birth. However these will be a relatively small percentage of the total number of options. Just for the fun of it, let's look at dates and let's say that I'm trying to guess your date of birth, wedding anniversary, child's birthday or other 'secret' date that you may have used as a password:
There are 12 months in a year, each of which has a maximum of 31 days. Chances are that your date is within the last 50 years so for a DD-MM-YY date (6 digits) I have 31x12x50 options which is 18600 possibilities. Now that's a lot for a human but a password cracking program could work through them in a matter of minutes.
Add every name and dictionary word, forwards and backwards, into the equation and our cracker program still only has to deal with a tiny fraction of the number of possibilities that it would have to deal with if we use a truly random password. Of course a cracker could write a program to try totally random passwords but they don't need to because many, many people out there will use a name, date, dictionary word or something else and a program that tries these first will give them access to a lot of systems.
How To Remember It
Okay, so now that I've (hopefully) sold you the idea that a password should be random, how do we go about remembering such a password?
Well one option that we already mentioned is to write it down and while there are a lot of cases where this would be a security risk, there are a lot where it is not a problem. I don't have my server password written on a post-it note but I do have a document on my computer that lists a whole bunch of passwords and PIN numbers that I didn't have any choice about and that I don't trust myself to always remember. The file is backed up so there's no chance of me losing it, and it's password protected (with a good password of my own devising) so to get in there a cracker would need to access my computer and crack that password. In other words, it is HIGHLY unlikely, so although the passwords are 'written' down, they are safe. Of course this still leaves me with the problem of how to remember the password that protects the file, and this is where my technique for remembering a random password comes into play, however I now have a confession to make: it isn't random!!!!!
What? Not random? Well, no, but almost. Let me explain:
My wife's name, let's call her Anne, would make a really poor password. First of all because it's only 4 letters and secondly because it's a name. However let's solve the length problem by adding letters from her surname until we have 6. Hell, let's be different and create a 7 letter password. Assuming that our surname is Jones we now have 'AnneJon'.
Now that's still pants as a password and of course if she'd been called Deborah then we wouldn't have needed the extra letters from the surname and it would be even worse. So, to improve things, let's change some of the letters to numbers that look a bit like them such that we get '4nn3J0n'. Now let's reverse it: 'n0J3nn4'.
Immediately it is looking a lot more random however that number swap is a common technique, and so is reversing a word. Plus, a name is a bad starting point in the first place. Clearly this gives us two areas that we can improve on: our seed word and our encryption' technique.
Now we still want to be able to remember this thing so rather than using one seed word, let's use three. Let's say I want a password for my account at Amazon, that my wife's name is Anne Jones and that I was born on the 04/03/64. By converting the site name to upper case and taking a character from each seed in turn we get: AA0Mn4An0Ze3OS6Nm4ith. However, I only want 7 characters so let's take the first 7 (because I can do that in my head without having to write anything down) and I get: AA0Mn4A.
Now that isn't truly random in the sense that it's been generated using two pieces of personal information and a modified (converted to upper case) site name but given the huge number of pieces of information that I could have used as 'seeds' and the huge number of ways in which I could manipulate them, it will be close to impossible for somebody else to guess at. In the example I used the site name, my wife's name, and my date of birth, in that order, and converted the site name to upper case. I could have used my fathers middle name, the registration number of the first car I ever bought, my sister's birthday, the last letter of each word in the first line of the song that was playing when I met my wife, the last 6 digits of my phone number or any number of other pieces of information as seeds. Furthermore I could have 'encrypted' them by reversing them, taking only the first and last letters of the names, using just the odd number letters, omiting vowels, or dozens of other techniques.
The important thing is that rather than using a memorable password I'm using memorable techniques to generate what is to all intents and purposes a random password. If I decide upon a set of seeds and a set of techniques and always use the same ones then I will always know how to generate my passwords. Note that by using the site name (or system name or something that I strongly associate with it) as one of the seeds, I can use the same seeds and my encryption techniques for the others and have different passwords for different sites/systems. Using the scheme above as an example, my password for eBay would be 'EA0Bn4A' and my password from GoogleMail would be 'GA0On4O'. What's more it means that if I register on any forums or shop checkout systems that store member/customer passwords in an unencrypted form, then a malicious person at the company can only see my password for their system; a password that gives them no clue about what I might be using on other systems.
Before we finish, I would just like to step back to my earlier decision to exclude punctuation characters when we were calculating how many possibilities there are for a six character password. Now that I have explained my technique for generating passwords it should be obvious why I did this: because the kind of seeds I suggest don't include punctuation symbols. You could however build in punctuation symbols at the encryption stage. Just in case you are thinking that 56.8 that we calculated before aren't enough.
Domain Names
From time to time I encounter people who need a domain name so I thought I'd share my thinking about choosing one:
Choosing a domain name is like naming a child in that you're as well to give it some very careful consideration because, although it's not impossible to change it again later, it can be rather difficult. In another way however, it's not at all like that because while there are a lot of people called, for example, Andy Slater, there is and can only be, one www.andyslater.com
Things to consider when choosing a domain name:
Choosing a domain name is like naming a child in that you're as well to give it some very careful consideration because, although it's not impossible to change it again later, it can be rather difficult. In another way however, it's not at all like that because while there are a lot of people called, for example, Andy Slater, there is and can only be, one www.andyslater.com
Things to consider when choosing a domain name:
- Availability
As stated above, every domain name has to be unique so if somebody has already registered the one you'd like, although you may be able to buy it from them for some exorbitant sum, it's probably just a case of hard luck.
You can find out if a name is available at places like http://www.uk2.net
On the front page of their site is a box where you can type the domain name that you're interested in registering and click a button to see if it's available. Chances are that your first choice won't be so be prepared for some disappointment and for some considerable time spent thinking up alternatives.
Please note that if you are planning to ask me to host your website on my server that the whole thing is a lot easier if your domain name is registered via UK2 but please contact me before jumping in. - www.
Pretty much all domain names start with www. so you don't need to worry about that bit. - .com? .co.uk? .net? .info?
The bit at the end is called the extension and there are various different ones available. However:- .com is the one that's best known. If people can't remember your extension, .com will probably be the first one they'll try so it's well worth having.
- Search engine results show your domain name including the extension. If you've ever used a search engine to find something you want to buy then you'll be aware that much of the time you end up looking at shops in the USA. Amongst the results however you will see sites that have a .co.uk extension and this is a valuable clue that the site is based in the UK. When I'm shopping, I make a beeline for the ones with a .co.uk extension. Thus if you are selling to the UK market, a .co.uk extension can be a bonus.
- Name Length
Short names are best. eBay, Amazon, Google, etc. Short and snappy. I used to run a business from a site called www.themodelmakersresource.co.uk and found that the domain name was too long to fit on till receipts, too long to get printed onto pens for promotional purposes, etc, etc. Doh! - Hyphens and Underscores
You can't use spaces in domain names but you can use hypens and underscores. If your desired name contains more than one word you could therefore:- Join all the words together: www.themodelmakersresource.co.uk
- Use hypens: www.the-modelmakers-resource.co.uk
- Use underscores: www.the_modelmakers_resource.co.uk
Although I said you could use underscores, I recommend that you don't. It's not common practice and is bound to cause confusion; people will mistake them for hypens and spaces.
Note also that for business purposes it's probably wise to obtain all variants. You can for example access Games Workshop's site with:
www.gamesworkshop.com
www.games-workshop.com
www.gamesworkshop.co.uk
www.games-workshop.co.uk
Smart move guys. Note however that eBay, Amazon, Google (businesses that were Internet businesses from day 1) and the like only need half as many domain names to cover all the options. Even smarter. - Memorability
My experience with The Model Makers Resource has shown me that on the web, unless you can become a household name, you're at the mercy of the search engines if you want people to 'stumble' across your site. Obviously the more memorable your name, the better your chances of people coming straight back to you rather than heading for the search engines. The Model Makers Resource Ltd is too long and clumsy for most folks to remember. There's also an issue about whether "model makers" should be one word, two words, or hyphenated and arguing that your choice is correct according to the dictionary is of little value if potential customers are failing to reach you because they are typing it in wrong.
It is strange, but it is a fact that I would often hear people outside my shop saying "Oh look, The Model Makers Resource Centre". The sign did NOT have the word 'centre' anywhere on it but for some reason people just seemed to add it on the end. What chance have they got of remembering the domain name if they can't even read the company name off the sign above the shop window correctly?
It's a bad name and more 'field testing' would have told me that before I invested in it. - Spelling
If you're a business, having a name that's easy to remember has to be a smart move. Even if you're not a business, obscure spellings, or things that people have difficulty in spelling, are best avoided. Choosing something that you have to explain or spell out every time you tell someone your domain name over the phone is a bad move. Choose something such that when you tell people your domain name, their first guess at how to spell it will be correct. - Keywords
'Keywords' are words that people type into search engines when they are looking for something. Having a keyword in your domain name can enhance it's position on the list (it's ranking). Perhaps more important, the search engines are fickle after all, is the fact that your domain name is displayed and can be used to say something about you.
When I was setting up The Model Makers Resource, having the word 'model' in the name was on my list of priorities. Part of the trouble I had finding a domain name was that the word 'model' can be used in more than one context. You've got model kits, catwalk models, data modelling, model citizens and a huge range of other things. As a result, pretty much every domain name I though of was already in use. I did find one short one, models-uk, uk-models or something like that; however, all of the other 'variants' were taken up by competitors, model agencies and in one case a rather nasty porn site. I decided to steer clear.
If you can slip a keyword into your name it's certainly worth considering but I believe that there are more important issues, as listed above, and I wouldn't compromise them for the sake of inserting a keyword.
At the end of the day, Boots (the chemists) don't seem to have suffered from having an inappropriate name although one is inclined to wonder what ever became of Mr Chemist the cobbler?
HTML Special Characters
This document used to live on one of my websites however it doesn't really fit in there any more. I still need it though, so I've moved it here.
For the benefit of those who don't know, HTML special character codes are codes that you can use when writing HTML content and the exist for a number of reasons:
The list below is far from complete. There are many more codes to define all kinds of symbols and foreign characters with accents, umlauts and the like. The following list however shows the codes that I use the most frequently.
For the benefit of those who don't know, HTML special character codes are codes that you can use when writing HTML content and the exist for a number of reasons:
- Some characters don't appear on the keyboard. Special character codes provide a means of entering them.
- Some characters, the greater-than character for example, have a special meaning in HTML. If you want a browser to display those character rather than trying to interpret them as the part of an HTML tag, you need to use the special character code.
- Some characters are language and/or character set dependent (currency symbols being a good example). You need a way to force the viewers browser to display the correct character. Special character codes provide a way of doing this.
The list below is far from complete. There are many more codes to define all kinds of symbols and foreign characters with accents, umlauts and the like. The following list however shows the codes that I use the most frequently.
Code | Character | Description |
& | & | ampersand |
< | < | less-than sign |
> | > | greater-than sign |
| nonbreaking space | |
£ | £ | pound sterling |
€ | € | euro |
© | © | copyright |
™ | ™ | trademark sign |
® | ® | registered trademark |
° | ° | degree sign |
± | ± | plus or minus |
² | ² | superscript two |
³ | ³ | superscript three |
¼ | ¼ | one-quarter |
½ | ½ | one-half |
¾ | ¾ | three-quarters |
× | × | multiplication sign |
÷ | ÷ | division sign |
Subscribe to:
Posts (Atom)