click for content
benefits
to your business
services
how we deliver
clients
our client list
contact us
let's talk
   
net-progress

War of the worlds – website security

Forget the Summer blockbuster starring Tom Cruise. The real pyrotechnics took place between two astronomers last year.

In July 2005, Jose-Luis Ortiz and his team at the Institute of Astrophysics of Andalusia announced that they had discovered a giant object orbiting beyond Neptune. Mike Brown, an astronomer at Caltech, emailed his congratulations to Ortiz, and at the same time, told the Minor Planet Center (MPC) that he had also been tracking the object. Soon after, Brian Marsden of the MPC told Brown that telescope logs including his observations were publicly available on the internet.

Brown then checked his server records, and by performing reverse DNS lookup (incidentally demonstrating what a valuable process this is), discovered that his logs had been accessed via two computers at the Institute of Astrophysics of Andalusia. Ortiz readily admits that this is the case. However, he claims that he did nothing wrong, as he found the logs on a publicly available website via a Google search. However, as the use of the Caltech logs were not recognised, it is not clear whether the log file data was used to validate the Spanish findings, or whether it caused them to re-examine images taken more than two years previously.

Putting to one side the elements of the debate particular to the astronomy community, lets concentrate on the accessing of the log files. Well, within the letter of the law, you would have to say that Ortiz is right in saying that the log files were in the public domain, and therefore “fair game”. In fact, we found that the log files are still available to the public. However, we would have to say that for us, it isn’t right ethically.

Don’t think that finding information not really intended for everyone is uncommon. Not that long ago, we found that we had been nominated for an award when we found the entry form via Google. It wasn’t particularly sensitive, but we knew that it wasn’t supposed to be available to the general public. If your website can be indexed by Google, it will index it. Normally, of course, this is a good thing, but its worth sitting back for a moment and thinking about what you have on your website and whether you want Google to index everything it finds.

One thing you can do is to go to Google and type in, “site:www.mydomain.com”, inserting your own domain name, of course. This will list all the pages that Google has indexed from your website.

Assuming that you want to keep information from Google, what can you do? Well, firstly, you can password-protect directories and pages. This is probably the best solution, as it is difficult to argue that information is in the public domain if someone has to hack a password to get it. You can also use a robots.txt file to tell the search engine spiders (the technology used to index a website) what it can list and what is off limits. Similarly, a meta tag can be placed in the head of individual pages to the same effect.

This won’t completely fireproof you however. On some websites, your browser will list the contents of a directory if there is no index page. That loophole can also be closed, just don't forget to do it.

Return to resources page

 
 
 
 

Related pages

You may also want to see our glossary page


site map | glossary | resources | privacy | accessibility
turning information into business intelligence since 1997