Ok, this is just plain hillarious.. Though I do feel sorry for the poor sysadmin who had to trace this issue and I am impressed that he actually managed to solve it. Basically:
“Josh [Breckman] was called in to investigate and noticed that one particularly troublesome external IP had gone in and deleted all of the content on the system. The IP didn’t belong to some overseas hacker bent on destroying helpful government information. It resolved to googlebot.com, Google’s very own web crawling spider. Whoops.â€
So when googlebot crawled the pages it deleted them as it went through them. This is really poor application design, the authentication mechanism depended on the client to have javascript enabled and if they didn’t it assumed that allowed the visitor full access to the site.. Not smart…
Complete Writeup: The Spider of Doom – The Daily WTF (Has some good suggestions on how to avoid this)
Thanks to Daily Blogoscoped for the link.
– Suramya