Thursday, June 19, 2008

Google Search wipes out 85% of web site content


From timalex ...
Our companies web site uses a content managment system whose interface is all browser based. Turning the GSA loose on our web site using an administrative account ended up wiping out 85% of our web site's content thru the execution of delete actions from web page links in the administrative interface of the content managment system.

The CMS system we use is built in coldfusion (which we're rapidly moving away from to .NET sometime next year.). These coldfusion pages have buttons / images all hyperlinked to perfrom different actions for content records, content folders, and unfortunately whole web site instances. One of these hyperlinked image buttons deletes the content when clicked, which the crawler furiously did last night.

More

Lessons learnt - crawl with an appropriate account that doesn't have access to the CM authoring functionality.

3 Comments:

  • Anonymous Anonymous said…

      While I have no love for Google, this is problem could occur with any spider (i.e. Microsoft Search) that has the admin creditials.   

  • Anonymous Anonymous said…

      As a rule, spiders will not make POSTs, but they will make GETs. It's a good idea to make destructive actions require a form post.

    Also, what was the GSA doing with CMS admin credentials?   

  • Anonymous Anonymous said…

      No, the lesson is NOT learned. Go and learn HTTP protocol basics first. Namely, the difference between GET and POST requests.   

Post a comment