Updating between versions and robots.txt

Permalink 2 users found helpful
When you install concrete5 the updates folder is not disallowed in robots.txt Most of the other folders are automatically added on your first install except /updates. Example:

User-agent: *
Disallow: /blocks
Disallow: /concrete
Disallow: /config
Disallow: /controllers
Disallow: /css
Disallow: /elements
Disallow: /helpers
Disallow: /jobs
Disallow: /js
Disallow: /languages
Disallow: /libraries
Disallow: /mail
Disallow: /models
Disallow: /packages
Disallow: /single_pages
Disallow: /themes
Disallow: /tools

I came across this by accident when searching my website on Google. A few days after updating, there were suddenly thousands of new entries for my website. Investigating, I was shocked to find the entire contents of my /updates folder (over a thousand files) had been indexed by various search engines (mostly Google). It seems I'm not the only one caught out. Do a Google search using the terms "concrete5" and "updates". There are hundreds of other sites which have had their entire directory contents and more listed. Messy stuff and a possible security risk.

For people in the same boat as me, add /updates to your robots.txt file thus:

Disallow: /updates

If you have Google webmaster access, there is a nifty way of deleting the entire contents from a directory folder without having to remove each url individually. Add the disallow to your robots.txt as above, then log on into your Google webmaster dashboard. Go into "Site configuration" on the left-hand dashboard, and mouse click "Crawler access". Click "Remove URL" then "New removal request". In the pop down box enter /updates . Press "continue" and your website with the "/updates" folder will be automatically appended to your url address. In the "reason" menu, click on "Remove directory", then press "Submit Request" when finished. Done. Saved a lot of hassles for me.