robots.txt?

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Yaroslav
    Junior Member
    • Mar 2004
    • 19

    robots.txt?

    For some reason, robots.txt keeps appearing in the root folder of one of my accounts. As I run AdSense etc, blocking crawlers is the last thing I want. Why could this be happening? Is there some automated "crawler protection" in place?

    Also, is there a way for me to find out what is creating a file? (Is it done by a script or via an FTP session or whatever.)
  • AndrewT
    Administrator
    • Mar 2004
    • 3653

    #2
    I can't really say why such a file might have appeared in your case but you should certainly have some kind of robots.txt file configured to rate limit the bot requests. Many of these bots by default will run through pages very quickly and when these are PHP/SQL pages this can create some problems.

    Comment

    • Frank Hagan
      Senior Member
      • Mar 2004
      • 724

      #3
      The big search engines, like Google, MSN and Ask, want to find a robots.txt file. They will honor it, for the most part.

      Andrew mentioned a problem I had with "twiceler", a robot that pegged my monthly bandwidth on an account in one day. I had an Amazon.com shop, and it was trying to index Amazon's entire catalog through my site. I excluded it in robots.txt because it wasn't reading the html headers on the individual pages for "index, no follow" commands.

      Comment

      Working...