Results 1 to 10 of 10

Thread: A question for those who have been visited by Googlebot

  1. #1
    Junior Member
    Join Date
    Feb 2006
    Posts
    16

    Default A question for those who have been visited by Googlebot

    I noticed the other day that my bandwidth had been spiking at an alarming rate. I live in absolute paranoia that I'll wake up one day and get a horrid screen that says, "Your site has been suspended...". So the first thing I thought when I saw that my site had used more in a day that it usually does in a week I freaked out.

    Looking through my stats it would seem it is Googlebot. It is listed as using up roughly the same amount of data that I find so unusual. Fine, I would love to be indexed. I have about 90GB bandwidth a month, I think, and even though I am alarmed that just a few days in to the month I'm close to using a GB (I usually only use between 4-6 the whole month) I think I should hold up okay. I only have 2GB worth of content after all... just how many times is Googlebot going to visit me?

    That's as far as my bandwidth limit goes. What my concern is, what is a visit from Googlebot like? This is shared hosting after all. I've read up on the situation and seen that many people have had their sites shut up from exceeding their monthly limit - fair enough, not cool, but this situation has two easy solutions: buy more bandwidth or just wait til next month rolls over. But has anyone heard of sites being shut down due to Googlebot causing problems in a shared enviroment? Is the Googlebot method like one user going from one page to the next, or is it like 1000s of users all hitting my site at once?

    This thing seems like a blessing and a curse. On one hand, I'd love to have my site indexed as it isn't indexed at all so far. On the other hand, I certainly don't want to lose it all together because it is causing stability issues and I get booted off my server. It's a good thing I am still working on my site so I have a lot of unused bandwidth to spare. If I had launched it, I'd probably be sweating even more.

    Should I worry? I read the other posts here about that txt file trick to stop it. I actually don't want to stop Googlebot from visiting me, unless it might get the site shut down, or, if it will be doing this every day for a year or more.

  2. #2
    Administrator AndrewT's Avatar
    Join Date
    Mar 2004
    Location
    Tulsa, OK
    Posts
    3,639

    Default

    On more than one occassion I've seen Google bots completely flood sites with page views as they were indexing. Usually we just temporarily block the IP if it is a problem.

  3. #3
    Senior Member
    Join Date
    Mar 2004
    Location
    California
    Posts
    724

    Default

    You might check out the entry on "robots.txt" that Google has at http://www.google.com/support/webmas....py?topic=8843 They go into a lot of detail on how they index a site.

    I have the "Crawl-Delay: 10" command in my robots.txt file to slow down the indexing, and I have certain directories excluded from indexing. If you have a site with rapidly changing pages, like forums, etc., then disallowing those pages from indexing will lessen the load.

  4. #4
    Junior Member
    Join Date
    Feb 2006
    Posts
    16

    Default

    I did read over it Frank, but I guess I missed the part about "Crawl-Delay: 10" - sounds like a good idea, I might incorporate something like that when I update more often. Thanks for the tip.

    Thanks for your response also, Andrew - that's pretty comforting to know how it is usually handled.

  5. #5
    Senior Member Buddha's Avatar
    Join Date
    Mar 2004
    Location
    Florida USA
    Posts
    825

    Default

    Since I fear Slurp more than Googlebot, here's a link to Yahoo's help page explaining crawl-delay:
    http://help.yahoo.com/help/us/ysearc.../slurp-03.html

    MSN supports this instruction too.
    http://search.msn.com.my/docs/siteow...otIndexing.htm
    "Whatcha mean I shouldn't be rude to my clients?! If you want polite then there will be a substantial fee increase." - Buddha

  6. #6
    Junior Member
    Join Date
    Feb 2006
    Posts
    16

    Default

    Thanks, reading that over now. Yahoo so far hasn't shown much interest in me, accounting for only a handful of hits - strangely I have about 100+ pages indexed with them but coinciding with Google their interest seems to be picking up. Good to know.

  7. #7
    Junior Member
    Join Date
    Dec 2005
    Posts
    13

    Default

    I recommend that you sign up for Google webmaster tools.

    It's free and if you already have an account you can go to https://www.google.com/webmasters/sitemaps/crawlrate?siteUrl=http://www.[yourdomain].com%2F&hl=en and change the rate that the googlebot crawls your site.

    Also, googlebot performs bandwidth benchmarking and you can't increase the frequency of the visits if they don't think your site can handle it. You can always decrease the frequency, though.

  8. #8
    Senior Member
    Join Date
    Mar 2004
    Location
    California
    Posts
    724

    Default

    Also, the "crawl-delay" command isn't recognized by Google. MSN and Slurp obey it, I think. I had a problem with MSNbot hitting the site very rapidly.

    Google doesn't index php forums accurately, so I have it excluded from the forum folder, and use Boardtracker for forum searches.

  9. #9
    Senior Member KyleC's Avatar
    Join Date
    Mar 2004
    Location
    Dallas, TX
    Posts
    291

    Default

    the MSN bot got caught in a loop on one of my customers sites. it was trying to access the member list on the forums and kept try again and again, and racked up 3gig of bandwidth in 8 days.

  10. #10
    Junior Member
    Join Date
    Dec 2005
    Posts
    13

    Default

    I've also heard of instances where bots can get stuck on dynamically generated calendars in which you can go REALLY FAR to anytime in the future or past.

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •