A question for those who have been visited by Googlebot

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • PNPx
    Junior Member
    • Feb 2006
    • 16

    A question for those who have been visited by Googlebot

    I noticed the other day that my bandwidth had been spiking at an alarming rate. I live in absolute paranoia that I'll wake up one day and get a horrid screen that says, "Your site has been suspended...". So the first thing I thought when I saw that my site had used more in a day that it usually does in a week I freaked out.

    Looking through my stats it would seem it is Googlebot. It is listed as using up roughly the same amount of data that I find so unusual. Fine, I would love to be indexed. I have about 90GB bandwidth a month, I think, and even though I am alarmed that just a few days in to the month I'm close to using a GB (I usually only use between 4-6 the whole month) I think I should hold up okay. I only have 2GB worth of content after all... just how many times is Googlebot going to visit me?

    That's as far as my bandwidth limit goes. What my concern is, what is a visit from Googlebot like? This is shared hosting after all. I've read up on the situation and seen that many people have had their sites shut up from exceeding their monthly limit - fair enough, not cool, but this situation has two easy solutions: buy more bandwidth or just wait til next month rolls over. But has anyone heard of sites being shut down due to Googlebot causing problems in a shared enviroment? Is the Googlebot method like one user going from one page to the next, or is it like 1000s of users all hitting my site at once?

    This thing seems like a blessing and a curse. On one hand, I'd love to have my site indexed as it isn't indexed at all so far. On the other hand, I certainly don't want to lose it all together because it is causing stability issues and I get booted off my server. It's a good thing I am still working on my site so I have a lot of unused bandwidth to spare. If I had launched it, I'd probably be sweating even more.

    Should I worry? I read the other posts here about that txt file trick to stop it. I actually don't want to stop Googlebot from visiting me, unless it might get the site shut down, or, if it will be doing this every day for a year or more.
  • AndrewT
    Administrator
    • Mar 2004
    • 3653

    #2
    On more than one occassion I've seen Google bots completely flood sites with page views as they were indexing. Usually we just temporarily block the IP if it is a problem.

    Comment

    • Frank Hagan
      Senior Member
      • Mar 2004
      • 724

      #3
      You might check out the entry on "robots.txt" that Google has at http://www.google.com/support/webmas....py?topic=8843 They go into a lot of detail on how they index a site.

      I have the "Crawl-Delay: 10" command in my robots.txt file to slow down the indexing, and I have certain directories excluded from indexing. If you have a site with rapidly changing pages, like forums, etc., then disallowing those pages from indexing will lessen the load.

      Comment

      • PNPx
        Junior Member
        • Feb 2006
        • 16

        #4
        I did read over it Frank, but I guess I missed the part about "Crawl-Delay: 10" - sounds like a good idea, I might incorporate something like that when I update more often. Thanks for the tip.

        Thanks for your response also, Andrew - that's pretty comforting to know how it is usually handled.

        Comment

        • Buddha
          Senior Member
          • Mar 2004
          • 825

          #5
          Since I fear Slurp more than Googlebot, here's a link to Yahoo's help page explaining crawl-delay:


          MSN supports this instruction too.
          "Whatcha mean I shouldn't be rude to my clients?! If you want polite then there will be a substantial fee increase." - Buddha

          Comment

          • PNPx
            Junior Member
            • Feb 2006
            • 16

            #6
            Thanks, reading that over now. Yahoo so far hasn't shown much interest in me, accounting for only a handful of hits - strangely I have about 100+ pages indexed with them but coinciding with Google their interest seems to be picking up. Good to know.

            Comment

            • AndyP
              Junior Member
              • Dec 2005
              • 13

              #7
              I recommend that you sign up for Google webmaster tools.

              It's free and if you already have an account you can go to https://www.google.com/webmasters/sitemaps/crawlrate?siteUrl=http://www.[yourdomain].com%2F&hl=en and change the rate that the googlebot crawls your site.

              Also, googlebot performs bandwidth benchmarking and you can't increase the frequency of the visits if they don't think your site can handle it. You can always decrease the frequency, though.

              Comment

              • Frank Hagan
                Senior Member
                • Mar 2004
                • 724

                #8
                Also, the "crawl-delay" command isn't recognized by Google. MSN and Slurp obey it, I think. I had a problem with MSNbot hitting the site very rapidly.

                Google doesn't index php forums accurately, so I have it excluded from the forum folder, and use Boardtracker for forum searches.

                Comment

                • KyleC
                  Senior Member
                  • Mar 2004
                  • 291

                  #9
                  the MSN bot got caught in a loop on one of my customers sites. it was trying to access the member list on the forums and kept try again and again, and racked up 3gig of bandwidth in 8 days.
                  -Kyle

                  Comment

                  • AndyP
                    Junior Member
                    • Dec 2005
                    • 13

                    #10
                    I've also heard of instances where bots can get stuck on dynamically generated calendars in which you can go REALLY FAR to anytime in the future or past.

                    Comment

                    Working...