03 Sep
Posted by Andrew as Web & Tech
On 01 August, 2007, I noticed a significant drop in traffic to many of my sites, including my main website. Now, most webmasters will be familiar with the occasional blip when things go wrong, so you learn to sit tight and monitor the situation. Sometimes, these things can last anywhere from a day to 2 weeks, and the last thing you DON’T want to do is react in panic and make changes that aggravate the situation. However, after 3 weeks and no sign of recovery, I started to worry. I decided to do some checking on my own. Along the way, I learnt many lessons.
The symptoms
* Google Sitemaps reports “Errors” that alternated between :
* Checking the Diagnostics tab, I started to see many pages listed as unreachable.

* Analysis of my cached in Diagnostics revealed Google’s last cached date for my robots. txt file was was back in July 13, 2007. This would suggest that Googlebot had problems accessing my robots. txt file (and my site) after that.
* A look into my server logs showed Googlebot’s last single visit for the day was on August 10. That seemed strange when in the past, Googlebot would visit me at least once every hour on average.
* The graph of Google’s crawl rate showed Google stopped crawling my site in late July, and the second graph showed erratic download times co-inciding with Googlebot’s absence.


I believe that as a result of Googlebot being unable to access my pages, my site dropped out of the Search Engine Ranked Pages (serps), so traffic that used to come from being ranked on page one for popular terms stopped. This caused an unnerving drop in revenue.
Checking Google
As usual, my first stop when these sort of things happen is the online forums. I needed to check if this was a widespread issue that webmasters were facing. I did find a number of threads on the same problem, but not enough to verify that this was a serious bug on Google’s part. However, the small number of forum threads about a particular problem doesn’t necessarily mean that it isn’t a Google bug. After all, Google IS a machine, made up of hundreds of thousands of servers and many, many algorithms. And if your site is not a heavy-weight traffic generator, it’s unlikely that a Google bug affecting your site is going to be top-priority for the guys at Mountain View.
Checking Myself and My Sites
Once I had more or less determined that this wasn’t a widespread Google bug, I needed to make sure Googlebot didn’t stop visiting because of my own doing.
* Did Googlebot stop visiting due to a penalty?
The only thing I could think of that would possibly cause a penalty by Google was reciprocal links pages. I had stopped reciprocal linking for a long time now BUT maintained those pages out of courtesy to the sites that still kept their links to me. However, recent updates in Google’s Webmaster Guidelines suggests that reciprocal linking will hurt your site :
| Don’t participate in link schemes designed to increase your site’s ranking or PageRank. In particular, avoid links to web spammers or “bad neighborhoods” on the web, as your own ranking may be affected adversely by those links. |
I wasn’t sure if my reciprocal links pages were part of the problem, but in the interest of self-preservation, I decided to remove all reciprocal link pages from all my sites.
* Was my robots. txt file okay?
First, the Google Webmaster Tools indicated to me that my robots. txt file was okay although its cached version was about 30 days old. The long lapse between the current date and the last cached date of the robots. txt file SHOULD have indicated to me that Googlebot had problems accessing the file. Robots. txt validators that I used below showed there was nothing wrong with my file :
http://www.invision-graphics.com/robotstxt_validator.html
http://validator.czweb.org/robots-txt.php
Conclusion : My robots. txt file seemed okay.
* Was my sitemap file okay?
Google’s sitemaps can seem like an unruly monster to the uninitiated. Many webmasters have reported that the Google sitemap reports have been known to be buggy at times and report errors when there were none. In any case, I used the following tools to check the validity of my sitemaps and its schemas :
http://www.validome.org/google/
http://www.smart-it-consulting.com/internet/google/submit-validate-sitemap/
http://www.xml-sitemaps.com/validate-xml-sitemap.html
https://www.google.com/webmasters/tools/docs/en/protocol.html
http://www.sitemaps.org/protocol.php
Conclusion : My sitemap.xml file seemed okay and passed validation by the tools above.
* Was my .htaccess file okay?
My investigations revealed that incorrect coding in the .htaccess file could cause unnecessary looping (meaning I could have been sending Googlebot in circles and it threw up a red flag), so this had to be checked out. There were no known validators that I could use to verify that my .htaccess file was okay so I took my problem to the forums :
Since I had no experience in .htaccess coding and debugging, I had to rely on the more experienced webmasters who contribute to the forum listed above. Thankfully, all who looked at the content of my .htaccess file cleared it, saying there wasn’t any problem with my .htaccess file.
Conclusion : My .htaccess file didn’t seem to contain coding errors
* Were my pages blocking Googlebot?
It was unlikely that my pages were blocking Googlebot since I did not make major changes and Googlebot was visiting them up till the last visit on July 13 1007. However, I needed to make sure I did not inadvertently place any tag like the following in my pages :
Update : Another way you may be blocking Googlebot is if you load your pages with scripts that leave Googlebot “stranded”.
Conclusion : All my pages DID NOT contain any tags that would have blocked Googlebot
Checking My Web Host Provider
The nature of the error in Google Sitemaps strongly suggested that Google was having problems accessing my site where other bots and search engines didn’t. It was a really perplexing situation. My web host initially replied that they had not blocked Googlebot, so it was left to me to find possible loopholes. I turned again to the boards. The difficult part initially was trying to find out the right question to ask. So I started by describing my problem. Then one by one, contributors made suggestions and I followed through on every one of those suggestions.
A contributor commented that using her header check tool, she found all my URLs were returning a 403 (forbidden) status. I have not placed a live link to her tool because she has since taken it offline for upgrading and maintenance. In any case, to verify the results her tool was giving me, I used another header check tool. Indeed using this tool, most of my URLs returned a “Operation Timed Out” error.
I then wondered if my site was the only one that was experiencing 403 Forbidden and time out error. I copied the URLs of all the clients listed on their “prominent clients” page and checked them with the 2 tools. Indeed, I found many of those sites returned the same 403 - Forbidden status on the first tool and timed out on the second tool. I reported this finding to my web host.
The first sign that I was probably on to something was when there was a reply from them stating they needed time to “conduct tests”. I signed up with a trial account with WatchMouse.com, a website monitoring service to see if any more timeout errors were popping up and indeed they were.

Again I reported this to my web host. Their reply stated that timeouts could be caused by many other reasons other than the servers, which I accepted. However, I noted that they had requested their Security Team to investigate the matter, which was another indication that we were on the right track.
Hurray!
Checking my stats on August 30th, I was surprised to see that the server had registered over a hundred visits from Googlebot. I immediately contacted my web host and asked if they had made any changes and they confirmed that they did. So here’s what caused the inadvertent blocking of Googlebot according to them :
|
Our firewall has an automated mechanism which will block IP addresses deemed to be making too many concurrent connections to our server in a short time. Our security department has whitelisted the google network range that is noticed to make these connections. On top of that we have made the firewall less stringent in the sense we will allow a higher threshold of concurrent connections compared to previously. Based on your feedback, the configuration is just right. It is not the server that has the problem but the datacenter network that is not reachable from certain locations. We have not change any settings at the time. However, it is possible that there are more users who use Google Sitemap, causing increased concurrent connections to the server. For the current issue, it appears that our firewall’s stringent policy has temporary block the bot. |
Lessons Learned
In hindsight, it occured to me that modification to my .htaccess file could have caused an increase in the concurrent connections to the server. I had modified my .htaccess file to solve canonicalization problems by redirecting :
I theorize that since these redirections involved hundreds of URLs it’s possible that when I deployed the changes in my .htaccess file in mid July, it triggered the “increase” in concurrent connections as the bots were redirected to the correct pages. In other words Googlebot attempted to make 2 connections for every page - once to the old/non-www URL and then to the new/www URL. As the concurrent connections increased, it triggered the automated mechanism that blocked Googlebot’s IP address. This in turn caused more time-out errors. The spikes in the Googlebot Download Time chart (above) indicates long download times which eventually ended in timeouts. Unfortunately, this affected one of the most important files - the robots. txt file - which every bot needs to before it accesses a site’s pages. These time-outs also made my sitemap inaccessible, so since Googlebot could not access these 2 important pages, it could not confirm my site still existed!
However, my web host’s analysis of the situation brought them to the conclusion that more of their clients (like myself) had begun to use Google Sitemaps and this is the reason for the increase in concurrent connections. Whatever the reason, at last check, Googlebot has resumed its crawl of my site and I must say it is a welcome sight.

I’ve learnt a lot as I struggled with this problem and I hope this post helps some of you who may be wondering why Googlebot has stopped visiting your site or you are experiencing the “Network unreachable: Robots. txt unreachable” error.
Popularity: 68% [?]
Welcome Home Googlebot!
Wordpress 2.1 403 Forbidden Error During Post
Low Adsense impressions caused by Javascript error
I Passed! I Passed! I’m Officially A BlogRush-er!
Zero To Hero - How A Simple Homemaker Turned Into A Money Making Machine
BlogRush Phase 2 - What A Rush!
Interesting article - I’ve been having similar issues fetching robots.txt, so I wonder if they might be related. I’ll look into it when I get home
Hi Blogs for Money…
Hope you find the fault. When I asked on forums whether “network” meant the physical server infrastructure or website network, I got a variety of answers. One other webmaster I know also inadvertently blocked Googlebot with a setting in his server. Whatever the case, one thing I do know is that when Googlebot stops visiting, it leads to a whole lot of other symptoms and usually ends with a revenue drop. I hope you catch the problem before it hurts your earnings.
Andrew
Hello,
I am also having same problem.. google bot stopped visting one site .. there are other sites on the same server with different IP’s .. have no problem with google bot …
Only one site was working fine until last week but sitemap is getting rejected with the same error network unreachable…
Checked the server logs.. google bot is not coming at all..
Did not do any thing that prevents the google bot as you have mentioned ion your blog..
Any thoughts on this issue ..
Please share , if possible please email me …
Regards,
vrsane
Hi vrsane…
Hope you received my email reply, but for the benefit of other readers who might also be affected, I think the best way to deal with this is to troubleshoot systematically.
It’s a three-pronged approach. You have to be open that the error could be at Google’s end, your site or your web host, although the nature of the error seems to point to a problem which the web host has to deal with.
Andrew
[…] I’ve investigated this issue a bit and it seems to be similar to the case described in: Debugging The Network Unreachable / Robots. txt Unreachable Error - HomeWithAndrew.com The problem was related to the host/datacenter: […]
ok
i have the same problem
what shoud i do??
if there is any modification to .htaccess file
plz write it
thanks
hello bahgdad…
I must admit that .htaccess coding is really alien to me. My .htaccess contents are standard codes used to re-rirect my index.php to root and non-www pages to www. These were copied from examples in online forums and okayed by my web host support.
From what I’ve read, if you have NOT done anything to your website and the error suddenly popped up, the solution is likely to be found at your web host.
Hope it helps bahgdad…
thanks andrew
but im biggener
and not understand what you say
plz
if there is any suggestion to tell my host
bec i ask my host
he said the problem not from server
plz help

I’m sorry to hear your problem with Googlebot.
When I first asked my web host, they too said that it was not a problem on their server. So I had to do a lot of invertigation and learning on my own. It took me three weeks before I knew WHAT questions to ask.
I understand that every situation is different when it comes to this error, so my best advice to you is to describe in detail what has happened, and ASK your web host to explain to you possible reasons WHY this error occured. That way, you learn and take it from there.
I understand from Vamsee (the earlier commenter) that he checked his server logs and found that his firewall WAS indeed blocking 2 of Googlebot’s IPs.
Here’s another post about someone with the same problem and how he found the solution :
http://www.sabahan.com/2007/08/18/banned-from-google-find-out-how-to-entice-googlebot-to-recrawl-your-site/
If you don’t know how to explain to your webhost, perhaps you can point them to this post or the one above.
Hope it helps…
thanks
i will do so
and come back again
see you
[…] have three hosting providers and it wouldn’t be that big of an issue. Then I found a post by Andrew Shim , a very stubborn man who refused all the answers I just received and set out on his own to conquer […]
Hi,
I just found your site googling for “network unreachable.” I’m having the same problem - 140 of my pages are now unreachable according to my Google tools stats. This has been going on for two weeks. I’ve spent the last few hours testing everything that I can, and trying to think of what I installed on my site in the last two weeks. I just discovered I started having the issue the day after I installed BlogRush - blogrush.com. I uninstalled it a few days ago, so I’ll soon see if that has anything to do with this, or if it’s just a coincidence.
I’ve put a message out to my webhost to see if they can look into the problem. I don’t have access to my logs. My sitemap and robots.txt appear fine.
hello jeni…
I don’t think the BlogRush widget would cause this problem. As the error suggests, it is most likely and issue that your web host needs to deal with. Remember that the Google sitemaps errors may not be real-time. That means that your webhost may have been down while you were not aware and heaven forbid, that WAS the time Googlebot came visiting. This would result in a network unreachable error. You might want to use the tools mentioned in the post to help debug.
There’s also a free service called WatchMouse that will check your website every hour or so. YOu might want to use this to see if your website is accessible. Note that errors in accessing a website can occur at many points. Don’t assume it is just occuring at your webhost’s server.
As for your sitemap and robotx.txt files (for the benefit of newbies), they may be viewable on your browser but you might still get the network unreachable error. Remember that this error (network unreachable) is reporting the status of Googlebot’s attempt to visit your website. As my webhost mentioned, it could be due to an automatic barring by their server. Unless you are using a free blogging site - like blogspot - you should be able to access your server logs. If the problem persists, it may be time to change to your own domain.
I hope the error clears up by itself - which has been reported before by other webmasters. This would indicate your server was down at the time of Googlebot’s visit.
Cheers Jeni!
Thanks for your response! According to the Google tools, my site was having trouble on about 5 separate days over the last 2 weeks, for a total of 147 unreachable urls. Google didn’t have trouble reaching the sitemap or robots.txt - just regular pages on my site.
I just signed up for WatchMouse. I’ve been using InternetSeer for the past 7 years on my other site, and that only monitors once an hour, but I’ve only had my site down maybe 3 of 4 times in those 7 years, according to InternetSeer.
I have my own domain, but I don’t have a private server. Since I have so many different scripts installed across all of my websites, I think moving hosts would be a bigger nightmare than the loss of Google traffic. I feel like I’m stuck. I’ve always liked my host for their high uptime %…
I did try all the other tools you suggested, and found no problems. However, in my .htaccess I also use the redirect to the www version of my site, but I’ve had this on my site til day one, and the google errors only started two weeks ago. Again I guess it points to a host issue.
The only other thing I was wondering is if something on my pages is causing the pages to load too slow for Google, causing the errors. Guess I’ll just play it by ear and hope my host comes up with something.
Sounds like you know what you’re doing Jeni…
If you have your own domain and it’s hosted on an Apache server, your webhost SHOULD allow you to download your server logs - it’s a common feature of every hosting package. If they don’t, me thinks it’s time to look for another host…
Since you say that it’s the regular pages that are affected, you need to check to see WHEN the errors were occuring. Only your log files can tell you this. When you know the exact time these errors were occuring, and what the errors were (404 etc) then you need to confirm this with your webhost. I know… NO webhost likes to admit fault immediately. Sometimes, it takes a bit of “convincing” before they admit….
The other possibility would be an internal issue on Google’s part. In all the forums that I asked this question, a number of webmasters also confirmed that they experienced similar problems which sort of resolved themselves. This would point to Google being the source of the problem itself.
However, when you mention that you have lots of scripts running, it does throw up a few red flags. A common webmaster rule is never to make a page script-heavy with too many calls to external scripts or even place scripts within multiple nested tables. It’s a long shot, but if the problem persists, what I would do is to create a test page that is script-free. Then add those scripts one by one, and wait to see if Googlebot throws up the same errors.
I hope you get to the root of the problem Jeni. Just out of curiosity, is this the URL you are talking about?
Yes this url is the one I’m having the problems with - savvyskin.com. I still haven’t heard back from my host, so I’m going to keep monitoring with WatchMouse, and check back every day with google tools. I’m trying to keep my blog as simple as possible, but I’ve installed a lot of plugins and widgets, which could always slow things down. This is my first blog, so I am still learning how to perfect it for speed and function. But in the end I have a feeling the problem is my host (or hopefully a google glitch). Thanks for your help!
Just a quick update. Even though my site doesn’t seem to go offline at all, WatchMouse has found all sorts of errors - Timeout during negotiation, Host name lookup failure, Broken pipe, while sending HTTP header, etc. There have been about 20 errors per day! Did you notice such dramatic errors on your site when you got WatchMouse?
At this point, I’m considering switching hosts just for my one site, but I’m extremely nervous about finding a new host, since the one I chose had excellent reviews when I chose it. And I’m worried it will be a nightmare switching my blog ever - not sure if I need to reinstall it on the new server, etc.
Something I discovered with my own problem with the bot is that the robots.txt file can’t have blank lines at the end of it nor any blank spaces at the end of the last line. The cxweb site that you direct folks to pointed this out to me as being the problem while the other site didn’t even mention it.
Thanks for the information.
yes… it was an eye opener for me too. I’ve been a programmer long enough to know that even a semi-colon (or lack of it) can throw everything out of kilter but come on… my first impression was… how the H*LL are we supposed to know a thousand and one syntax requirements for every language and file!
Sigh… live and learn….
It’s a picky little bastard.
Actually still can’t get Google to see my sites. *sigh*
I think I only noticed Googlebot stopped visiting 2 weeks after its last visit. Then it was about another 1 week of frantic searches on forums and web for possible solutions and another desperate 1 week of ding-dong with my webhost. All in all, I lost a couple hundred bucks of earnings when traffic dissappeared.
It took a personal plea to the CEO of my webhost before they actually got their Technical Manager into the picture and still the she (the Technical Manager) refused to admit that the problem was on their end. Fortunately, one of the Engineers was a really decent fellow with a conscience. He did his own research and found the problem.
What I’m saying is… Don’t give up drmike. Like everyone who’s posted their comments here, your problem probably has a different twist to it that you need to figure out systematically.
And yes…I really understand - it’s like staring at a blank wall everywhere you turn!
This thread is really helpful. I do have this to add. On EVERY website we’re running with Apple WebStar serving software, our robots.txt file and our sitemap.xml files are being read just fine. On EVERY website we’re running with Apache web serving software (two different servers), NONE of the robots.txt files appear to be read by Google — all indicate the infamous Robots.txt file unavailable error.
This can’t be a network issue because the servers are all on the same LAN and the same WAN.
I keep thinking it must be an Apache thing, but WHAT?
I believe Apache is the most popular serving software on the internet, so if it’s an Apache thing, gobs of people must be fighting this. Any Apache guru’s out there?
Thanks for the great input Stanton. I DID notice that and wondered if it could indeed be the cause. I’m running on Apache, but when checking other sites that returned a 301 error using the tools mentioned above, a number of them were on IIS.
Perhaps those who posted could leave a note to mention what server you’re on. Who knows… we might help thousands of other baffled webbies end their nightmare!
My site is on an Apache server, and Google doesn’t have a problem reading my robots.txt file.
I am still having the same “network unreachable” errors on a lot of my individual pages. WatchMouse is finding all sorts of errors almost every day on my site. Since I don’t have any sites on other hosts, I don’t know if it’s common for WatchMouse to find lots of errors, or if my host is indeed really bad.
I’ve been briefly researching new webhosts, but am extremely nervous to make a move. I haven’t figured out if I have to reinstall my WordPress software and all my plugins, and have a new Mysql database created at the new host, etc. And I’ve been reading glowing reviews, and then horrible reviews, about different hosts, and I can’t find any with mostly positive reviews.
My host has written back to me, and said they made some changes to my server try to make it run better. But, since I’m on a shared host, they didn’t have much else to offer me, other than moving to my own server. But since my site is not making much money, I’d probably pay more in hosting that I’m making.
I’m not sure if the errors are really affecting my Google traffic. In the end, I wonder what harm having my pages “unreachable” is doing, if my pages aren’t actually dropping out of Google.
Hello Jeni… please remember that data has to travel through many points from your server to the visitor and errors occuring at any one of these points CAN throw up and error on Watchmouse.
Finding a new webhost can be very scary if its your first time, but generally speaking, when you decide to move, your current webhost SHOULD port over your WHOLE site (as is) to the new host. This means that they will send over a DUPLICATE of your site to the new host. They then will make the necessary changes to the DNS and wait for it to propogate. All in all, moving webhosts shouldn’t take any more than 3 days.
Yes, everybody will have different views about webhosts. Here’s how I checked on my webhost prior to moving:
- I contacted their Live Support chat and asked questions. If they’re a good host, they should be online almost imeediately.
- I asked technical questions. If they can explain everything and clear all my doubts (without sounding like they have better things to do) then they’ve scored another point.
- I then asked if they would handle all the necessary tasks to port my site over and they confirmed everything would be handled according to procedure.
- Then I visited their website and forums to gauge if they were a reliable host. What you don’t want to do is to go with a reseller.
Tell your webhost that the site is running fine, but it may be blocking Googlebot. Check your affected pages against HTML validators and see if they pass, although Googlebot is pretty lenient about this.
If the Network Unreachable errors are recent, then ask your webhost to provide access to the server logs and note go through line by line for the affected dates. See what the error code is.
Ask your webhost to check their firewall settings.
In the meantime, post questions on forums and http://www.google.com/support/webmasters/. I didn’t find the answer directly by doing this, but it led me slowly as to WHAT to look for, WHAT questions to ask and WHERE to ask them.
Thanks !
I reached your blog using google trying to find a solution for this problem.
At last i found the firewall is blocking googlebot from accessing my server.
I unblocked google’s ip from the the black list and i am waiting for google coming visits.
Thanks for your detailed post !
10/10
Great Rasheed! My sources tell me that some web hosts set a low threshold for concurrent visits by the same bot to prevent abuse. In fact, many website owners ARE experiencing this problem but don’t even realize it! They just chalk it up to “being penalized” by Google.
Just a quick update! I had sort of forgotten about my “unreachable” problem because despite having 100 unreachable URLs at any given time, my Google traffic had tripled in the last month or two, so I was finally happy.
Last week, my site went offline for an entire day, and I had enough, and switched hosts finally. I just checked, and I went from having 100 unreachable URLs to 3! Woohoo! It’s not making any difference in my Google traffic, but it’s a relief nonetheless!
Case closed!
Wow! That’s great news Jeni! Wasn’t that much of a nightmare switching hosts right? However, now that you’re on a new server, you need to keep an eagle eye on your stats. If you’re using Google Sitemaps, then go to Tools > Set Crawl Rate and you will find 3 graphs that show you the number of pages Google has crawled, the number of kB it has downloaded a day and the Time spent downloading. The third one is important. If you see erratic spikes in this graph, it means Googlebot is having problems accessing your pages.
Hope all is smooth sailing from here on Jeni!
Yeah switching hosts ended up being really easy. I thought I would have to reinstall WordPress and all the plugins, but it turns out I didn’t have to do any of that!
I just checked the graphs, and there’s only one giant spike in the last 90 days. I’ll keep monitoring it to see how it pans out with the new host. Thanks!
This great, thanks a lot Andrew. I had exactly same problem. Lil bit headache, I thought it was my hosting firewall block googlebot IP. Then I read this article, I follow all the steps then violaa…, no more errors on google webmaster tools sitemap. Thanks…
Thank you. I was pulling my hair out at this. Great post. I am on to my hosting provider about this.
Heh… just a reminder - please check things on your end first. I hate being the one that points the finger at webhosts… I hope things work out for you…
Currently! I have same problem with my site. I found your article very interesting. I will try
Thanks man
i cant figure out this xml sitemap i cant not get indexed at all
Hi Avery…
There is a difference between an ordinary sitemap and an XML sitemap. An ordinary sitemap is just an ordinary (HTML) page that list out the links to subsections and pages. An “XML sitemap” would normally be referred to as an RSS feed. This is a page that conforms to a specific format usually written in XML. You can learn more about creating your own RSS XML feed if you Google “how to create your own xml rss feed”. This RSS feed is basically used by RSS aggregators to “feed” web users with your content without them having to actually visit your site. In order to do this, web users will need to use an RSS reader. You can Google “RSS readers” to learn more about RSS feeds.
As far as you site is concerned, I searched for “snipersuitghillie” and found 38 entries in Google, with your Home page at PR4. This means that you HAVE indeed been indexed by Google, so you are doing something right! Don’t worry. Getting the hang of SEO and Google takes some time.
All the best.
Hey Andrew,
Your post really helped me! I have a lot of sites and was getting the exact “network unreachabel” and “sitemap error bwecause of robots.txt” that you mentioned. I just forwarded this url to my webhost and hopefully they can fix it in their firewall rules. Your analysis was spot on and I really appreciate you spelling it all out. If you had some adsense ads on here I’d click them just to make you some money, but alas you only have the one webhosting ad and I’ve already got that…
hi hard money… glad you found the post helpful.
I now make it a habit to visit Webmaster Tools to check my sites’ status and possible errors. Doing that daily is a heck of a lot better than getting one gigantic heart attack when traffic suddenly stops!
Adsense doesn’t do too well on HomeWithAndrew.com so I pulled them out long ago. But your kind intentions have made me think about putting a “tipping jar” script so’s nice folks llike you can buy me a Big Mac! Cheers!
[…] may have to contact your web hosting provider but read this first Debugging The Network Unreachable / Robots. txt Unreachable Error - HomeWithAndrew.com __________________ Devanand […]
Andrew, this thread has been my living nightmare for the past 10 days. I have axperienced all the problems above and then some. I have done everything to try and get Google and my miserable host–HostGator–to play nicely and help me work through a solution, but they’re acting like spoiled brats, blaming each other and pointing the finger. I’m at the end of my rope and I want OUT!!!! I cannot run my business this way. I do not want to spend the next week teaching HostGator how to do their job. I have simply given up and want an new Host. PLEASE!!!! Tell me a Host that most people like and has already resolved the issues illustrated here. If anyone knows a decent host, please let me know. I’ve got 25 blogs that are sitting there with no traffic because Googlebot can’t access my plain-as-day, well-written sitemap.xml file (server time-outs) and breaks when it sees my simple, generic Google-supplied robots.txt file.
Desperate in Boston.
Wow Jasper… 25 blogs and no visitors. That is really the pits. To be fair, I have NEVER met a System Admin who admits the fault could be on their end until they exhaust all possible options. By then, it may be too late to say “I told you so”.
Let’s get to down to your prob. If I were you, I would first transfer ONE or TWO domain(s) out to any other host. Then you know the drill. Put out some links or get your pals to put out some links pointing to your site. Hard as it may be, sit tight and wait for Googlebot. Continue to monitor the situation for a couple of days (yeah… it’s nerve wracking). If you’re sure that Googlebot is spidering your (transfered) blog without problems, then chances are you are right and there is a problem over at your present host that they may be unable (or unwilling) to change. However, IF you still can’t get Googlebot to visit after your domain transfer, you will need to check a 1001 other things that may have gone wrong.
As far as I’m concerned (from the way they handled my problem above), my present webhost is as reliable and professional as I can expect. You might want to give them a try. Your 2 options are :
1. The banner on the right leading to Exabytes.com. This is an affiliate ad. If you prefer not to go via my affiliate link, feel free to
2. Go direct to http://www.exabytes.com
I really hope you get out of this mess soon Jasper!
Hey Andrew…thank you very much for your excellent advice. I am definitely going to take it… starting right now. The latest update from my current host, HostGator, is that they suggest I try to verify my site from a different location using a different ISP. They suggest that maybe my current ISP, Verizon, is dropping packets in the transfer. I suspect they believe this because they do not block any bots or spiders coming from Google IPs, and they have been able to verify my sites without an issue. I have three PCs at home–two that connect wirelessly and one direct-connected. I’ve tried all three PCs and used both Firefox and IE Explorer, all without success. The odd thing is that a month ago, I had no problem verifying my sites in Webmaster Tools. And a strange new update is this: HostGator verified several of my sites for me, and these sites showed ‘no errors’ for the sitemap.xml. Google has since spidered the file and now reports errors. Some of these sites only have two or three pages, and I had rebuilt and resent the sitemap.xml file prior to HostGator verifying it, so this can’t be a problem on my end. This has to be some problem in communication occurring between Google and HostGator. Well, anyway…I’m going to test the waters with a new host, as per your suggestion, and see how that goes. This thread, by the way, is excellent and I really appreciate your rapid feedback, Andrew! Much appreciated, mate!
Hi Andrew, Great post, thanks for this! I am experiencing the same problem. I am in contact with my ISP now and they want to know the google network range so they can whitelist it. Do you know how I can provide them with this?
[…] aver letto un interessante post sull’argomento datato perĂ² 2007 ho trovato diverse risposte utili all’interno del forum di google ed in […]
[…] Debugging the Network unreachable / Robots.txt unreachable error […]
I was facing same error and i also contacted my host..but they said they are not blocking any IP range..i found that due to my default wp privacy settings..it was blocking search engines to crawl my site..but i changed that option to allow all search engines to crawl my site..and other seach engines like yahoo, bing are crawling too but not google..even google ads are not showing properly..they are showing public service ads on my site ..where as my content is original and its not adult too..
Great post. I am experiencing the exact same problem with only a few of my pages (as compared to all pages, which makes it a bit more difficult to debog) and this was very helpful.
to my mind, network unreachable problem means when google bots tries to make so many concurrent connection, website server temporarily disallows access to them. then google reports it as network unreachable.
Hi, Nice post. I too had a similar problem. I added the following Meta Tags on all my pages and it worked out fine.
I too had the same problem. I added the following meta tags to all my pages and it all worked out fine.
meta name=”GOOGLEBOT” content=”INDEX, FOLLOW” /
meta name=”robots” content=”noodp,index,follow” /
meta name=”revisit-after” content=”7 days” /