If you’re serious about your websites – and you should be if they’re making you money – then you must monitor your server logs. This is especially important if you see a sudden spike in traffic in your stats.
Here’s something you might not be aware of – not all bots are good. I’ve been having problems with one particular bot this past 2 months. Last month, when I noticed a huge spike in hits, I peeked into my logs and noticed a bot called FunWebProducts hitting my sites and eating my bandwidth. First, I thought “cool… this bot is picking up my content”. Then whammo… I checked to see where this bot was from and it turns out FunWebProducts is NO GOOD.
Bad bots do harm to you in many ways :
- Some bots are designed to scrape your content – copy and steal your content.
- They eat up bandwidth – remember you pay for your bandwidth.
- If they hit you often enough, they can choke your bandwidth and real visitors will have problems accessing your site.
- The make your server logs files look like one long piece of s**t. It’s tough enough looking at your log file without having to see this crap popping up in every other line.
So the good thing is that FunWebProducts made me learn 2 ways I can block people or bots from accessing my websites.
The first way is to deny specific IP addresses or blocks of IP addresses by entering those IP addresses in your CPanel > IP Deny Manager panel.
Now, the problem is that FunWebProducts has way too many IP addresses to block, so I checked if there was another way to block the bot by name and indeed there were a couple of ways. I found that the simplest way to block bots was to use your .htaccess file to do it.
I’ve just deployed the changes to my .htaccess file. I’ll post an update later.
Update (20 January 2007) :
Apparently, the second method above failed to block the FunWebProducts bot from accessing my site. I contacted my web host and they help to include one line in my .htaccess file :
|SecFilterSelective HTTP_USER_AGENT “FunWebProducts” “deny,log”|
This seems to have worked, going by my server logs. Notice the 403 Forbidden status resulting from FunWebProducts trying to access my homepage :
|198.xx.202.xx – - [20/Jan/2008:00:34:12 +0800] “GET / HTTP/1.1″ 403 – “http://www.google.co.za/search?sourceid=navclient&ie=UTF-8&rlz=1T4GGLR_enZA229ZA229&q=making+money” “Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; FunWebProducts; .NET CLR 1.1.4322; .NET CLR 2.0.50727)”|
Now, the problem with throwing a blanket block on FunWebProducts is that any visitor using any product or tool from FunWebProducts (who may genuinely wanting to visit my sites) will see a 403 Forbidden page. I will monitor my stats and see if this adversely affects my earnings. If it does, I may be forced to find some sort of compromise and let FunWebProducts through even though it is bloating my log files terribly.
Update (21 January 2007) :
Lots of people have inadvertently downloaded toolbars from FunWebProducts without knowing the full consequences. I found some resources that you might want to read :