Canonicalization. That’s a mouthfull. It’s a Google term that many website owners are not familiar with, but are likely to have experienced its effects. This might be a fairly “techie” sort of post but I guarantee that if you are serious about making money with your website and you depend on Google for traffic, you will want to take note of URL canonicalization.

Here’s how Matt Cutts defines it :

… it’s a strange word; that’s what we call it around Google. Canonicalization is the process of picking the best url when there are several choices, and it usually refers to home pages.

Here’s the problem.
We tend to use different URLs when we link to our homepage in our navigation system, marketing activities and emails. To the uninitiated, these URLs mean the same thing :

URL canonicalization

However, Matt Cutts puts it this way :

…But technically all of these urls are different. A web server could return completely different content for all the urls above. When Google “canonicalizes” a url, we try to pick the url that seems like the best representative from that set.

URL canonicalization

The effect.
Google recognizes all the URL variations as different, but having the same content (your homepage). Technically this means that you have many different URLs with duplicate content, so Google picks a URL that they feel is the best and ignores / filters the rest.

Now, let’s assume you have been focussing all your marketing efforts using www.homepage.com and you are ranking for multiple keywords and phrases with this URL. If Google canonicalizes your URL and somehow picks www.homepage.com/index.html as THE best representation of your homepage content, they will treat www.homepage.com as duplicate content and ignore it.

So how will webmasters feel the effects of URL canonicalization? Since Google has chosen to ignore www.homepage.com, webmasters may suddenly find that they’ve been dropped from Google’s search engine results for the keywords that they once used to rank. It’s as if their website has ceased to exist overnight. They may search for their domain name and find that www.homepage.com doesn’t appear anymore, but in the results set will be www.homepage.com/index.html or www.homepage.com/home.php.

How to fix the problem.
Surprisingly, the solution is simple and is supposed to be standard practice. There are 2 main things that you want to accomplish :

  1. redirect all non-www versions of pages to the www versions (or vice versa if you choose to use non-www URLs)
  2. redirect all variants of your homepage URLs to ONLY one main URL.

You will need to create a 301 redirect to instruct the server to redirect variant requests for URLs pointing to your homepage to www.homepage.com. If you have been using different URLs to point to your homepage and you find yourself facing canonical issues, then it is imperative that you do a 301 redirect immediately.

URL 301 redirect

You can modify your .htaccess file to redirect your index.html file to your root directory. The .htaccess file should be located in the root of your website. If it does not exist, then you will need to create a new .htaccess file. Your web host should be able to do a 301 redirect for you free of charge. If they don’t it’s time to change web hosts. Here’s how the .htaccess file for one of my websites looks :

# Redirect index.php to domain.com
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.php\ HTTP/
RewriteRule ^index\.php$ http://www.yourdomainname.com/ [R=301,L]

RewriteEngine on
RewriteCond %{HTTP_HOST} ^yourdomainname.com [NC]
RewriteRule ^(.*)$ http://www.yourdomainname.com/$1 [L,R=301]

The first section tells the server to redirect all non-www URLs (eg. homepage.com) to www versions (eg. www.homepage.com). This means when anyone requests for homepage.com/example.html or any other URL without the www, the server will redirect it to the www version - www.homepage.com/example.html.

The second part of the example is to instruct the server to do a 301 redirect from www.homepage.com/index.php to the root to www.homepage.com/.

After you’ve completed the 301 redirect - it should be done for all your sites - you need to make it a habit to use only ONE URL when linking to your homepage in your website navigation and hyperlinks, emails and marketing collateral. While you can’t control how other people will link to you, controlling your own links should establish consistency and hopefully you will never ever wake up one morning to find your homepage missing from Google’s search results pages because of canonical issues.

Closing thoughts

URL canonicalization

Having your main URL drop out of Google serps will definitely leave a lasting impression financially. Take a look at the two blips in the chart (above) showing my adsense earnings. The first blip in the chart blips was caused by my failiure to redirect non-www URLs to their www versions. The homepage went missing for three weeks. Then I did a 301 redirect but only solved the non-www to www problem. Two months later (the second blip) was caused by my failure to redirect my index.html page to the main URL. I have since gone through my .htaccess file and made sure I covered all possible effects of URL canonicalization.

I have to admit that I kept skipping threads on online forums about canonical issues because I thought it was a techie issue that I didn’t need to consider. I was proven wrong. It’s easy to get complacent and forget to keep up with what’s happening on the SEO scene. Keeping your website ranked in the search engine results involves lots of interconnected issues that you MUST be aware of. Canonicalization is one of them. Get familiar with the issue if you are depending on your websites to make money.

Here are some links to help you get up to scratch on canonicalization :

Popularity: 11% [?]

Share this blog post with a friend:

These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • del.icio.us
  • Technorati
  • Netscape
  • Reddit
  • YahooMyWeb
  • StumbleUpon
  • Linkter
  • SphereIt