The Canon­i­cal Link Element (also referred to as the “Canon­i­cal tag”) has been around for over two years now and has received a bit of atten­tion lately as more and more websites utilize this power­ful link to respond to a variety of dupli­cate content issues.  But there are clear instances when the Canon­i­cal tag should be used and times when it should be avoided.

I like to think of the Canon­i­cal Link Element as the “spare wheel” of website archi­tec­ture.  It works in an emer­gency but it comes with perfor­mance issues.  But if you have ever had a flat tire, hooking up the spare and driving slowly is far better than being dead in the water.

Dupli­cate content occurs when multi­ple URLs return the exact same content. Exam­ples include www vs non-www domains, adding /index.html to the home page URL, capi­tal­iz­ing URLs, session Id’s, track­ing code from market­ing campaigns and many, many others.  Search engines really frown upon dupli­cate content – it costs them data­base resources, causes confu­sion about the valid­ity of URLS and can frag­ment pageR­ank.  As a result, websites can suffer in search engine rank­ings if dupli­cate content infrac­tions are substan­tial.

Back in Febru­ary of 2009, the big 3 search engines (Google, Yahoo and Microsoft – but let’s be current and just say Google and Bing) came together to recog­nize the Canon­i­cal link element as a way to deal with dupli­cate content prob­lems.  Jill Whalen famously stated that “devel­op­ers keep SEOs in busi­ness” and there are many ways sub optimal coding can cause dupli­cate content.  But beyond the myriad of devel­oper and coding issues that can cause dupli­ca­tion, the problem is exac­er­bated when Market­ing teams append track­ing para­me­ters to URLs in adver­tis­ing campaigns.

One piece of advice given by Matt Cutts regard­ing the use of the Canon­i­cal link element was that it is “far better” to avoid dupli­cate URLs in the first place and “exhaust alter­na­tives” before turning to the Canon­i­cal element.  For instance, handling the “www” vs. “non-www” URL dupli­ca­tion is best executed on the server if possi­ble.  Creat­ing a consis­tent inter­nal linking struc­ture and a consis­tent XML site map should be enforced by your website’s code, not the Canon­i­cal link element.

The situ­a­tions that are often out of the hands of devel­op­ers should consti­tute the limited scope where the Canon­i­cal link should be used.  For example, very diffi­cult dupli­cate content manage­ment issues can be caused by track­ing code para­me­ters, incon­sis­tent links from exter­nal domains and restric­tive CMS systems that create unique URLs for non-unique pages, such as sort order.

Although it serves a good purpose, the use of the Canon­i­cal tag should be strate­gic and used only when neces­sary.  It is a crutch with a limited scope of uses.

by John Sherrod
Google +