The Canon­i­cal Link Ele­ment (also referred to as the “Canon­i­cal tag”) has been around for over two years now and has received a bit of atten­tion late­ly as more and more web­sites uti­lize this pow­er­ful link to respond to a vari­ety of dupli­cate con­tent issues.  But there are clear instances when the Canon­i­cal tag should be used and times when it should be avoid­ed.

I like to think of the Canon­i­cal Link Ele­ment as the “spare wheel” of web­site archi­tec­ture.  It works in an emer­gency but it comes with per­for­mance issues.  But if you have ever had a flat tire, hook­ing up the spare and dri­ving slow­ly is far bet­ter than being dead in the water.

Dupli­cate con­tent occurs when mul­ti­ple URLs return the exact same con­tent. Exam­ples include www vs non-www domains, adding /index.html to the home page URL, cap­i­tal­iz­ing URLs, ses­sion Id’s, track­ing code from mar­ket­ing cam­paigns and many, many oth­ers.  Search engines real­ly frown upon dupli­cate con­tent – it costs them data­base resources, caus­es con­fu­sion about the valid­i­ty of URLS and can frag­ment pageR­ank.  As a result, web­sites can suf­fer in search engine rank­ings if dupli­cate con­tent infrac­tions are sub­stan­tial.

Back in Feb­ru­ary of 2009, the big 3 search engines (Google, Yahoo and Microsoft – but let’s be cur­rent and just say Google and Bing) came togeth­er to rec­og­nize the Canon­i­cal link ele­ment as a way to deal with dupli­cate con­tent prob­lems.  Jill Whalen famous­ly stat­ed that “devel­op­ers keep SEOs in busi­ness” and there are many ways sub opti­mal cod­ing can cause dupli­cate con­tent.  But beyond the myr­i­ad of devel­op­er and cod­ing issues that can cause dupli­ca­tion, the prob­lem is exac­er­bat­ed when Mar­ket­ing teams append track­ing para­me­ters to URLs in adver­tis­ing cam­paigns.

One piece of advice giv­en by Matt Cutts regard­ing the use of the Canon­i­cal link ele­ment was that it is “far bet­ter” to avoid dupli­cate URLs in the first place and “exhaust alter­na­tives” before turn­ing to the Canon­i­cal ele­ment.  For instance, han­dling the “www” vs. “non-www” URL dupli­ca­tion is best exe­cut­ed on the serv­er if pos­si­ble.  Cre­at­ing a con­sis­tent inter­nal link­ing struc­ture and a con­sis­tent XML site map should be enforced by your website’s code, not the Canon­i­cal link ele­ment.

The sit­u­a­tions that are often out of the hands of devel­op­ers should con­sti­tute the lim­it­ed scope where the Canon­i­cal link should be used.  For exam­ple, very dif­fi­cult dupli­cate con­tent man­age­ment issues can be caused by track­ing code para­me­ters, incon­sis­tent links from exter­nal domains and restric­tive CMS sys­tems that cre­ate unique URLs for non-unique pages, such as sort order.

Although it serves a good pur­pose, the use of the Canon­i­cal tag should be strate­gic and used only when nec­es­sary.  It is a crutch with a lim­it­ed scope of uses.

by John Sher­rod
Google +