by John Sherrod, Director of SEO
I work with a lot of ASP.NET websites and inevitably the issue of URL case sensitivity and SEO rears its ugly head. It is not uncommon for a single URL to be articulated multiple ways in an ASP.NET website because it doesn’t affect the outcome of the page request. On Microsoft / Windows servers, /page1.html is treated the same as /Page1.html and /PAGE1.html. And I’ve heard it said many times “if it works, it’s right”.
Unfortunately, Google doesn’t subscribe to that philosophy.
Google’s literal interpretation of characters and their case is precise, much like how passwords work. That could be a conscious decision but it also just happens to be how UNIX/Linux servers work (they’re customizable – so for the sake of simplicity, I’ll just call them all UNIX and move on). Microsoft servers don’t distinguish between upper and lower case letters. UNIX servers do.
Not to get into the quality, stability and reliability debate between Microsoft and UNIX servers, suffice it to say that they work differently. If you’re developing in PHP, you’re on a UNIX server. Because of the necessity of providing exact match page requests to the UNIX server, it is much more difficult to introduce duplicate content involving different case URL structures.
On UNIX, mis-matched URLs will simply break and the damage will be immediately apparent. It just won’t work. So if you have a page called /page1.php and you accidentally link to /Page1.php, the server will respond with a 404 not-found error.
If your site is in ASP.NET then you are using a Microsoft server. In this case, /page1.html, /Page1.html and PAGE1.html will all be returned successfully and will not break. In other words, requests for those URLs will resolve in a 200 ok server response indicating, “All is well with those URLs”. So if Google indexes those 3 different URLs (and they will) they will see them as unique and independent.
Fortunately, in most cases Google is smart enough to eventually figure out that these unique URLs are really all the same page. Unfortunately, the flow of pageRank to each of those pages can be diluted or thinned because the various incoming links to those pages are not consolidated. When this happens, rankings can actually suffer.
To put it simply, 50 links to /page1.html and 50 links to /Page1.html is not equal to 100 links to /page1.html.
Several years ago, Google had a hard time distinguishing between http://www.website.com and http://website.com if they both resolved with a 200 ok server response. That is practically a non-issue with Google today but can be a serious problem with Bing.
More recently, Google began to better understand that /page1.html and /Page1.html were just two instances of the same content. I don’t believe we will have to wait too long until the pageRank consolidation of /page1.html and /Page1.html occurs in Google’s ranking algorithm. I can only guess that is something on their radar.
However, don’t count Bing out of the race. Yes, Microsoft invented both ASP.NET and Bing but they are like wildly different siblings that just happen to have the same parents. Bing is picky. Bing likes a defined Canonical, light code, clean architectures and pages that are named, tagged and optimized with keywords. And even though Bing uses their own servers (one would assume) their CDN is actually Akamai, which runs on Linux. But I digress.
So, back to the question – Is case important in URL structures when it comes to SEO? Yes, it is.
At a fundamental level, search engines need to clearly understand the difference between bird the animal and Bird the nickname (Charlie Parker) and Larry Bird the legendary basketball player. Capitalization matters because the intent is different. And intent is that little nuanced piece of loveliness that search engines constantly struggle to determine about its users. They are good at it, but not great.
So don’t leave it up to the search engines to try and determine your intended structure. That’s just more work for them to do and history says they can and will get it wrong. And that can cost you or your client’s money when it comes to SEO.