This creates several problems. such as:
- For the webmaster, multiple URLs have scattered page weights, which is not conducive to ranking.
- For search engines, waste resources and waste bandwidth.
- When a search engine finds that multiple URLs have the same content, it will not punish, but will try to find out which URL should be normalized. But the program is just a program after all, it may be wrong, and the one that is picked out may not be the normalized URL that the webmaster wants.
- If the URL normalization problem on the website is too serious, it may also affect the inclusion. For a domain name that is not very heavy, the total number of pages that can be included is limited. Search engines spend resources on non-standard URLs, leaving fewer resources for truly different content.
There are also many options for resolving URL normalization issues, such as:
- Set up with 3W and without 3W in the Google Admin Tool, which is the canonical version
- Use 301 Steering to redirect all non-normalized URLs to normalize URLs
- Ensure that the CMS system used only produces normalized URLs
- Make sure all in- site links on your site point to canonical URLs
- Specify a canonical URL in the sitemap submitted to the search engine
But these methods have their own limitations.
- Google Admin Tool is not available for other search engines
- Some webmasters can’t do 301 steering for some reason.
- CMS systems are not controlled by themselves in most cases
- Internal links can be controlled by themselves, but others are not controlled when they link to their own websites.
In short, although there are alternatives to the solution, the standardization of the URL is still a big problem so far.
A few days ago, Google, Yahoo, and Microsoft jointly released a new tag canonical tag to solve the problem of URL normalization.
To put it simply, add a piece of code to the head of the HTML file:
<link rel=”canonical” href=”http://www.example.com/product.php?item=swedish-fish” />
The meaning is that the normalized URL of this page should be:
These URLs can be added to this URL:
The truly normalized URLs for these URLs become:
Simply put, this tag is quite a 301 turn within a page. The difference is that the user is not turned, or stays on the same URL, and the search engine treats it as a 301 redirect, which means that the weight of the page link is concentrated on the normalized URL specified in the code.
In addition, there are several details that the webmaster needs to pay attention to:
- This tag is just a suggestion or hint, not an instruction. It is not an instruction like a robots file. So search engines will consider this code to a large extent, but not 100%, and will consider other situations to judge the normalized URL. This also prevents the webmaster from making a mistake in the URL.
- This code can use either an absolute address or a relative address. It is usually recommended to use absolute address comparison insurance.
- The content on the specified normalized URL may be somewhat different from other non-normalized URLs that use this code, and may not be exactly the same. For example, there are many e-commerce websites that are sorted by price, colour, and size. The generated URLs are all different, but the contents are basically the same. Only small differences can be used.
- The specified normalized URL can be a non-existent page, return 404, or a page that has not yet been included. But it is not recommended doing so, don’t worry about finding something.
- This tag applies to the same domain name, including the second-level domain name. But it does not apply to different domain names to prevent someone from hijacking. (Update: canonical can now be used across domain names)
- Don’t take this label as a life-saving grass. First of all, you have to do a good job of website structure, and try to avoid URL normalization. This is only the last resort.
Sensitive people can probably see the opportunity to build a lot of external links from this new standard.
Finally, this standard is supported by three major search engines, Google, Yahoo, and Microsoft. Why didn’t you mention Baidu? I remember seeing reports. From the perspective of search volume, Baidu is the second largest search engine in the world. Why not play with us? (2013 update: Baidu also supports canonical tags)