How Humans Affect Search Results

Pascal Côté

6 years ago

Every year or so, I take it upon myself to re-read Google’s Search Quality Rater Guidelines document, 160+ pages of pure joy and stats related to the mysterious search quality raters employed by Google.

While some SEOs use this guide to find tiny flaws in Google’s system, it’s important to understand that this guide explains only a part of how Google indexes and ranks websites in its engine; the part affected by humans, and not by robots. I was reading Google’s PageRank patent the other day, in the hopes of finding golden nuggets of information regarding how links truly work and came across a very interesting note in the document:

“[…] selecting seeds involves a human manually identifying these high-quality pages […]”
Producing a ranking for pages using distances in a web-link graph – P.4, L.34-35

There are a couple of key terms to understand here:

Seeds: This refers to a sample of pages considered as “trusted” according to Google. These pages were previously rated and indexed as excellent sources to refer to.
Human: In this case, the “humans” this patent is referring to are the Search Quality Raters; people employed by Google to review the quality and the relevance of indexed pages.
High-Quality Pages:This quality rating is measured according to the human’s interaction with the page itself and the Quality Rater Guideline document.

With these definitions in mind, it’s easier to understand the role of human interaction in Search, as well as how Google’s algorithms may benefit from vetting pages here and there (looking at you, RankBrain). Conveniently, the last time this patent was updated coincided with RankBrain’s introduction as an “official” update for the search engine.

A Quick Recap of the Ranking Process

Crawling, indexing and ranking pages never really changed since the 1990’s, it stayed more or less the same and the foundations are pretty strong.

A web crawler parses the web for web pages
- The crawler categorizes and interprets web pages based on their content and context.
The categorized and interpreted pages are then compressed, indexed and ranked in the search engine
- Compression is used in order to lighten the weight of a lot of data
- Indexation is the categorization process
- Ranking involves many, many factors based on a recurring process
Everything is then stored in a data center (server) and parsed by users (client-side of search engines)
- Every time a user submits a query to the search engine, the latter parses what’s in the data center and returns the highest ranking indexed pages

None of this should be a surprise for more experienced SEOs; there’s a plethora of information regarding crawling and indexing, here’s a good article from our friends at OnCrawl. What we’re interested in here is the “ranking” part — the 2^nd step of the above process — the “how” of the whole process and where humans actually have their say in this tech-dominated world.

Ranking Basics: What We Know

Again, nothing new here, long-time SEO experts are still speculating about the exact amount of ranking factors used by Google to rank their web pages. This article from Backlinko lists about 200 of them, some say there are many more. At Bloom, we group up ranking factors in three (3) different categories, as follows:

Technical factors
Content-related factors
External factors

While it is true that we have a 80+ points technical SEO checklist for our clients, you’ll notice that not all of the 200+ factors stated by Backlinko and other sources are covered. Simply put: we don’t have control over most of these ranking factors and we choose to focus on what’s more actionable for our clients. Content-wise, there are some standard rules and tools that can help us identify opportunities, but “relevance” per se is determined by users. As for external factors, they’re external and there is a very limited number of actions we can take to help sites out.

So in theory, if a web page checks all the boxes of the three (3) categories mentioned above, it should rank well in the search results, correct? Well if it was that easy, every single page on the web would be a candidate for the #1 spot. Even if Google has a ton of complex algorithms running in the background with the sole purpose of finding the most relevant content to offer its users, the best possible solution to determine relevance is a human being, and this is where the Quality Raters come in.

The Raters, Their Jobs and Their Impact On Search

Keeping in mind that most of the “bulk” work is done by AI, robots and other machines, the qualitative aspect of ranking search results often require a human touch in order to consistently improve the recurring ranking process. The Raters are the people who analyze single web pages in order to determine how relevant they are to a specific query, thus influencing the algorithms surrounding rankings.

The Raters have to follow a strict set of guidelines and methodologies related to how they should rate said pages. All of the details can be found in this juicy 160+ pages document, a document we referred to earlier in this article as well.

These Raters are inspecting web pages and are rating them based on two (2) distinct factors:

Page Quality
Needs Met

Page Quality

Page Quality, or PQ for short, is a rating attributed to a web page solely based on what is on the page. It is divided into three (3) individual sections:

Main Content (MC)
Secondary Content (SC)
Advertisements/Monetization (Ads)

Each section is evaluated individually and then mashed up together as a whole. The MC of a page refers to the “core” purpose of the page; what it tries to solve or what question it tries to answer. The SC of a page is everything else related to the navigation of the site, extra resources, documents, etc; anything that helps the MC to achieve its purpose. The Ads are… well, ads; anything related to monetizing the website itself. This does NOT include products and services. It is strictly for advertisements.

All of these sections, once rated on how well they achieve the page’s purpose, are then submitted to an “E-A-T” reference check. If you’re familiar with SEO, you may have heard that acronym before, E-A-T stands for “Expertise, Authoritativeness & Trustworthiness”. This part of the rating process involves extensive reputation research to confirm the E-A-T of the creator/owner of the page’s MC and SC. An expert, an authoritative figure or a trustworthy person is well regarded by search engines (and Raters) when that specific page is being reviewed.

In the past few years, Google has been exceptionally unforgiving when discovering and ranking pages with low E-A-T, especially when it comes to YMYL (Your Money Your Life) topics. These topics are judged more harshly since they could directly impact a user’s financial, physical or mental health. E-A-T factors are of utmost importance in these cases.

All in all, Page Quality checks the page’s purpose (and how well it’s achieved) as well as the E-A-T references. Once both are rated, a Page Quality score is given to the web page.

Needs Met

This factor could be considered as an extension of the “purpose” part of Page Quality rating. The Needs Met factor tries to correlate the query used by a user and the “response” provided by the search engine. It tries to rate the relevance of the provided page and to what degree the web page “meets” the user’s expectation. This part of the Rater’s job involves less analysis since it targets the SERP and not the site itself. Raters have to rate a specific result provided by the search engine for a specific query.

What’s interesting to note here is that a page could be very well crafted and have a decent Page Quality score, but ultimately fails to meet the user’s expectations, thus obtaining a low “Needs Met” score. A good ranking page should have both, not only one of the two scores. The good thing about this factor is that you can somewhat test it on current sites by measuring the page’s engagement metrics.

Depending on the nature of the site, a high bounce rate usually (but not always) indicates that user expectations are not being met. While Google (the search engine itself) and its Raters do not have access to Google Analytics (or any other analytics software), you can make your own hypothesis and test different content targeting different user queries.

Another way of testing expectations is by looking at the current SERP for a specific query. If a user is searching for “hotel” while on his mobile, what should he expect? Google Maps result? A specific hotel website? A directory? By listing out the different expected results, it makes it easier to scope the expectations and tailor content that would best answer the user’s questions.

It’s Not Only About Robots

SEOs have always said that “users come first” and that your SEO actions should always focus on the user and not on the robots. The intervention of Raters in the ranking process reinforces that argument. If the Raters see that the reviewed content is purposefully made for robots, it will be scored negatively, thus sending signals to the search engine algorithms. Signals that will form pattern and then better detect poor content.

SEO’s have always tried to game Google’s algorithms, finding flaws and techniques that are “working”; making poor quality sites rank higher than other better-rated sites. There will always be ways to game the system. Raters are there not only to apply a corrective force to these but also to help algorithms such as RankBrain to improve and learn how to better detect faulty sites.

P.S: Another very interesting analysis has been done by another SEO in the UK, Matthew Woodward, while I was studying the documents on my end. I invite you to read his article (which generated quite a lot of “heat” as he puts it) here.