Dr. Epstein, You Don’t Understand How Search Engines Work

Dr. Epstein, You Don’t Understand How Search Engines Work

Every few months an undercooked analysis of a Google conspiracy comes across my desk. With the election, our media is in high gear of perpetuating inane ideas. You can’t blame them too much, after all the election is pretty laughable.

Noted Google-hater and senior research psychologist Dr. Robert Epstein is back, on SputnikNews.com, to throw some more gas on the fire that SourceFed started earlier this year and Rhea Drysdale, CEO of OutSpoken Media already doused with an ocean’s worth of facts. But it’s back in the news cycle, riding on the gusto of a research study that doesn’t quite make sense.

So with my new born daughter crying, and me up at 6 am, I felt the need to highlight some of the specific issues with the Dr. Epstein’s study that make this a non-story.

The Inherent Bias of Search Engine Audiences

What’s most surprising about Dr. Epstein’s study, given his seniority, is that it ignores the obvious reality of how search engine audiences are stratified. It compares the search suggest results between Google, Bing and Yahoo as though all search engines should return the same thing.

This ignores the fact that these suggestions are the result of user behavior. The demographics and psychographics of users on the three search engines is vastly different. A quick review of ComScore, or any other market research outlet would indicate that there is a skew in who uses each search engine that can potentially align with political affiliation or otherwise lead to different results. While ComScore’s data is not freely available without an account, but Bing makes the data about their search engine easy to access.

Consider this screenshot highlighting the age breakdown of Bing users.

This data indicates that the Bing audience skews more toward the older segments. Oh, and by the way Yahoo is powered by Bing as well. Now consider the next screenshot comparing the indexation of age per search engine.

The data above from this study, completed by the UK based marketing agency Further, indicates that Bing and Yahoo under-index for younger demographics and over-index for the older demographics while Google is the opposite. The inference here is that people who don’t know enough to change their default configuration from Bing/Yahoo to Google are also likely to be the same people who are looking for negative results on Hillary Clinton.

Whether or not that is actually true, it’s definitely a bias that needs to be accounted for when making a “scientific” examination of Google’s “biased” results.

The Inherent Bias of Dr. Epstein’s Study

There is also a series of biases in how the study was conducted. I won’t harp on the obvious fact that Dr. Epstein has a history of going after Google half-cocked. Let’s instead discuss data rather than just the anecdotal examples he presents in his analysis.

First, let’s start with the autosuggest for the term Hillary Clinton. As Dr. Epstein mentions, and quickly casts aside, what we’re seeing in my screenshot is a series of results that are, to some degree, dictated by the personalization Google uses to prepare rankings. Google may or may not use your search history or any other details to segment you in their database of affinity by leveraging data from across their entire ecosystem.

Nevertheless, when you dig into the data comparatively, it is pretty obvious why these are the terms that most people are seeing. They are the result of what users are actually looking for more widely across the United States.

While Dr. Epstein highlights the growth of a query like “hillary clinton is a liar,” he does not highlight the geographical biases of such a query.


As you can see, while the trend has spiked for that term, there is only interest in Florida (ha!), New York, California and Texas. For the term “Hillary Clinton dead,” which is the second result in Google Suggest, there is also a spike.


The difference, however is that people all across the country care about that term and therefore it often appears in the suggested searches.

I won’t go through every single query that Dr. Epstein has presented, but some of them simply are not specific enough to return a suggested search related to Hillary Clinton in Google, despite the fact that they do in Yahoo and Bing. Let’s take the “Crooked…” queries as an example.

Since the Hummingbird update, Google has moved to examine entities first when considering queries. “Crooked” does not necessarily result in a vector that Google automatically associates with Hillary Clinton. Rather, Google, looking at the historical association of the word “crooked” with a hit song by J.Cole, makes a lot more sense given the both the bias towards what the audience is actually searching for and the entity association.

However, the fact that I just justified my answer highlighting a song that I like shows my own inherent bias. See what I did there?

Secondly, the way this study was conducted suggests a lack of understanding of search. Dr. Epstein presents query trends in isolation and wonders why they don’t outperform the queries that do have visibility. It’s akin to saying “why isn’t everyone else thinking about the thing I’m thinking about as much as I’m thinking about it?”

Additionally, he mentions that they used proxies, Tor, flushed their cache and cookies, but there is no mention of accounting for location. As I said, Google uses a number of data points for personalization, but location, is one of the primary vectors. As long as their searches were conducted in the United States, it’s likely that the suggest results would be biased in kind.

How Search Suggest Works

Ultimately, the key takeaway from both Dr. Epstein and SourceFed’s analyses on the subject is that, well, these folks only have a cursory understanding of how Organic Search and Search Suggest appliances work. Both of these posts are essentially pop science used to fuel something in the undercurrent of the collective conscience for pageviews and TV appearances. In fact, I suspect that as I type this, Dr. Epstein is in an Uber heading to a news studio.

Irrespective of any biases, the obvious reality is that Donald Trump generates more negative – everything. Based on how Search Suggest actually works, it’s easy to see why he has more negative results in Search suggest.

Finally, I disagree with Google’s statement about them not showing negative terms in Search Suggest. There is a third party tool that keeps track of search suggest terms by extracting them from Google and a variety of other sources called KeywordTool.io. For the uninitiated, this method is called “scraping” and is not condoned by Google as it gives marketers insights into things that the search engine would normally like to keep away from people.

That said, the suggested keywords for Hillary Clinton related queries certainly feature negative terms, but there simply are not many of them and there’s not much search volume to support them appearing in the search suggest.

Search suggest is entirely dictated by search volume of the queries you see featured. I know this because I’ve helped people clean up their search suggest in the past through a variety of methods that Google has since “outlawed.” Those methods largely had to do with getting many people to search for a certain query or faking those searches using tools that impersonate users searching.

These methods don’t work as well now largely due to the fact that there is a geographical bias and the people that would help you fake searches for a low cost tend to be off-shore.

Google Is Certainly Not Perfect

Now, none of this is to say Google is not above faults. I can certainly present you with a long list of what Google is doing wrong. We can talk for hours about their actual biases, like how they prefer to highlight big brands over smaller brands with better content, how the link graph that powers rankings is still very problematic, and how they keep stripping data away only to make marketers pay for it. Or we could talk about bigger picture things like how they have all of the technology to become Terminator’s SkyNet.

However, the idea that they are actively attempting to change the results of an election, in my professional opinion, is bullshit.

Now, while I have your attention, I just want to remind you to register to vote. And when you click that link, notice how every result on that page is skewed to your state. That’s how Google works.

Mike King

8 Comments

  • Reply

    Boom goes the rebuttal dynamite. Michael, thanks for writing this up. It’s unfortunate that this all started with half-baked “research” several months back, that Dr. Epstein referenced in his “scientific research” rant.

    People who don’t understand how it works are not surprisingly going to wonder. People who claim to perform true research, while completely failing to take the time to speak with experts on how it works, or educate themselves on how it works, and go by speculation, just create a dishonest, if not outright deceitful presence. At the very least, it’s junk “research”, and serves nobody other than the good ad revenue monitors at the site he posted on.

  • Sam Hollingsworth
    Reply

    Mic drop!

  • Reply

    One of the things I love about search technology is how it forces us to understand context and nuance. This is something you and Rhea so brilliantly illuminated. Thank you.

  • Reply

    This is fun, Mike, but I wish you would have debunked this in a more systematic way. Many of your counterclaims could be supported with good old search volume data, so I’m not sure why you didn’t use it.

    “Search suggest is entirely dictated by search volume of the queries you see featured.” The correlation between the rank order of terms shown in Google Suggest and the rank order of the terms’ search volume (including volume-stable terms) is high, but not perfect (factoring in personalized search, location, etc).

    Why not:

    1. Take all of the negative terms that Epstein claims Google is removing (you could pull these from the Bing or Yahoo lists if you want).
    2. Determine current search volume for these using Google’s tools (keyword volume or even Google Trends as a relative benchmark).

    Do these terms have higher search volume than those that are actually appearing in Google Suggest? If so, then Google removed them. I’m not sure why you don’t believe their claim that they do this. Google hasn’t disclosed their negative list and those I’ve asked haven’t been keen on divulging it, but what is clear that Google’s classification of what is negative is not transparent and seems pretty arbitrary, the more anecdata you look at. I completely agree with you that one candidate may have more negativity than another and that this negativity is going to be concentrated in different areas for each candidate. I think it likely that Google’s filtering of negative keywords may benefit one candidate over another just because of where the negativity is concentrated, topically or semantically.

  • Christi Olson
    Reply

    Great article, thanks for the early morning write up. I love that you call out the differences between audiences… people don’t get that. Even people within our space don’t seem to get it. BTW, The link to Bing’s audience page is broken in the post (https://ads.bingads.microsoft.com/en-us/audience) is the page.

  • Reply

    Nice response Mike! It’s unfortunate that someone with “credibility” can write something that is as inaccurate as his and it be taken seriously, without fact checking by real and knowledgeable sources.

  • Reply

    Great rebuttal, Mike! Unfortunately, Dr. Epstein is just one of a multitude of folks willing to present slanted data that will support his personal wishlist. Even more unfortunate – too many will accept it as truth, ’cause “Internet” and “Dr.”

  • Reply

    If only people would believe researched facts over fear of ‘black box’ algorithms.
    Haters gonna hate.
    Good stuff Mike.
    Cheers

Leave a Comment

Your email address will not be published. Required fields are marked *

Get The Rank Report

iPullRank's weekly SEO, Content Strategy and Generative AI Newsletter

TIPS, ADVICE, AND EXCLUSIVE INSIGHTS DIRECT TO YOUR INBOX

Join over 4000+ Rank Climbers and get the SEO, Content, and AI industry news, updates, and best practices to level up your digital performance.

Considering AI Content?

AI generative content has gone mainstream.

Discover what that means for your business and why AI generation can be your competitive advantage in the world of content and SEO.