Scrape Rate and Shareability Rate

As I tweeted about a month ago my top draft picks for relatively unknown SEOs that are about to make tsunamis for 2012 are Dan Shure (@dan_shure) and Jon Cooper (@pointblankseo). Both of whom are not taking that declaration lightly and have come out swinging. Last week Jon Cooper invented a new metric for guest post prospecting called Scrape Rate and I’m going to build upon that a little bit here.

The basic premise is that posts on many popular blogs are scraped and placed and/or syndicated onto other websites and therefore writing a guest post containing one link back to a given site will have a multiplier effect once the post goes live and is scraped. This is something that I’ve noticed in the past in writing for SEOmoz. As seen below there are 103 pages that have scraped and republished the “Just How Smart are Search Robots” post that I wrote at the end of November 2011.

Intitle SERP example

And for that reason alone I am on the first page (between 7 and 9 depending) for the keyword “Googlebot,” a highly competitive keyword according to Moz’s Keyword Difficulty Report.

It’s pretty safe to say that this is very powerful stuff that should be weighed when considering where to guest post.

Shareability Rate (or Share Rate)

Immediately after seeing Jon’s post I thought “this is brilliant but why aren’t we computing this for the propensity of posts on a given site to be shared.” For example as soon as a post goes live on the main blog of SEOmoz there are roughly 300 tweets immediately no matter what due to their devoted following and RSS scrapers.

For example Andrew’s post went up sometime early this afternoon and already has 537 tweets.

Computing Share Rate

I skipped the explanation of how to compute scrape rate with good reason; shareability rate will play into what I believe to the most effective way to compute scrape rate.

Share metrics are available via free APIs and or even directly in Excel via Neil Bosma’s SeoTools plugin and therefore are far easier to come by than Google results numbers. So while Jon suggested that you pull 3 random URLs in computing scrape rate, I’m going to suggest something more empirical. We’re going to pulls all the URLs for the section of the site that we want to guest post on and then pull social metrics on all of those URLs and then calculate an average of all the social metrics for those URLs. For speed and simplicity of creating this post I’m using 11 URLs from the SEOmoz blog.

  1. Download and Install Neil Bosma’s SeoTools – That is if you haven’t already.
  2. Download the XML Sitemap
  3. Open the XML Sitemap in Excel – Hat tip to Justin Briggs for this one. I had no idea that Excel opened and properly formatted XML files.
  4. Pull the Facebook Likes, Google Plus Ones, and Tweet Counts for each URL. The functions are =FacebookLikes(), TwitterCount() and =GooglePlusCount().

  5. Sort the values from lowest to highest – this is important later for computing Scrape Rate.
  6. Calculate the average of each
  7. Calculate the average of the averages
Voila, we’ve determined that on average across Google+, Facebook and Twitter a guest post for this example set will get 142 social shares. Note: This is just an example, not the actual Share Rate of an SEOmoz blog post.

Computing Scrape Rate

Jon talks about pulling 3 random URLs in order to determine the Scrape Rate. While this is a good quick and dirty judge of the Scrape Rate it’s not empirical. That is to say without a more empirical approach you may end up getting a Scrape Rate that isn’t truly indicative of the site’s propensity to be scraped. This is where Share Rate can inform the Scrape Rate. Using the share counts as an indication of popularity allows you to weigh the importance of pages and therefore you can get a better Scrape Rate by taking average of the 3 pages at the mean of the Share Rate.

  1. Find the median of the URL list. So if there are 200 URLs the median is 100th URL.
  2. Pull the median URL, the URL preceding and the URL following.
  3. Pull the page titles for these 3 pages using =HtmlTitle() in SeoTools.
  4. Run “intitle:” queries with these 3 page titles and record the number of results Google returns. Note: SeoTools has a function for pulling the number of Results (=GoogleResultCount) but I couldn’t get it to work properly with concatenate and and intitle query.
  5. Take an average of these numbers.

Voila, we have an empirical Scrape Rate!


Guest posting takes a lot of time and effort and you always want to get the most bang for your buck. Scrape Rate and Share Rate are great metrics for determining how worth your while it will be to write a guest post on a given site. As always you know I’m not into all this manual labor for computing something like this so you know I’m already working on a tool… Stay tuned.

As always your thoughts and feedback is appreciated. Also big shout out to Jon Cooper on his great work!

Happy guest posting!

Mike King


  • Reply

    Great stuff Mike.

    I think the value of DMOZ over the years wasn’t so much its high PageRank (so much of it is diluted) but rather the fact that so many sites just scrape the heck out of it so you get lots more links with the anchor text you’re looking for in that way.

    Interesting concept, great way to prioritize bloggers for outreach.

  • Sam

    You can also take into account the niche you’re writing in… you may find a lot of autoblogs that pull content from that site’s RSS feed.

    You can use footprints to find out what site has the most autoblogs pulling from their feed and solicit them to write a guest post. It’d be possible to find out then how many links and what types of links you could expect to get.

    I dig this thinking. How many times do you see people complaining about their content getting scraped? Way too often.

  • […] is a raw idea that was finally expanded upon by Michael King. First read my idea, then check out this great extension off of it.Domain Hunter Plus – Next Generation Link Checker – This is one of my new favorite […]

  • […] searching for [Your company Vs] and see who you are compared to, recommended by @wilreynolds. – Scrape rate detects how much content is scraped. Good content gets scraped, recommended by @ipullrank. – […]

Leave a Comment

Your email address will not be published. Required fields are marked *

Get The Rank Report

iPullRank's weekly SEO, Content Strategy and Generative AI Newsletter


Join over 4000+ Rank Climbers and get the SEO, Content, and AI industry news, updates, and best practices to level up your digital performance.

Considering AI Content?

AI generative content has gone mainstream.

Discover what that means for your business and why AI generation can be your competitive advantage in the world of content and SEO.