Category : Tools

Home/Archive by Category "Tools"

6 Marketing Tools We Use That No One Else Really Talks About

As a marketing technologist, I can tell you that, frankly, there are many too many marketing technology solutions out there. I’m sure you’ve seen that aesthetically challenged image that keeps getting busier and busier each year with all of the marketing and ad tech solutions. We’re in the middle of the MarTech cash grab.

I spend a lot of time playing around with these tools and separating the good from the bad. Specifically with regard to marketing analysis, there are number of great all-in-ones and point solutions that everyone is using like Searchmetrics, Ahrefs, Screaming Frog, et al. However, there are a number of solutions that we get a lot of value out of that I don’t hear talked about much in the online marketing echo chamber. I feel that some of these tools give us an unfair advantage to scale our approaches and do a lot of great work with our small team. So today, I just want to introduce you to some of these tools in hopes that you may find value in adding them to your workflow and toolbox.


When I started doing link building in 2006, everything we did was tracked in Excel sheets. Then many of the non-tinfoil hat people moved to Google Docs due to its ImportXML features. At some point a lot of people with dev chops realized there’s a lot of things that can be automated and tracked to improve and scale outreach. As a practitioner that really spent the time to get to know my prospects, I’ve been a big fan of BuzzStream because of how conducive it is to living within workflows outside of its system, and I very much appreciate some of their latest features (hello BuzzStream Discovery).

As a manager, I’ve found a lot of value in using Pitchbox due to its process-driven approach and its UI that keeps you within the system and focused on your specific tasks.

Pitchbox is the process-driven outreach tool. It has features that let you identify prospects in a variety of ways based on queries. You can import a list of prospects and it pulls many features and information about these sites and their authors. The tool also allows you to maintain a number of email templates with complex mail merge fields. Best of all, the system allows you to review prospects without even leaving it and provides analytics to show the activity of your team.

I love how Pitchbox helps inform strategy based on its prospecting. I also love that it makes it easy to provide complete transparency to our clients regarding our outreach efforts and, more importantly, it allows for a separation of concerns. The reality is that someone who is good at finding contacts, and knowing what a good link is, is not always someone who is good at outreach. Think about it, there are definitely SEO people that you wouldn’t want speaking to people without prior approval in real life or otherwise.

We can specify a user that just reviews link opportunities, and we can specify a user that just looks for features and data about the writer and their site and article to fill in the details for the mail merge fields in the emails. While I have historically been against form letters, I recognize that in order to develop predictable results across a larger team, we need to use very tailored form letters and Pitchbox allows for the best of both universes. Also, you have the ability to approve what’s going out a number of times before the emails are sent.

Judge for yourself though, Jon Cooper wrote an epic post comparing all of the available solutions for managing outreach. He brings up great points about all of the solutions, but I still consider Pitchbox the go to for my current use case.


I appreciate innovation, and I respect tool companies that are able to maintain a high quality of output that continues to shift what we can do as marketers. CognitiveSEO is one of those companies that continually releases features that are very relevant to the changing landscape of Organic Search. They were one of the first tools to really capitalize on the changes in lieu of Penguin and prepare actionable tools that allow you to get to the bottom of issues in your link profile.
CognitiveSEO is positioned as all-in-one providing rank reporting, penalty analysis, link reporting and quantitative content analysis. For our purposes, we primarily leverage it for its link analysis features.

A long time ago, I’d built a tool that crawled all of my links from all the available link indices to collect data and segment. What I love about Cognitive, is that they’ve taken the same ideas and built them to a point well beyond my imagination. One of my favorite features is the force-directed visualization of the link profile. It allows you to visually highlight the nofollow vs. dofollow links, unnatural vs. natural and also isolate links using search. Visualizations like this make it a lot easier for clients to understand what’s at play.

I also really love their spam classification system. It’s driven off human input in that it requires the user to classify anchor text first and then run the spam classification algorithm. Once that’s done you get a detailed report of what they algorithmically determine to be spam links. While the human classification process can be very time-intensive, there is typically a very high payoff and you definitely shave several hours off of reviewing every link by hand.


Again, on the thread of innovation, I get excited due to the rate at which new APIs are popping up. I frequently click around on ProgrammableWeb to see what new datasets are available. What I don’t have time for is fiddling with libraries or reconciling code against outdated API documentation versus how things actually work. So Postman is a great Chrome Extension for me to play with an API, see what types of data it will return and how I can work it into something else we’ve built.  It’s a great tool for non-coders to get insights on APIs to better communicate with developers what data they want to use.

I love the fact that it allows me to quickly prototype an idea, understand the idiosyncrasies of the API’s responses and develop functional requirements.


I think it’s funny that you can put the word “science” after anything and it sounds way more interesting than it actually is. Think about it; Computer Science. Marketing Science. And of course, Data Science. That being said, I still haven’t gotten around to deep fluency in Python or R, so Orange is an incredibly helpful tool for me to use in data mining. You can visually set up your analysis, set the options and let it rip. Of course, you need an understanding of statistics to begin with, but once you’ve got that, there’s no coding involved. Orange is a data mining tool that for Windows, Linux and MacOS that allows you to visually set up data analysis. Effectively, it’s a GUI for data science tasks in Python.

I love it because complicated computations are drag and drop. You can drag in your dataset, drag in linear regression and the method of visualization and voila! It’s that simple to get a result, although it’s certainly not that simple to interpret and ensure you get the right results. As a data miner, I find it incredibly valuable for running models and digging into data.


There’s been a lot of talk in recent years about topic modeling, entities and the usage of co-relevant or co-occurring terms, but I find it hard to believe many people are truly putting these ideas into practice. Virante on the other hand has done the research and built an otherwise ugly and heuristic defying tool called nTopic to support the practice.

Simply, nTopic identifies the keywords your content should feature to be more relevant when considered for ranking for your target keywords. For example, taking the first page of our magnificent GTM Guide, nTopic tells me that while the page is a 99.87% for the keyword “google tag manager” there are some words to consider when editing the copy to have the page perform better.

I love it because, as opposed to Searchmetrics’ version of the tool, I don’t need to setup a campaign to access the analysis. It’s easy to do it on an ad hoc basis. They also have a plugins for WordPress and Chrome that makes it easy for these optimizations to happen within a writer’s normal workflow.

Keyword Studio

Our approach to keyword research is perhaps one of the most thoughtful, audience-driven, time consuming and, most importantly, actionable methodologies available. As an agency, time comes at a premium though so it’s up to us to stay on top of ways to automate where possible to achieve quality and scale. The most time-intensive part of the process is the segmentation of keywords into categories, personas and need states.

While any tool for the latter two would require machine learning, Keyword Studio leverages a synonymy engine to greatly improve speed of categorization. It’s the only true all-in-one keyword research tool in that it pulls keyword opportunities, rankings, search volumes and CPCs. Basically it’s what Google’s Keyword Tool would be if they supported Organic the same way they support Paid Search.

That’s All Folks

There you have it. Some other lesser known tools that are becoming incredibly critical to our various online marketing analyses and workflows. Hope they can help you with your speed and scale.

Your turn, what tools do you use that people aren’t talking about?

How to Run Screaming Frog and URL Profiler on Amazon Web Services

I’ve been a huge fan of Screaming Frog SEO Spider for a number of years now. One would be hard-pressed to find a finite number of use cases for the tool. I also very much appreciate Dan Sharp and his team’s continued focus on innovation and improvement with the tool as well.

I also love a lot of the other crawler tools that have popped up in its wake like DeepCrawl and URLProfiler. Now I’m also getting to know as well and I encourage you to give their free trial a spin.

URL Profiler though has planted itself as the go to tool for our content auditing process. Although, I’d encourage you to check out Moz’s new content auditing tool as well.

From what I know of each of these tools is that they all have their own strengths, weaknesses and use cases. For example, if we’re doing a population (vs. sample-based) content audit on millions of pages, we’d typically use DeepCrawl then batches of 50k URLs in URLProfiler.

However, despite how awesome the SaaS crawlers are, I always feel like I “know” a website better when I do a Screaming Frog or URLProfiler crawl. Also one of our team members has built to bring headless browsing features to Screaming Frog, so that is an added incentive for us to make it work. I’m well aware that this is more a reflection of how well I know these products than the shortcomings of the other products. Nonetheless, it’s more important to do what it takes to do work that we’re PROUD of than to use the most sophisticated tool.

All that said, how many times have you been frustrated by this dialog box?

Why Does That Happen?

Technologically, cloud-based crawlers have a distinct advantage over desktop crawlers. Typically, cloud-based crawlers operate using a series of nodes that distribute the crawl. Each of these nodes runs a small application managed by another centralized application that makes the crawling fault-tolerant. Also cloud-based crawlers are saving their crawl data to a database so memory overhead can be kept very low. Finally, cloud-based crawlers have a virtually infinite set of computing resources to pull from to facilitate the crawl. In summation, cloud-based crawlers can be distributed, faster and more resistant to failure. The diagram below from an eBay patent gives a visual representation on of how a cloud-based distributed crawling system typically works.

Conversely, desktop crawlers are limited by the specs of your computer and they run in memory. If your machine has 4 CPU cores, 8 GB of RAM, you’re running Windows 8, have 50 tabs open in Chrome and have a bunch of TSRs running, the Frog is very likely to actually be screaming in pain while it’s crawling for you. A desktop crawl is inherently a resource limited crawl; this is why it’s prone to crash or run out of memory when it crawls too many pages.

Screaming Frog’s advantage over URL Profiler is that, once it reaches the resource limitation, it will ask you if you’d like to save you crawl and then keep going. URL Profiler on the other hand will just crash and all of that data is gone. Typically, I watch the usage of processes in Task Manager and start closing other applications when CPU or memory get too close to 100%.

Sounds like the odds are against you for big sites with desktop tools? Sure, they certainly can be, but none of the cloud-based tools get me the combination of data I want just the way I want it. So what can we do?

Enter Amazon Web Services

What we’re going to do now is run Screaming Frog and URLProfiler on Amazon Web Services. This will allow us to run the tools on an isolated machine that is has far more resources and likely more consistent speed than anything you or I have in our respective offices. My own machine, which is a fantastic Samsung ATIV-9, has 2 cores, 8 GB RAM and 256 SSD. On AWS we can configure a machine that has 40 cores, 160 GB and virtually infinite space. We won’t, because that’s overkill, but you get the point.

Odds are that you’ve heard of Amazon Web Services (AWS) and you may throw it around as an option for how you can do fancy things on the web. Or perhaps you’ve read about how it powers many of the apps that we all use every day. Whatever the case, the long and short of it is Amazon Web Services gives you virtual computing resources in a variety of different ways. Effectively, you can host a series of servers, databases, storage space, et al in myriad configurations and manipulate them programmatically on-demand. For example, when you fire up a crawl in DeepCrawl, it takes a few minutes for it to get started because it has to launch a number of EC2 instances to facilitate that crawl.

That use case doesn’t apply to what we’re doing here, but you now have a picture of how those tools use AWS to their advantage. In this case, we’ll spin up one box and configure it to just run exactly what we need.

As you can see below, there are numerous different services that Amazon offers. The one we will be focusing on most is Elastic Computing Cloud, commonly referred to as EC2.

You’ll also need to know a little bit about VPC to get access to your servers remotely, but we won’t go too deep into that.

Although the list of services above can appear daunting, I promise you the process of getting setup will be pretty painless. Shall we?

How to Set Up a Windows Box on AWS with Screaming Frog and URLProfiler

To get yourself going on Amazon Web Services, we’ll effectively be setting up an instance of a Windows Server, installing the programs on it, running our crawls, saving an image of that instance and shutting it down. Here we go!

  1. Login to Amazon Web ServicesYou’ll be using you Amazon account for this. Amazon gives a free 12 months of AWS service to first time users. Be advised that the free tier only applies to certain usage types. Instances in the free tier won’t be adequate for what we’re looking to accomplish, but pricing beyond those usage types is quite reasonable.
  2. Launch your Instance – First, make sure you’re in the right availability zone (in the upper right, next to my name). North Virginia is the cheapest of the data centers. After that click Launch Instance.
  3. Choose your AMI – An Amazon Machine Image (AMI) is a pre-installed set of configured software. Rather than setting up a blank machine and needing to install an operating system, Amazon allows you to clone a fresh machine with an Operating System of you choice already installed. You could set up your own configurations and create your own AMIs as well, but we won’t. In this case we’ll be choosing the Windows Server 2012 R2 Base AMI.
  4. Choose an Instance Type – This is where you get to choose your computing power. As you can see the free tier (t2.micro) only gives you one core and one GB of RAM. That’d be fine, for a single node, if you’re writing a script that did your crawling, but you’re not, you’re running a full featured memory-hungry Windows application. Go with the r3.4xlarge instance type with 16 cores and 122 GB of RAM and let those programs breathe. You can find out more information on the instance types that AWS offers here. Spoiler alert: The R3 instances are “memory-optimized” and suggested specifically for running analytics programs.
  5. Configure Instance Details – You can pretty much leave these all as defaults. Well, this being your first instance, you’ll have to set up a VPC and configure a network interface so that you can actually login to your Windows server. You should also check protect from automatic shut down since this is your first time playing with AWS; that way you’re sure to not lose any data.

    Read this for more information on configuring a VPC.

  6. Configure Security Group – AWS is annoyingly secure. You’re going to need to configure a security group using the launch wizard. Security groups allow you to give access to users based on their IP addresses. However, since you’re not storing anything significant on this box you can go ahead and give the security group access from any IP. Should you start saving anything of value, I’d recommend locking it down to the IPs that only you and your team can access.
  7. Review Instance Launch – As with any tool that uses a wizard, you are just making a final check of your configuration at this point. Double check that your screen looks pretty close to this. You should see the two warning indicators at the top if you’ve configured it as I have. Your instance type will be reflective of whatever options you’ve set.
  8. Create a New Key Pair – A key pair is a public and private key that AWS uses for logging in. For Windows Server, AWS uses this so you can retrieve the administrator password. Create the key pair and download the file.

  9. Connect to Your Instance – AWS will give you a configuration file to download in order to connect to your instance using the Remote Desktop application. You’ll also need to upload your key pair first to get the administrator password here. Once you do this, the admin password does not change so as long as you keep it, you won’t need to connect via this interface again. So go ahead and save your password and login using the Remote Desktop Connection app directly. You’ll want to save the file and password to make it easy to share login details with your colleagues.

    Once you’ve logged in, you’ll get a window of Windows that looks like this (minus Chrome, URL Profiler and my Screaming Frog crawls directory):

    Naturally Windows Server has a different features from the Home versions, but it will operate fundamentally the same as Windows 8. RDC will take over hot keys whenever the window is maximized. If this is your first time use the Remote Desktop application, check out this post on how to map your drives so you can access your local files on the remote machine.

  10. Install Chrome – The first thing you will want to do is install Chrome so you are not saddled with the abomination that is Internet Explorer.
  11. Change Internet Security Settings – You’re going to run into some issues trying to install Java on this annoyingly “secure” install of Windows Server. Go to Security Settings and configure the custom level by enabling everything. You can go ahead and change it back after Java is installed.
  12. Install Java 64-bit – You’ll want to install Windows Offline 64-bit from the manual install page on 64-bit is an important because the allocation option breaks Screaming Frog otherwise.

  13. Install Screaming Frog SEO Spider – Because Screaming Frog requires a little more configuration to get it supercharged, let’s start with that first. Download Screaming Frog and input your license key.

  14. Maximize Screaming Frog’s Memory Allocation – Screaming Frog has a configuration file that allows you to specify how much memory it allocates for itself at runtime. This ScreamingFrogSEOSpider.I4j file is located with the executable application files. Open it in Notepad and change its default 512MB memory allocation to 120GB. For those that want to know what this does, this value is an JVM environment variable that tells Java to allocate the specified amount of space to Screaming Frog. Screaming Frog simply passes this through to Java when it runs.
  15. Ramp up the threads – By default Screaming Frog only uses 5 threads at a time to be nice to webmasters. Let’s ramp that up to 15 so we can get this job done quicker.

  16. Install URL Profiler – Download URL Profiler, install it and put in your license key.

  17. Setup your API Keys – Setup your API keys for all of the services that you want it to use.
  18. Create an AMI Image – Now that your instance is completely configured, we’ll want to create an image of it just in case anything goes wrong or you want to create several instances of your box if you need to run multiple high-octane crawls at once.

    Give your image a name.

Now You’re Ready to Roll

While I don’t know the limitations of this configuration, I’m currently looking at it in the middle of a 20 million URL crawl. If you run into any problems you can always go to the bigger instance for more memory. Ideally, you’d be able to add bigger volumes (hard drives) to the instances the programs could lean on virtual memory, but from tests and the documentation it appears that Screaming Frog and URLProfiler only use physical memory. Effectively, you are limited to whatever the maximum memory configuration (244 GB in case you’re wondering) can hold at once. For reference, Screaming Frog’s documentation specifies that “Generally speaking with the standard memory allocation of 512mb the spider can crawl between 10K-100K URI of a site. You can increase the SEO spider’s memory and as a very rough guide, a 64bit machine with 8gb of RAM will generally allow you to crawl a couple of hundred thousand URLs.” While I’m skeptical of that number based on those specs, assuming 8GB gets you 200k URLs, then 122GB should get you 3.05 million URLs.

Additionally, the beauty of Remote Desktop is that you can start the crawl, close the window and then remote back in later and it will have kept running the entire time. Remember that Amazon Web Services charges you by the hour, so don’t forget that you’re running an instance if you’re concerned with what you’re spending. Which brings me to my next point…

What’s this Going to Cost Me?

Amazon’s pricing is completely dependent upon your configuration and they have a price calculator as well as the spending alert system to help you stay on top of it.

Based on the configuration that we’ve chosen, if we left it up for 100 hours (a little over 4 consecutive days) per month, it’d cost $237.33. Providing you could crawl 3 million URLs in that time period (site speed and throttling dependent) that’s far cheaper than the $2980 that DeepCrawl charges for 3 million URLs with their pay as you go plan.

Wrapping Up

Naturally, there are different plans that cloud-based crawlers offer and they do a lot of the work for you or you could just build a maxed out machine that just runs Screaming Frog and URLProfiler and save money. Or you could run Screaming Frog on a linux box to save more overhead and potentially run on a smaller instance, but I’m guessing that if you could, you’re probably not reading this post. Either way hosting Screaming and URLProfiler on AWS is a great short term solution when your Desktop crawl needs more power.

Now it’s your turn. I’d like to hear how you’ve overcome the limitations of desktop crawling in the comments below!

*** UPDATE: Fili Weise actually beat me the punch in this. Check out his discussion on how to run Screaming Frog on Google Gloud Servers! ***

How to Uncover 100s of New Longtail Keywords in Minutes

Hey, I’m Andrew Breen. I run Outshine Online Marketing.  We’re a small company with big ambitions. Let’s connect on Twitter: @breenandrew.

By now you’re probably well acquainted with Google Suggest. Its Google’s search tool that gives you Search suggestions as you type your query into the Google search box.

Google Canada Prepaid Credit Cards Screenshot

And while Google Suggest can generate some hilarious and weird results, you can also you it to quickly generate a massive longtail keyword list in minutes.

This article will show you how to combine two awesome Google Suggest scrape tools to generate a list of hundreds or thousands of related keywords in minutes. Then I’ll show you how to turn that huge list of keywords into something you can actually use.

What the heck do I want all those keywords for?

If you’re wondering why you’d want such a list of longtail keywords, wonder no more. I use this Google Suggest scrape method to find keywords for two things:

  1. New content ideas– Stuck on topics to write about in your niche? Scraping Google Suggest will give you content ideas you’re not going to find anywhere else.The results Google Suggest shows you are a reflection of search activity on the web. Sure, the keyword phrases it suggests may not get a lot of searches, but they are getting some. You can use this scrape method to unearth great longtail phrases that are easy to rank for and still generate traffic.
  2. PPC campaigns– Think you’ve found all the right keywords to bid on in Adwords? Maybe not.By scraping Google Suggest keywords, you can find keywords to add to your campaign that you would have never thought of.Just as importantly, Google Suggest scraping is a great way to find negative match keywords to target before you waste your money on them.

Now that you know why you want to scrape Google Suggest, let’s get into my method of actually doing it.

Step 1

Start at Ubersuggest. It’s a free web-based tool that lets you export lists of Google Suggest phrases based on a keyword you enter. Kudos to Ken Jurina for showing me this.

Enter your keyword and select your language. You can choose to scrape Google Suggest phrases from the web, news, or products searches. In this case I’ll use the web results.

Now check the txt box – that will let you download the results in a text file.

Click “suggest” and a suggestion.txt file will download to your computer. Open it in NotePad++, regular Notepad screws up the spacing.

Now I have a list of 242 “prepaid credit card” related keywords that Ubersuggest has extracted from Google Suggest.

Prepaid Credit Cards Keywords

Step 2

Here’s where the fun starts. We are going to take all 242 keywords from the previous step and look for even more Google Suggest results using ScrapeBox. While UberSuggest lets you scrape the results for one keyword, ScrapeBox lets you scrape the results for hundreds of keywords at a time.

Not familiar with ScrapeBox?  You should be. It’s a powerful way to speed up SEO tasks like keyword research and link building.

I should fully disclose here that ScrapeBox is typically a black hat SEO tool. Sure, it’s popular for mass blog comment spamming, but it’s also a versatile tool that can be used for white hat SEO too. Think of Scrapebox as a weapon – in the wrong hands it’s deadly, but it can be used for good too. And we’re all about the good.

Once ScrapeBox is open, drop that list of keywords from Step 1 into ScrapeBox’s Keyword Scraper Tool.

Select your scrape sources and search engines. There are a number of options here, and your choices will depend on the type of site you are doing keyword research for. The product and shopping suggestions are handy for ecommerce research, but for content ideas I just focus on the main search engines:

Scrapebox Keyword Scraper Options Screenshot

Click “Scrape” and kick back as ScrapeBox does its thing. This can take a few minutes depending on the number of scrape sources you picked, the number of keywords in your main list, and the speed of your proxies.

When ScrapeBox finishes running, I click “Remove Duplicate Keywords” and I am left with a list of 574 keywords related to prepaid credit cards.

Want even more keywords? Transfer the list you just scraped back into the main keyword list and run another scrape.

What do I do with all these keywords?

So now you have 100’s of keywords. Are they all useful? Of course not. But with a few minutes of work in Excel, you can turn this unmanageable mass of keywords to a targeted list that you can actually use. Here’s two different ways you can refine the list.

    1. If I’m using the list for PPC keyword ideas, I rely on Excel’s Conditional Formatting and Sort & Filter functions to hone in on the keywords I’m interested.Let’s say I’m running an Adwords campaign offering prepaid credit cards from a major credit card provider. The client is sensitive about their brand image – they don’t want to appear to be marketing to minors.With Excel, I can use the “Text That Contains” Formatting option to highlight uses of keywords like “teen,” “kid” and “child.”This highlights all keyword phrases that contain my specified text. But the highlighted keywords are still mixed in the regular keywords, so I’d then filter the list using the “Sort by Color” option. The “Sort by Color” option brings all the highlighted keywords to the top of the list so I can review them all at once.

      The Google Suggest scrape method is great for finding new negative match keywords you didn’t consider. In this case I found people were using phrases I hadn’t considered, like “under 13,” which I immediately add to my negative match list in Adwords.

    2. If you’re doing keyword research for content ideas, you’re going to love what I am about to tell you: With Richard Baxter’s Google Adwords API Extension for Excel you can take your list of keywords from Scrapebox and, from within Excel, grab Adwords search volume data.That’s right – no more flipping back and forth between the web-based Adwords Keyword Tool and your Excel sheet.  Talk about a timesaver. Now you can sort all your content idea keywords by search volume, which will show you where you’re best off investing your content-creation time.The Excel extension is free; all you pay for is the API costs to Google, which are negligible. High-five to John Doherty for showing me this.

So now that you know how to build a huge list of longtail keywords using Google Suggest scrapers, what are you waiting for? Get cracking! Test it out now and start generating new ideas for your website or PPC campaign.