Puppeteer on The Edge: SEO Use Cases with Cloudflare’s New Rendering API (beta)

With Cloudflare's new Rendering API, you can launch some creative browser based automations on the Edge with Puppeteer for SEO processes.

In search engine optimization engineering, automated testing is critical to a stable organic search presence. Despite the utility of a framework like Puppeteer, there are a shocking amount of hoops one would need to configure before properly running the library.

In an exciting turn of events, Cloudflare recently announced the ability to run Puppeteer within a Cloudflare Worker as part of their Workers Browser Rendering API, which is currently in closed beta, but accepting new clients via a waitlist.

This opens the door to an exciting variety of possibilities as many developers, myself included, have already been using Google Cloud Functions, AWS Lambda, or even a dedicated virtual machine to handle the bloat of a full web browser as part of your application. 

You can currently make fetch calls to a cloud function (Google Cloud, AWS, DO, Heroku, etc) from a Cloudflare Worker as you have 30 seconds to respond, which is normally enough time, or even up to 15 minutes on a scheduled Cloudflare Worker.

Community Cloudflare Excitement and Innovation

Taking a step back, it seems I was not the only one wanting this feature:

Adam Wathan of TailwindCSS called this out as a killer feature that Cloudflare could build by offering pre-rendering for SPAs when a bot/crawler is detected. Cloudflare already offers various integrations like this such as integrations of IndexNow as well Signed exchanges (SXGs)

After the announcement, there was more discussion on how this could have a major impact on tools that are built on top of Puppeteer.

Repeat.dev offers a webhook to schedule tasks and generate PDFs, which is the exact use case Cloudflare calls out and they hope to integrate the solution soon.

The timing of this tweet is just perfect as it was announced a few days later:

Example Cloudflare Puppeteer Integration

So although I have requested early access, here is what we know about what Cloudflare is offering and how easily it could be integrated. 

Puppeteer can be easily imported as a package, specifically built for Cloudflare.    

import puppeteer from ‘@cloudflare/puppeteer’

You can create a browser instance and launch a page, just like you would on any puppeteer script. It will be interesting to see if there are Cloudflare-specific methods and/or missing methods from the normal Puppeteer package.

const browser = await puppeteer.launch({
browserBinding: env.MY_BROWSER
})
const page = await browser.newPage()

await page.goto(“https://example.com/”)
const img = await page.screenshot() as Buffer
await browser.close()’

This code example that Cloudflare provided even showcases how you can upload the screenshot to R2 storage, which is a Cloudflare object storage solution, much like AWS S3 or Google Cloud Storage.

try {
await env.MY_BUCKET.put(“screenshot.jpg”, img);
return new Response(`Success!`);
} catch (e) {
return new Response(”, { status: 400 })
}

What kind of limitations will Puppeteer have on Cloudflare?

Since Cloudflare is a CDN and workers operate on the edge, I’m curious to see what limits or backdoors this opens up to the world of scraping. 

Cloudflare Workers already allow you to visit sites that normally would block you if you are making requests from another server. Because Cloudflare Workers are on the Edge, you are less likely to be blocked. 

One limit that I came across recently was around subrequests. Rather than have the client make 50 individual requests that would render data on the fly, I had the client make 1 request to a Cloudflare Worker, which would then wrap 50 requests and respond with the final result. 

This allows for heavy lifting such as filtering unique values and sorting arrays before sending them back to the client. So all of that business logic is not needed in the front end but can take a payload and render a view. 

What are some possible projects for Puppeteer on the Edge?

The obvious use case is crawling a page and parsing the content for SEO insights:

  • Page Title
  • Meta Tags
  • Headings (H1-H6)
  • Extracting Text
  • Extracting Links

There are already a ton of businesses built on top of Puppeteer:

  • API & E2E Monitoring
  • PDF/Gif/Screenshot Rendering
  • Product Monitoring

Specifically for the SEO community, here’s a list of opportunities that I am excited about:

  • Edge Crawling – a Screaming Frog crawler built on Cloudflare that exports to R2 storage. No server, no local machine, just Edge.
  • Crawl + AI – Fetch a SERP, parse out the content, and use the content to generate new content to upload to your site.
  • SEO Alerts – Crawl your clients’ sites to monitor for changes and report issues.
  • SEO + NLP – Crawl sites, extract text, and process NLP for better insights and opportunities.

There are a ton of low-hanging opportunities as well, just something as simple as status code monitoring to find broken links and redirect issues could be configured. 

What will you build or want to build with Puppeteer on the Edge? Hit me up on Twitter – @johnmurch with your thoughts.

Want to see how iPullRank can set your organization up with SEO monitoring automation? Check out how we help advanced SEO teams and contact us for any projects you want to launch.

John Murch

Get The Rank Report

iPullRank's Bi-weekly SEO Newsletter

TIPS, ADVICE, AND EXCLUSIVE INSIGHTS DIRECT TO YOUR INBOX

Join over 4000+ Rank Climbers and get the SEO industry news, updates, and best practices to level up your SEO.

👋🏿 Before you go...

Sign up for The Rank Report Newsletter

🏆 Loved by over 4000 subscribers who pull rank