Web Scraping / Cloudflare
How to Scrape Cloudflare-Protected Websites
Cloudflare protects a significant portion of the web. When you try to scrape a Cloudflare-protected site with a standard HTTP request or a basic headless browser, you get a challenge page instead of the content you need. Here is how to get through it.
How Cloudflare identifies automated traffic
JavaScript challenges
Cloudflare's first line of defense is a JavaScript challenge that runs in the browser. It checks whether your client can actually execute JavaScript and evaluates browser environment properties. Plain HTTP requests fail immediately because they cannot execute JavaScript at all.
Browser fingerprinting
Even if your headless browser can run JavaScript, Cloudflare's challenge script collects detailed browser fingerprints - canvas rendering, WebGL info, audio context, screen properties, and more. Default headless Chrome and Puppeteer installations fail these checks because their fingerprints differ from real browsers.
Turnstile challenges
Sites in Cloudflare's "Under Attack" mode or with higher security settings serve Turnstile challenges. These are more advanced than basic JavaScript challenges and require the browser to complete a proof-of-work or behavioral verification. They are designed to be invisible to real users but block most automated tools.
IP reputation scoring
Cloudflare maintains reputation scores for IP addresses across its entire network. Datacenter IPs and IPs with a history of automated traffic receive higher challenge rates. Even a clean headless browser will get challenged more often if it comes from a datacenter IP range.
Scrape through Cloudflare automatically
Browser7 uses real Chrome browsers on residential IPs, which passes Cloudflare's standard JavaScript challenges and fingerprint checks without any special configuration. For most Cloudflare-protected sites, a normal render call is all you need.
from browser7 import Browser7
client = Browser7(api_key="b7_your_api_key")
# No special configuration needed for Cloudflare-protected sites
result = client.render(
"https://www.example.com",
country_code="US",
)
print(result.html)There is no Cloudflare-specific parameter or mode. Browser7's default behavior - a real browser, residential IP, proper TLS fingerprint - is what gets through Cloudflare's checks. The same code works whether the target site uses Cloudflare or not.
Handling Turnstile challenges
Some Cloudflare-protected sites use Turnstile challenges, which are more aggressive than standard JavaScript checks. For these sites, add the captcha="auto" parameter to let Browser7 handle the Turnstile challenge automatically.
from browser7 import Browser7
client = Browser7(api_key="b7_your_api_key")
# For sites in Cloudflare's "Under Attack" mode with Turnstile challenges
result = client.render(
"https://www.example.com",
captcha="auto",
country_code="US",
)
print(result.html)Try without captcha first. Most Cloudflare-protected sites work with a standard render. Only add CAPTCHA solving if you are getting Turnstile challenge pages in the response.
Why most scraping tools fail on Cloudflare
HTTP libraries cannot pass JavaScript challenges
Python requests, Node.js fetch, Go net/http, and PHP cURL cannot execute JavaScript. Cloudflare's challenge requires JavaScript execution, so these tools get stuck on the challenge page permanently.
Default headless browsers have detectable fingerprints
Puppeteer and Playwright expose the navigator.webdriver flag, have missing browser plugins, and produce different canvas and WebGL fingerprints than real browsers. Cloudflare's challenge script detects these differences. Stealth plugins help but do not cover every signal Cloudflare checks.
Datacenter IPs face higher challenge rates
Even a perfectly configured browser will get challenged more often on a datacenter IP. Cloudflare knows which IP ranges belong to cloud providers and applies stricter scrutiny to traffic from those ranges. You need residential IPs to get the same treatment as regular visitors.
What this costs
Cloudflare bypass is included in every Browser7 request at $0.01 per page. There is no premium tier for Cloudflare-protected sites, no extra credits for JavaScript challenges, and no surcharge for Turnstile solving. The price is the same whether the site uses Cloudflare or not.