Web Scraping / Anti-Bot Detection
How to Handle Anti-Bot Detection
Major websites use commercial anti-bot systems - DataDome, Akamai Bot Manager, PerimeterX, Kasada, and others - to identify and block automated traffic. These systems go far beyond simple rate limiting. They analyze your browser at a deep technical level and reject anything that does not look like a real person browsing. Here is how they work and how to get past them.
How anti-bot systems detect scrapers
Browser fingerprinting
Anti-bot scripts collect dozens of signals from the browser environment: canvas rendering hashes, WebGL vendor and renderer strings, audio context fingerprints, installed plugins, screen dimensions, timezone, language settings, and more. These signals are combined into a fingerprint that should be unique to each browser but consistent across visits. Headless browsers produce fingerprints that are either missing signals or internally inconsistent.
TLS fingerprinting
Before any HTTP traffic is exchanged, the TLS handshake reveals the client's cipher suites, extensions, and supported protocols. Real Chrome, Python requests, Node.js fetch, and Go net/http all produce different TLS fingerprints. Anti-bot systems like Akamai and Cloudflare check that the TLS fingerprint matches what the User-Agent header claims - if you say you are Chrome but your TLS looks like Python, the request is blocked.
JavaScript environment checks
Anti-bot scripts probe for signs of automation: the navigator.webdriver property, missing Chrome-specific APIs, phantom properties left by Puppeteer or Playwright, and overridden functions. Stealth plugins try to hide these signals, but the detection scripts are updated frequently and new checks appear regularly.
Behavioral analysis
Some anti-bot systems track mouse movements, scroll patterns, typing cadence, and navigation timing. A request that arrives, reads the page, and leaves without any mouse movement or scrolling is suspicious. While this is less common for basic page loads, sites with aggressive protection may use behavioral signals as an additional layer.
Anti-bot systems you will encounter
The most common commercial anti-bot systems used by major websites:
Cloudflare
Protects a large portion of the web. Uses JavaScript challenges, Turnstile CAPTCHAs, and IP reputation scoring. See our Cloudflare guide for details.
Akamai Bot Manager
Used by Walmart, Nike, and many enterprise sites. Relies heavily on browser fingerprinting and TLS analysis. One of the more challenging systems to get through with custom tooling.
DataDome
Used by Reddit (new.reddit.com), Hermes, and others. Combines device fingerprinting with behavioral analysis and machine learning classification.
PerimeterX (HUMAN)
Used by Zillow, Crunchbase, and others. Focuses on behavioral biometrics and advanced JavaScript environment probing.
How Browser7 handles anti-bot detection
Browser7 does not try to make a headless browser look real - it uses an actual Chrome browser. The fingerprint is real because the browser is real. The TLS handshake matches Chrome because it is Chrome. Combined with residential IPs, this passes the checks that catch most automated tools.
from browser7 import Browser7
client = Browser7(api_key="b7_your_api_key")
# Works on sites protected by DataDome, Akamai, PerimeterX, etc.
result = client.render(
"https://www.walmart.com/search?q=laptop",
country_code="US",
)
print(result.html)There is no anti-bot mode to enable, no stealth flag, and no detection-specific configuration. The default behavior handles it:
- Real Chrome browser - genuine fingerprint signals for canvas, WebGL, audio context, and all other probes
- Native TLS - the TLS fingerprint matches real Chrome because the request comes from a real Chrome instance
- Residential IPs - clean IPs from real ISPs, not datacenter ranges that anti-bot systems flag automatically
- No automation leaks - no navigator.webdriver, no Puppeteer artifacts, no Playwright traces
Why building your own anti-detection is difficult
It is a constant arms race
Anti-bot companies update their detection scripts regularly. A workaround that works today might fail next week when DataDome adds a new fingerprint check or Akamai updates their JavaScript challenge. Maintaining custom anti-detection code is an ongoing time investment, not a one-time setup.
Stealth plugins cover the obvious signals
Puppeteer-extra-plugin-stealth and similar libraries patch the most well-known detection signals. But anti-bot systems test for hundreds of signals, and stealth plugins typically address 10-20 of them. The gap between "patched the known signals" and "genuinely undetectable" is where most custom solutions fail.
Each anti-bot system is different
What works against Cloudflare might not work against Akamai. What works against DataDome might fail on PerimeterX. If you scrape multiple sites with different anti-bot providers, you need different workarounds for each. An API that handles all of them removes that complexity.
What this costs
Anti-bot bypass is included in every Browser7 request at $0.01 per page. There is no premium tier for protected sites and no extra credits when anti-bot detection is present. Whether the site uses Cloudflare, Akamai, DataDome, or no protection at all, the price is the same.
See it in practice
These guides scrape sites with commercial anti-bot protection:
- How to Scrape Walmart - Akamai Bot Manager
- How to Scrape Amazon - custom anti-bot with aggressive fingerprinting
- How to Scrape LinkedIn - strict session and rate limit enforcement