Web Scraping / Anti-Bot Detection

How to Handle Anti-Bot Detection

Major websites use commercial anti-bot systems - DataDome, Akamai Bot Manager, PerimeterX, Kasada, and others - to identify and block automated traffic. These systems go far beyond simple rate limiting. They analyze your browser at a deep technical level and reject anything that does not look like a real person browsing. Here is how they work and how to get past them.

How anti-bot systems detect scrapers

Browser fingerprinting

Anti-bot scripts collect dozens of signals from the browser environment: canvas rendering hashes, WebGL vendor and renderer strings, audio context fingerprints, installed plugins, screen dimensions, timezone, language settings, and more. These signals are combined into a fingerprint that should be unique to each browser but consistent across visits. Headless browsers produce fingerprints that are either missing signals or internally inconsistent.

TLS fingerprinting

Before any HTTP traffic is exchanged, the TLS handshake reveals the client's cipher suites, extensions, and supported protocols. Real Chrome, Python requests, Node.js fetch, and Go net/http all produce different TLS fingerprints. Anti-bot systems like Akamai and Cloudflare check that the TLS fingerprint matches what the User-Agent header claims - if you say you are Chrome but your TLS looks like Python, the request is blocked.

JavaScript environment checks

Anti-bot scripts probe for signs of automation: the navigator.webdriver property, missing Chrome-specific APIs, phantom properties left by Puppeteer or Playwright, and overridden functions. Stealth plugins try to hide these signals, but the detection scripts are updated frequently and new checks appear regularly.

Behavioral analysis

Some anti-bot systems track mouse movements, scroll patterns, typing cadence, and navigation timing. A request that arrives, reads the page, and leaves without any mouse movement or scrolling is suspicious. While this is less common for basic page loads, sites with aggressive protection may use behavioral signals as an additional layer.

Anti-bot systems you will encounter

The most common commercial anti-bot systems used by major websites:

Cloudflare

Protects a large portion of the web. Uses JavaScript challenges, Turnstile CAPTCHAs, and IP reputation scoring. See our Cloudflare guide for details.

Akamai Bot Manager

Used by Walmart, Nike, and many enterprise sites. Relies heavily on browser fingerprinting and TLS analysis. One of the more challenging systems to get through with custom tooling.

DataDome

Used by Reddit (new.reddit.com), Hermes, and others. Combines device fingerprinting with behavioral analysis and machine learning classification.

PerimeterX (HUMAN)

Used by Zillow, Crunchbase, and others. Focuses on behavioral biometrics and advanced JavaScript environment probing.

How Browser7 handles anti-bot detection

Browser7 does not try to make a headless browser look real - it uses an actual Chrome browser. The fingerprint is real because the browser is real. The TLS handshake matches Chrome because it is Chrome. Combined with residential IPs, this passes the checks that catch most automated tools.

from browser7 import Browser7

client = Browser7(api_key="b7_your_api_key")

# Works on sites protected by DataDome, Akamai, PerimeterX, etc.
result = client.render(
    "https://www.walmart.com/search?q=laptop",
    country_code="US",
)

print(result.html)

There is no anti-bot mode to enable, no stealth flag, and no detection-specific configuration. The default behavior handles it:

Real Chrome browser - genuine fingerprint signals for canvas, WebGL, audio context, and all other probes
Native TLS - the TLS fingerprint matches real Chrome because the request comes from a real Chrome instance
Residential IPs - clean IPs from real ISPs, not datacenter ranges that anti-bot systems flag automatically
No automation leaks - no navigator.webdriver, no Puppeteer artifacts, no Playwright traces

Why building your own anti-detection is difficult

It is a constant arms race

Anti-bot companies update their detection scripts regularly. A workaround that works today might fail next week when DataDome adds a new fingerprint check or Akamai updates their JavaScript challenge. Maintaining custom anti-detection code is an ongoing time investment, not a one-time setup.

Stealth plugins cover the obvious signals

Puppeteer-extra-plugin-stealth and similar libraries patch the most well-known detection signals. But anti-bot systems test for hundreds of signals, and stealth plugins typically address 10-20 of them. The gap between "patched the known signals" and "genuinely undetectable" is where most custom solutions fail.

Each anti-bot system is different

What works against Cloudflare might not work against Akamai. What works against DataDome might fail on PerimeterX. If you scrape multiple sites with different anti-bot providers, you need different workarounds for each. An API that handles all of them removes that complexity.

What this costs

Anti-bot bypass is included in every Browser7 request at $0.01 per page. There is no premium tier for protected sites and no extra credits when anti-bot detection is present. Whether the site uses Cloudflare, Akamai, DataDome, or no protection at all, the price is the same.

See it in practice

These guides scrape sites with commercial anti-bot protection:

How to Scrape Walmart - Akamai Bot Manager
How to Scrape Amazon - custom anti-bot with aggressive fingerprinting
How to Scrape LinkedIn - strict session and rate limit enforcement

Try it yourself

100 free renders - test on any protected site with no payment required.