Guides / Scrape Amazon

How to Scrape Amazon in 2026

Amazon uses aggressive anti-bot protection, JavaScript-rendered product pages, and IP-based rate limiting. Scraping it yourself means managing residential proxies, handling CAPTCHAs, rotating browser fingerprints, and dealing with constantly changing page structures. With Browser7, it is a single API call.

What makes Amazon hard to scrape

Anti-bot detection

Amazon detects and blocks automated requests using browser fingerprinting, behavioral analysis, and request pattern detection. Datacenter IP addresses are blocked almost immediately. Even residential IPs get flagged if request patterns look automated.

JavaScript-rendered content

Product prices, availability, reviews, and "frequently bought together" sections are loaded dynamically via JavaScript. A simple HTTP request returns a page shell with missing data. You need a real browser to get the full page.

CAPTCHA challenges

Amazon serves CAPTCHA challenges to suspicious requests. If you are scraping at any meaningful volume without proper proxy rotation and browser fingerprinting, you will hit CAPTCHAs frequently.

Geo-targeted pricing

Amazon shows different prices, availability, and product selections based on the visitor's location. To see what customers in a specific country or city see, you need proxies in that location.

Scrape an Amazon product page

Browser7 handles all of the hard parts - residential proxies, browser fingerprinting, CAPTCHA solving, and JavaScript rendering. You send a URL and get fully rendered HTML back. This example uses the North America API endpoint and geo-targets the US to ensure you see US pricing and availability.

from browser7 import Browser7

client = Browser7(
    api_key="b7_your_api_key",
    base_url="https://ca-api.browser7.com/v1"
)

result = client.render(
    "https://www.amazon.com/dp/B0DDZJS3SB",
    country_code="US"
)

print(result.html)

That is the complete code. No proxy configuration, no browser setup, no CAPTCHA handling logic. The response contains the fully rendered HTML of the Amazon product page, including dynamically loaded prices, reviews, and availability data.

Data you can extract

The rendered HTML contains all the data Amazon shows to a real visitor. Common data points to extract:

Product details

Title, brand, and description
ASIN and product category
Images and gallery URLs
Product specifications and dimensions
Variation options (size, color, etc.)

Pricing and availability

Current price and list price
Deal and coupon information
Prime eligibility
Stock status and delivery estimates
Seller information and Buy Box winner

Reviews and ratings

Overall star rating
Total review count
Rating distribution (5-star, 4-star, etc.)
Individual review text and ratings
Verified purchase badges

Search and rankings

Search result positions
Best Sellers Rank
Sponsored vs organic results
"Frequently bought together" products
"Customers also viewed" products

Complete example: render and parse product data

Here is a complete example that renders an Amazon product page and extracts structured data from the HTML. The Python example uses BeautifulSoup, Node.js uses Cheerio, and PHP uses DOMDocument - the standard HTML parsing approach for each language.

from browser7 import Browser7
from bs4 import BeautifulSoup
import json

client = Browser7(
    api_key="b7_your_api_key",
    base_url="https://ca-api.browser7.com/v1"
)

result = client.render(
    "https://www.amazon.com/dp/B0DDZJS3SB",
    country_code="US"
)

soup = BeautifulSoup(result.html, "html.parser")

product = {
    "title": None,
    "price": None,
    "rating": None,
    "review_count": 0,
    "asin": None,
    "merchant_id": None,
}

# Title
title_el = soup.find("span", id="productTitle")
if title_el:
    product["title"] = title_el.get_text(strip=True)

# Price
price_el = soup.find("span", class_="a-offscreen")
if price_el:
    product["price"] = price_el.get_text(strip=True)

# Rating
rating_el = soup.select_one(
    "#acrPopover .a-size-small.a-color-base"
)
if rating_el:
    product["rating"] = rating_el.get_text(strip=True)

# Review count
review_el = soup.find("span", id="acrCustomerReviewText")
if review_el:
    text = review_el.get_text(strip=True)
    product["review_count"] = int(text.strip("()").replace(",", ""))

# ASIN
asin_el = soup.find("input", id="ASIN")
if asin_el:
    product["asin"] = asin_el.get("value")

# Merchant ID
merchant_el = soup.find("input", id="merchantID")
if merchant_el:
    product["merchant_id"] = merchant_el.get("value")

print(json.dumps(product, indent=2))

CSS selectors may change if Amazon updates their page structure. Inspect the current page if any fields return null.

Sample output:

{
  "title": "Amazon Kindle Paperwhite 16GB (newest model)...",
  "price": "$179.99",
  "rating": "4.8",
  "review_count": 1132,
  "asin": "B0DDZJS3SB",
  "merchant_id": "ATVPDKIKX0DER"
}

Scrape Amazon from a different country

Amazon shows different prices, products, and availability depending on the visitor's location. Change the country and city parameters to see exactly what a customer in a specific location sees. In this example, because we are targeting the UK, we use the EU API endpoint for optimal performance and lower latency. Geo-targeting is included in the $0.01 per page price - no extra charge.

from browser7 import Browser7

# Use the EU endpoint for European targets
client = Browser7(
    api_key="b7_your_api_key",
    base_url="https://eu-api.browser7.com/v1"
)

# Get Amazon UK pricing from a London IP
result = client.render(
    "https://www.amazon.co.uk/dp/B0CP31T5M6",
    country_code="GB",
    city="london"
)

print(result.html)
print(f"Rendered from: {result.selected_city}")

Wait for specific content to load

Some Amazon data loads after the initial page render - reviews, Q&A sections, and recommendation carousels. Use wait actions to ensure the data you need is present before Browser7 returns the HTML.

from browser7 import Browser7, wait_for_selector, wait_for_click

client = Browser7(
    api_key="b7_your_api_key",
    base_url="https://ca-api.browser7.com/v1"
)

result = client.render(
    "https://www.amazon.com/dp/B0DDZJS3SB",
    country_code="US",
    wait_for=[
        # Wait for the product title to load
        wait_for_selector("#productTitle"),
        # Click "See all reviews" if present
        wait_for_click("#acrCustomerReviewLink", timeout=5000),
        # Wait for the reviews widget to appear
        wait_for_selector(".cr-widget-FocalReviews", timeout=10000),
    ]
)

print(result.html)

Wait actions run in order. You can wait for elements to appear, click buttons to expand sections, and then wait for the expanded content to load. Up to 10 wait actions per request.

Take a screenshot of the product page

Need a visual record of the page alongside the HTML? Enable screenshots to get a PNG or JPEG image of the rendered page. Useful for price monitoring dashboards, compliance records, or visual diffing.

import base64
from browser7 import Browser7

client = Browser7(
    api_key="b7_your_api_key",
    base_url="https://ca-api.browser7.com/v1"
)

result = client.render(
    "https://www.amazon.com/dp/B0DDZJS3SB",
    country_code="US",
    block_images=False,
    include_screenshot=True,
    screenshot_full_page=True,
    screenshot_format="png"
)

# Save the screenshot
with open("amazon-product.png", "wb") as f:
    f.write(base64.b64decode(result.screenshot))

print("Screenshot saved")

What this costs

Every Amazon page render costs $0.01 - the same as any other website. Residential proxies, JavaScript rendering, CAPTCHA solving, geo-targeting, and screenshots are all included. There are no per-domain surcharges, no credit multipliers, and no bandwidth fees.

10,000 Amazon product pages costs $100. You know this before you start, not after.

Try it yourself

100 free renders - enough to test Amazon scraping with no payment required.