Guides / Scrape Amazon
How to Scrape Amazon in 2026
Amazon uses aggressive anti-bot protection, JavaScript-rendered product pages, and IP-based rate limiting. Scraping it yourself means managing residential proxies, handling CAPTCHAs, rotating browser fingerprints, and dealing with constantly changing page structures. With Browser7, it is a single API call.
What makes Amazon hard to scrape
Anti-bot detection
Amazon detects and blocks automated requests using browser fingerprinting, behavioral analysis, and request pattern detection. Datacenter IP addresses are blocked almost immediately. Even residential IPs get flagged if request patterns look automated.
JavaScript-rendered content
Product prices, availability, reviews, and "frequently bought together" sections are loaded dynamically via JavaScript. A simple HTTP request returns a page shell with missing data. You need a real browser to get the full page.
CAPTCHA challenges
Amazon serves CAPTCHA challenges to suspicious requests. If you are scraping at any meaningful volume without proper proxy rotation and browser fingerprinting, you will hit CAPTCHAs frequently.
Geo-targeted pricing
Amazon shows different prices, availability, and product selections based on the visitor's location. To see what customers in a specific country or city see, you need proxies in that location.
Scrape an Amazon product page
Browser7 handles all of the hard parts - residential proxies, browser fingerprinting, CAPTCHA solving, and JavaScript rendering. You send a URL and get fully rendered HTML back. This example uses the North America API endpoint and geo-targets the US to ensure you see US pricing and availability.
from browser7 import Browser7
client = Browser7(
api_key="b7_your_api_key",
base_url="https://ca-api.browser7.com/v1"
)
result = client.render(
"https://www.amazon.com/dp/B0DDZJS3SB",
country_code="US"
)
print(result.html)That is the complete code. No proxy configuration, no browser setup, no CAPTCHA handling logic. The response contains the fully rendered HTML of the Amazon product page, including dynamically loaded prices, reviews, and availability data.
Data you can extract
The rendered HTML contains all the data Amazon shows to a real visitor. Common data points to extract:
Product details
- Title, brand, and description
- ASIN and product category
- Images and gallery URLs
- Product specifications and dimensions
- Variation options (size, color, etc.)
Pricing and availability
- Current price and list price
- Deal and coupon information
- Prime eligibility
- Stock status and delivery estimates
- Seller information and Buy Box winner
Reviews and ratings
- Overall star rating
- Total review count
- Rating distribution (5-star, 4-star, etc.)
- Individual review text and ratings
- Verified purchase badges
Search and rankings
- Search result positions
- Best Sellers Rank
- Sponsored vs organic results
- "Frequently bought together" products
- "Customers also viewed" products
Complete example: render and parse product data
Here is a complete example that renders an Amazon product page and extracts structured data from the HTML. The Python example uses BeautifulSoup, Node.js uses Cheerio, and PHP uses DOMDocument - the standard HTML parsing approach for each language.
from browser7 import Browser7
from bs4 import BeautifulSoup
import json
client = Browser7(
api_key="b7_your_api_key",
base_url="https://ca-api.browser7.com/v1"
)
result = client.render(
"https://www.amazon.com/dp/B0DDZJS3SB",
country_code="US"
)
soup = BeautifulSoup(result.html, "html.parser")
product = {
"title": None,
"price": None,
"rating": None,
"review_count": 0,
"asin": None,
"merchant_id": None,
}
# Title
title_el = soup.find("span", id="productTitle")
if title_el:
product["title"] = title_el.get_text(strip=True)
# Price
price_el = soup.find("span", class_="a-offscreen")
if price_el:
product["price"] = price_el.get_text(strip=True)
# Rating
rating_el = soup.select_one(
"#acrPopover .a-size-small.a-color-base"
)
if rating_el:
product["rating"] = rating_el.get_text(strip=True)
# Review count
review_el = soup.find("span", id="acrCustomerReviewText")
if review_el:
text = review_el.get_text(strip=True)
product["review_count"] = int(text.strip("()").replace(",", ""))
# ASIN
asin_el = soup.find("input", id="ASIN")
if asin_el:
product["asin"] = asin_el.get("value")
# Merchant ID
merchant_el = soup.find("input", id="merchantID")
if merchant_el:
product["merchant_id"] = merchant_el.get("value")
print(json.dumps(product, indent=2))CSS selectors may change if Amazon updates their page structure. Inspect the current page if any fields return null.
Sample output:
{
"title": "Amazon Kindle Paperwhite 16GB (newest model)...",
"price": "$179.99",
"rating": "4.8",
"review_count": 1132,
"asin": "B0DDZJS3SB",
"merchant_id": "ATVPDKIKX0DER"
}Scrape Amazon from a different country
Amazon shows different prices, products, and availability depending on the visitor's location. Change the country and city parameters to see exactly what a customer in a specific location sees. In this example, because we are targeting the UK, we use the EU API endpoint for optimal performance and lower latency. Geo-targeting is included in the $0.01 per page price - no extra charge.
from browser7 import Browser7
# Use the EU endpoint for European targets
client = Browser7(
api_key="b7_your_api_key",
base_url="https://eu-api.browser7.com/v1"
)
# Get Amazon UK pricing from a London IP
result = client.render(
"https://www.amazon.co.uk/dp/B0CP31T5M6",
country_code="GB",
city="london"
)
print(result.html)
print(f"Rendered from: {result.selected_city}")Wait for specific content to load
Some Amazon data loads after the initial page render - reviews, Q&A sections, and recommendation carousels. Use wait actions to ensure the data you need is present before Browser7 returns the HTML.
from browser7 import Browser7, wait_for_selector, wait_for_click
client = Browser7(
api_key="b7_your_api_key",
base_url="https://ca-api.browser7.com/v1"
)
result = client.render(
"https://www.amazon.com/dp/B0DDZJS3SB",
country_code="US",
wait_for=[
# Wait for the product title to load
wait_for_selector("#productTitle"),
# Click "See all reviews" if present
wait_for_click("#acrCustomerReviewLink", timeout=5000),
# Wait for the reviews widget to appear
wait_for_selector(".cr-widget-FocalReviews", timeout=10000),
]
)
print(result.html)Wait actions run in order. You can wait for elements to appear, click buttons to expand sections, and then wait for the expanded content to load. Up to 10 wait actions per request.
Take a screenshot of the product page
Need a visual record of the page alongside the HTML? Enable screenshots to get a PNG or JPEG image of the rendered page. Useful for price monitoring dashboards, compliance records, or visual diffing.
import base64
from browser7 import Browser7
client = Browser7(
api_key="b7_your_api_key",
base_url="https://ca-api.browser7.com/v1"
)
result = client.render(
"https://www.amazon.com/dp/B0DDZJS3SB",
country_code="US",
block_images=False,
include_screenshot=True,
screenshot_full_page=True,
screenshot_format="png"
)
# Save the screenshot
with open("amazon-product.png", "wb") as f:
f.write(base64.b64decode(result.screenshot))
print("Screenshot saved")What this costs
Every Amazon page render costs $0.01 - the same as any other website. Residential proxies, JavaScript rendering, CAPTCHA solving, geo-targeting, and screenshots are all included. There are no per-domain surcharges (unlike ScraperAPI and Oxylabs which charge extra for Amazon), no credit multipliers, and no bandwidth fees.
10,000 Amazon product pages costs $100. You know this before you start, not after.