Use Case: Web Scraping
When target sites have anti-bot mechanisms (Cloudflare, PerimeterX, DataDome, etc.), regular headless Chrome gets detected and blocked. Browser Forest's anti-detection engine enables all patches by default, making browser instances indistinguishable from real users in terms of fingerprint, behavior, and CDP characteristics.
Approach A: Scrape API (Recommended, Simplest)
No Session lifecycle management needed. Ideal for single-shot scraping: provide a URL, get back rendered HTML / Markdown / screenshot. The platform creates a browser, waits for page load, extracts content, and auto-destroys — entirely transparent to the caller.
Typical Scenario: Scraping E-commerce Product Detail Pages
The target page has JavaScript-rendered price and inventory data, requiring JS execution to complete before correct data can be retrieved.
curl -X POST https://bf.mktindex.com/api/v1/scrape \
-H "X-API-Key: bf_live_xxxxxxxx" \
-H "Content-Type: application/json" \
-d '{
"url": "https://www.amazon.com/dp/B09G9FPHY6",
"format": "markdown",
"waitFor": "networkidle"
}'
Example response:
{
"url": "https://www.amazon.com/dp/B09G9FPHY6",
"content": "# Apple AirPods Pro (2nd Generation)\n\n**Price**: $189.99\n\n**In Stock**: Yes\n...",
"metadata": {
"title": "Amazon.com: Apple AirPods Pro",
"description": "Active Noise Cancellation..."
},
"durationMs": 4821
}
| Parameter | Type | Default | Description |
|---|---|---|---|
| url | string | Required | Target URL to scrape |
| format | string | html | html / markdown / text / screenshot |
| waitFor | string | load | load / domcontentloaded / networkidle |
| selector | string | None | Extract only the region matching a CSS selector |
networkidle waits for network requests to be quiet for 500ms — good for SPA/React pages; load is faster and better for server-rendered static pages.Approach B: Session + Puppeteer (Complex Interactions)
When scraping requires multi-step operations like login, pagination, clicks, and form filling, first create a Session, then connect and control the browser via CDP WebSocket using Puppeteer or Playwright.
Typical Scenario: Data Behind a Login Wall
Step 1: Create a Session
curl -X POST https://bf.mktindex.com/api/v1/sessions \
-H "X-API-Key: bf_live_xxxxxxxx" \
-H "Content-Type: application/json" \
-d '{
"os": "windows",
"timeout": 300,
"idleTimeout": 60
}'
Returns cdpUrl in the format wss://bf.mktindex.com/ws/session/ses_xxxxxxxx — the browser is ready at this point.
Step 2: Connect and Operate with Puppeteer
import puppeteer from 'puppeteer-core';
const browser = await puppeteer.connect({
browserWSEndpoint: 'wss://bf.mktindex.com/ws/session/ses_xxxxxxxx',
});
const [page] = await browser.pages();
// Login
await page.goto('https://target-site.com/login', { waitUntil: 'networkidle2' });
await page.type('#email', '[email protected]');
await page.type('#password', 'secret123');
await page.click('[type="submit"]');
await page.waitForNavigation({ waitUntil: 'networkidle2' });
// Paginate and scrape
const allItems = [];
for (let p = 1; p <= 5; p++) {
await page.goto(`https://target-site.com/products?page=${p}`, { waitUntil: 'networkidle2' });
const items = await page.evaluate(() =>
Array.from(document.querySelectorAll('.product-card')).map(el => ({
name: el.querySelector('.title')?.textContent?.trim(),
price: el.querySelector('.price')?.textContent?.trim(),
sku: el.dataset.sku,
}))
);
allItems.push(...items);
}
console.log(`Scraped ${allItems.length} items`);
// Disconnect (do NOT call browser.close() — it would close the remote browser process)
await browser.disconnect();
Step 3: Delete the Session
curl -X DELETE https://bf.mktindex.com/api/v1/sessions/ses_xxxxxxxx \
-H "X-API-Key: bf_live_xxxxxxxx"
Persistent Login State (Context)
After logging in once, save cookies and localStorage to a Context. Subsequent Sessions specifying the same contextId will not need to re-login. This is especially valuable for scenarios requiring frequent IP rotation (new Session per proxy change) while maintaining the same account login.
First Time: Login and Save State
# 1. Create Context (once)
curl -X POST https://bf.mktindex.com/api/v1/contexts \
-H "X-API-Key: bf_live_xxxxxxxx" \
-H "Content-Type: application/json" \
-d '{"name": "amazon-account"}'
# Returns: { "id": "ctx_xxxxxxxx", ... }
# 2. Create Session bound to the Context
curl -X POST https://bf.mktindex.com/api/v1/sessions \
-H "X-API-Key: bf_live_xxxxxxxx" \
-H "Content-Type: application/json" \
-d '{"contextId": "ctx_xxxxxxxx"}'
# 3. Connect with Puppeteer and manually log in
# 4. Delete Session (auto-uploads cookies snapshot to S3)
curl -X DELETE https://bf.mktindex.com/api/v1/sessions/ses_xxxxxxxx \
-H "X-API-Key: bf_live_xxxxxxxx"
Subsequent Times: Restore Login State Directly
# Create a new Session with the same contextId — login state auto-restored
curl -X POST https://bf.mktindex.com/api/v1/sessions \
-H "X-API-Key: bf_live_xxxxxxxx" \
-H "Content-Type: application/json" \
-d '{"contextId": "ctx_xxxxxxxx"}'
Approach C: Cookie REST API (No CDP Client Needed)
If you already have login cookies (exported from DevTools / EditThisCookie), you can directly inject them into an active Session via API, or write them to a Context for automatic restoration on subsequent Sessions. Ideal for Python/curl scripts and agent toolchains.
Inject into Session, Then Visit Target Site
# 1. Create Session
curl -X POST https://bf.mktindex.com/api/v1/sessions \
-H "X-API-Key: bf_live_xxxxxxxx" \
-H "Content-Type: application/json" \
-d '{"timeout": 300}'
# 2. Inject cookies (CDP format JSON array)
curl -X PUT https://bf.mktindex.com/api/v1/sessions/ses_xxxxxxxx/cookies \
-H "X-API-Key: bf_live_xxxxxxxx" \
-H "Content-Type: application/json" \
-d '{"cookies": [{"name":"session","value":"...","domain":".target.com","path":"/","secure":true}]}'
# 3. Connect with Puppeteer / Playwright via cdpUrl and navigate
# 4. Export current cookies for backup
curl "https://bf.mktindex.com/api/v1/sessions/ses_xxxxxxxx/cookies?domain=.target.com" \
-H "X-API-Key: bf_live_xxxxxxxx"
Write to Context for Persistence
curl -X PUT https://bf.mktindex.com/api/v1/contexts/ctx_xxxxxxxx/cookies \
-H "X-API-Key: bf_live_xxxxxxxx" \
-H "Content-Type: application/json" \
-d '{"cookies": [ ... ]}'
Repo examples: test/pm-agent-login.py (supports test / prod environment switching), test/cookie-api-test.py (Cookie API smoke test). See test/.env.example for config.
Proxy Configuration (Bypass IP Blocking)
Specify a proxy when creating a Session — all browser traffic will go through it. You can integrate with residential proxy pools to use a different IP per Session.
curl -X POST https://bf.mktindex.com/api/v1/sessions \
-H "X-API-Key: bf_live_xxxxxxxx" \
-H "Content-Type: application/json" \
-d '{
"os": "windows",
"proxy": {
"type": "http",
"host": "residential-proxy.provider.com",
"port": 8080,
"username": "user123",
"password": "pass456"
}
}'
OS Fingerprint Simulation
The os parameter makes the browser present a complete fingerprint matching the target platform (User-Agent, Platform, WebGL renderer, font list, etc.), indistinguishable from real Chrome on that OS.
| os Value | User-Agent Example | Recommended Scenario |
|---|---|---|
| windows | Windows NT 10.0; Win64; x64 | Most e-commerce and financial sites |
| macos | Macintosh; Intel Mac OS X 10_15_7 | Apple services, design platforms |
| linux | X11; Linux x86_64 | Dev tools, GitHub, API documentation sites |
Node.js SDK Version
If using in a Node.js project, we recommend managing the Session lifecycle through the SDK:
import { BrowserForestClient } from '@browser-forest/sdk';
import puppeteer from 'puppeteer-core';
const client = new BrowserForestClient({ apiKey: 'bf_live_xxxxxxxx' });
async function scrapeWithLogin(url: string) {
// Pass contextId to restore login state from a saved Context
const session = await client.sessions.create({
os: 'windows',
timeout: 180,
});
const browser = await puppeteer.connect({
browserWSEndpoint: session.cdpUrl!,
});
try {
const [page] = await browser.pages();
await page.goto(url, { waitUntil: 'networkidle2' });
const data = await page.evaluate(() => ({
title: document.title,
price: document.querySelector('.price')?.textContent,
}));
return data;
} finally {
await browser.disconnect();
await client.sessions.delete(session.id);
}
}