To set up a proxy for Puppeteer in Node.js, configure the browser launch arguments with --proxy-server=<proxy_address>:<port> to route all browser traffic through the specified proxy. This enables IP address masking, geographic targeting, or bypassing rate limits and IP bans during automated web interactions.
Basic Proxy Configuration
Puppeteer controls a Chromium instance, passing proxy settings directly via command-line arguments. The primary argument is --proxy-server.
Specifying a Proxy Server
The --proxy-server argument accepts the proxy's address and port. The protocol (HTTP, HTTPS, SOCKS4, SOCKS5) can be explicitly defined; HTTP is typically assumed if omitted.
const puppeteer = require('puppeteer');
async function launchBrowserWithProxy(proxyAddress) {
const browser = await puppeteer.launch({
args: [
`--proxy-server=${proxyAddress}`,
// Optional: For extreme anonymity, consider disabling WebRTC
// '--disable-features=WebRTC'
],
headless: true, // or false for headful
});
const page = await browser.newPage();
return { browser, page };
}
async function runBasicProxyExample() {
const proxy = 'http://your-proxy-ip:port'; // Example: 'http://192.168.1.1:8080' or 'socks5://192.168.1.1:1080'
let browser;
try {
const { browser: launchedBrowser, page } = await launchBrowserWithProxy(proxy);
browser = launchedBrowser; // Assign to outer scope variable for finally block
await page.goto('https://httpbin.org/ip'); // A service to show your external IP
const ipInfo = await page.evaluate(() => document.body.textContent);
console.log('External IP through proxy:', ipInfo);
} catch (error) {
console.error('Error launching browser with proxy:', error);
} finally {
if (browser) {
await browser.close();
}
}
}
runBasicProxyExample();
Replace your-proxy-ip:port with the actual IP address and port of your proxy server. Ensure the protocol prefix (e.g., http://, socks5://) matches your proxy type.
Authenticated Proxies
Many proxy services require authentication. Puppeteer handles this via page.authenticate() after browser and page initialization, sending HTTP Basic authentication headers to the proxy.
Proxy Authentication Methods
page.authenticate(): Recommended for HTTP Basic authentication with Puppeteer.username:password@ip:portin--proxy-server: While some tools support this format directly,page.authenticate()is the standard Puppeteer approach for authentication.
const puppeteer = require('puppeteer');
async function launchBrowserWithAuthenticatedProxy(proxyAddress, username, password) {
const browser = await puppeteer.launch({
args: [`--proxy-server=${proxyAddress}`],
headless: true,
});
const page = await browser.newPage();
// Authenticate with the proxy
await page.authenticate({ username, password });
return { browser, page };
}
async function runAuthenticatedProxyExample() {
const proxy = 'http://your-authenticated-proxy-ip:port';
const proxyUsername = 'your_username';
const proxyPassword = 'your_password';
let browser;
try {
const { browser: launchedBrowser, page } = await launchBrowserWithAuthenticatedProxy(proxy, proxyUsername, proxyPassword);
browser = launchedBrowser;
await page.goto('https://httpbin.org/ip');
const ipInfo = await page.evaluate(() => document.body.textContent);
console.log('External IP through authenticated proxy:', ipInfo);
} catch (error) {
console.error('Error with authenticated proxy:', error);
} finally {
if (browser) {
await browser.close();
}
}
}
runAuthenticatedProxyExample();
Proxy Rotation Strategies
For large-scale scraping, a single proxy can lead to rate limiting. Proxy rotation involves switching between multiple proxy servers to distribute requests and maintain anonymity.
Implementing Proxy Rotation
- Maintain a Proxy List: Store available proxies in an array, including authentication details if necessary.
- Selection Logic: Implement a strategy to pick a proxy for each new browser instance or request. Common strategies: Round-Robin, Random Selection, Weighted Rotation.
```javascript
const puppeteer = require('puppeteer');
// Example list of proxies, some with authentication details
const proxyConfigurations = [
{ address: 'http://proxy1.example.com:8080' },
{ address: 'http://auth-proxy1.example.com:8080', username: 'userA', password: 'passA' },
{ address: 'socks5://proxy3.