An HTTP proxy is an intermediary server that sits between your Node.js application and the target server, forwarding requests and responses. Using proxies in Node.js with libraries like axios, Puppeteer, and Playwright allows you to mask your IP address, bypass geographical restrictions, load balance requests, and scrape websites more effectively. This article provides practical examples of how to configure these libraries to use proxies.
Configuring Proxies with Axios
Axios is a popular promise-based HTTP client for Node.js. Configuring a proxy with Axios is straightforward using the proxy configuration option.
Basic Proxy Configuration
The proxy option accepts an object with properties like host, port, username, and password.
const axios = require('axios');
async function fetchData() {
try {
const response = await axios.get('https://api.example.com/data', {
proxy: {
host: 'your_proxy_host',
port: 8080,
username: 'your_username', // Optional
password: 'your_password' // Optional
}
});
console.log(response.data);
} catch (error) {
console.error('Error fetching data:', error);
}
}
fetchData();
Replace your_proxy_host and 8080 with your actual proxy host and port. If your proxy requires authentication, provide the username and password.
Using Environment Variables
For security and maintainability, it's best to store proxy credentials in environment variables.
const axios = require('axios');
async function fetchData() {
try {
const proxyHost = process.env.PROXY_HOST;
const proxyPort = process.env.PROXY_PORT;
const proxyUsername = process.env.PROXY_USERNAME;
const proxyPassword = process.env.PROXY_PASSWORD;
const response = await axios.get('https://api.example.com/data', {
proxy: {
host: proxyHost,
port: parseInt(proxyPort), // Ensure port is an integer
username: proxyUsername,
password: proxyPassword
}
});
console.log(response.data);
} catch (error) {
console.error('Error fetching data:', error);
}
}
fetchData();
Make sure to set the environment variables before running the script. For example, in your terminal:
export PROXY_HOST=your_proxy_host
export PROXY_PORT=8080
export PROXY_USERNAME=your_username
export PROXY_PASSWORD=your_password
node your_script.js
Handling HTTPS Proxies
For HTTPS proxies, ensure your proxy supports the CONNECT method. Axios handles HTTPS proxies seamlessly with the same configuration.
Configuring Proxies with Puppeteer
Puppeteer is a Node.js library that provides a high-level API to control headless Chrome or Chromium. Using proxies with Puppeteer is crucial for web scraping and automation tasks where you need to avoid IP bans or access geographically restricted content.
Launching Puppeteer with a Proxy
You can configure a proxy when launching Puppeteer using the --proxy-server launch argument.
const puppeteer = require('puppeteer');
async function scrapeWebsite() {
const browser = await puppeteer.launch({
args: [
`--proxy-server=http://${process.env.PROXY_HOST}:${process.env.PROXY_PORT}`
]
});
const page = await browser.newPage();
await page.goto('https://www.example.com');
console.log(await page.title());
await browser.close();
}
scrapeWebsite();
Replace process.env.PROXY_HOST and process.env.PROXY_PORT with your proxy host and port. Note that this example assumes an HTTP proxy; for HTTPS proxies, use https:// in the --proxy-server argument.
Authenticated Proxies with Puppeteer
Puppeteer doesn't directly support username and password authentication through launch arguments. You need to handle authentication within the page context.
const puppeteer = require('puppeteer');
async function scrapeWebsite() {
const browser = await puppeteer.launch({
args: [
`--proxy-server=http://${process.env.PROXY_HOST}:${process.env.PROXY_PORT}`
]
});
const page = await browser.newPage();
// Authenticate with the proxy using page.authenticate
await page.authenticate({
username: process.env.PROXY_USERNAME,
password: process.env.PROXY_PASSWORD,
});
await page.goto('https://www.example.com');
console.log(await page.title());
await browser.close();
}
scrapeWebsite();
The page.authenticate() method provides the necessary credentials to the proxy server. This method must be called before navigating to the target page.
Proxy per Request Considerations
Puppeteer applies the proxy setting globally for the entire browser instance. If you need different proxies for different requests, you'll need to launch multiple browser instances, each configured with a different proxy. Alternatively, consider using a library that intercepts requests and applies proxy settings on a per-request basis (less common, more complex).
Configuring Proxies with Playwright
Playwright is another powerful Node.js library for browser automation and end-to-end testing. Like Puppeteer, it supports configuring proxies for its browser instances.
Launching Playwright with a Proxy
Playwright offers a more direct way to configure proxies compared to Puppeteer, using the proxy property in the launch options.
const { chromium } = require('playwright');
async function scrapeWebsite() {
const browser = await chromium.launch({
proxy: {
server: `http://${process.env.PROXY_HOST}:${process.env.PROXY_PORT}`,
username: process.env.PROXY_USERNAME,
password: process.env.PROXY_PASSWORD,
}
});
const page = await browser.newPage();
await page.goto('https://www.example.com');
console.log(await page.title());
await browser.close();
}
scrapeWebsite();
This approach is cleaner and more explicit, as it directly supports username and password authentication within the proxy configuration.
Proxy per Context
Playwright supports browser contexts, which allows for isolated browsing sessions within a single browser instance. Each context can have its own proxy configuration. This is useful for running parallel tasks, each with a different proxy.
const { chromium } = require('playwright');
async function scrapeWebsite() {
const browser = await chromium.launch();
const context = await browser.newContext({
proxy: {
server: `http://${process.env.PROXY_HOST}:${process.env.PROXY_PORT}`,
username: process.env.PROXY_USERNAME,
password: process.env.PROXY_PASSWORD,
}
});
const page = await context.newPage();
await page.goto('https://www.example.com');
console.log(await page.title());
await browser.close();
}
scrapeWebsite();
Comparison Table: Proxy Configuration
| Feature | Axios | Puppeteer | Playwright |
|---|---|---|---|
| Configuration | proxy option in request config |
--proxy-server launch argument, page.authenticate() |
proxy option in launch or newContext |
| Authentication | username and password in proxy |
page.authenticate() |
username and password in proxy |
| HTTPS Support | Yes, same configuration as HTTP | Yes, use https:// in --proxy-server |
Yes, same configuration as HTTP |
| Proxy per Request | No, global for the Axios instance | No, global for the browser instance | Yes, using browser contexts |
Best Practices
- Use Environment Variables: Store proxy credentials in environment variables to avoid hardcoding them in your code.
- Handle Errors: Implement error handling to gracefully manage proxy connection issues.
- Rotate Proxies: For web scraping, rotate proxies frequently to avoid IP bans. Consider using a proxy pool.
- Test Your Setup: Verify that your proxy is working correctly by checking your IP address through the proxy. Use websites like whatismyipaddress.com{rel="nofollow"}.
- Secure Proxies: Use secure proxies (HTTPS or SOCKS5) to encrypt your traffic.
- Legal Considerations: Ensure your web scraping activities comply with the target website's terms of service and all applicable laws.
Conclusion
Configuring proxies in Node.js with libraries like axios, Puppeteer, and Playwright is essential for various tasks, including web scraping, bypassing geo-restrictions, and enhancing privacy. Each library offers different approaches to proxy configuration, with Playwright providing the most flexible and direct method, especially for authenticated proxies and per-context configurations. By following the best practices outlined in this article, you can effectively utilize proxies in your Node.js applications.