HAProxy (High Availability Proxy) is an open-source, high-performance TCP/HTTP load balancer and proxy server that distributes network traffic across multiple backend servers to maximize performance, reliability, and server capacity. It operates at both Layer 4 (TCP) and Layer 7 (HTTP) of the OSI model, enabling precise traffic management, high availability, and efficient resource utilization for applications and services.
HAProxy is renowned for its speed, stability, and ability to handle very high traffic volumes. It is commonly deployed in front of web servers, application servers, and database clusters to ensure even distribution of client requests, prevent server overload, and facilitate seamless maintenance operations.
Load Balancing with HAProxy
Load balancing is the process of distributing network traffic across a group of backend servers, known as a server farm or cluster. HAProxy employs various algorithms to determine which server receives the next request, aiming to optimize resource use, maximize throughput, minimize response time, and avoid individual server overload.
Key aspects of HAProxy load balancing include:
- Algorithm Selection: HAProxy offers several algorithms to suit different application needs.
- Server Health Checks: Continuous monitoring of backend server availability and responsiveness.
- Server Weighting: Prioritizing certain servers to receive more traffic.
Load Balancing Algorithms
HAProxy provides a range of algorithms configured within the backend section:
roundrobin: Distributes requests sequentially to each server in the backend group. Default algorithm.leastconn: Directs new connections to the server with the fewest active connections. Optimal for long-lived connections.source: Uses a hash of the source IP address to determine the server. Ensures a client consistently connects to the same server, useful for stateful applications without explicit session persistence.uri: Hashes the left part of the URL (before the query string) to select a server. Useful for caching proxies.hdr(<name>): Hashes the value of a specified HTTP header.random: Randomly picks a server.
backend web_servers
balance roundrobin
server web1 192.168.1.10:80 check
server web2 192.168.1.11:80 check
server web3 192.168.1.12:80 check
Proxying with HAProxy
Proxying involves an intermediary server that acts on behalf of a client or server. HAProxy functions as a reverse proxy, accepting connections from clients and forwarding them to backend servers, then returning the servers' responses to the clients. This abstraction layer provides security, performance, and operational benefits.
Layer 4 (TCP) Proxying
At Layer 4, HAProxy forwards raw TCP connections without inspecting the application-layer content. This is suitable for non-HTTP services, databases, or custom protocols where content inspection is not required or desired.
listen mysql_cluster
bind *:3306
mode tcp
balance leastconn
server db1 192.168.1.20:3306 check
server db2 192.168.1.21:3306 check
Layer 7 (HTTP) Proxying
At Layer 7, HAProxy can inspect and manipulate HTTP request and response headers, URLs, and cookies. This enables advanced features like content-based routing, SSL termination, URL rewriting, and session persistence.
frontend http_frontend
bind *:80
mode http
default_backend web_servers
HAProxy Configuration Components
HAProxy configuration typically resides in /etc/haproxy/haproxy.cfg and is structured into several sections.
global section
The global section defines process-wide parameters, such as logging, security settings, and performance limits. These settings apply to the entire HAProxy instance.
global
log /dev/log local0 info
maxconn 20000
chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid
user haproxy
group haproxy
daemon
stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners
defaults section
The defaults section specifies default parameters for all listen, frontend, and backend sections that follow it. This reduces configuration redundancy.
defaults
mode http
log global
timeout connect 5000ms
timeout client 50000ms
timeout server 50000ms
option httplog
option dontlognull
option http-server-close
frontend section
A frontend defines the public-facing entry point where HAProxy listens for client connections. It specifies the IP address, port, protocol (mode), and rules for routing requests to specific backends using Access Control Lists (ACLs).
frontend http_in
bind *:80
mode http
acl host_app1 hdr(host) -i app1.example.com
acl host_app2 hdr(host) -i app2.example.com
use_backend app1_servers if host_app1
use_backend app2_servers if host_app2
default_backend default_web_servers
backend section
A backend defines a group of servers that HAProxy can forward requests to. It includes the load balancing algorithm, health check parameters, and the individual server definitions.
backend app1_servers
balance leastconn
option httpchk GET /healthz
server s1 10.0.0.10:8080 check inter 2000 fall 3 rise 2
server s2 10.0.0.11:8080 check inter 2000 fall 3 rise 2 backup
check: Enables health checks for the server.inter 2000: Checks every 2000ms.fall 3: Marks the server down after 3 consecutive failed checks.rise 2: Marks the server up after 2 consecutive successful checks.backup: Server will only be used when all other non-backup servers are down.
listen section
A listen section combines the functionalities of a frontend and a backend into a single block. This is often used for simpler configurations or for services like HAProxy's statistics page.
listen stats_page
bind *:8080
mode http
stats enable
stats uri /haproxy?stats
stats realm HAProxy\ Statistics
stats auth admin:securepassword
stats refresh 5s
Access Control Lists (ACLs)
ACLs are powerful conditional rules used to match specific criteria in client requests (e.g., source IP, host header, URL path, HTTP method). They enable dynamic routing, content switching, and blocking.
frontend website_frontend
bind *:443 ssl crt /etc/haproxy/certs/mydomain.pem
# ACLs based on path
acl is_admin_area path_beg /admin
acl is_api_v1 path_beg /api/v1
# Route based on ACLs
use_backend admin_backend if is_admin_area
use_backend api_v1_backend if is_api_v1
default_backend main_website_backend
Health Checks
HAProxy continually monitors the health of backend servers to ensure requests are only sent to operational instances. If a server fails health checks, HAProxy temporarily removes it from the rotation until it recovers.
- TCP Check (
check): Basic port connectivity. - HTTP Check (
option httpchk): Sends an HTTP request (e.g.,GET /health) and expects a valid HTTP status code (2xx or 3xx). - SSL Hello Check (
ssl-hello-chk): Checks if an SSL handshake can be established.
Advanced HAProxy Features
SSL Termination and Offloading
HAProxy can handle SSL/TLS encryption and decryption, offloading this CPU-intensive task from backend servers. It decrypts incoming HTTPS traffic and forwards plain HTTP to the backend, or can re-encrypt for end-to-end SSL.
frontend https_in
bind *:443 ssl crt /etc/haproxy/certs/mydomain.pem
mode http
default_backend web_servers
Sticky Sessions (Persistence)
Sticky sessions ensure that a client's requests are consistently routed to the same backend server throughout their session. This is critical for applications that maintain session state on individual servers.
- Cookie-based persistence: HAProxy inserts a cookie into the client's browser, which is then used to identify the correct backend server for subsequent requests.
- Source IP persistence: Uses the client's source IP address to consistently route to the same server (less reliable behind NAT).
backend web_servers_sticky
balance roundrobin
cookie SERVERID insert indirect nocache
server s1 10.0.0.10:80 check cookie s1
server s2 10.0.0.11:80 check cookie s2
Content-Based Routing
Content-based routing directs traffic to different backend server pools based on specific attributes within the client's request, such as the requested URL path, HTTP host header, or custom HTTP headers. This facilitates microservices architectures or multi-tenant applications.
(Example already shown in frontend section with acl host_app1 and use_backend).
High Availability of HAProxy Itself
While HAProxy is designed for high availability of backend services, HAProxy instances themselves can be made highly available using external mechanisms like VRRP (Virtual Router Redundancy Protocol) with tools such as Keepalived. This creates a floating IP address that automatically fails over between primary and secondary HAProxy servers in case of a failure, ensuring continuous load balancing service.
HAProxy vs. Nginx (Brief Comparison)
Both HAProxy and Nginx can function as reverse proxies and load balancers. Their primary design goals and typical deployment patterns differ.
| Feature | HAProxy | Nginx |
|---|---|---|
| Primary Role | Dedicated, high-performance Load Balancer & Proxy | Web Server, Reverse Proxy, Load Balancer, Cache |
| Performance | Extremely high (especially L4/TCP) | High (good all-rounder) |
| Configuration | Purpose-built for load balancing, extensive options | More general-purpose, flexible |
| Caching | Limited (requires external modules) | Native, powerful HTTP caching |
| Static File Serving | Not its primary focus | Excellent, highly optimized |
| Modularity | Relatively monolithic | Highly modular with a rich ecosystem of modules |
| SSL Termination | Yes | Yes |
| WebSockets | Yes | Yes |
HAProxy is often chosen for critical, high-traffic environments where pure load balancing performance and robust health checking are paramount. Nginx is frequently used when a combination of web serving, caching, and reverse proxying is needed, alongside load balancing. It is common to see them deployed together, with HAProxy acting as the primary load balancer and Nginx handling specific application-level proxying or static content.