HAProxy Load Balancing & Proxying with GProxy

HAProxy (High Availability Proxy) is an open-source, high-performance TCP/HTTP load balancer and proxy server that distributes network traffic across multiple backend servers to maximize performance, reliability, and server capacity. It operates at both Layer 4 (TCP) and Layer 7 (HTTP) of the OSI model, enabling precise traffic management, high availability, and efficient resource utilization for applications and services.

HAProxy is renowned for its speed, stability, and ability to handle very high traffic volumes. It is commonly deployed in front of web servers, application servers, and database clusters to ensure even distribution of client requests, prevent server overload, and facilitate seamless maintenance operations.

Load Balancing with HAProxy

Load balancing is the process of distributing network traffic across a group of backend servers, known as a server farm or cluster. HAProxy employs various algorithms to determine which server receives the next request, aiming to optimize resource use, maximize throughput, minimize response time, and avoid individual server overload.

Key aspects of HAProxy load balancing include:

Algorithm Selection: HAProxy offers several algorithms to suit different application needs.
Server Health Checks: Continuous monitoring of backend server availability and responsiveness.
Server Weighting: Prioritizing certain servers to receive more traffic.

Load Balancing Algorithms

HAProxy provides a range of algorithms configured within the backend section:

roundrobin: Distributes requests sequentially to each server in the backend group. Default algorithm.
leastconn: Directs new connections to the server with the fewest active connections. Optimal for long-lived connections.
source: Uses a hash of the source IP address to determine the server. Ensures a client consistently connects to the same server, useful for stateful applications without explicit session persistence.
uri: Hashes the left part of the URL (before the query string) to select a server. Useful for caching proxies.
hdr(<name>): Hashes the value of a specified HTTP header.
random: Randomly picks a server.

backend web_servers
    balance roundrobin
    server web1 192.168.1.10:80 check
    server web2 192.168.1.11:80 check
    server web3 192.168.1.12:80 check

Proxying with HAProxy

Proxying involves an intermediary server that acts on behalf of a client or server. HAProxy functions as a reverse proxy, accepting connections from clients and forwarding them to backend servers, then returning the servers' responses to the clients. This abstraction layer provides security, performance, and operational benefits.

Layer 4 (TCP) Proxying

At Layer 4, HAProxy forwards raw TCP connections without inspecting the application-layer content. This is suitable for non-HTTP services, databases, or custom protocols where content inspection is not required or desired.

listen mysql_cluster
    bind *:3306
    mode tcp
    balance leastconn
    server db1 192.168.1.20:3306 check
    server db2 192.168.1.21:3306 check

Layer 7 (HTTP) Proxying

At Layer 7, HAProxy can inspect and manipulate HTTP request and response headers, URLs, and cookies. This enables advanced features like content-based routing, SSL termination, URL rewriting, and session persistence.

frontend http_frontend
    bind *:80
    mode http
    default_backend web_servers

HAProxy Configuration Components

HAProxy configuration typically resides in /etc/haproxy/haproxy.cfg and is structured into several sections.

`global` section

The global section defines process-wide parameters, such as logging, security settings, and performance limits. These settings apply to the entire HAProxy instance.

global
    log /dev/log    local0 info
    maxconn 20000
    chroot /var/lib/haproxy
    pidfile /var/run/haproxy.pid
    user haproxy
    group haproxy
    daemon
    stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners

`defaults` section

The defaults section specifies default parameters for all listen, frontend, and backend sections that follow it. This reduces configuration redundancy.

defaults
    mode http
    log global
    timeout connect 5000ms
    timeout client 50000ms
    timeout server 50000ms
    option httplog
    option dontlognull
    option http-server-close

`frontend` section

A frontend defines the public-facing entry point where HAProxy listens for client connections. It specifies the IP address, port, protocol (mode), and rules for routing requests to specific backends using Access Control Lists (ACLs).

frontend http_in
    bind *:80
    mode http
    acl host_app1 hdr(host) -i app1.example.com
    acl host_app2 hdr(host) -i app2.example.com

    use_backend app1_servers if host_app1
    use_backend app2_servers if host_app2
    default_backend default_web_servers

`backend` section

A backend defines a group of servers that HAProxy can forward requests to. It includes the load balancing algorithm, health check parameters, and the individual server definitions.

backend app1_servers
    balance leastconn
    option httpchk GET /healthz
    server s1 10.0.0.10:8080 check inter 2000 fall 3 rise 2
    server s2 10.0.0.11:8080 check inter 2000 fall 3 rise 2 backup

check: Enables health checks for the server.
inter 2000: Checks every 2000ms.
fall 3: Marks the server down after 3 consecutive failed checks.
rise 2: Marks the server up after 2 consecutive successful checks.
backup: Server will only be used when all other non-backup servers are down.

`listen` section

A listen section combines the functionalities of a frontend and a backend into a single block. This is often used for simpler configurations or for services like HAProxy's statistics page.

listen stats_page
    bind *:8080
    mode http
    stats enable
    stats uri /haproxy?stats
    stats realm HAProxy\ Statistics
    stats auth admin:securepassword
    stats refresh 5s

Access Control Lists (ACLs)

ACLs are powerful conditional rules used to match specific criteria in client requests (e.g., source IP, host header, URL path, HTTP method). They enable dynamic routing, content switching, and blocking.

frontend website_frontend
    bind *:443 ssl crt /etc/haproxy/certs/mydomain.pem

    # ACLs based on path
    acl is_admin_area path_beg /admin
    acl is_api_v1   path_beg /api/v1

    # Route based on ACLs
    use_backend admin_backend if is_admin_area
    use_backend api_v1_backend if is_api_v1
    default_backend main_website_backend

Health Checks

HAProxy continually monitors the health of backend servers to ensure requests are only sent to operational instances. If a server fails health checks, HAProxy temporarily removes it from the rotation until it recovers.

TCP Check (check): Basic port connectivity.
HTTP Check (option httpchk): Sends an HTTP request (e.g., GET /health) and expects a valid HTTP status code (2xx or 3xx).
SSL Hello Check (ssl-hello-chk): Checks if an SSL handshake can be established.

Advanced HAProxy Features

SSL Termination and Offloading

HAProxy can handle SSL/TLS encryption and decryption, offloading this CPU-intensive task from backend servers. It decrypts incoming HTTPS traffic and forwards plain HTTP to the backend, or can re-encrypt for end-to-end SSL.

frontend https_in
    bind *:443 ssl crt /etc/haproxy/certs/mydomain.pem
    mode http
    default_backend web_servers

Sticky Sessions (Persistence)

Sticky sessions ensure that a client's requests are consistently routed to the same backend server throughout their session. This is critical for applications that maintain session state on individual servers.

Cookie-based persistence: HAProxy inserts a cookie into the client's browser, which is then used to identify the correct backend server for subsequent requests.
Source IP persistence: Uses the client's source IP address to consistently route to the same server (less reliable behind NAT).

backend web_servers_sticky
    balance roundrobin
    cookie SERVERID insert indirect nocache
    server s1 10.0.0.10:80 check cookie s1
    server s2 10.0.0.11:80 check cookie s2

Content-Based Routing

Content-based routing directs traffic to different backend server pools based on specific attributes within the client's request, such as the requested URL path, HTTP host header, or custom HTTP headers. This facilitates microservices architectures or multi-tenant applications.

(Example already shown in frontend section with acl host_app1 and use_backend).

High Availability of HAProxy Itself

While HAProxy is designed for high availability of backend services, HAProxy instances themselves can be made highly available using external mechanisms like VRRP (Virtual Router Redundancy Protocol) with tools such as Keepalived. This creates a floating IP address that automatically fails over between primary and secondary HAProxy servers in case of a failure, ensuring continuous load balancing service.

HAProxy vs. Nginx (Brief Comparison)

Both HAProxy and Nginx can function as reverse proxies and load balancers. Their primary design goals and typical deployment patterns differ.

Feature	HAProxy	Nginx
Primary Role	Dedicated, high-performance Load Balancer & Proxy	Web Server, Reverse Proxy, Load Balancer, Cache
Performance	Extremely high (especially L4/TCP)	High (good all-rounder)
Configuration	Purpose-built for load balancing, extensive options	More general-purpose, flexible
Caching	Limited (requires external modules)	Native, powerful HTTP caching
Static File Serving	Not its primary focus	Excellent, highly optimized
Modularity	Relatively monolithic	Highly modular with a rich ecosystem of modules
SSL Termination	Yes	Yes
WebSockets	Yes	Yes

HAProxy is often chosen for critical, high-traffic environments where pure load balancing performance and robust health checking are paramount. Nginx is frequently used when a combination of web serving, caching, and reverse proxying is needed, alongside load balancing. It is common to see them deployed together, with HAProxy acting as the primary load balancer and Nginx handling specific application-level proxying or static content.

Analysis & Check

Security & Network

Generators

9 tools

HAProxy