Load Balancing with NGINX and NGINX Plus, Part 1

NGINX is a capable accelerating proxy for a wide range of HTTP‑based applications. Its caching, HTTP connection processing, and offload significantly increase application performance, particularly during periods of high load.

Editor – NGINX Plus Release 5 and later can also load balance TCP-based applications. TCP load balancing was significantly extended in Release 6 by the addition of health checks, dynamic reconfiguration, SSL termination, and more. In NGINX Plus Release 7 and later, the TCP load balancer has full feature parity with the HTTP load balancer. Support for UDP load balancing was introduced in Release 9.

You configure TCP and UDP load balancing in the stream context instead of the http context. The available directives and parameters differ somewhat because of inherent differences between HTTP and TCP/UDP; for details, see the documentation for the HTTP and TCP Upstream modules.

NGINX Plus extends the capabilities of NGINX by adding further load-balancing capabilities: health checks, session persistence, live activity monitoring, and dynamic reconfiguration of load-balanced server groups.

This blog post steps you through the configuring NGINX to load balance traffic to a set of web servers. It highlights some of the additional features in NGINX Plus.

For further reading, you can also take a look at the overview of load balancing on nginx.org, and the follow‑up article to this one, Load Balancing with NGINX and NGINX Plus, Part 2.

Proxying Traffic with NGINX

We’ll begin by proxying traffic to a pair of upstream web servers. The following NGINX configuration is sufficient to terminate all HTTP requests to port 80, and forward them in a round‑robin fashion across the web servers in the upstream group:

http {
server {
listen 80;

location / {
proxy_pass http://backend;

upstream backend {
server web-server1:80;
server web-server2:80;

With this simple configuration, NGINX forwards each request received on port 80 to web‑server1 and web‑server2 in turn, establishing a new HTTP connection in each case.

Setting the Load‑Balancing Method

By default, NGINX uses the Round Robin method to spread traffic evenly between servers, informed by an optional “weight” assigned to each server to indicate its relative capacity.

The IP Hash method distributes traffic based on a hash of the source IP address. Requests from the same client IP address are always sent to the same upstream server. This is a crude session persistence method that is recalculated whenever a server fails or recovers, or whenever the upstream group is modified; NGINX Plus offers better solutions if session persistence is required.

The Least Connections method routes each request to the upstream server with the fewest active connections. This method works well when handling a mixture of quick and complex requests.

All load‑balancing methods can be tuned using an optional weight parameter on the server directive. This makes sense when servers have different processing capacity. In the following example, NGINX directs four times as many requests to web‑server2 than to web‑server1:

upstream backend {
zone backend 64k;

server web-server1 weight=1;
server web-server2 weight=4;

In NGINX, weights are managed independently by each worker process. NGINX Plus uses a shared memory segment for upstream data (configured with the zone directive), so weights are shared between workers and traffic is distributed more accurately.

Failure Detection

If there is an error or timeout when NGINX tries to connect with a server, pass a request to it, or read the response header, NGINX retries the connection request with another server. (You can include the proxy_next_upstream directive in the configuration to define other conditions for retrying the request.) In addition, NGINX can take the failed server out of the set of potential servers and occasionally try requests against it to detect when it recovers. The max_fails and fail_timeout parameters to the server directive control this behavior.

NGINX Plus adds a set of out‑of‑band health checks that perform sophisticated HTTP tests against each upstream server to determine whether it is active, and a slow‑start mechanism to gradually reintroduce recovered servers back into the load‑balanced group:

server web-server1 slow_start=30s;

A Common ‘Gotcha’ – Fixing the Host Header

Quite commonly, an upstream server uses the Host header in the request to determine which content to serve. If you get unexpected 404 errors from the server, or anything else that suggests it is serving the wrong content, this is the first thing to check. Then include the proxy_set_header directive in the configuration to set the appropriate value:

location / {
proxy_pass http://backend;

    # Rewrite the 'Host' header to the value in the client request
    # or primary server name
    proxy_set_header Host $host;

# Alternatively, put the value in the config:
#proxy_set_header Host www.example.com;

Advanced Load Balancing using NGINX Plus

A range of advanced features in NGINX Plus make it an ideal load balancer in front of farms of upstream servers:

For details about advanced load balancing and proxying, see the follow‑up post to this one, Load Balancing with NGINX and NGINX Plus, Part 2.

To try NGINX Plus, start your free 30‑day trial today or contact us for a live demo.

Infrastructure as Code
Get the latest on Microservices Design & Deployment