NGINX.COM

High-performance load balancing

Scale out your applications with NGINX and NGINX Plus

Free trial

Load balancing is the process of distributing a workload evenly across multiple servers. In the case of a web application, HTTP requests are load balanced across a pool of application servers. There are two main benefits to load balancing. One is to scale out and handle more users than you can with a single server. The second is redundancy – if one server fails, others are available to ensure the application stays online.

NGINX load balancing
Scale out your applications with NGINX and NGINX Plus load balancing

Both the open source NGINX software and NGINX Plus can load balance HTTP, TCP, and UDP traffic. NGINX Plus extends open source NGINX with enterprise‑grade load balancing that includes session persistence, active health checks, on‑the‑fly reconfiguration of load‑balanced server groups without a server restart, and additional metrics.

HTTP Load Balancing

When load balancing HTTP traffic, NGINX Plus terminates each HTTP connection and processes each request individually. You can strip out SSL encryption, inspect and manipulate the request, queue the request using rate limits, and then select the load‑balancing policy.

To improve performance, NGINX Plus can automatically apply a wide range of optimizations to an HTTP transaction, including HTTP upgrades, keepalive optimization, and transformations such as content compression and response caching.

Load balancing HTTP traffic is easy with NGINX Plus:

http {
upstream my_upstream {
server server1.example.com;
server server2.example.com;
}

server {
listen 80;
location / {
proxy_set_header Host $host;
proxy_pass http://my_upstream;
}
}
}

First specify a virtual server using the server directive, and then listen for traffic on a port. You then match the URLs of client requests using a location block, set the Host header with the proxy_set_header directive, and include the proxy_pass directive to forward the request to an upstream group. (The upstream block defines the servers across which NGINX Plus load balances traffic.)

For more information, check out this introduction to load balancing with NGINX and NGINX Plus.

TCP and UDP Load Balancing

NGINX Plus can also load balance TCP applications such as MySQL, and UDP applications such as DNS and RADIUS. For TCP applications, NGINX Plus terminates the TCP connections and creates new connections to the backend.

stream {
upstream my_upstream {
server server1.example.com:1234;
server server2.example.com:2345;
}

server {
listen 1123 [udp];
proxy_pass my_upstream;
}
}

As with HTTP load balancing, you specify a virtual server using the server directive, and then listen for traffic on a port. You then proxy the request to an upstream group, which defines the servers across which NGINX Plus load balances traffic.

For more information on TCP and UDP load balancing, click here.

Load‑Balancing Methods

NGINX supports a number of application load‑balancing methods for HTTP, TCP, and UDP load balancing:

  • Round‑Robin (the default) – Requests are distributed in order across the list of servers.
  • Least Connections – Each request is sent to the server with the lowest number of active connections, taking into consideration the weights assigned to the servers.
  • Hash – Requests are distributed based on a user‑defined key such as the client IP address or URL. NGINX Plus can optionally apply a consistent hash to minimize redistribution of loads if the set of upstream servers changes.
  • IP Hash (HTTP only) – Requests are distributed based on the first three octets of the client IP address.

NGINX Plus has an additional algorithm:

  • Least Time – Requests are sent to upstream servers with the fastest response times and fewest active connections.

Connection Limiting with NGINX Plus

You can limit the number of connections NGINX Plus sends to upstream HTTP or TCP servers (for UDP the number of session is limited). With connection limiting, NGINX Plus stops sending new connections or sessions to upstream servers when the number of concurrent connections (or sessions for UDP) on a server exceeds a defined limit.

In the configuration snippet below, the connection limit for webserver1 is 250 and for webserver2 is 150. When you include the queue directive, NGINX Plus places request in excess of the limit in the operating system’s listen queue. NGINX Plus then distributes the queued connections to servers as the number of active connections falls below the limit for each one.

upstream backend {
zone backends 64k;
queue 750 timeout=30s;

server webserver1 max_conns=250;
server webserver2 max_conns=150;
}

In this example, up to 750 requests can be queued for up to 30 seconds each. Limiting connections helps to ensure consistent, predictable servicing of client requests – even in the face of large traffic spikes – with fair distribution across users.

Session Persistence

NGINX Plus can identify user sessions and send all requests in a client session to the same upstream server. This can avoid fatal errors that might otherwise result when app servers store state locally and a load balancer sends an in‑progress user session to a different server. Session persistence can also improve performance when applications share information across a cluster.

There are various methods for implementing session persistence, one of which is sticky cookie. In this method, NGINX Plus adds a session cookie to the first response from the upstream group to a given client, identifying (in an encoded fashion) the server that generated the response. Subsequent requests from the client include the cookie value, and NGINX Plus uses it to route the request to the same upstream server:

upstream backend {
server webserver1;
server webserver2;

sticky cookie srv_id expires=1h domain=.example.com path=/;
}

In this example, NGINX Plus inserts a cookie called srv_id in the initial client response, identifying the server the request was sent to. When a subsequent request contains the cookie, NGINX Plus forwards it to the same server.

NGINX Plus also supports sticky learn and sticky route persistence methods.

Note: Session persistence is exclusive to NGINX Plus.

Active Health Checks

The open source NGINX software performs basic checks on responses from upstream servers, retrying failed requests where possible. NGINX Plus adds out‑of‑band application health checks (also known as synthetic transactions) and a slow‑start feature to gracefully add new and recovered servers into the load‑balanced group.

These features enable NGINX Plus to detect and work around a much wider variety of problems, significantly improving the reliability of your HTTP and TCP/UDP applications.

upstream my_upstream {
zone my_upstream 64k;
server server1.example.com slow_start=30s;
}
server {
# ...
location /health {
internal;
health_check interval=5s uri=/test.php match=statusok;
proxy_set_header HOST www.example.com;
proxy_pass http://my_upstream;
}

match statusok {
# Used for /test.php health check
status 200;
header Content-Type = text/html;
body ~ "Server[0-9]+ is alive";
}

In the example, NGINX Plus sends a request for /test.php every five seconds. The match block defines the conditions the response must meet in order for the upstream server to be considered healthy – in this case, a status code of 200 OK, as well a response body that contains the text ServerN is alive.

Note: Active health checks are exclusive to NGINX Plus.

Service Discovery Using DNS

Normally, NGINX Plus servers resolve DNS names when they start up, and cache these resolved values persistently. When you use a domain name (such as example.com) to identify a group of upstream servers in the server directive and include the resolve parameter, NGINX Plus periodically re‑resolves the name in DNS. If the associated list of IP addresses has changed, NGINX Plus immediately starts load balancing across the updated group of servers.

To let NGINX Plus know to use DNS SRV records, which contain the port number, you must include the resolver directive in the configuration, along with the service=http parameter to the server directive, as shown:

resolver 127.0.0.11 valid=10s;
upstream service1 {
zone service1 64k;
server service1 service=http resolve;
}

In the example, NGINX Plus queries 127.0.0.11 (the built‑in Docker DNS server) every 10 seconds to re‑resolve service1.

Note: Service discovery using DNS is exclusive to NGINX Plus.

NGINX Plus API

NGINX Plus offers an API that falls closely to the REST architecture and is unified under a single API endpoint. Use the NGINX Plus API to update upstream configurations and key‑value stores on the fly with zero downtime. The following basic configuration snippet includes the api directive to enable the API endpoint at /api.

upstream backend {
zone backends 64k;
server 10.10.10.2:220 max_conns=250;
server 10.10.10.4:220 max_conns=150;
}

server {
listen 80;
server_name www.example.org;

location /api {
api write=on;
}
}

With the API enabled, you can run the following curl command, using the POST method to add a new server with IP address 192.168.78.66 to the existing upstream group called backend, using JSON encoding to set its weight to 200 and the maximum number of simultaneous connections to 150.

$ curl -iX POST -d '{"server":"192.168.78.66:80","weight":"200","max_conns":"150"}' http://localhost:80/api/1/http/upstreams/backend/servers/

When modifying the configuration of an existing server in an upstream group, you can identify it by its internal ID. The following command uses the PATCH method to reconfigure server with ID 4 in the backend group to have IP address 192.168.78.55 and listen on port 80, with a weight of 300 and a connection limit of 350.

$ curl -iX PATCH -d '{"server":"192.168.78.55:80","weight":"500","max_conns":"350"}' http://localhost:80/api/1/http/upstreams/backend/servers/4

To display the complete configuration of all servers in an upstream group in JSON format, run this command:

$ curl -s http://localhost:80/api/1/http/upstreams/backend/servers/ | python -m json.tool
{
"backup": false,
"down": false,
"fail_timeout": "10s",
"id": 0,
"max_conns": 250,
"max_fails": 1,
"route": "",
"server": "10.10.10.2:220",
"slow_start": "0s",
"weight": 1
},
{
"backup": false,
"down": false,
"fail_timeout": "10s",
"id": 1,
"max_conns": 150,
"max_fails": 1,
"route": "",
"server": "10.10.10.4:220",
"slow_start": "0s",
"weight": 1
},
{
"backup": false,
"down": false,
"fail_timeout": "10s",
"id": 2,
"max_conns": 200,
"max_fails": 1,
"route": "",
"server": "192.168.78.66:80",
"slow_start": "0s",
"weight": 200
}

You can access to the full Swagger documentation of the NGINX Plus API at https://demo.nginx.com/swagger-ui/.

Note: The NGINX Plus API is exclusive to NGINX Plus.

TRY NGINX PLUS!

Download a 30 day free trial and see what you've been missing.

X

Got a question for the NGINX team?

< back
X

Sign up for beta

We'll be in touch with you about our NGINX Controller beta.

X

Sign up for beta

We'll be in touch with you real soon about our NGINX Unit beta.

X