NGINX Plus extends the reverse proxy capabilities of the open source NGINX software with an additional application load‑balancing method, enhancements for multicore servers, and features such as session persistence, health checks, live activity monitoring, and on‑the‑fly reconfiguration of load‑balanced server groups.

NGINX Plus as an Application Load Balancer

Distribute traffic across groups of upstreams servers when using NGINX Plus as an application load balancer.

NGINX Plus supports the same load‑balancing methods as the open source NGINX product (Round‑Robin, Least Connections, Generic Hash, and IP Hash), and adds the Least Time method. All load‑balancing methods are extended to operate more efficiently on multicore servers; the NGINX Plus worker processes share information about load balancing state so that traffic distribution and weights can be more accurately applied.

View a live demo of extended status reporting

HTTP, TCP, and UDP Load Balancing with NGINX Plus

NGINX Plus load balances a broad range of HTTP, TCP, and UDP applications.

NGINX Plus can optimize and load balance HTTP connections, TCP connections to support high availability for applications such as MySQL, LDAP, and Chat, and UDP traffic for applications such as DNS, RADIUS, and syslog. You define settings for HTTP load balancing in the http configuration context, and settings for both TCP and UDP load balancing in the stream configuration context.

All application load‑balancing methods include inline and synthetic health checks. They provide performance data to NGINX Plus’ live activity monitoring module and can be controlled using on‑the‑fly reconfiguration of load‑balanced server groups.

HTTP Load Balancing

When load balancing HTTP traffic, NGINX Plus terminates each HTTP connection and processes each request individually. You can strip SSL encryption, inspect and manipulate the request, queue the request using rate limits, and then select the load‑balancing policy. To improve performance, NGINX Plus can automatically apply a wide range of optimizations to an HTTP transaction, including HTTP upgrades, Keep‑Alive optimization, and transformations such as content compression and response caching. For more information, check out this introduction to load balancing with NGINX and NGINX Plus.

NGINX Plus’ session persistence methods can be used to track user sessions based on data in the HTTP requests, pinning subsequent traffic from a client to the appropriate upstream server.

For more information, check out NGINX Load Balancing – HTTP Load Balancer in the NGINX Plus Admin Guide.

TCP and UDP Load Balancing

NGINX Plus terminates TCP connections, makes a load‑balancing decision, and then establishes a connection to the upstream server, relaying data between the client and server on demand. You can strip or add SSL encryption, queue connections, apply rate limits to new connections, and apply bandwidth limits to incoming and outgoing data. NGINX Plus delivers high availability using in‑line and synthetic health checks, slow‑start for recovered servers, concurrency control, and the ability to designate servers as active, backup, or down.

NGINX Plus receives UDP requests, makes a load‑balancing decision, and then forwards the requests to the upstream server, relaying data between the client and server on demand. You can strip or add SSL encryption, and apply rate and bandwidth limits to incoming and outgoing data. NGINX Plus delivers high availability using in‑line and synthetic health checks, slow‑start for recovered servers, concurrency control, and the ability to designate servers as active, backup, or down.

For more information, check out NGINX Load Balancing – TCP and UDP Load Balancer in the NGINX Plus Admin Guide.

NGINX Plus Load‑Balancing Methods

Different applications and services perform best with different application load‑balancing methods, and NGINX Plus gives you the choice and control.

NGINX Plus supports a number of application load‑balancing methods for HTTP, TCP, and UDP load balancing:

  • Round‑Robin (the default) – Requests are distributed in order across the list of servers.
  • Least Connections – Each request is sent to the server with the lowest number of active connections, taking into consideration the weights assigned to the servers.
  • Least Time – Requests are sent to upstream servers with the fastest response times and fewest active connections.
  • Generic Hash – Requests are distributed based on a user‑defined key such as the URL. NGINX Plus can optionally apply a consistent hash to minimize redistribution of load.
  • IP Hash – A special case of the generic Hash method, hardwired to use the client IP address.

Connection Limiting with NGINX Plus

You can limit the number of connections NGINX Plus sends to upstream HTTP or TCP (stream) servers, to prevent them from being overwhelmed by concurrent connections during periods of high load.

The optimizations and traffic acceleration techniques described in HTTP Load Balancing above significantly reduce the number of HTTP connections between NGINX Plus and upstream servers compared to the number of connection requests from clients to NGINX Plus. The reduced number can nevertheless be too much for some upstream servers and applications to handle.

In particular, servers that are thread‑ or process‑based typically have a hard limit on the number of concurrent connections they can manage without becoming overloaded. When the limit is reached, additional requests are placed in the operating system’s listen queue. There is no guarantee of prompt servicing, and existing concurrency slots can be occupied indefinitely by client keepalive connections or idle TCP connections. If resources such as memory and file descriptors become exhausted, the server can become unable to process any requests, a state from which it might never recover.

To avoid overwhelming upstream servers, you can include the max_conns parameter on server directives in HTTP, TCP, and UDP upstream contexts. When the number of concurrent connections (or sessions for UDP) on the server exceeds the defined limit, NGINX Plus stops sending new connections/sessions to it. In the following example, the limit is 250 connections for webserver1 and 150 connections for webserver2 (presumably because it has less capacity than webserver1):

upstream backend {
zone backends 64k;
queue 750 timeout=30s;

server webserver1 max_conns=250;
server webserver2 max_conns=150;
}

When the number of existing connections exceeds the max_conns limit on every server, NGINX Plus queues new connections, distributing them to the servers as the number of connections falls below the limit on each one. You can define the maximum number of queued requests with the queue directive as shown here. Its optional timeout parameter defines how long a request remains in the queue before NGINX Plus discards it and returns an error to the client; the default is 60 seconds. In the example, up to 750 requests can be queued for up to 30 seconds each.

You can include the max_conns parameter when the name in the server directive (such as webserver1 in the example) is a domain name that resolves to a list of server IP addresses in the Domain Name System (DNS). In this case, the value of max_conns applies to all the servers. For more information, see Configuring Load Balancing using DNS.

Limiting connections helps to ensure consistent, predictable servicing of client requests even in the face of large traffic spikes, with fair request distribution for all users.