This article describes how to configure and use HTTP health checks in NGINX Plus and NGINX Open Source.

In This Section

Overview

NGINX can continually test your upstream servers, avoid the servers that have failed, and gracefully add the recovered servers into the load-balanced group.

Prerequisites

Passive Health Checks

Passive health checks mean that NGINX or NGINX Plus will monitor transactions as they happen, and if the transaction fails, NGINX will try to resume it. If the transaction still cannot be resumed, NGINX will consider an upstream server unavailable and temporarily stop sending requests to that server until it is considered active again.

Conditions, on which an upstream server will be considered unavailable, are set per each upstream server with the parameters of the upstream server directive:

  • fail_timeout – Sets the time during which a number of failed attempts must happen to consider the server unavailable, and also the time during which the server will be unavailable (default is 10 seconds).
  • max_fails – Sets the number of failed attempts during the fail_timeout time to consider the server unavailable (default is 1 connection) attempt.

The example shows that if NGINX fails to send a request to some server or does not receive a response from this server at least three times, it immediately considers the server unavailable for 30 seconds:

upstream backend {
    server backend1.example.com;
    server backend2.example.com max_fails=3 fail_timeout=30s;
}

Server Slow Start

A recently recovered server can be easily overwhelmed by connections, which may cause the server to be marked as “failed” again. Slow start allows an upstream server to gradually recover its weight from zero to its nominal value after it has been recovered or became available. This can be done with the slow_start parameter of the upstream server directive:

upstream backend {
    server backend1.example.com slow_start=30s;
    server backend2.example.com;
    server 192.0.0.1 backup;
}

The time value sets the time for the server will recover its weight.

Note that if there is only a single server in a group, max_fails, fail_timeout and slow_start parameters will be ignored and this server will never be considered unavailable.

Active Health Checks

NGINX Plus can periodically check the health of upstream servers by sending special health check requests to each server and check for a response.

To enable active health checks:

  1. In the location that passes requests to the upstream group (proxy_pass), specify the health_check directive:

    server {
        location / {
                proxy_pass http://backend;
                health_check;
            }
        }
  2. Specity a shared memory zone for the upstream server group with the zone directive:
    http {
        upstream backend {
            zone backend 64k;
    
            server backend1.example.com;
            server backend2.example.com;
            server backend3.example.com;
            server backend4.example.com;
        }
    }

    This configuration defines an upstream group backend and a virtual server with a single location that passes all requests to this upstream group. It also turns on advanced health monitoring with default parameters: every five seconds NGINX will send the “/” requests to each server in the backend group. If any communication error or timeout occurs (or a proxied server responds with the status code other than 2xx or 3xx) the health check will fail for this upstream server. Any upstream server that fails a health check will be considered unhealthy, and NGINX will stop sending client requests to it until it once again passes a health check.

    The zone directive defines a memory zone that is shared among worker processes and is used to store the configuration of the upstream group. This enables the worker processes to use the same set of counters to keep track of responses from the servers in the group. The zone directive also makes the group dynamically configurable.

    This behavior can be overridden using the parameters of the health_check directive:

    location / {
        proxy_pass http://backend;
        health_check interval=10 fails=3 passes=2;
    }

    Here, the duration between two consecutive health checks has been increased to 10 seconds using the interval parameter. In addition, a server will be considered unhealthy after 3 consecutive failed health checks by setting the fails parameter to 3. Finally, using the passes parameter, we have made it so that a server needs to pass 2 consecutive checks to be considered healthy again.

    It is possible to set a specific URI to request in a health check. Use the uri parameter for this purpose:

    location / {
        proxy_pass http://backend;
        health_check uri=/some/path;
    }

    The provided URI will be appended to the server domain name or IP address specified for the server in the upstream directive. For example, for the first server in the backend group declared above, a health check request will have the http://backend1.example.com/some/path URI.

    Finally, it is possible to set custom conditions that a healthy response should satisfy. The conditions are specified in the match block, which is defined in the match parameter of the health_check directive.

    http {
        ...
    
        match server_ok {
            status 200-399;
            body !~ "maintenance mode";
        }
    
        server {
            ...
    
            location / {
                proxy_pass http://backend;
                health_check match=server_ok;
            }
        }
    }

    Here a health check is passed if the response has the status in the range from 200 to 399, and its body does not match the provided regular expression.

    The match directive allows NGINX to check the status, header fields, and the body of a response. Using this directive it is possible to verify whether the status is in the specified range, whether a response includes a header, or whether the header or body matches a regular expression. The match directive can contain one status condition, one body condition, and multiple header conditions. To correspond to the match block, the response must satisfy all of the conditions specified within it.

    For example, the following match directive looks for responses that have the status code 200, contain the “Content-Type” header with text/html exact value, and have the body that includes the text “Welcome to nginx!”:

    match welcome {
        status 200;
        header Content-Type = text/html;
        body ~ "Welcome to nginx!";
    }

    In the following example using a !, conditions match responses that have the status other than 301, 302, 303, and 307 and do not include the “Refresh” header field.

    match not_redirect {
        status ! 301-303 307;
        header ! Refresh;
    }

    Health checks can also be enabled for non-HTTP protocols, such as FastCGI, uwsgi, SCGI, memcached, TCP and UDP.