NGINX.COM
Web Server Load Balancing with NGINX Plus

This tutorial is one of four that put into practice concepts from Microservices March 2022: Kubernetes Networking:

Want detailed guidance on using NGINX for even more Kubernetes networking use cases? Download our free eBook, Managing Kubernetes Traffic with NGINX: A Practical Guide.

Your organization just launched its first app and API in Kubernetes. You’ve been told to expect high traffic volumes (and already implemented autoscaling to ensure NGINX Ingress Controller can quickly route the traffic), but there are concerns that the API may be targeted by a malicious attack. If the API receives a high volume of HTTP requests – a possibility with brute‑force password guessing or DDoS attacks – then both the API and app might be overwhelmed or even crash.

But you’re in luck! The traffic‑control technique called rate limiting is an API gateway use case that limits the incoming request rate to a value typical for real users. You configure NGINX Ingress Controller to implement a rate‑limiting policy, which prevents the app and API from getting overwhelmed by too many requests. Nice work!

Lab and Tutorial Overview

This blog accompanies the lab for Unit 2 of Microservices March 2022 – Exposing APIs in Kubernetes, demonstrating how to combine multiple NGINX Ingress Controllers with rate limiting to prevent apps and APIs from getting overwhelmed.

To run the tutorial, you need a machine with:

  • 2 CPUs or more
  • 2 GB of free memory
  • 20 GB of free disk space
  • Internet connection
  • Container or virtual machine manager, such as Docker, Hyperkit, Hyper-V, KVM, Parallels, Podman, VirtualBox, or VMware Fusion/Workstation
  • minikube installed
  • Helm installed
  • A configuration that allows you to launch a browser window. If that isn’t possible, you need to figure out how to get to the relevant services via a browser.

To get the most out of the lab and tutorial, we recommend that before beginning you:

This tutorial uses these technologies:

The instructions for each challenge include the complete text of the YAML files used to configure the apps. You can also copy the text from our GitHub repo. A link to GitHub is provided along with the text of each YAML file.

This tutorial includes three challenges:

  1. Deploy a Cluster, App, API, and Ingress Controller
  2. Overwhelm Your App and API
  3. Save Your App and API with Dual Ingress Controllers and Rate Limiting

Challenge 1: Deploy a Cluster, App, API, and Ingress Controller

In this challenge, you deploy a minikube cluster and install Podinfo as a sample app and API. You then deploy NGINX Ingress Controller, configure traffic routing, and test the Ingress configuration.

Create a Minikube Cluster

Create a minikube cluster. After a few seconds, a message confirms the deployment was successful.

$ minikube start 
🏄  Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default 

Install the Podinfo App and Podinfo API

Podinfo is a “web application made with Go that showcases best practices of running microservices in Kubernetes”. We’re using it as a sample app and API because of its small footprint.

  1. Using the text editor of your choice, create a YAML file called 1-apps.yaml with the following contents (or copy from GitHub). It defines a Deployment that includes:

    • A web app (we’ll call it Podinfo Frontend) that renders an HTML page
    • An API (Podinfo API) that returns a JSON payload
    apiVersion: apps/v1 
    kind: Deployment 
    metadata: 
      name: api 
    spec: 
      selector: 
        matchLabels: 
          app: api 
      template: 
        metadata: 
          labels: 
            app: api 
        spec: 
          containers: 
            - name: api 
              image: stefanprodan/podinfo 
              ports: 
                - containerPort: 9898 
    --- 
    apiVersion: v1 
    kind: Service 
    metadata: 
      name: api 
    spec: 
      ports: 
        - port: 80 
          targetPort: 9898 
          nodePort: 30001 
      selector: 
        app: api 
      type: LoadBalancer 
    --- 
    apiVersion: apps/v1 
    kind: Deployment 
    metadata: 
      name: frontend 
    spec: 
      selector: 
        matchLabels: 
          app: frontend 
      template: 
        metadata: 
          labels: 
            app: frontend 
        spec: 
          containers: 
            - name: frontend 
              image: stefanprodan/podinfo 
              ports: 
                - containerPort: 9898 
    --- 
    apiVersion: v1 
    kind: Service 
    metadata: 
      name: frontend 
    spec: 
      ports: 
        - port: 80 
          targetPort: 9898 
          nodePort: 30002 
      selector: 
        app: frontend 
      type: LoadBalancer 
    
  2. Deploy the app and API:

    $ kubectl apply -f 1-apps.yaml
    deployment.apps/api created 
    service/api created 
    deployment.apps/frontend created 
    service/frontend created 
    
  3. Confirm that the pods for Podinfo API and Podinfo Frontend deployed successfully, as indicated by the value Running in the STATUS column.

    $ kubectl get pods  
    NAME                        READY   STATUS    RESTARTS  AGE 
    api-7574cf7568-c6tr6        1/1     Running   0         87s 
    frontend-6688d86fc6-78qn7   1/1     Running   0         87s 
    

Deploy NGINX Ingress Controller

The fastest way to install NGINX Ingress Controller is with Helm.

Install NGINX Ingress Controller in a separate namespace (nginx) using Helm.

  1. Create the namespace:

    $ kubectl create namespace nginx 
    
  2. Add the NGINX repository to Helm:

    $ helm repo add nginx-stable https://helm.nginx.com/stable 
    
  3. Download and install NGINX Ingress Controller in your cluster:

    $ helm install main nginx-stable/nginx-ingress \ 
     --set controller.watchIngressWithoutClass=true \ 
     --set controller.ingressClass=nginx \ 
     --set controller.service.type=NodePort \ 
     --set controller.service.httpPort.nodePort=30010 \ 
     --set controller.enablePreviewPolicies=true \ 
     --namespace nginx 
    
  4. Confirm that the NGINX Ingress Controller pod deployed, as indicated by the value Running in the STATUS column (for legibility, the output is spread across two lines).

    $ kubectl get pods -namespace nginx 
    NAME                                  READY   STATUS   ...
    main-nginx-ingress-779b74bb8b-d4qtc   1/1     Running  ...
    
          ... RESTARTS   AGE 
          ... 0          92s 
    

Route Traffic to Your App

  1. Using the text editor of your choice, create a YAML file called 2-ingress.yaml with the following contents (or copy from GitHub). It defines the Ingress manifest required to route traffic to the app and API.
  2. apiVersion: networking.k8s.io/v1 
    kind: Ingress 
    metadata: 
      name: first 
    spec: 
      ingressClassName: nginx 
      rules: 
        - host: "example.com" 
          http: 
            paths: 
              - backend: 
                  service: 
                    name: frontend 
                    port: 
                      number: 80 
                path: / 
                pathType: Prefix 
        - host: "api.example.com" 
          http: 
            paths: 
              - backend: 
                  service: 
                    name: api 
                    port: 
                      number: 80 
                path: / 
                pathType: Prefix 
    
  3. Deploy the Ingress resource:
  4. $ kubectl apply -f 2-ingress.yaml 
    ingress.networking.k8s.io/first created 
    

Test the Ingress Configuration

  1. To ensure your Ingress configuration is performing as expected, test it using a temporary pod. Launch a disposable BusyBox pod in the cluster:

    $ kubectl run -ti --rm=true busybox --image=busybox 
    If you don't see a command prompt, try pressing enter.
    / #
  2. Test Podinfo API by issuing a request to the NGINX Ingress Controller pod with the hostname api.example.com. The output shown indicates that the API is receiving traffic.

    / # wget --header="Host: api.example.com" -qO- main-nginx-ingress.nginx 
    { 
      "hostname": "api-687fd448f8-t7hqk", 
      "version": "6.0.3", 
      "revision": "", 
      "color": "#34577c", 
      "logo": "https://raw.githubusercontent.com/stefanprodan/podinfo/gh-pages/cuddle_clap.gif", 
      "message": "greetings from podinfo v6.0.3", 
      "goos": "linux", 
      "goarch": "arm64", 
      "runtime": "go1.16.9", 
      "num_goroutine": "6", 
      "num_cpu": "4" 
    } 
  3. Test Podinfo Frontend by issuing the following command in the same BusyBox pod to simulate a web browser and retrieve the web page. The output shown is the HTML code for the start of the web page.

    / # wget --header="Host: example.com" --header="User-Agent: Mozilla" -qO- main-nginx-ingress.nginx 
    <!DOCTYPE html> 
    <html> 
    <head> 
      <title>frontend-596d5c9ff4-xkbdc</title> 
      # ...
  4. In another terminal, open Podinfo in a browser. The greetings from podinfo page indicates Podinfo is running.

    $ minikube service podinfo

    Congratulations! NGINX Ingress Controller is receiving requests and forwarding them to the app and API.

  5. In the original terminal, end the BusyBox session:

    / # exit 
    $

Challenge 2: Overwhelm Your App and API

In this challenge, you install Locust, an open source load‑generation tool, and use it to simulate a traffic surge that overwhelms the API and causes the app to crash.

Install Locust

  1. Using the text editor of your choice, create a YAML file called 3-locust.yaml with the following contents (or copy from GitHub).

    The ConfigMap object defines a script called locustfile.py which generates requests to be sent to the pod, complete with the correct headers. The traffic is not distributed evenly between the app and API – requests are skewed to Podinfo API, with only 1 of 5 requests going to Podinfo Frontend.

    The Deployment and Service objects define the Locust pod.

    apiVersion: v1 
    kind: ConfigMap 
    metadata: 
      name: locust-script 
    data: 
      locustfile.py: |- 
        from locust import HttpUser, task, between 
    
        class QuickstartUser(HttpUser): 
            wait_time = between(0.7, 1.3) 
    
            @task(1) 
            def visit_website(self): 
                with self.client.get("/", headers={"Host": "example.com", "User-Agent": "Mozilla"}, timeout=0.2, catch_response=True) as response: 
                    if response.request_meta["response_time"] > 200: 
                        response.failure("Frontend failed") 
                    else: 
                        response.success() 
      
            @task(5) 
            def visit_api(self): 
                with self.client.get("/", headers={"Host": "api.example.com"}, timeout=0.2) as response: 
                    if response.request_meta["response_time"] > 200: 
                        response.failure("API failed") 
                    else: 
                        response.success() 
    --- 
    apiVersion: apps/v1 
    kind: Deployment 
    metadata: 
      name: locust 
    spec: 
      selector: 
        matchLabels: 
          app: locust 
      template: 
        metadata: 
          labels: 
            app: locust 
        spec: 
          containers: 
            - name: locust 
              image: locustio/locust 
              ports: 
                - containerPort: 8089 
              volumeMounts: 
                - mountPath: /home/locust 
                  name: locust-script 
          volumes: 
            - name: locust-script 
              configMap: 
                name: locust-script 
    --- 
    apiVersion: v1 
    kind: Service 
    metadata: 
      name: locust 
    spec: 
      ports: 
        - port: 8089 
          targetPort: 8089 
          nodePort: 30015 
      selector: 
        app: locust 
      type: LoadBalancer
  2. Deploy Locust:

    $ kubectl apply -f  3-locust.yaml 
    configmap/locust-script created 
    deployment.apps/locust created 
    service/locust created 
    
  3. Verify the Locust deployment. In the following sample output, the verification command was run just a few seconds after the kubectl apply command and so the installation is still in progress, as indicated by the value ContainerCreating for the Locust pod in the STATUS field. Wait until the value is Running before continuing to the next section. (The output is spread across two lines for legibility.)

    $ kubectl get pods
    NAME                        READY   STATUS            ...           api-7574cf7568-c6tr6        1/1     Running           ...
    frontend-6688d86fc6-78qn7   1/1     Running           ...            locust-77c699c94d-hc76t     0/1     ContainerCreating ...
    
          ... RESTARTS   AGE 
          ... 0          33m
          ... 0          33m
          ... 0           4s

Simulate a Traffic Surge

    1. Open Locust in a browser.

      $ minikube service locust

    2. Enter the following values in the fields:

      • Number of users – 1000
      • Spawn rate – 30
      • Host – http://main-nginx-ingress
    3. Click the Start swarming button to send traffic to Podinfo API and Podinfo Frontend. Observe the traffic patterns on the Locust Charts and Failures tabs:

      • Chart – As the number of API requests increases, the Podinfo API response times worsen.
      • Failures – Because Podinfo API and Podinfo Frontend share an Ingress controller, the increasing number of API requests soon causes the web app to start returning errors.

    This is problematic because a single bad actor using the API can take down not only the API, but all apps served by NGINX Ingress Controller!

    Challenge 3: Save Your App and API with Dual Ingress Controllers and Rate Limiting

    In the final challenge, you deploy two NGINX Ingress Controllers to eliminate the limitations of the previous deployment, creating a separate namespace for each one, installing separate NGINX Ingress Controller instances for Podinfo Frontend and Podinfo API, reconfigure Locust to direct traffic for the app and API to their respective NGINX Ingress Controllers, and verify that rate limiting is effective.

    First, let’s look at how to address the architectural problem. In the previous challenge, you overwhelmed NGINX Ingress Controller with API requests, which also impacted the app. This happened because a single Ingress controller was responsible for routing traffic to both the web app (Podinfo Frontend) and the API (Podinfo API).

    Running a separate NGINX Ingress Controller pod for each of your services prevents your app from being impacted by too many API requests. This isn’t necessarily required for every use case, but in our simulation it’s easy to see the benefits of running multiple NGINX Ingress Controllers.

    The second part of the solution, which prevents Podinfo API from getting overwhelmed, is to implement rate limiting by using NGINX Ingress Controller as an API gateway.

    What Is Rate Limiting?

    Rate limiting restricts the number of requests a user can make in a given time period. To mitigate a DDoS attack, for example, you can use rate limiting to limit the incoming request rate to a value typical for real users. When rate limiting is implemented with NGINX, clients that submit too many requests are redirected to an error page so they cannot negatively impact the API. Learn how this works in the NGINX Ingress Controller documentation.

    What Is an API Gateway?

    An API gateway routes API requests from clients to the appropriate services. A big misinterpretation of this simple definition is that an API gateway is a unique piece of technology. It’s not. Rather, “API gateway” describes a set of use cases that can be implemented via different types of proxies – most commonly an ADC or load balancer and reverse proxy, and increasingly an Ingress controller or service mesh. Rate limiting is a common use case for deploying an API gateway. Learn more about API gateway use cases in Kubernetes in How Do I Choose? API Gateway vs. Ingress Controller vs. Service Mesh on our blog.

    Prepare Your Cluster

    Before you can implement the new architecture and rate limiting, you must delete the previous NGINX Ingress Controller configuration.

    1. Delete the NGINX Ingress Controller configuration:

      $ kubectl delete -f 2-ingress.yaml 
      ingress.networking.k8s.io "first" deleted 
      
    2. Create a namespace called nginx‑web for Podinfo Frontend:

      $ kubectl create namespace nginx-web 
      namespace/nginx-web created 
      
    3. Create a namespace called nginx‑api for Podinfo API:

      $ kubectl create namespace nginx-api 
      namespace/nginx-api created 
      

    Install the NGINX Ingress Controller for Podinfo Frontend

    1. Install NGINX Ingress Controller:

      $ helm install web nginx-stable/nginx-ingress  
        --set controller.ingressClass=nginx-web \ 
        --set controller.service.type=NodePort \ 
        --set controller.service.httpPort.nodePort=30020 \ 
        --namespace nginx-web
    2. Create an Ingress manifest called 4-ingress-web.yaml for Podinfo Frontend (or copy from GitHub).

      apiVersion: networking.k8s.io/v1 
      kind: Ingress 
      metadata: 
        name: frontend 
      spec: 
        ingressClassName: nginx-web 
        rules: 
          - host: "example.com" 
            http: 
              paths: 
                - backend: 
                    service: 
                      name: frontend 
                      port: 
                        number: 80 
                  path: / 
                  pathType: Prefix 
      
    3. Deploy the new manifest:

      $ kubectl apply -f 4-ingress-web.yaml 
      ingress.networking.k8s.io/frontend created  
      

    Install the NGINX Ingress Controller for Podinfo API

    The manifest you created in the last section is exclusively for the NGINX Ingress Controller for Podinfo Frontend, as specified by the value nginx‑web in the ingressClassName field. Now you install an NGINX Ingress Controller for Podinfo API, including a rate‑limiting policy to prevent your API from getting overwhelmed.

    There are two ways to configure rate limiting with NGINX Ingress Controller:

    • NGINX Ingress resources – NGINX Ingress resources are an alternative to Kubernetes custom resources. They provide a native, type‑safe, and indented configuration style which simplifies implementation of Ingress load balancing capabilities, including:

      • Circuit breaking – For appropriate handling of application errors
      • Sophisticated routing – For A/B testing and blue‑green deployments
      • Header manipulation – For offloading application logic to the NGINX Ingress controller
      • Mutual TLS authentication (mTLS) – For zero‑trust or identity‑based security
      • Web application firewall (WAF) – For protection against HTTP vulnerability attacks.
    • Snippets – Snippets are a mechanism for inserting raw NGINX configuration into different contexts of the configurations generated by NGINX Ingress Controller. While snippets are a possible approach, we recommend avoiding them whenever possible because they’re error‑prone, can be difficult to work with, don’t provide fine‑grained control, and can create security issues.

    This tutorial uses the NGINX Ingress policy resource called rateLimit, which offers numerous configuration options. In this challenge, you use just the three required parameters:

    • rate – The maximum permitted rate of requests, expressed in requests per second (r/s) or requests per minute (r/m).
    • key – The characteristic by which each requester is uniquely identified, for example its IP address. The value can contain text, variables, or a combination.
    • zoneSize – The amount of shared memory allocated for the NGINX worker processes to keep track of requests, expressed in KB (K) or MB (M).

    This example limits each requester to 10 requests per second, identifying requesters by IP address (captured by the NGINX variable ${binary_remote_addr}), and allocates 10 MB for the shared memory zone.

    rateLimit: 
          rate: 10r/s 
          key: ${binary_remote_addr} 
          zoneSize: 10M 
    
    1. Install NGINX Ingress Controller:

      $ helm install api nginx-stable/nginx-ingress  
        --set controller.ingressClass=nginx-api \ 
        --set controller.service.type=NodePort \ 
        --set controller.service.httpPort.nodePort=30030 \ 
        --set controller.enablePreviewPolicies=true \ 
        --namespace nginx-api
    2. Create an Ingress manifest called 5-ingress-api.yaml for Podinfo API (or copy from GitHub).

      apiVersion: k8s.nginx.org/v1 
      kind: Policy 
      metadata: 
        name: rate-limit-policy 
      spec: 
        rateLimit: 
          rate: 10r/s 
          key: ${binary_remote_addr} 
          zoneSize: 10M 
      --- 
      apiVersion: k8s.nginx.org/v1 
      kind: VirtualServer 
      metadata: 
        name: api-vs 
      spec: 
        ingressClassName: nginx-api 
        host: api.example.com 
        policies: 
        - name: rate-limit-policy 
        upstreams: 
        - name: api 
          service: api 
          port: 80 
        routes: 
        - path: / 
          action: 
            pass: api 
      
    3. Deploy the new manifest:

      $ kubectl apply -f 5-ingress-api.yaml 
      ingress.networking.k8s.io/api created 
      

    Reconfigure Locust

    Now, reconfigure Locust and verify that:

    • Podinfo API doesn’t get overloaded.
    • No matter how many requests are sent to Podinfo API, there is no impact on Podinfo Frontend.

    Perform these steps:

    1. Change the Locust script so that:

      • All requests to Podinfo Frontend are directed to the nginx‑web NGINX Ingress Controller at http://web-nginx-ingress.nginx-web
      • All requests to Podinfo API are directed to the nginx‑api NGINX Ingress Controller at http://api-nginx-ingress.nginx-api

      Because Locust supports just a single URL in the dashboard, hardcode the value in the Python script using the YAML file 6-locust.yaml with the following contents (or copy from GitHub). Take note of the URLs in each task.

      apiVersion: v1 
      kind: ConfigMap 
      metadata: 
        name: locust-script 
      data: 
        locustfile.py: |- 
          from locust import HttpUser, task, between 
      
          class QuickstartUser(HttpUser): 
              wait_time = between(0.7, 1.3) 
      
              @task(1) 
              def visit_website(self): 
                  with self.client.get("http://web-nginx-ingress.nginx-web/", headers={"Host": "example.com", "User-Agent": "Mozilla"}, timeout=0.2, catch_response=True) as response: 
                      if response.request_meta["response_time"] > 200: 
                          response.failure("Frontend failed") 
                      else: 
                          response.success() 
        
      
              @task(5) 
              def visit_api(self): 
                  with self.client.get("http://api-nginx-ingress.nginx-api/", headers={"Host": "api.example.com"}, timeout=0.2) as response: 
                      if response.request_meta["response_time"] > 200: 
                          response.failure("API failed") 
                      else: 
                          response.success() 
      --- 
      apiVersion: apps/v1 
      kind: Deployment 
      metadata: 
        name: locust 
      spec: 
        selector: 
          matchLabels: 
            app: locust 
        template: 
          metadata: 
            labels: 
              app: locust 
          spec: 
            containers: 
              - name: locust 
                image: locustio/locust 
                ports: 
                  - containerPort: 8089 
                volumeMounts: 
                  - mountPath: /home/locust 
                    name: locust-script 
            volumes: 
              - name: locust-script 
                configMap: 
                  name: locust-script 
      --- 
      apiVersion: v1 
      kind: Service 
      metadata: 
        name: locust 
      spec: 
        ports: 
          - port: 8089 
            targetPort: 8089 
            nodePort: 30015 
        selector: 
          app: locust 
        type: LoadBalancer 
      
    2. Deploy the new Locust configuration. The output confirms that the script changed but the other elements remain unchanged.

      $ kubectl apply -f 6-locust.yaml 
      configmap/locust-script configured 
      deployment.apps/locust unchanged 
      service/locust unchanged
    3. Delete the Locust pod to force a reload of the new ConfigMap. To identify the pod to remove, the argument to the kubectl delete pod command is expressed as piped commands that select the Locust pod from the list of all pods.

      $ kubectl delete pod `kubectl get pods | grep locust | awk {'print $1'}` 
      
    4. Verify Locust has been reloaded (the value for the Locust pod in the AGE column is only a few seconds).

      $ kubectl get pods
      NAME                        READY   STATUS   ...           api-7574cf7568-jrlvd        1/1     Running  ...
      frontend-6688d86fc6-vd856   1/1     Running  ...            locust-77c699c94d-6chsg     0/1     Running  ...
      
            ... RESTARTS   AGE 
            ... 0        9m57s
            ... 0        9m57s
            ... 0           6s

    Verify Rate Limiting

    1. Return to Locust and change the parameters in these fields:

      • Number of users – 400
      • Spawn rate – 10
      • Host – http://main-nginx-ingress
    2. Click the Start swarming button to send traffic to Podinfo API and Podinfo Frontend.

      In the Locust title bar at top left, observe how as the number of users climbs in the STATUS column, so does the value in FAILURES column. However, the errors are no longer coming from Podinfo Frontend but rather from Podinfo API because the rate limit set for the API means excessive requests are being rejected. In the trace at lower right you can see NGINX is returning the message 503 Service Temporarily Unavailable, which is part of the rate‑limiting feature and can be customized. The API is rate limited, and the web application is always available. Well done!

    Next Steps

    In the real world, rate limiting alone isn’t enough to protect your apps and APIs from bad actors. You need to implement at least one or two of the following methods for protecting Kubernetes apps, APIs, and infrastructure:

    • Authentication and authorization
    • Web application firewall and DDoS protection
    • End-to-end encryption and Zero Trust
    • Compliance with industry regulations

    We cover these topics and more in Unit 3 of Microservices March 2022 – Microservices Security Pattern in Kubernetes.

    To try NGINX Ingress Controller for Kubernetes with NGINX Plus and NGINX App Protect, start your free 30-day trial today or contact us to discuss your use cases.

    To try NGINX Ingress Controller with NGINX Open Source, you can obtain the release source code, or download a prebuilt container from DockerHub.

Hero image
Managing Kubernetes Traffic with F5 NGINX: A Practical Guide

Learn how to manage Kubernetes traffic with F5 NGINX Ingress Controller and F5 NGINX Service Mesh and solve the complex challenges of running Kubernetes in production.



About The Author

Daniele Polencic

Daniele Polencic

Managing Director

About F5 NGINX

F5, Inc. is the company behind NGINX, the popular open source project. We offer a suite of technologies for developing and delivering modern applications. Together with F5, our combined solution bridges the gap between NetOps and DevOps, with multi-cloud application services that span from code to customer.

Learn more at nginx.com or join the conversation by following @nginx on Twitter.