NGINX.COM

Using NGINX as a DoT or DoH Gateway

There’s a lot of talk surrounding the Domain Name System (DNS) at the moment, with massive changes being proposed for the 36-year-old protocol. The Internet’s name service, which has its origins in ARPANET, has never had any backward compatible breakages since its inception. But new proposals to change the DNS transport mechanism may be about to change that.

In this post I look at two emerging technologies for securing DNS, DNS over TLS (DoT) and DNS over HTTPS (DoH), and show how to implement them using NGINX Open Source and NGINX Plus.

The History of DNS

DNS was the second attempt at creating a naming service for the Advanced Research Project Agency’s (ARPA) early Internet, the first being the Internet Name Server protocol published by John Postel in 1979 as IEN-116. DNS was designed to be hierarchical and provide a structure for hostnames to be decentralized into zones and managed by many separate authorities. The first RFCs for DNS were published in 1983 (RFC 882 and 883), and while they have had several extensions over the years, a client written to the standards defined then would still work today.

So why change the protocol now? It clearly works as intended, justifying the confidence of its authors, who were so sure they had it right that they didn’t include a version number in the DNS packet – I can’t think of many other protocols that can make that claim. DNS was conceived in a more innocent time, when most protocols were in cleartext and often 7-bit ASCII, but the Internet of today is a much scarier place than the ARPANET of the 1980s. Today most protocols have embraced Transport Layer Security (TLS) for encryption and verification purposes. Critics of DNS argue that it is well overdue for some additional security protections.

So, in the same way that DNS was the second attempt at providing a name‑service protocol for the Internet, DNS over TLS (DoT) and DNS over HTTPS (DoH) are emerging as second attempts to secure the DNS protocol. The first attempt was an extension known as DNSSEC, and while most top‑level domains (TLDs) make use of DNSSEC, it was never intended to encrypt the data carried in the DNS packets; it only provides verification that the data was not tampered with. DoT and DoH are protocol extensions that wrap DNS inside a TLS tunnel, and if adopted they will end 36 years of backward compatibility.

DoT and DoH in More Detail

DoT, I think, is largely seen as a sensible extension. It has already been assigned its own port number (TCP/853) by the Internet Assigned Numbers Authority (IANA), and simply wraps TCP DNS packets inside a TLS‑encrypted tunnel. Many protocols have done this before: HTTPS is HTTP inside a TLS tunnel, and SMTPS, IMAPS, and LDAPS are secured versions of those protocols. DNS has always used UDP (or TCP in certain cases) as the transport protocol, so adding the TLS wrapper is not a massive change.

DoH, on the other hand, is a little more controversial. DoH takes a DNS packet and wraps it inside a HTTP GET or POST request, which is then sent using HTTP/2 or higher over an HTTPS connection. This appears to all intents and purposes to be just like any other HTTPS connection, and it’s impossible for enterprises or service providers to see what requests are being made. Mozilla and other supporters say that this practice increases users’ privacy by keeping the sites that they visit private from prying eyes.

But that’s not exactly true. Critics of DoH point out that it’s not quite the Tor of DNS because when the browser eventually makes the connection to the host, which it has looked up using DoH, the request will almost certainly use the TLS Server Name Indication (SNI) extension – and that includes the hostname and is sent in cleartext. Further, if the browser attempts to validate the server’s certificate using Online Certificate Status Protocol (OSCP), that process most likely occurs in cleartext as well. So, anyone with the ability to monitor DNS lookups also has the ability to read the SNI in the connection or the certificate name in the OCSP validation.

For many people, the biggest problem with DoH is that browser vendors choose the DNS servers to which DoH requests made by their users are sent by default (in the case of Firefox users in the US, for example, the DNS servers belong to Cloudflare). The operator of the DNS servers can see the user’s IP address and the domain names of the sites to which they make requests. That may not seem like much, but researchers at the University of Illinois found that it’s possible to deduce which websites a person has visited from nothing more than the destination addresses of requests for elements of a web page, the so‑called Page Load Fingerprint. That information can then be used to “profile and target the user for advertising”.

How Can NGINX Help?

DoT and DoH are not inherently evil, and there are use cases where they increase user privacy. However, there is a growing consensus that a public, centralized DoH service is bad for user privacy, and we recommend that you avoid using one at all costs.

In any case, for your sites and apps you likely manage your own DNS zones – some public, some private, and some with a split horizon. At some point you might decide you want to run your own DoT or DoH service. This is where NGINX can help.

The privacy enhancements provided by DoT offer some great advantages for DNS security, but what if your current DNS server doesn’t offer support for DoT? NGINX can help here by providing a gateway between DoT and standard DNS.

Or perhaps you like the firewall busting potential of DoH for cases where the DoT port may be blocked. Again NGINX can help, by providing a DoH-to-DoT/DNS gateway.

Deploying a Simple DoT-DNS Gateway

The NGINX Stream (TCP/UDP) module supports SSL termination, and so it’s actually really simple to set up a DoT service. You can create a simple DoT gateway in just a few lines of NGINX configuration.

You need an upstream block for your DNS servers, and a server block for TLS termination:

stream {
    # DNS upstream pool
    upstream dns {
        zone dns 64k;
        server 8.8.8.8:53;
    }

    # DoT server for decryption
    server {
        listen 853 ssl;
        ssl_certificate /etc/nginx/ssl/certs/doh.local.pem;
        ssl_certificate_key /etc/nginx/ssl/private/doh.local.pem;
        proxy_pass dns;
    }
}

Of course we can also go the other way and forward incoming DNS requests to an upstream DoT server. This is less useful, however, because most DNS traffic is UDP and NGINX can translate only between DoT and other TCP services, such as TCP‑based DNS.

stream {
    # DoT upstream pool
    upstream dot {
        zone dot 64k;
        server 8.8.8.8:853;
    }

    # DNS server for upstream encryption
    server {
        listen 53;
        proxy_ssl on;
        proxy_pass dot;
    }
}

A Simple DoH-DNS Gateway

Compared to a DoT gateway, the configuration of a simple DoH gateway is a little more complex. We need both a HTTPS service and a Stream service, and use JavaScript code and the NGINX JavaScript module (njs) to translate between the two protocols. The simplest configuration is:

http {
    # This is our upstream connection to the njs translation process
    upstream dohloop {
        zone dohloop 64k;
        server 127.0.0.1:8053;
    }

    # This virtual server accepts HTTP/2 over HTTPS
    server {
        listen 443 ssl http2;
        ssl_certificate /etc/nginx/ssl/certs/doh.local.pem;
        ssl_certificate_key /etc/nginx/ssl/private/doh.local.pem;

        # Return 404 for non-DoH requests
        location / {
            return 404 "404 Not Found\n";
        }

        # Here we downgrade the HTTP/2 request to HTTP/1.1 and forward it to
        # the DoH loop service
        location /dns-query {
            proxy_http_version 1.1;
            proxy_set_header Connection "";
            proxy_pass http://dohloop;
        }
    }
}

stream {
    # Import the JavaScript file that processes the DoH(?) packets
    js_include /etc/nginx/njs.d/nginx_stream.js;

    # DNS upstream pool (can also be DoT)
    upstream dns {
        zone dns 64k;
        server 8.8.8.8:53;
    }

    # DNS over HTTPS (gateway) translation process
    # Upstream can be either DNS (TCP) or DoT 
    server {
        listen 127.0.0.1:8053;
        js_filter doh_filter_request;
        proxy_pass dns;
    }
}

This configuration does the minimum amount of processing required to send the packet on its way to the DNS service. This use case assumes that the upstream DNS server performs any other filtering, logging, or security functions.

The JavaScript script used in this configuration (nginx_stream.js) includes various DNS library module files. Inside the dns.js module, the dns_decode_level variable sets how much processing is done on the DNS packets. Processing DNS packets obviously hurts performance. If you are using a configuration like the one above, set dns_decode_level to 0.

A More Advanced DoH Gateway

Here at NGINX we’re really rather good at HTTP, so we think that using NGINX only as a simple DoH gateway is a wasted opportunity.

The JavaScript code used here can be set up to do full or partial decoding of the DNS packets. That allows us to build an HTTP content cache for the DoH queries with Expires and Cache-Control headers set based on the minimum TTLs of the DNS responses.

A more complete example follows, with additional connection optimization, and support for content caching and logging:

http {
    include       /etc/nginx/mime.types;
    default_type  application/octet-stream;

    log_format  dns  '$remote_addr - $remote_user [$time_local] "$request" '
                     '[ $msec, $request_time, $upstream_response_time $pipe ] '
                     '$status $body_bytes_sent "-" "-" "$http_x_forwarded_for" '
                     '$upstream_http_x_dns_question $upstream_http_x_dns_type '
                     '$upstream_http_x_dns_result '
                     '$upstream_http_x_dns_ttl $upstream_http_x_dns_answers '
                     '$upstream_cache_status';

    access_log  /var/log/nginx/doh-access.log dns;

    upstream dohloop {
        zone dohloop 64k;
        server 127.0.0.1:8053;
        keepalive_timeout 60s;
        keepalive_requests 100;
        keepalive 10;
    }

    proxy_cache_path /var/cache/nginx/doh_cache levels=1:2 keys_zone=doh_cache:10m;
  
    server {

        listen 443 ssl http2;
        ssl_certificate /etc/nginx/ssl/certs/doh.local.pem;
        ssl_certificate_key /etc/nginx/ssl/private/doh.local.pem;
        ssl_session_cache shared:ssl_cache:10m;
        ssl_session_timeout 10m;

        proxy_cache_methods GET POST;

        location / {
            return 404 "404 Not Found\n";
        }

        location /dns-query {
            proxy_http_version 1.1;
            proxy_set_header Connection "";
            proxy_cache doh_cache;
            proxy_cache_key $scheme$proxy_host$uri$is_args$args$request_body;
            proxy_pass http://dohloop;
        }
    }
}

stream {
    js_include /etc/nginx/njs.d/nginx_stream.js;

    # DNS upstream pool
    upstream dns {
        zone dns 64k;
        server 8.8.8.8:53;
    }

    # DNS over TLS upstream pool
    upstream dot {
        zone dot 64k;
        server 8.8.8.8:853;
    }

    # DNS over HTTPS (gateway) service
    # This time we’ve used a DoT upstream
    server {
        listen 127.0.0.1:8053;
        js_filter doh_filter_request;
        proxy_ssl on;
        proxy_pass dot;
    }
}

An Advanced DNS Filter Using NGINX Plus Features

If you have an NGINX Plus subscription, then you can combine the examples above with some of the more advanced NGINX Plus features, such as active health checks and high availability, or even use the cache purging API for managing cached DoH responses.

It’s also possible to use the NGINX Plus key‑value store to build a DNS filtering system that protects users from malicious domains by returing a DNS response that effectively prevents access to them. You manage the contents of the key‑value store dynamically with the RESTful NGINX Plus API.

We define two categories of malicious domains, “blocked” and “blackhole”. The DNS filtering system handles a DNS query about a domain differently depending on its category:

  • For blocked domains, it returns the NXDOMAIN response (indicating the domain does not exist)
  • For blackhole domains, it returns a single DNS A record of 0.0.0.0 in response to requests for A records, or a single AAAA record of :: in response to requests for AAAA (IPv6) records; for other record types, it returns a response containing zero answers

When our DNS filter receives a request, it first checks for a key matching the exact queried FQDN. If it finds one, it “scrubs” (blocks or blackholes) the request as specified by the associated value. If there’s no exact match, it looks for the domain name in two lists – one of blocked domains and the other of blackholed domains – and scrubs the request in the appropriate way if there’s a match.

If the queried domain is not malicious, the system forwards it to a Google DNS server for regular processing.

Setting Up the Key-Value Store

We define the domains we consider malicious in two kinds of entries in the NGINX Plus key‑value store:

  • Individual entries for fully qualified domain name (FQDNs), each with a value of either blocked or blackhole
  • Two lists of domains, with keys called blocked_domains and blackholed_domains, each mapped to a list of domains in comma‑separated value (CSV) format

The following configuration for key‑value store goes in the stream context. The keyval_zone directive allocates a block of memory for the key‑value store, called dns_config. The first keyval directive loads in any matching FQDN key‑value pair which has been set, while the second and third define the two lists of domains:

# Key-value store for blocking domains (NGINX Plus only)
keyval_zone zone=dns_config:64k state=/etc/nginx/zones/dns_config.zone;
keyval $dns_qname $scrub_action zone=dns_config;
keyval "blocked_domains" $blocked_domains zone=dns_config;
keyval "blackhole_domains" $blackhole_domains zone=dns_config;

We can then use the NGINX Plus API to assign the value blocked or blackhole to any FQDN key that we explicitly want to scrub, or modify the CSV‑formatted list associated with the blocked_domains or blackhole_domains key. We can modify or remove exact FQDNs, or modify the set of domains in either list, at any time and the DNS filter is instantly updated with the changes.

Choosing the Upstream Server to Handle the Request

The following config loads the $dns_response variable, which is populated by a js_preread directive in the server blocks defined in Configuring the Servers that Listen for DNS Queries below. When the invoked JavaScript code determines that the request needs to be scrubbed, it sets the variable to blocked or blackhole as appropriate.

We use a map directive to assign the value of the $dns_response variable to the $upstream_pool variable, thereby controlling which of the upstream groups in Defining the Upstream Servers below handles the request. Queries for non‑malicious domains get forwarded to the default Google DNS server, while queries for blocked or blackholed domains are handled by the upstream server for those categories.

# The DNS response packet; if we're scrubbing the domain, this gets set
js_set $dns_response dns_get_response;

# Set upstream to the Google DNS server if $dns_response is empty, otherwise
# to 'blocked' or 'blackhole'
map $dns_response $upstream_pool {
    "blocked" blocked;
    "blackhole" blackhole;
    default google;
}

Defining the Upstream Servers

This config defines the blocked, blackhole, and google groups of upstream servers, which handle requests for blocked, blackhole, and non‑malicious domains respectively.

# Upstream pool for blocked requests
upstream blocked {
    zone blocked 64k;
    server 127.0.0.1:9953;
}

# Upstream pool for blackholed requests
upstream blackhole {
    zone blackhole 64k;
    server 127.0.0.1:9853;
}

# Upstream pool for standard (Google) DNS
upstream google {
    zone dns 64k;
    server 8.8.8.8:53;
}

Configuring the Servers that Listen for DNS Queries

Here we define the servers in the stream context that listen for incoming DNS requests. The js_preread directive invokes the Javascript code that decodes the DNS packet, retrieves the domain name from the packet’s NAME field, and looks up the domain in the key‑value store. If the domain name matches an FQDN key or is within the zone of one of the domains in either the blocked_domains or blackhole_domains list, it gets scrubbed. Otherwise it ends up sent to the Google DNS server for resolution.

# DNS (TCP) server
server {
    listen 53;
    js_preread dns_preread_dns_request;
    proxy_pass $upstream_pool;
}

# DNS (UDP) server
server {
    listen 53 udp;
    js_preread dns_preread_dns_request;
    proxy_responses 1;
    proxy_pass $upstream_pool;
}

That’s almost it – we just need to add our final server block, which responds to queries with either the blackholed or blocked response.

# Server for responding to blocked/blackholed responses
server {
    listen 127.0.0.1:9953;
    listen 127.0.0.1:9853;
    listen 127.0.0.1:9953 udp;
    listen 127.0.0.1:9853 udp;
    js_preread dns_preread_dns_request;
    return $dns_response;
  }

Note that the servers in this block are almost identical to the actual DNS servers just above them, the main difference being that these servers send a response packet to the client instead of forwarding the request to an upstream group of DNS servers. Our JavaScript code detects when it is being run from port 9953 or 9853, and instead of setting a flag to indicate that the packet should be blocked, it populates $dns_response with the actual response packet. That’s all there is to it.

Of course, we could also have applied this filtering to DoT and DoH services, but we used standard DNS in order to keep things simple. Merging of DNS filtering with DoH gateway is left as an exercise for the reader.

Testing the Filter

We have previously used the NGINX Plus API to add entries for two FQDNs, and some domains to the two lists of domains:

$ curl -s http://localhost:8080/api/5/stream/keyvals/dns_config | jq
{
   "www.some.bad.host": "blocked",
   "www.some.other.host": "blackhole",
   "blocked_domains": "bar.com,baz.com",
   "blackhole_domains": "foo.com,nginx.com"
}

So when we request resolution of www.foo.com (a host within the blackholed foo.com domain), we get an A record with IP address 0.0.0.0:

$ dig @localhost www.foo.com

; <<>> DiG 9.11.3-1ubuntu1.9-Ubuntu <<>> @localhost www.foo.com
; (1 server found)
;; global options: +mcd
;; Got answer:
;; ->>HEADER,,- opcode: QUERY, status: NOERROR, id: 58558
;; flags: qr aa rd ad; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
;; WARNING: recursion requested but not available

;; QUESTION SECTION:
; www.foo.com.                  IN      A

;; ANSWER SECTION:
www.foo.com.           300      IN      A      0.0.0.0

;; Query time: 0 msec
;; SERVER: 172.0.0.1#53(126.0.0.1)
;; WHEN: Mon Dec 2 14:31:35 UTC 2019
;; MSG SIZEW  rcvd: 45

Accessing the Code

The JavaScript code and NGINX configuration files discussed in this blog are available in my GitHub repo, in the examples folder.

Cover image
Monolith to Microservices

Free ebook that goes deep on transitioning an existing monolithic architecture to microservices

Tags

No More Tags to display