When Every Millisecond Counts, Team Internet Uses NGINX Plus on AWS

NGINX Plus Enables Autoscaling on Every Tier with Automation and APM Integration

 

Logo for Team Internet NGINX Plus on AWS Case Study

Situation

Team Internet is a leading provider of services in the online advertising space. The company offers a variety of ad types that match advertisers with online publishers in order to provide high value, targeted traffic for its advertisers and relevant, useful content for Internet users.

One of Team Internet’s products is TONIC., a traffic marketplace that enabled advertisers to drive highly relevant traffic to their websites. Advertisers are able to from a range of detailed targeting options to ensure maximum relevancy of their ads. Despite the complexity the whole process from bidding to ad delivery needs to happen in 100-200 milliseconds. Top performance of the platform is critical in order to satisfy advertisers – who are Team Internet’s customers – and Internet users themselves.

We are very, very latency sensitive. We need to answer requests very fast, and we often receive more than 25,000 requests per second. Each one of those needs to be answered within 100-200 milliseconds. For us, every millisecond counts.
– Markus Ostertag, Head of Development (Advertiser Products) at Team Internet

Team Internet started out using open source NGINX hosted on AWS to manage its TONIC. traffic. For several years, this worked well. The team compiled and packaged its own open source NGINX software and a separate third-party plug-in for health checks. However, compiling a customized version of open source NGINX was becoming time consuming. In addition, the third-party plug-in they had been using was not available for the latest version of NGINX.

“We wanted to upgrade to the latest version of open source NGINX to be on the most secure version available, but there was no patch available for the third-party health check plug-in that we had been compiling. We could have written it ourselves, but that would involve a lot of work, which costs money, and we could run the risk of introducing potential security issues with our own patch. Or we could look into NGINX Plus, which is tested and packaged by engineers who work at NGINX, has health checks built in, and is cheaper than creating and maintaining our own open source distribution,” explains Ostertag.

Near the same time, adoption of the TONIC. platform was growing. Onboarding a new publisher could mean adding – within minutes – multiple thousand requests per second to its traffic load, sometimes requiring a sudden increase in backend server capacity. To achieve the necessary flexibility, Team Internet needed to be able to autoscale its NGINX instances along with the rest of its infrastructure.

“We came to a point where we needed to increase the number of NGINX load balancers we were running and autoscale them. In order to autoscale, we needed a centralized way of adding and removing our backend servers. I tried to find out how this would be possible to do without needing to restart or reload the whole nginx process. Then I stumbled upon the DNS service discovery feature of NGINX Plus that would enable us to autoscale easily and help us reach our goals, so we decided to take a deeper look,” notes Ostertag.

Solution

Team Internet started a free 30-day evaluation of NGINX Plus. NGINX Plus is a complete application delivery platform with health checks, autoscaling support, and additional enterprise-ready features, all in one easy-to-deploy software package. Team Internet generated a new Amazon Machine Image (AMI) and ran two instances of NGINX Plus in parallel with their open source instances.

Within just a few days, the NGINX Plus deployment on AWS processed more requests more efficiently than its old architecture based on open source NGINX.

The combination of having NGINX Plus already neatly packaged with the features we need, having the better way of doing health checks against backends, having the ability to easily autoscale, and having better performance was powerful. These features helped convince my CTO to purchase NGINX Plus.
– Markus Ostertag, Head of Development (Advertiser Products) at Team Internet

Now, Team Internet uses NGINX Plus in conjunction with Amazon Elastic Load Balancer (ELB), Datadog, and AWS Lambda to autoscale and deliver its Node.js application. The development team set up NGINX Plus servers as an autoscaling group connected to ELB. If there is a spike in traffic, ELB will detect this and automatically bring up new NGINX Plus instances to handle additional load. When the traffic spike dies down, the additional instances are destroyed to save Team Internet money.

NGINX Plus also helps Team Internet determine when to scale backend instances up and down by providing information about traffic flow via its extended status API. The extended status API provides a wealth of metrics about the performance backend instances. Team Internet feeds this data into Datadog using the pre-built plugin made available by the NGINX team. When traffic load passes a certain threshold, an alert is triggered and additional Node.js servers are started to handle the traffic. Just as with the NGINX Plus tier, when the traffic level dies down, the additional instances are gracefully removed to help reduce costs.

Ostertag adds,“if ELB tells us that it has a queue of requests, we scale up NGINX Plus and work against this queue. If NGINX Plus tells us that it needs more resources, then we scale out the backend servers. We are scaling on every tier.”

Results

> Autoscaling with On-the-Fly Reconfiguration

With traffic flows that can vary widely from one day to the next, the ability to scale up and down automatically is essential to Team Internet’s business.

By upgrading to NGINX Plus and taking advantage of the on-the-fly reconfiguration feature, Team Internet can dynamically add and remove backend instances from its load-balanced server groups. Specifically, Team Internet uses the information in DNS A records to change the configuration of upstream server groups. The benefit is that the development team can manage the servers in the upstream group without modifying the NGINX Plus configuration directly, which allows them to autoscale easily and without manual intervention. Best of all, setting up the autoscaling infrastructure with NGINX Plus took just two hours of work time.

“The on-the-fly reconfiguration feature of NGINX Plus is really huge for us. It gives us the flexibility we want and need to have at our load-balancing layer in order to support our business,” notes Ostertag.

Autoscaling with NGINX Plus on AWS Diagram
Team Internet autoscales on every tier

> Saving Time with Built-In Application Health Checks

With NGINX Plus, Team Internet enjoys built-in, intelligent application health checks. Having the the health checks built in not only saves Team Internet time, but the NGINX Plus checks work even better than the open source health check patch it used in the past.

“With NGINX Plus, we don’t have the hassle of compiling our own NGINX version from source. It’s time consuming, and someone has to manage it. We try to keep overhead on operations very low. With NGINX Plus, we no longer need to spend our time and resources building and customizing NGINX because we have a full-featured product with everything we need built in and the experience and support we need from the experts at NGINX,” says Ostertag.

> Up and Running in Under 6 Hours

When switching to NGINX Plus, Team Internet’s developers wanted full control over their AWS deployment, so they installed NGINX Plus on their own AMI rather than using the prebuilt NGINX Plus AMI. Even so, the entire process took no more than six hours.

“We spent a few hours generating a new AMI and installing it. This was very easy because of the great documentation available from the NGINX team on how to install it. After following the instructions, I started up my own AMI and it worked out of the box, with even better performance than our previous setup. After our testing period, we switched over from the evaluation license to the NGINX Plus licenses we purchased, and then set up our autoscaling group which took another couple of hours. So to switch over the whole thing – new AMI with NGINX Plus, new configuration, new licensing, and autoscaling – in total took no more than four to six effective working hours,” explains Ostertag.

> Great Out of the Box Performance

The TONIC. platform brings together 30+ publishers who are selling traffic with its more than 15,000 internal advertisers together running 100,000+ campaigns and executing many millions of different bids. Behind the scenes, a lot of server-to-server calls and requests to databases are needed to make this happen. With NGINX Plus in its environment, Team Internet can process more requests per server than before.

“We’re happy with how NGINX Plus performs for us. With NGINX Plus doing the heavy lifting for us, we can focus on other parts of the business, like developing new features, monitoring our systems, and finding ways to continue improving and outperforming the competition.”

 

Team Internet Image of Team on a Regular Thursday

About Team Internet

Team Internet was founded in 2010 with the belief that Ideas. Change. Markets. Located in Munich, Germany, Team Internet is a leading provider of online advertising services. The company’s products currently reach more than 1 billion visitors globally per month.

With an estimated market size of US $1 billion per year, direct navigation traffic has a 4.23% conversion-to-sale rate, significantly outperforming other traffic sources such as search engines (2.30%) and Internet links (0.96%). Direct navigation is the preferred way for users to locate a site for the first time, which makes this traffic source a prime vehicle for user acquisition.

For more information, visit http://www.teaminternet.com/

Infrastructure as Code
Get the latest on Microservices Design & Deployment