NGINX at WordPress.com

Scaling to Tens of Thousands of Simultaneous Connections

 

 
WordPress.com is the cloud version of WordPress that is hosted and supported by Automattic.

WordPress.com serves more than 33 million sites attracting over 339 million people and 3.4 billion pages each month. Since April 2008, WordPress.com has experienced about 4.4 times growth in page views. WordPress.com VIP hosts many popular sites including CNN’s Political Ticker, the NFL, Time Inc’s The Page, People Magazine’s Style Watch, corporate blogs for Flickr and KROQ, and many more. Automattic operates 2000 servers in 12 globally distributed data centers. WordPress.com customer data is instantly replicated across different locations to provide an extremely reliable and fast web experience for hundreds of millions of visitors.

Problem

WordPress.com, which began in 2005, started on shared hosting, much like all of the WordPress.org sites. It was soon moved to a single dedicated server and then to two servers. In late 2005, WordPress.com opened to the public and by early 2006 had expanded to four web servers, with traffic being distributed using round‑robin DNS. Soon thereafter WordPress.com expanded to a second data center and then to a third. It quickly became apparent that round‑robin DNS wasn’t a viable long‑term solution.

While hardware appliances like F5 BIG‑IP offered many features that WordPress.com required, Automattic decided to evaluate different options built on existing open source software. Using open source software on commodity hardware provides the ultimate level of flexibility and also brings cost savings.

Purchasing a pair of capable hardware appliances in a failover configuration for a single datacenter may be a little expensive, but purchasing and servicing 10 sets for 10 data centers soon becomes very expensive.

At first, the WordPress.com team chose Pound as a software load balancer because of its ease of use and built‑in SSL support. After using Pound for about two years, WordPress.com required additional functionality and scalability, namely:

  • On‑the‑fly reconfiguration capabilities, without interruption to live traffic.
  • Better health‑check mechanisms, enabling smooth and gradual recovery from a backend failure, without overloading application infrastructure with unexpected load of requests.
  • Better scalability as measured both in requests per second and the number of concurrent connections. Pound’s thread‑based model wasn’t able to reliably handle over 1,000 requests per second per instance.

Solution

In April 2008 Automattic converted all WordPress.com load balancers from Pound to NGINX. Before that Automattic engineers had been using NGINX for Gravatar for a few months and were impressed by its performance and scalability, so moving WordPress.com over was the natural next step. Before switching WordPress.com to NGINX, Automattic evaluated several other products, including HAProxy, and LVS. Here are some of the reasons why NGINX was chosen:

  • Easy, flexible and logical configuration.
  • Ability to reconfigure and upgrade NGINX instances on‑the‑fly, without dropping user requests.
  • Application request routing via FastCGI, uwsgi, or SCGI protocols; NGINX can also serve static content directly from storage for additional performance optimization.
  • The only software tested that was capable of reliably handling over 10,000 requests per second of live traffic to WordPress applications from a single server.
  • NGINX’s memory and CPU footprints are minimal and predictable. After switching to NGINX the CPU usage on the load balancing servers dropped by a factor of three.

Results

As of 2012, WordPress.com was serving an average of 70,000 requests per second and over 15 Gbps of traffic from its 36 NGINX load balancers, with plenty of room to grow. Most of the NGINX load balancers serve about 5,000 requests per second, sometimes peaking to 20,000 requests per second, and have about 50,000 established connections.

A typical hardware configuration is Dual Xeon 5620 8‑core CPUs with hyper‑threading, 8–12 GB of RAM, running Debian Linux 6.0. As part of its high availability setup, WordPress.com previously used Wackamole/Spread but has recently started to migrate to Keepalived. Even distribution of inbound requests across the NGINX‑based web‑acceleration and load‑balancing layer is based on the DNS round‑robin mechanism.

Following a successful deployment of NGINX as a web acceleration, load balancing, and traffic management solution, WordPress.com recently migrated from Litespeed to NGINX across all application backend servers.

NGINX combined with the FastCGI Process Manager (FPM) for PHP allows greater control, easier configuration, and no additional maintenance overhead for the 5‑member Automattic Systems Team.

References