NGINX.COM
Web Server Load Balancing with NGINX Plus

Building Microservices: Using an API Gateway

Editor – This seven‑part series of articles is now complete:

  1. Introduction to Microservices
  2. Building Microservices: Using an API Gateway (this article)
  3. Building Microservices: Inter‑Process Communication in a Microservices Architecture
  4. Service Discovery in a Microservices Architecture
  5. Event-Driven Data Management for Microservices
  6. Choosing a Microservices Deployment Strategy
  7. Refactoring a Monolith into Microservices

You can also download the complete set of articles, plus information about implementing microservices using NGINX Plus, as an ebook – Microservices: From Design to Deployment. Also, please look at the new Microservices Solutions page.

The first article in this seven‑part series about designing, building, and deploying microservices introduced the Microservices Architecture pattern. It discussed the benefits and drawbacks of using microservices and how, despite the complexity of microservices, they are usually the ideal choice for complex applications. This is the second article in the series and will discuss building microservices using an API Gateway.

When you choose to build your application as a set of microservices, you need to decide how your application’s clients will interact with the microservices. With a monolithic application there is just one set of (typically replicated, load‑balanced) endpoints. In a microservices architecture, however, each microservice exposes a set of what are typically fine‑grained endpoints. In this article, we examine how this impacts client‑to‑application communication and proposes an approach that uses an API Gateway.

Introduction

Let’s imagine that you are developing a native mobile client for a shopping application. It’s likely that you need to implement a product details page, which displays information about any given product.

For example, the following diagram shows what you will see when scrolling through the product details in Amazon’s Android mobile application.

Indexed elements of Amazon's mobile app for Android, as they appear on a mobile phone screen

Even though this is a smartphone application, the product details page displays a lot of information. For example, not only is there basic product information (such as name, description, and price) but this page also shows:

  • Number of items in the shopping cart
  • Order history
  • Customer reviews
  • Low inventory warning
  • Shipping options
  • Various recommendations, including other products this product is frequently bought with, other products bought by customers who bought this product, and other products viewed by customers who bought this product
  • Alternative purchasing options

When using a monolithic application architecture, a mobile client would retrieve this data by making a single REST call (GET api.company.com/productdetails/productId) to the application. A load balancer routes the request to one of N identical application instances. The application would then query various database tables and return the response to the client.

In contrast, when using the microservices architecture the data displayed on the product details page is owned by multiple microservices. Here are some of the potential microservices that own data displayed on the example product details page:

  • Shopping Cart Service – Number of items in the shopping cart
  • Order Service – Order history
  • Catalog Service – Basic product information, such as its name, image, and price
  • Review Service – Customer reviews
  • Inventory Service – Low inventory warning
  • Shipping Service – Shipping options, deadlines, and costs drawn separately from the shipping provider’s API
  • Recommendation Service(s) – Suggested items

Mobile client of ecommerce app needs a way to access the RESTful APIs of the 7 microservices

We need to decide how the mobile client accesses these services. Let’s look at the options.

Direct Client‑to‑Microservice Communication

In theory, a client could make requests to each of the microservices directly. Each microservice would have a public endpoint (https://serviceName.api.company.name). This URL would map to the microservice’s load balancer, which distributes requests across the available instances. To retrieve the product details, the mobile client would make requests to each of the services listed above.

Unfortunately, there are challenges and limitations with this option. One problem is the mismatch between the needs of the client and the fine‑grained APIs exposed by each of the microservices. The client in this example has to make seven separate requests. In more complex applications it might have to make many more. For example, Amazon describes how hundreds of services are involved in rendering their product page. While a client could make that many requests over a LAN, it would probably be too inefficient over the public Internet and would definitely be impractical over a mobile network. This approach also makes the client code much more complex.

Another problem with the client directly calling the microservices is that some might use protocols that are not web‑friendly. One service might use Thrift binary RPC while another service might use the AMQP messaging protocol. Neither protocol is particularly browser‑ or firewall‑friendly and is best used internally. An application should use protocols such as HTTP and WebSocket outside of the firewall.

Another drawback with this approach is that it makes it difficult to refactor the microservices. Over time we might want to change how the system is partitioned into services. For example, we might merge two services or split a service into two or more services. If, however, clients communicate directly with the services, then performing this kind of refactoring can be extremely difficult.

Because of these kinds of problems it rarely makes sense for clients to talk directly to microservices.

Using an API Gateway

Usually a much better approach is to use what is known as an API Gateway. An API Gateway is a server that is the single entry point into the system. It is similar to the Facade pattern from object‑oriented design. The API Gateway encapsulates the internal system architecture and provides an API that is tailored to each client. It might have other responsibilities such as authentication, monitoring, load balancing, caching, request shaping and management, and static response handling.

The following diagram shows how an API Gateway typically fits into the architecture:

An API gateway enables mobile clients of ecommerce app to access the RESTful APIs of its 7 microservices

The API Gateway is responsible for request routing, composition, and protocol translation. All requests from clients first go through the API Gateway. It then routes requests to the appropriate microservice. The API Gateway will often handle a request by invoking multiple microservices and aggregating the results. It can translate between web protocols such as HTTP and WebSocket and web‑unfriendly protocols that are used internally.

The API Gateway can also provide each client with a custom API. It typically exposes a coarse‑grained API for mobile clients. Consider, for example, the product details scenario. The API Gateway can provide an endpoint (/productdetails?productid=xxx) that enables a mobile client to retrieve all of the product details with a single request. The API Gateway handles the request by invoking the various services – product info, recommendations, reviews, etc. – and combining the results.

A great example of an API Gateway is the Netflix API Gateway. The Netflix streaming service is available on hundreds of different kinds of devices including televisions, set‑top boxes, smartphones, gaming systems, tablets, etc. Initially, Netflix attempted to provide a one‑size‑fits‑all API for their streaming service. However, they discovered that it didn’t work well because of the diverse range of devices and their unique needs. Today, they use an API Gateway that provides an API tailored for each device by running device‑specific adapter code. An adapter typically handles each request by invoking on average six to seven backend services. The Netflix API Gateway handles billions of requests per day.

Benefits and Drawbacks of an API Gateway

As you might expect, using an API Gateway has both benefits and drawbacks. A major benefit of using an API Gateway is that it encapsulates the internal structure of the application. Rather than having to invoke specific services, clients simply talk to the gateway. The API Gateway provides each kind of client with a specific API. This reduces the number of round trips between the client and application. It also simplifies the client code.

The API Gateway also has some drawbacks. It is yet another highly available component that must be developed, deployed, and managed. There is also a risk that the API Gateway becomes a development bottleneck. Developers must update the API Gateway in order to expose each microservice’s endpoints. It is important that the process for updating the API Gateway be as lightweight as possible. Otherwise, developers will be forced to wait in line in order to update the gateway. Despite these drawbacks, however, for most real‑world applications it makes sense to use an API Gateway.

Implementing an API Gateway

Now that we have looked at the motivations and the trade‑offs for using an API Gateway, let’s look at various design issues you need to consider.

Performance and Scalability

Only a handful of companies operate at the scale of Netflix and need to handle billions of requests per day. However, for most applications the performance and scalability of the API Gateway is usually very important. It makes sense, therefore, to build the API Gateway on a platform that supports asynchronous, nonblocking I/O. There are a variety of different technologies that can be used to implement a scalable API Gateway. On the JVM you can use one of the NIO‑based frameworks such Netty, Vertx, Spring Reactor, or JBoss Undertow. One popular non‑JVM option is Node.js, which is a platform built on Chrome’s JavaScript engine. Another option is to use NGINX Plus. NGINX Plus offers a mature, scalable, high‑performance web server and reverse proxy that is easily deployed, configured, and programmed. NGINX Plus can manage authentication, access control, load balancing requests, caching responses, and provides application‑aware health checks and monitoring.

Using a Reactive Programming Model

The API Gateway handles some requests by simply routing them to the appropriate backend service. It handles other requests by invoking multiple backend services and aggregating the results. With some requests, such as a product details request, the requests to backend services are independent of one another. In order to minimize response time, the API Gateway should perform independent requests concurrently. Sometimes, however, there are dependencies between requests. The API Gateway might first need to validate the request by calling an authentication service, before routing the request to a backend service. Similarly, to fetch information about the products in a customer’s wish list, the API Gateway must first retrieve the customer’s profile containing that information, and then retrieve the information for each product. Another interesting example of API composition is the Netflix Video Grid.

Writing API composition code using the traditional asynchronous callback approach quickly leads you to callback hell. The code will be tangled, difficult to understand, and error‑prone. A much better approach is to write API Gateway code in a declarative style using a reactive approach. Examples of reactive abstractions include Future in Scala, CompletableFuture in Java 8, and Promise in JavaScript. There is also Reactive Extensions (also called Rx or ReactiveX), which was originally developed by Microsoft for the .NET platform. Netflix created RxJava for the JVM specifically to use in their API Gateway. There is also RxJS for JavaScript, which runs in both the browser and Node.js. Using a reactive approach will enable you to write simple yet efficient API Gateway code.

Service Invocation

A microservices‑based application is a distributed system and must use an inter‑process communication mechanism. There are two styles of inter‑process communication. One option is to use an asynchronous, messaging‑based mechanism. Some implementations use a message broker such as JMS or AMQP. Others, such as Zeromq, are brokerless and the services communicate directly. The other style of inter‑process communication is a synchronous mechanism such as HTTP or Thrift. A system will typically use both asynchronous and synchronous styles. It might even use multiple implementations of each style. Consequently, the API Gateway will need to support a variety of communication mechanisms.

Service Discovery

The API Gateway needs to know the location (IP address and port) of each microservice with which it communicates. In a traditional application, you could probably hardwire the locations, but in a modern, cloud‑based microservices application this is a nontrivial problem. Infrastructure services, such as a message broker, will usually have a static location, which can be specified via OS environment variables. However, determining the location of an application service is not so easy. Application services have dynamically assigned locations. Also, the set of instances of a service changes dynamically because of autoscaling and upgrades. Consequently, the API Gateway, like any other service client in the system, needs to use the system’s service discovery mechanism: either Server‑Side Discovery or Client‑Side Discovery. A later article will describe service discovery in more detail. For now, it is worthwhile to note that if the system uses Client‑Side Discovery then the API Gateway must be able to query the Service Registry, which is a database of all microservice instances and their locations.

Handling Partial Failures

Another issue you have to address when implementing an API Gateway is the problem of partial failure. This issue arises in all distributed systems whenever one service calls another service that is either responding slowly or is unavailable. The API Gateway should never block indefinitely waiting for a downstream service. However, how it handles the failure depends on the specific scenario and which service is failing. For example, if the recommendation service is unresponsive in the product details scenario, the API Gateway should return the rest of the product details to the client since they are still useful to the user. The recommendations could either be empty or replaced by, for example, a hardwired top ten list. If, however, the product information service is unresponsive then API Gateway should return an error to the client.

The API Gateway could also return cached data if that was available. For example, since product prices change infrequently, the API Gateway could return cached pricing data if the pricing service is unavailable. The data can be cached by the API Gateway itself or be stored in an external cache such as Redis or Memcached. By returning either default data or cached data, the API Gateway ensures that system failures do not impact the user experience.

Netflix Hystrix is an incredibly useful library for writing code that invokes remote services. Hystrix times out calls that exceed the specified threshold. It implements a circuit breaker pattern, which stops the client from waiting needlessly for an unresponsive service. If the error rate for a service exceeds a specified threshold, Hystrix trips the circuit breaker and all requests will fail immediately for a specified period of time. Hystrix lets you define a fallback action when a request fails, such as reading from a cache or returning a default value. If you are using the JVM you should definitely consider using Hystrix. And, if you are running in a non‑JVM environment, you should use an equivalent library.

Summary

For most microservices‑based applications, it makes sense to implement an API Gateway, which acts as a single entry point into a system. The API Gateway is responsible for request routing, composition, and protocol translation. It provides each of the application’s clients with a custom API. The API Gateway can also mask failures in the backend services by returning cached or default data. In the next article in the series, we will look at communication between services.

Editor – This seven‑part series of articles is now complete:

  1. Introduction to Microservices
  2. Building Microservices: Using an API Gateway (this article)
  3. Building Microservices: Inter‑Process Communication in a Microservices Architecture
  4. Service Discovery in a Microservices Architecture
  5. Event-Driven Data Management for Microservices
  6. Choosing a Microservices Deployment Strategy
  7. Refactoring a Monolith into Microservices

You can also download the complete set of articles, plus information about implementing microservices using NGINX Plus, as an ebook – Microservices: From Design to Deployment.

For a detailed look at additional use cases, see our three‑part blog series, Deploying NGINX Plus as an API Gateway:

  • Part 1 provides detailed configuration instructions for several use cases.
  • Part 2 extends those use cases and looks at a range of safeguards that can be applied to protect and secure backend API services in production.
  • Part 3 explains how to deploy NGINX Plus as an API gateway for gRPC services.

Guest blogger Chris Richardson is the founder of the original CloudFoundry.com, an early Java PaaS (Platform as a Service) for Amazon EC2. He now consults with organizations to improve how they develop and deploy applications. He also blogs regularly about microservices at http://microservices.io.

Delivering Video at Scale with NGINX Plus

Serving Media with NGINX PlusWhether you’re promoting a product or training a new user, video is a powerful platform for getting your message out. Delivering video at a scale that satisfies user demand, however, can be a challenge, and sites like Netflix set a high bar. Users expect videos to begin playing instantly on whatever device they use.

NGINX and NGINX Plus provide a complete application delivery platform with features specifically designed for streaming media on your site. NGINX can stream MP4 and Flash video on demand (VOD), as well as live video using the Real‑Time Messaging Protocol (RTMP), Apple’s HTTP Live Streaming (HLS), and Dynamic Adaptive Streaming over HTTP (DASH).

NGINX Plus adds additional enterprise‑ready streaming media features, such as adaptive bitrate streaming for VOD, bandwidth controls for MP4 streaming, and enhanced session logging.

For complete details about how you can use NGINX and NGINX Plus as a streaming media server, check out this whitepaper, Serving Media with NGINX Plus.

Inside NGINX: How We Designed for Performance & Scale

NGINX leads the pack in web performance, and it’s all due to the way the software is designed. Whereas many web servers and application servers use a simple threaded or process‑based architecture, NGINX stands out with a sophisticated event‑driven architecture that enables it to scale to hundreds of thousands of concurrent connections on modern hardware.

The Inside NGINX infographic drills down from the high‑level process architecture to illustrate how NGINX handles multiple connections within a single process. This blog explains how it all works in further detail.

Setting the Scene – The NGINX Process Model

The NGINX (and NGINX Plus) master process spawns three types of child process: worker, cache manage, and cache loader. They used shared memory for caching, session persistence, rate limits, and logging.

To better understand this design, you need to understand how NGINX runs. NGINX has a master process (which performs the privileged operations such as reading configuration and binding to ports) and a number of worker and helper processes.

# service nginx restart
* Restarting nginx
# ps -ef --forest | grep nginx
root     32475     1  0 13:36 ?        00:00:00 nginx: master process /usr/sbin/nginx 
                                                -c /etc/nginx/nginx.conf
nginx    32476 32475  0 13:36 ?        00:00:00  _ nginx: worker process
nginx    32477 32475  0 13:36 ?        00:00:00  _ nginx: worker process
nginx    32479 32475  0 13:36 ?        00:00:00  _ nginx: worker process
nginx    32480 32475  0 13:36 ?        00:00:00  _ nginx: worker process
nginx    32481 32475  0 13:36 ?        00:00:00  _ nginx: cache manager process
nginx    32482 32475  0 13:36 ?        00:00:00  _ nginx: cache loader process

On this four‑core server, the NGINX master process creates four worker processes and a couple of cache helper processes which manage the on‑disk content cache.

Why Is Architecture Important?

The fundamental basis of any Unix application is the thread or process. (From the Linux OS perspective, threads and processes are mostly identical; the major difference is the degree to which they share memory.) A thread or process is a self‑contained set of instructions that the operating system can schedule to run on a CPU core. Most complex applications run multiple threads or processes in parallel for two reasons:

  • They can use more compute cores at the same time.
  • Threads and processes make it very easy to do operations in parallel (for example, to handle multiple connections at the same time).

Processes and threads consume resources. They each use memory and other OS resources, and they need to be swapped on and off the cores (an operation called a context switch). Most modern servers can handle hundreds of small, active threads or processes simultaneously, but performance degrades seriously once memory is exhausted or when high I/O load causes a large volume of context switches.

The common way to design network applications is to assign a thread or process to each connection. This architecture is simple and easy to implement, but it does not scale when the application needs to handle thousands of simultaneous connections.

How Does NGINX Work?

NGINX uses a predictable process model that is tuned to the available hardware resources:

  • The master process performs the privileged operations such as reading configuration and binding to ports, and then creates a small number of child processes (the next three types).
  • The cache loader process runs at startup to load the disk‑based cache into memory, and then exits. It is scheduled conservatively, so its resource demands are low.
  • The cache manager process runs periodically and prunes entries from the disk caches to keep them within the configured sizes.
  • The worker processes do all of the work! They handle network connections, read and write content to disk, and communicate with upstream servers.

The NGINX configuration recommended in most cases – running one worker process per CPU core – makes the most efficient use of hardware resources. You configure it by setting the auto parameter on the worker_processes directive:

worker_processes auto;

When an NGINX server is active, only the worker processes are busy. Each worker process handles multiple connections in a nonblocking fashion, reducing the number of context switches.

Each worker process is single‑threaded and runs independently, grabbing new connections and processing them. The processes can communicate using shared memory for shared cache data, session persistence data, and other shared resources.

Inside the NGINX Worker Process

The NGINX worker process is a nonblocking, event-driven engine for processing requests from web clients.

Each NGINX worker process is initialized with the NGINX configuration and is provided with a set of listen sockets by the master process.

The NGINX worker processes begin by waiting for events on the listen sockets (accept_mutex and kernel socket sharding). Events are initiated by new incoming connections. These connections are assigned to a state machine – the HTTP state machine is the most commonly used, but NGINX also implements state machines for stream (raw TCP) traffic and for a number of mail protocols (SMTP, IMAP, and POP3).

To process and incoming client request, NGINX reads the HTTP headers, applies limits if configured, makes internal redirects and subrequests as required, forwards to backend services, applies filters, and logs its actions.

The state machine is essentially the set of instructions that tell NGINX how to process a request. Most web servers that perform the same functions as NGINX use a similar state machine – the difference lies in the implementation.

Scheduling the State Machine

Think of the state machine like the rules for chess. Each HTTP transaction is a chess game. On one side of the chessboard is the web server – a grandmaster who can make decisions very quickly. On the other side is the remote client – the web browser that is accessing the site or application over a relatively slow network.

However, the rules of the game can be very complicated. For example, the web server might need to communicate with other parties (proxying to an upstream application) or talk to an authentication server. Third‑party modules in the web server can even extend the rules of the game.

A Blocking State Machine

Recall our description of a process or thread as a self‑contained set of instructions that the operating system can schedule to run on a CPU core. Most web servers and web applications use a process‑per‑connection or thread‑per‑connection model to play the chess game. Each process or thread contains the instructions to play one game through to the end. During the time the process is run by the server, it spends most of its time ‘blocked’ – waiting for the client to complete its next move.

Most web application platforms use blocking I/O, meaning each worker (thread or process) can handle only one active connection at a time.

  1. The web server process listens for new connections (new games initiated by clients) on the listen sockets.
  2. When it gets a new game, it plays that game, blocking after each move to wait for the client’s response.
  3. Once the game completes, the web server process might wait to see if the client wants to start a new game (this corresponds to a keepalive connection). If the connection is closed (the client goes away or a timeout occurs), the web server process returns to listening for new games.

The important point to remember is that every active HTTP connection (every chess game) requires a dedicated process or thread (a grandmaster). This architecture is simple and easy to extend with third‑party modules (‘new rules’). However, there’s a huge imbalance: the rather lightweight HTTP connection, represented by a file descriptor and a small amount of memory, maps to a separate thread or process, a very heavyweight operating system object. It’s a programming convenience, but it’s massively wasteful.

NGINX is a True Grandmaster

Perhaps you’ve heard of simultaneous exhibition games, where one chess grandmaster plays dozens of opponents at the same time?

Kiril Georgiev
Kiril Georgiev played 360 people simultaneously in Sofia, Bulgaria. His final score was 284 wins, 70 draws and 6 losses.

That’s how an NGINX worker process plays “chess.” Each worker (remember – there’s usually one worker for each CPU core) is a grandmaster that can play hundreds (in fact, hundreds of thousands) of games simultaneously.

NGINX uses an event-driven architecture with nonblocking I/O, so it can handle hundreds of thousands of simultaneous connections.

  1. The worker waits for events on the listen and connection sockets.
  2. Events occur on the sockets and the worker handles them:
    • An event on the listen socket means that a client has started a new chess game. The worker creates a new connection socket.
    • An event on a connection socket means that the client has made a new move. The worker responds promptly.

A worker never blocks on network traffic, waiting for its “opponent” (the client) to respond. When it has made its move, the worker immediately proceeds to other games where moves are waiting to be processed, or welcomes new players in the door.

Why Is This Faster than a Blocking, Multiprocess Architecture?

NGINX scales very well to support hundreds of thousands of connections per worker process. Each new connection creates another file descriptor and consumes a small amount of additional memory in the worker process. There is very little additional overhead per connection. NGINX processes can remain pinned to CPUs. Context switches are relatively infrequent and occur when there is no work to be done.

In the blocking, connection‑per‑process approach, each connection requires a large amount of additional resources and overhead, and context switches (swapping from one process to another) are very frequent.

For a more detailed explanation, check out this article about NGINX architecture, by Andrew Alexeev, VP of Corporate Development and Co‑Founder at NGINX, Inc.

With appropriate system tuning, NGINX can scale to handle hundreds of thousands of concurrent HTTP connections per worker process, and can absorb traffic spikes (an influx of new games) without missing a beat.

Updating Configuration and Upgrading NGINX

NGINX’s process architecture, with a small number of worker processes, makes for very efficient updating of the configuration and even the NGINX binary itself.

NGINX reloads its configuration without any downtime (interruption of request processing).

Updating NGINX configuration is a very simple, lightweight, and reliable operation. It typically just means running the nginx -s reload command, which checks the configuration on disk and sends the master process a SIGHUP signal.

When the master process receives a SIGHUP, it does two things:

  1. Reloads the configuration and forks a new set of worker processes. These new worker processes immediately begin accepting connections and processing traffic (using the new configuration settings).
  2. Signals the old worker processes to gracefully exit. The worker processes stop accepting new connections. As soon as each current HTTP request completes, the worker process cleanly shuts down the connection (that is, there are no lingering keepalives). Once all connections are closed, the worker processes exit.

This reload process can cause a small spike in CPU and memory usage, but it’s generally imperceptible compared to the resource load from active connections. You can reload the configuration multiple times per second (and many NGINX users do exactly that). Very rarely, issues arise when there are many generations of NGINX worker processes waiting for connections to close, but even those are quickly resolved.

NGINX’s binary upgrade process achieves the Holy Grail of high availability – you can upgrade the software on the fly, without any dropped connections, downtime, or interruption in service.

NGINX reloads its binary without any downtime (interruption of request processing).

The binary upgrade process is similar in approach to the graceful reload of configuration. A new NGINX master process runs in parallel with the original master process, and they share the listening sockets. Both processes are active, and their respective worker processes handle traffic. You can then signal the old master and its workers to gracefully exit.

The entire process is described in more detail in Controlling NGINX.

Conclusion

The Inside NGINX infographic provides a high‑level overview of how NGINX functions, but behind this simple explanation is over ten years of innovation and optimization that enable NGINX to deliver the best possible performance on a wide range of hardware while maintaining the security and reliability that modern web applications require.

If you’d like to read more about the optimizations in NGINX, check out these great resources:

Meet the NGINX Team at Red Hat Summit and DevNation

NGINX, Inc. is excited to sponsor the Red Hat Summit, taking place in Boston from June 23rd to 26th. Join us as we venture off to this open source event to meet some of the industry’s top leaders and witness the latest innovations in cloud computing, platform, virtualization, and more!

redhatsummit

Come meet the team and learn why NGINX is the secret heart of the modern web. We’d love the opportunity to show you how to deliver your sites and applications with performance, reliability, security, and scale.

  • Attend Making the Case for Microservices: Faster Application Development and Repair in room 204 on Wednesday, June 24th @ 4:50 – 5:50PM. In this DevNation talk, NGINX’s Head of Developer Relations, Sarah Novotny, will teach you how to move from a monolithic architecture to one that gives you more flexibility, makes it easier to innovate, responds fast, and more importantly gives your users a better experience.
  • Come meet the NGINX team at Booth #804. Pick up some cool NGINX swag and chat with our tech experts about how NGINX best fits into your app architecture.

If you haven’t signed up already, register with discount code NGINX995 to get $500 off the entry fee.

We hope to see you there!

Did You Hear? NGINX Will Be at DockerCon, June 22 – 23

That’s right! We hope you reserved your tickets to this sold-out conference, because we want to meet you there! Visit us at DockerCon from June 22nd to 23rd in San Francisco to learn how Docker and NGINX work together, enabling you to easily scale and deploy your containerized applications.

More awesome reasons to go:

  • Attend Interconnecting containers at scale with NGINX on Monday June 22, 4:50 – 5:30PM:. Learn from Sarah Novotny, NGINX’s Head of Developer Relations, how to properly route and accelerate HTTP and TCP traffic, how to manage traffic across your distributed microservice architecture, and so much more.
  • Meet the NGINX team at Booth G16: Come learn why NGINX is the standard for delivering web apps. We will show you how easy it is to deploy NGINX and NGINX Plus with Docker.

Continue reading “Did You Hear? NGINX Will Be at DockerCon, June 22 – 23”

Socket Sharding in NGINX Release 1.9.1

NGINX 1.9.1 introduces a new feature that enables use of the SO_REUSEPORT socket option, which is available in newer versions of many operating systems, including DragonFly BSD and Linux (kernel version 3.9 and later). This socket option allows multiple sockets to listen on the same IP address and port combination. The kernel then load balances incoming connections across the sockets.

Editor – For NGINX Plus users, this feature is supported in NGINX Plus Release 7 (R7) and later. For an overview of all the new features in that release, see Announcing NGINX Plus R7 on our blog.

The SO_REUSEPORT socket option has many potential real‑world applications. Other services can use it for easy rolling upgrades of executables (NGINX already supports rolling upgrades through different means). For NGINX, enabling this socket option can improve performance in certain scenarios by reducing lock contention.

As depicted in the figure, when the SO_REUSEPORT option is not enabled, a single listening socket notifies workers about incoming connections, and each worker tries to take a connection.

With the SO_REUSEPORT option enabled, there are multiple socket listeners for each IP address and port combination, one for each worker process. The kernel determines which available socket listener (and by implication, which worker) gets the connection. This can reduce lock contention between workers accepting new connections, and improve performance on multicore systems. However, it can also mean that when a worker is stalled by a blocking operation, the block affects not only connections that the worker has already accepted, but also connection requests that the kernel has assigned to the worker since it became blocked.

Configuring Socket Sharding

To enable the SO_REUSEPORT socket option, include the new reuseport parameter to the listen directive for HTTP or TCP (stream module) traffic, as in these examples:

http {
     server {
          listen 80 reuseport;
          server_name  localhost;
          # ...
     }
}

stream {
     server {
          listen 12345 reuseport;
          # ...
     }
}

Including the reuseport parameter also disables the accept_mutex directive for the socket, because the mutex is redundant with reuseport. It can still be worth setting accept_mutex if there are ports on which you don’t set reuseport.

Benchmarking Performance with reuseport

I ran a wrk benchmark with 4 NGINX workers on a 36‑core AWS instance. To eliminate network effects, I ran both client and NGINX on localhost, and also had NGINX return the string OK instead of a file. I compared three NGINX configurations: the default (equivalent to accept_mutex on), with accept_mutex off, and with reuseport. As shown in the figure, reuseport increases requests per second by 2 to 3 times, and reduces both latency and the standard deviation for latency.

reuseport-benchmark

I also ran a related benchmark with the client and NGINX on separate hosts and with NGINX returning an HTML file. As shown in the following table, with reuseport the decrease in latency was similar to the previous benchmark, and the standard deviation decreased even more dramatically (almost ten‑fold). Other results (not shown in the table) were also encouraging. With reuseport, the load was spread evenly across the worker processes. In the default condition (equivalent to accept_mutex on), some workers got a higher percentage of the load, and with accept_mutex off all workers experienced high load.

Latency (ms) Latency stdev (ms) CPU Load
Default 15.65 26.59 0.3
accept_mutex off 15.59 26.48 10
reuseport 12.35 3.15 0.3

In these benchmarks, the rate of connection requests is high but the requests don’t require extensive processing. Other preliminary testing also indicates that reuseport improves performance the most when traffic matches this profile. (The reuseport parameter is not available on the listen directive in the mail context, for example, because email traffic definitely does not match the profile.) We encourage you to test reuseport to determine whether it improves performance in your NGINX deployment, rather than applying it wholesale. For some tips on testing NGINX performance, check out Konstantin Pavlov’s talk at nginx.conf 2014.

Acknowledgments

Thanks to Yingqi Lu at Intel and Sepherosa Ziehau, who each contributed a solution to the NGINX project that enables use of the SO_REUSEPORT socket option. The NGINX team combined ideas from both contributions to create what we believe is an ideal solution.

NGINX + Velocity = May 27-29, 2015

velocity

NGINX will be back in Santa Clara, May 27–29 at Velocity! Join us there to learn from some of the industry’s top experts and return home with tools and techniques to optimize your websites and build resilient systems at scale.

Velocity is a nexus of industry expertise relating to web performance and DevOps culture. At NGINX, we’re all about high‑performance application delivery and offer DevOps‑friendly tools and benefits, so we’re delighted to take part in this event!

Visit NGINX at booth 812 and:

  • Meet the team and learn why NGINX is the secret heart of the modern web
  • Provide feedback about your experiences with NGINX
  • Chat with our tech experts about how NGINX and NGINX Plus can help you improve web and app performance
  • Discover the latest information and trends around modern application architecture and microservices

As a Silver sponsor of Velocity, we are happy to share with you a 25% discount code (Nginx25).

Even if you can’t attend Velocity, you can still find out how NGINX can help you deliver your sites and apps fast and flawlessly.

However, we hope to see you at Velocity.

Let Us Help Share Your NGINX Story

We’re looking for people to share their story and passion for NGINX with the community. Do you have an interesting use case to share? Excited about your implementation? Have you written or used a module you enjoy, or found helpful?

Share your insights and stretch your speaking skills by letting us know you want to be an NGINX community speaker! Whether you’re new to NGINX or you’re a regular contributor, we want to hear your voice and help you find speaking opportunities.

There are two ways to share your story. We recently announced the open call for proposals for nginx.conf 2015 and we’re looking to spotlight community members in lightning talks at upcoming Summit and Training events and local meetups.

Summit + Training Events and Meetups

Don’t see an event in your area? Submit a proposal with your location and we’ll let you know of upcoming community events near you.

nginx.conf 2015, September 22–24 in San Francisco

We’re excited to host nginx.conf 2015, our second annual user conference. Submit your talk idea in the call for proposals.

If you want some inspiration or education, check out presentations by members of the NGINX community at nginx.conf 2014 and previous Summits.

We look forward to sharing your story!

Load Balancing Microsoft Exchange with NGINX Plus

Last month we announced the release of NGINX Plus R6 and with it a number of new features that were on our customers’ wish lists. Several of the features, including full‑featured TCP load balancing and enhancements to our live activity monitoring dashboard, extend the benefits of NGINX Plus to enterprises and other organizations that rely on applications running over TCP rather than HTTP.

In particular, many customers have asked about using NGINX Plus to proxy and load balance Microsoft Exchange traffic. Exchange is widely deployed in all types of environments – from on‑premises data centers to cloud environments like Amazon Web Services (AWS), the Google Cloud Platform, and Microsoft Azure – as a mail server, calendar, and contacts management application.

Editor – The deployment guide announced in this blog, Load Balancing Microsoft Exchange Servers with NGINX Plus, was updated for NGINX Plus Release 7, which introduced support for NT LAN Manager (NTLM) authentication with Microsoft applications.

How NGINX Plus R6 Load Balances Microsoft Exchange Traffic

Here’s how the new features in R6 enable NGINX Plus to proxy and load balance Exchange traffic:

  • TCP load balancing – In addition to HTTP and HTTPS, Exchange uses ports and protocols that run over TCP, including Internet Message Access Protocol (IMAP) and Simple Mail Transfer Protocol (SMTP). With the comprehensive TCP load balancing and reverse proxy capabilities in NGINX Plus R6, enterprises can now benefit from improved performance, availability, and security for all types of Exchange traffic.

    Editor – NGINX Plus R9 and later extends load balancing support to UDP‑based protocols as well.

  • Monitoring and health checks – It can be a challenge to keep on top of all the moving parts in a multi‑protocol application like Exchange. With NGINX Plus’ live activity monitoring dashboard, you can monitor dozens of performance metrics in real time, spotting potential problems before they affect users. NGINX Plus R6 introduces a full range of TCP counters to complement the existing set of HTTP counters. You can also configure NGINX Plus to proactively check the health of TCP‑based servers. It automatically stops sending requests to downed servers, greatly reducing the number of “page not found” and other errors that your users experience.
  • Unbuffered upload – Microsoft Outlook uses Microsoft’s RPC over HTTP technology for communication. It opens two simultaneous connections, one for constant upload and the other for constant download, using the HTTP methods RPC_IN_DATA and RPC_OUT_DATA. NGINX Plus R6 (and NGINX Open Source 1.7.11 and later) support RPC over HTTP traffic with the unbuffered upload feature, which you configure with the proxy_request_buffering directive.

Our Deployment Guide Gets You Started Quickly

To get you started with load balancing your Exchange servers as quickly as possible, we’ve published a deployment guide, Load Balancing Microsoft Exchange Servers with NGINX Plus. It explains step by step how to configure all components – DNS, Exchange, firewalls, and NGINX Plu – on premises or in a cloud environment. You can choose between basic load balancing and an enhanced configuration that improves performance and makes your deployment easier to manage.

We know that many of our customers are long‑time and expert NGINX and NGINX Plus users, and for their convenience we’re also publishing plain NGINX Plus configuration files for both basic and enhanced load balancing of Exchange.

Experienced with Exchange but new to NGINX Plus? Take advantage of our free 30-day trial and see for yourself how NGINX Plus can boost the performance and manageability of your Exchange deployment.

Related Resources

Load Balancing Microsoft Exchange Servers with NGINX Plus

Configuration file for basic load balancing (for experienced NGINX Plus users)

Configuration file for enhanced load balancing (for experienced NGINX Plus users)

Introduction to Microservices

Registration is now open for Microservices March 2023. See the agenda and sign up here.

Editor – This seven‑part series of articles is now complete:

  1. Introduction to Microservices (this article)
  2. Building Microservices: Using an API Gateway
  3. Building Microservices: Inter-Process Communication in a Microservices Architecture
  4. Service Discovery in a Microservices Architecture
  5. Event-Driven Data Management for Microservices
  6. Choosing a Microservices Deployment Strategy
  7. Refactoring a Monolith into Microservices

You can also download the complete set of articles, plus information about implementing microservices using NGINX Plus, as an ebook – Microservices: From Design to Deployment.

Also check out our Microservices Solutions page.

Microservices are currently getting a lot of attention: articles, blogs, discussions on social media, and conference presentations. They are rapidly heading towards the peak of inflated expectations on the Gartner Hype cycle. At the same time, there are skeptics in the software community who dismiss microservices as nothing new. Naysayers claim that the idea is just a rebranding of SOA. However, despite both the hype and the skepticism, the Microservices Architecture pattern has significant benefits – especially when it comes to enabling the agile development and delivery of complex enterprise applications.

This blog post is the first in a seven‑part series about designing, building, and deploying microservices. You will learn about the approach and how it compares to the more traditional Monolithic Architecture pattern. This series will describe the various elements of a microservices architecture. You will learn about the benefits and drawbacks of the Microservices Architecture pattern, whether it makes sense for your project, and how to apply it.

Let’s first look at why you should consider using microservices.

Building Monolithic Applications

Let’s imagine that you were starting to build a brand new taxi‑hailing application intended to compete with Uber and Hailo. After some preliminary meetings and requirements gathering, you would create a new project either manually or by using a generator that comes with Rails, Spring Boot, Play, or Maven. This new application would have a modular hexagonal architecture, like in the following diagram:

Modular, but still monolithic, architecture used as basis for sample microservices application

At the core of the application is the business logic, which is implemented by modules that define services, domain objects, and events. Surrounding the core are adapters that interface with the external world. Examples of adapters include database access components, messaging components that produce and consume messages, and web components that either expose APIs or implement a UI.

Despite having a logically modular architecture, the application is packaged and deployed as a monolith. The actual format depends on the application’s language and framework. For example, many Java applications are packaged as WAR files and deployed on application servers such as Tomcat or Jetty. Other Java applications are packaged as self‑contained executable JARs. Similarly, Rails and Node.js applications are packaged as a directory hierarchy.

Applications written in this style are extremely common. They are simple to develop since our IDEs and other tools are focused on building a single application. These kinds of applications are also simple to test. You can implement end‑to‑end testing by simply launching the application and testing the UI with Selenium. Monolithic applications are also simple to deploy. You just have to copy the packaged application to a server. You can also scale the application by running multiple copies behind a load balancer. In the early stages of the project it works well.

Marching Towards Monolithic Hell

Unfortunately, this simple approach has a huge limitation. Successful applications have a habit of growing over time and eventually becoming huge. During each sprint, your development team implements a few more stories, which, of course, means adding many lines of code. After a few years, your small, simple application will have grown into a monstrous monolith. To give an extreme example, I recently spoke to a developer who was writing a tool to analyze the dependencies between the thousands of JARs in their multi‑million line of code (LOC) application. I’m sure it took the concerted effort of a large number of developers over many years to create such a beast.

Once your application has become a large, complex monolith, your development organization is probably in a world of pain. Any attempts at agile development and delivery will flounder. One major problem is that the application is overwhelmingly complex. It’s simply too large for any single developer to fully understand. As a result, fixing bugs and implementing new features correctly becomes difficult and time consuming. What’s more, this tends to be a downwards spiral. If the codebase is difficult to understand, then changes won’t be made correctly. You will end up with a monstrous, incomprehensible big ball of mud.

The sheer size of the application will also slow down development. The larger the application, the longer the start‑up time is. For example, in a recent survey some developers reported start‑up times as long as 12 minutes. I’ve also heard anecdotes of applications taking as long as 40 minutes to start up. If developers regularly have to restart the application server, then a large part of their day will be spent waiting around and their productivity will suffer.

Another problem with a large, complex monolithic application is that it is an obstacle to continuous deployment. Today, the state of the art for SaaS applications is to push changes into production many times a day. This is extremely difficult to do with a complex monolith since you must redeploy the entire application in order to update any one part of it. The lengthy start‑up times that I mentioned earlier won’t help either. Also, since the impact of a change is usually not very well understood, it is likely that you have to do extensive manual testing. Consequently, continuous deployment is next to impossible to do.

Monolithic applications can also be difficult to scale when different modules have conflicting resource requirements. For example, one module might implement CPU‑intensive image processing logic and would ideally be deployed in Amazon EC2 Compute Optimized instances. Another module might be an in‑memory database and best suited for EC2 Memory‑optimized instances. However, because these modules are deployed together you have to compromise on the choice of hardware.

Another problem with monolithic applications is reliability. Because all modules are running within the same process, a bug in any module, such as a memory leak, can potentially bring down the entire process. Moreover, since all instances of the application are identical, that bug will impact the availability of the entire application.

Last but not least, monolithic applications make it extremely difficult to adopt new frameworks and languages. For example, let’s imagine that you have 2 million lines of code written using the XYZ framework. It would be extremely expensive (in both time and cost) to rewrite the entire application to use the newer ABC framework, even if that framework was considerably better. As a result, there is a huge barrier to adopting new technologies. You are stuck with whatever technology choices you made at the start of the project.

To summarize: you have a successful business‑critical application that has grown into a monstrous monolith that very few, if any, developers understand. It is written using obsolete, unproductive technology that makes hiring talented developers difficult. The application is difficult to scale and is unreliable. As a result, agile development and delivery of applications is impossible.

So what can you do about it?

Microservices – Tackling the Complexity

Many organizations, such as Amazon, eBay, and Netflix, have solved this problem by adopting what is now known as the Microservices Architecture pattern. Instead of building a single monstrous, monolithic application, the idea is to split your application into set of smaller, interconnected services.

A service typically implements a set of distinct features or functionality, such as order management, customer management, etc. Each microservice is a mini‑application that has its own hexagonal architecture consisting of business logic along with various adapters. Some microservices would expose an API that’s consumed by other microservices or by the application’s clients. Other microservices might implement a web UI. At runtime, each instance is often a cloud VM or a Docker container.

For example, a possible decomposition of the system described earlier is shown in the following diagram:

Microservices architecture for a sample ride-for-hire app, with each microservice presenting a RESTful API

Each functional area of the application is now implemented by its own microservice. Moreover, the web application is split into a set of simpler web applications (such as one for passengers and one for drivers in our taxi‑hailing example). This makes it easier to deploy distinct experiences for specific users, devices, or specialized use cases.

Each backend service exposes a REST API and most services consume APIs provided by other services. For example, Driver Management uses the Notification server to tell an available driver about a potential trip. The UI services invoke the other services in order to render web pages. Services might also use asynchronous, message‑based communication. Inter‑service communication will be covered in more detail later in this series.

Some REST APIs are also exposed to the mobile apps used by the drivers and passengers. The apps don’t, however, have direct access to the backend services. Instead, communication is mediated by an intermediary known as an API Gateway. The API Gateway is responsible for tasks such as load balancing, caching, access control, API metering, and monitoring, and can be implemented effectively using NGINX. Later articles in the series will cover the API gateway.

The 'Scale Cube' with functional decomposition into microservices on the Y-axis

The Microservices Architecture pattern corresponds to the Y‑axis scaling of the Scale Cube, which is a 3D model of scalability from the excellent book The Art of Scalability. The other two scaling axes are X‑axis scaling, which consists of running multiple identical copies of the application behind a load balancer, and Z‑axis scaling (or data partitioning), where an attribute of the request (for example, the primary key of a row or identity of a customer) is used to route the request to a particular server.

Applications typically use the three types of scaling together. Y‑axis scaling decomposes the application into microservices as shown above in the first figure in this section. At runtime, X‑axis scaling runs multiple instances of each service behind a load balancer for throughput and availability. Some applications might also use Z‑axis scaling to partition the services. The following diagram shows how the Trip Management service might be deployed with Docker running on Amazon EC2.

Sample microservices app for ride-for-hire service, deployed in Docker containers and fronted by a load balancer

At runtime, the Trip Management service consists of multiple service instances. Each service instance is a Docker container. In order to be highly available, the containers are running on multiple Cloud VMs. In front of the service instances is a load balancer such as NGINX that distributes requests across the instances. The load balancer might also handle other concerns such as caching, access control, API metering, and monitoring.

The Microservices Architecture pattern significantly impacts the relationship between the application and the database. Rather than sharing a single database schema with other services, each service has its own database schema. On the one hand, this approach is at odds with the idea of an enterprise‑wide data model. Also, it often results in duplication of some data. However, having a database schema per service is essential if you want to benefit from microservices, because it ensures loose coupling. The following diagram shows the database architecture for the example application.

Database architecture in sample microservices application for ride service

Each of the services has its own database. Moreover, a service can use a type of database that is best suited to its needs, the so‑called polyglot persistence architecture. For example, Driver Management, which finds drivers close to a potential passenger, must use a database that supports efficient geo‑queries.

On the surface, the Microservices Architecture pattern is similar to SOA. With both approaches, the architecture consists of a set of services. However, one way to think about the Microservices Architecture pattern is that it’s SOA without the commercialization and perceived baggage of web service specifications (WS‑*) and an Enterprise Service Bus (ESB). Microservice‑based applications favor simpler, lightweight protocols such as REST, rather than WS‑*. They also very much avoid using ESBs and instead implement ESB‑like functionality in the microservices themselves. The Microservices Architecture pattern also rejects other parts of SOA, such as the concept of a canonical schema.

The Benefits of Microservices

The Microservices Architecture pattern has a number of important benefits. First, it tackles the problem of complexity. It decomposes what would otherwise be a monstrous monolithic application into a set of services. While the total amount of functionality is unchanged, the application has been broken up into manageable chunks or services. Each service has a well‑defined boundary in the form of an RPC‑ or message‑driven API. The Microservices Architecture pattern enforces a level of modularity that in practice is extremely difficult to achieve with a monolithic code base. Consequently, individual services are much faster to develop, and much easier to understand and maintain.

Second, this architecture enables each service to be developed independently by a team that is focused on that service. The developers are free to choose whatever technologies make sense, provided that the service honors the API contract. Of course, most organizations would want to avoid complete anarchy and limit technology options. However, this freedom means that developers are no longer obligated to use the possibly obsolete technologies that existed at the start of a new project. When writing a new service, they have the option of using current technology. Moreover, since services are relatively small it becomes feasible to rewrite an old service using current technology.

Third, the Microservices Architecture pattern enables each microservice to be deployed independently. Developers never need to coordinate the deployment of changes that are local to their service. These kinds of changes can be deployed as soon as they have been tested. The UI team can, for example, perform A/B testing and rapidly iterate on UI changes. The Microservices Architecture pattern makes continuous deployment possible.

Finally, the Microservices Architecture pattern enables each service to be scaled independently. You can deploy just the number of instances of each service that satisfy its capacity and availability constraints. Moreover, you can use the hardware that best matches a service’s resource requirements. For example, you can deploy a CPU‑intensive image processing service on EC2 Compute Optimized instances and deploy an in‑memory database service on EC2 Memory‑optimized instances.

The Drawbacks of Microservices

As Fred Brooks wrote almost 30 years ago, there are no silver bullets. Like every other technology, the Microservices architecture has drawbacks. One drawback is the name itself. The term microservice places excessive emphasis on service size. In fact, there are some developers who advocate for building extremely fine‑grained 10–100 LOC services. While small services are preferable, it’s important to remember that they are a means to an end and not the primary goal. The goal of microservices is to sufficiently decompose the application in order to facilitate agile application development and deployment.

Another major drawback of microservices is the complexity that arises from the fact that a microservices application is a distributed system. Developers need to choose and implement an inter‑process communication mechanism based on either messaging or RPC. Moreover, they must also write code to handle partial failure since the destination of a request might be slow or unavailable. While none of this is rocket science, it’s much more complex than in a monolithic application where modules invoke one another via language‑level method/procedure calls.

Another challenge with microservices is the partitioned database architecture. Business transactions that update multiple business entities are fairly common. These kinds of transactions are trivial to implement in a monolithic application because there is a single database. In a microservices‑based application, however, you need to update multiple databases owned by different services. Using distributed transactions is usually not an option, and not only because of the CAP theorem. They simply are not supported by many of today’s highly scalable NoSQL databases and messaging brokers. You end up having to use an eventual consistency based approach, which is more challenging for developers.

Testing a microservices application is also much more complex. For example, with a modern framework such as Spring Boot it is trivial to write a test class that starts up a monolithic web application and tests its REST API. In contrast, a similar test class for a service would need to launch that service and any services that it depends upon (or at least configure stubs for those services). Once again, this is not rocket science but it’s important to not underestimate the complexity of doing this.

Another major challenge with the Microservices Architecture pattern is implementing changes that span multiple services. For example, let’s imagine that you are implementing a story that requires changes to services A, B, and C, where A depends upon B and B depends upon C. In a monolithic application you could simply change the corresponding modules, integrate the changes, and deploy them in one go. In contrast, in a Microservices Architecture pattern you need to carefully plan and coordinate the rollout of changes to each of the services. For example, you would need to update service C, followed by service B, and then finally service A. Fortunately, most changes typically impact only one service and multi‑service changes that require coordination are relatively rare.

Deploying a microservices‑based application is also much more complex. A monolithic application is simply deployed on a set of identical servers behind a traditional load balancer. Each application instance is configured with the locations (host and ports) of infrastructure services such as the database and a message broker. In contrast, a microservice application typically consists of a large number of services. For example, Hailo has 160 different services and Netflix has over 600 according to Adrian Cockcroft [Editor – Hailo has been acquired by MyTaxi.]. Each service will have multiple runtime instances. That’s many more moving parts that need to be configured, deployed, scaled, and monitored. In addition, you will also need to implement a service discovery mechanism (discussed in a later post) that enables a service to discover the locations (hosts and ports) of any other services it needs to communicate with. Traditional trouble ticket‑based and manual approaches to operations cannot scale to this level of complexity. Consequently, successfully deploying a microservices application requires greater control of deployment methods by developers, and a high level of automation.

One approach to automation is to use an off‑the‑shelf PaaS such as Cloud Foundry. A PaaS provides developers with an easy way to deploy and manage their microservices. It insulates them from concerns such as procuring and configuring IT resources. At the same time, the systems and network professionals who configure the PaaS can ensure compliance with best practices and with company policies. Another way to automate the deployment of microservices is to develop what is essentially your own PaaS. One typical starting point is to use a clustering solution, such as Kubernetes, in conjunction with a technology such as Docker. Later in this series we will look at how software‑based application delivery approaches like NGINX Plus, which easily handles caching, access control, API metering, and monitoring at the microservice level, can help solve this problem.

Summary

Building complex applications is inherently difficult. A Monolithic architecture only makes sense for simple, lightweight applications. You will end up in a world of pain if you use it for complex applications. The Microservices architecture pattern is the better choice for complex, evolving applications despite the drawbacks and implementation challenges.

In later blog posts, I’ll dive into the details of various aspects of the Microservices Architecture pattern and discuss topics such as service discovery, service deployment options, and strategies for refactoring a monolithic application into services.

Stay tuned…

Editor – This seven‑part series of articles is now complete:

  1. Introduction to Microservices (this article)
  2. Building Microservices: Using an API Gateway
  3. Building Microservices: Inter-Process Communication in a Microservices Architecture
  4. Service Discovery in a Microservices Architecture
  5. Event-Driven Data Management for Microservices
  6. Choosing a Microservices Deployment Strategy
  7. Refactoring a Monolith into Microservices

You can also download the complete set of articles, plus information about implementing microservices using NGINX Plus, as an ebook – Microservices: From Design to Deployment.

Guest blogger Chris Richardson is the founder of the original CloudFoundry.com, an early Java PaaS (Platform as a Service) for Amazon EC2. He now consults with organizations to improve how they develop and deploy applications.