OpenTelemetry (OTel) is an open source project that provides a vendor-neutral standard for collecting, processing, and exporting telemetry data from distributed systems (such as a microservices architecture). This simplified and universal approach to observability makes it easier for developers to analyze software’s performance and behavior so they can more easily diagnose and debug issues in their applications. OTel collects the following data:
Watch this video to see how to get started:
Types of Data Generated by OTel
A trace records the events that happen during an operation such as the handling of a single request. The trace is divided into a series of spans, each of them representing a unit of work.
For example, the trace for a web request might include three spans:
- Accepting the request
- Querying the database
- Sending a response
A trace slices up a data flow which may include multiple services into a series of chronologically ordered chunks to help you easily understand:
- All the steps that happened in each chunk
- The order in which the chunks executed
- How long each step lasted
- Metadata about each step
Once OTel has generated traces, the next step is to export them into a tracing backend or tool for analysis. OTel provides a set of exporters for popular backends such as Jaeger, Zipkin, and AWS X‑Ray. These services provide tools for analyzing and visualizing trace data.
In OTel, metrics are measurements of specific aspects of an operating system’s behavior and are collected over time as key‑value pairs (known as metric labels). The key‑value pairs provide context about the measurement over time. For example, a metric for the response time of a web service might include labels for the HTTP status code, the endpoint, and the HTTP method. All metrics also are timestamped, again to enable chronological ordering.
Logs are the oldest and most common method for getting insight into what is going on with a given service. They are generally produced as text and must be parsed to generate insights. Support for logs in OTel is still experimental.
To learn more about what our solution architects discovered when they compared the observability feature sets in OTel against other observability tools, see Integrating OpenTelemetry into the Modern Apps Reference Architecture – A Progress Report on our blog.
When setting up telemetry instrumentation, it’s best to start with a set of goals for instrumentation more defined than “send everything and hope for insights”. While it is true that you can’t know the full extent of what’s possible until you view the data, setting some minimal requirements helps ensure the smooth operation and maintenance of your services.
These can be technical concerns like:
- I want to know when my service is under pressure and needs scaling.
- I want to know if my service is restarting often.
But they can also be product and user experience‑related concerns like:
- I want users to see new messages in the system within five seconds.
- I want notifications to be dispatched within one minute of a message being sent.
As an example from our tutorial How to Use OpenTelemetry Tracing to Understand Your Microservices, you might define the following as the key goals:
- Understand all the steps a request takes to accomplish the new message flow.
- Check that the user flow completed successfully.
- Have confidence that the user flow is executing faster than five seconds from end to end (under “normal” circumstances).
- Learn whether the notifier service is processing the event (dispatched by the messenger service) in a timely manner.
OTel provides developers with a single set of application programming interfaces (APIs), software development kits (SDKs), and instrumentation libraries they can use to instrument their applications in a consistent and standardized way.
Because the format of the data produced by OTel is considered an industry standard, multiple telemetry aggregation and visualization solutions accept it. You can choose an on‑premises solution, like Jaeger (as we did in this tutorial), or opt for a Software-as-a-Service (SaaS) solution, like SumoLogic or SigNoz.
To manage all three types of telemetry, the only alternative to OTel is a combination of multiple tools. This adds even more complexity on top of the inherent complexity involved with running a microservices architecture and infrastructure.
APIs define the methods, functions, and protocols used by software components to interact with each other. The OTel APIs define a standard set of methods and protocols that developers can use to instrument their applications and collect telemetry data.
SDKs are software development tools provided by the author of a standard or application that make it easier for developers to build applications that conform to the standard or interact with the app. SDKS typically include libraries, code samples, documentation, and tools for testing, debugging, and performance tuning. OTel provides SDKs for tracing, metrics, and resource management.
NGINX is proud to present these additional resources to learn more about OTel: