Grafana Tempo is an open source, easy-to-use, and high-scale distributed tracing backend. Tempo is cost-efficient, requiring only object storage to operate, and is deeply integrated with Grafana, Prometheus, and Loki. Tempo can ingest common open source tracing protocols, including Jaeger, Zipkin, and OpenTelemetry.
Why distributed tracing?
There are times when we encounter an issue, metrics and logs alone can’t pinpoint the problem.
Metrics are good for aggregations but lack fine-grained information. Logs are good at revealing what happened sequentially in an application, or maybe even across applications, but they don’t show how a single request possibly behaves inside of a service.
This is where tracing comes in. Distributed tracing is a way to track and log a single request as it crosses through all of the services in your infrastructure.
Why Grafana Tempo?
Tempo enables you for faster debugging/troubleshooting by quickly allowing you to move from metrics to the relevant traces of the specific logs which have recorded some issues.
Tempo allows users to scale tracing as far as possible with less operational cost and complexity. Tempo’s only dependency is object storage, and it supports search solely via trace ID. Unlike other tracing back ends, Tempo can hit massive scale without a difficult-to-manage ElasticSearch or Cassandra cluster.
See Get started with Grafana Tempo for more details.
How does Grafana Tempo work?
Deploying Tempo
Tempo can be easily deployed through a number of tools.
One example is via Helm. https://grafana.github.io/helm-charts
See our configured tempo chart for details.
Client instrumentation
To build a tracing pipeline, you need four major components: client instrumentation, pipeline, backend, and visualization.
Client instrumentation is the first building block to a functioning distributed tracing visualization pipeline. It is the process of adding instrumentation points in the application that create and offload spans.
Most of the popular client instrumentation frameworks have SDKs in the most commonly used programming languages. You should pick one according to your application needs.
Using OpenTelemetry instrumentation for Java
OpenTelemetry instrumentation for Java provides a Java agent JAR that can be attached to any Java 8+ application and dynamically injects bytecode to capture telemetry from a number of popular libraries and frameworks.
You can export the telemetry data in a variety of formats. You can also configure the agent and exporter via command line arguments or environment variables. The net result is the ability to gather telemetry data from a Java application without code changes.
Adding dependencies and configuration
In order to enable automatic instrumentation, one or more dependencies need to be added. How dependencies are added are language specific.
As we are using Kotlin with Gradle, update the target micro-service Gradle file (build.gradle.kts) with the appropriate dependencies
and jib
configuration.
dependencies { // others omitted for brevity // tracing runtimeOnly("io.opentelemetry.javaagent:opentelemetry-javaagent:1.19.0") } jib { // others omitted for brevity container { jvmFlags = listOf( "-javaagent:/app/libs/opentelemetry-javaagent-1.19.0.jar" ) } }
See response-message-manager/build.gradle.kts for an example implementation.
The rest are already pre-configured in the kotlin
base chart of our micro-services.
# -- Java Agent OpenTelemetry Integration tracing: # -- Enable creation of OTEL env variables enabled: true # -- OTEL trace exporter traces_exporter: otlp # -- OTEL metrics exporter metrics_exporter: none # -- OTEL exporter endpoint endpoint: https://tempo.monitoring.dev.safibank.online
apiVersion: v1 kind: ConfigMap metadata: name: {{ include "kotlin.fullname" . }} labels: {{- include "kotlin.labels" . | nindent 4 }} data: {{- range $key, $value := .Values.env }} {{ $key }}: {{ $value | quote }} {{- end }} TM_KAFKA_CONSUMER_GROUP: {{ .Release.Name | quote }} {{- if .Values.tracing.enabled }} OTEL_EXPORTER_OTLP_ENDPOINT: {{ .Values.tracing.endpoint | quote }} OTEL_METRICS_EXPORTER: {{ .Values.tracing.metrics_exporter | quote }} OTEL_RESOURCE_ATTRIBUTES: "service.name={{ .Release.Name }}" OTEL_TRACES_EXPORTER: {{ .Values.tracing.traces_exporter | quote }} {{- end }}
See tracing
on SaFiMono/devops/charts/kotlin/README.md for details.
Viewing traces and visualization
Grafana is the last building block of a tracing pipeline and has a built-in Tempo datasource that can be used to query Tempo and visualize traces.
View by trace by ID
The most basic functionality is to visualize a trace using its ID. If you have a Trace ID (Identifier for the entire trace), you can jump directly to it. You can query and display traces from Tempo via Explore.
Select the Trace ID
tab and enter the ID to view it. This functionality is enabled by default..
See here for the trace view explanation.
View by service, span and others
Traces can be searched for data originating from a specific service, duration range, span, or process-level attributes included in your application’s instrumentation, such as HTTP status code and customer ID.
From Search
tab, you can select the service name to search from, span name, tags, min-max duration and even limit search results.
View by Service Graph (Node graph)
A service graph is a visual representation of the interrelationships between various services. Service graphs help to understand the structure of a distributed system, and the connections and dependencies between its components.
Service graphs infer the topology of a distributed system, provide a high level overview of the health of your system, and a historic view of a system’s topology. Service graphs show error rates and latencies, among other relevant data.
Select Service Graph
then run query button (upper right). Select a node for more details.
Clicking on nodes on the service graph, lets you reveal specific details based upon your selection as shown below.
References
Attachments:
image-20221101-023615.png (image/png)
image-20221101-023712.png (image/png)
image-20221101-023712.png (image/png)
image-20221101-024339.png (image/png)
image-20221101-024339.png (image/png)
image-20221103-031900.png (image/png)
image-20221103-035414.png (image/png)
image-20221103-042455.png (image/png)