Tracing is the process of following the trail of an application's code or data flow to understand bottlenecks in performance or find the root cause of an error—for example, a request made to the app by a user. The value received when tracking a single service’s activity progress in an entire system will be lower than that of tracking multiple services. Tracking multiple systems is referred to as distributed tracing.
Modern systems are often built as microservices instead of monolithic applications. The downsides of microservices architecture are its complexity and potential hidden events, like the reasons why requests fail. Distributed tracing helps DevOps teams and SREs track events and easily detect what’s gone wrong by monitoring requests and how data is exchanged. In this article, we will cover how tracing is done in applications that run on Kubernetes, a widely adopted microservices platform.
Effective tracing is an important aspect of monitoring and debugging applications deployed on Kubernetes. It helps developers keep tabs on their application performance, identify issues, and troubleshoot problems.
One of the keys to effective tracing is using a distributed tracing tool, such as Zipkin or Jaeger. These tools present the complete flow from the initial request to the final response in the application. They provide detailed information about the various components and services involved in the request-response flow, including performance and any errors or exceptions that may have occurred.
Therefore, to initiate tracing on Kubernetes, you must install and configure a distributed tracing tool. This typically involves deploying the tool as a containerized application on your Kubernetes cluster and configuring your application to send trace data to the tool. You can then view and analyze the trace data using the tool's web interface or API.
Another critical element of effective tracing is to instrument the application code to generate trace spans. A span is a unit of work in an application, such as a database query or an HTTP request. A trace consists of multiple spans, and each span can have an unlimited number of child spans under it.
If you want to know how a service or application is performing, observe the trail of the spans comprising it, along with the metadata.
To instrument your code with trace spans, use a tracing library like OpenTelemetry. These libraries provide APIs for creating trace spans and adding metadata to them. With this metadata, you can then filter and analyze your trace data.
In this section, we will cover popular tools for implementing distributed tracing. Most of these are open source, but they are also well-maintained and used in industrial applications. Open-source instrumentation provides interoperability with other open-source software. It is also well-maintained by a community of developers who actively adopt it in production environments, and it’s easy to learn via documentation.
OpenTelemetry was created as a combination of OpenTracing and OpenCensus libraries.
OpenTracing is an open-source specification for distributed tracing that defines a standard set of APIs for instrumenting applications and libraries for various programming languages. It aims to provide a vendor-neutral way to implement distributed tracing in microservices-based applications.
While OpenTelemetry has a general observability use case, it is among the best in tracing implementation. OpenTelemetry provides standardized vendor-agnostic software development kits (SDKs), APIs, and tools to collect and send telemetry data to a preferred observability backend. One benefit of using OpenTelemetry over a bespoke tracing tooling is that it’s a robust library that doesn’t require installation of a separate software for different needs.
The developers of OpenTelemetry seemingly adopted every necessary feature to keep dependency on third-party software minimal and make OpenTelemetry more robust.
Zipkin was developed by Twitter under an open-source software license. It uses Dapper-style tracing technique which involves adding unique identifiers, called “trace IDs”, to request as they follow through different services in a system. Zipkin comes with the following components:
Jaeger is another open-source distributed tracing tool that you can use to store, visualize, and filter distributed traces. We will use Jaeger as the distributed tracing client in this article to demonstrate the adoption of tracing in Kubernetes applications.
Jaeger is an Uber Technologies project that is now a CNCF graduate project.
It has components such as:
Jaeger can be used for the following services:
Pixie is yet another open-source Kubernetes observability tool. It is a CNCF sandbox project, and it makes use of eBPF to automatically capture telemetry data with no dependence on manual instrumentation.
Site24x7 Tracing is a commercially distributed tracing tool for Kubernetes applications. The main advantages of using Site24x7 Tracing over open-source tools include:
Open-source tools are great for customization, development setups, and experimentation. However, they can be a nightmare to work with if you want to combine several of them to get a complete solution. Using a professional tool designed for enterprises with round-the-clock tech support spares users the constant hassle of having to find and fix bugs that come with integrating multiple tools.
Site24x7 will get you started with monitoring within minutes of signing up, with 24/7 tech support and an active community to help users with any questions or problems they might have. You can customize your workflow from the data being sent to the dashboards you want to view.
To demonstrate how to implement tracing with Jaeger and OpenTelemetry in a Kubernetes application, we’ll follow these steps:
For our example, we will use Go to build a web application and name it k8sTrace
, then create a main.go
file with the contents below:
package main
import (
"context"
"fmt"
"log"
"net/http"
"os"
"time"
"go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp"
"go.opentelemetry.io/otel"
"go.opentelemetry.io/otel/attribute"
"go.opentelemetry.io/otel/exporters/jaeger"
"go.opentelemetry.io/otel/sdk/resource"
traceSDK "go.opentelemetry.io/otel/sdk/trace"
semConv "go.opentelemetry.io/otel/semconv/v1.7.0"
)
const (
service = "k8sTrace"
environment = "development"
id = 1
)
func tracerProvider(url string) (*traceSDK.TracerProvider, error)
{
// Create the Jaeger exporter
exp, err :=
jaeger.New(jaeger.WithCollectorEndpoint(jaeger.WithEndpoint(url)
))
if err != nil {
return nil, err
}
tp := traceSDK.NewTracerProvider(
// Always be sure to batch in production.
traceSDK.WithBatcher(exp),
// Record information about this application in a Resource.
traceSDK.WithResource(resource.NewWithAttributes(
semConv.SchemaURL,
semConv.ServiceNameKey.String(service),
attribute.String("environment", environment),
attribute.Int64("ID", id),
)),
)
return tp, nil
}
func main() {
// Tracer destination.
tp, err :=
tracerProvider("<https://localhost:14268/api/traces>")
if err != nil {
log.Fatal(err)
}
// Register our TracerProvider as the global so any imported instrumentation in the future will default to using it.
otel.SetTracerProvider(tp)
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
// Cleanly shutdown and flush telemetry when the application exits.
defer func(ctx context.Context) {
// Do not make the application hang when it is shutdown.
ctx, cancel = context.WithTimeout(ctx, time.Second*5)
defer cancel()
if err := tp.Shutdown(ctx); err != nil {
log.Fatal(err)
}
}(ctx)
tr := tp.Tracer("component-main")
ctx, span := tr.Start(ctx, "hello")
defer span.End()
// HTTP Handlers
helloHandler := func(w http.ResponseWriter, r *http.Request)
{
// Use the global TracerProvider
tr := otel.Tracer("hello-handler")
_, span := tr.Start(ctx, "hello")
span.SetAttributes(attribute.Key("testset").String("value"))
defer span.End()
yourName := os.Getenv("MY_NAME")
fmt.Fprintf(w, "Hello %q!", yourName)
}
otelHandler :=
otelhttp.NewHandler(http.HandlerFunc(helloHandler), "Hello")
http.Handle("/", otelHandler)
log.Println("Listening on localhost:3000")
log.Fatal(http.ListenAndServe(":3000", nil))
}
Next, we will test the code locally. To do so, we’ll provide an environment variable to satisfy MY_NAME
and then run the code
export MY_NAME="Alice" ; go run main.go
To confirm that the application works locally, visit the port on http://localhost:3000
.
Next, we’ll set up distributed tracing with Jaeger.
We need to initialize Jaeger before running the application via Kubernetes. Run the command below while the Docker engine is running:
$ docker run -d --name jaeger \\
-e COLLECTOR_ZIPKIN_HTTP_PORT=9411 \\
-p 5775:5775/udp \\
-p 6831:6831/udp \\
-p 6832:6832/udp \\
-p 5778:5778 \\
-p 16686:16686 \\
-p 14268:14268 \\
-p 9411:9411 \\
jaegertracing/all-in-one:1.6
Visit the Jaeger UI at http://localhost:16686
to see the default home page:
Now that we have the Jaeger UI running as a daemon via Docker, we will deploy the application on Kubernetes and run it.
To deploy the application on Kubernetes, we’ll first build the image for the application. Write a Dockerfile for the application. The Dockerfile content used in this article is as follows:
FROM golang:1.19-alpine AS build
ADD . /src
WORKDIR /src
RUN go get -d -v -t
RUN GOOS=linux GOARCH=amd64 go build -v -o k8strace
FROM alpine:3.17.2
EXPOSE 8080
CMD ["k8strace"]
ENV VERSION 1.1.4
COPY --from=build /src/k8strace /usr/local/bin/k8strace
RUN chmod +x /usr/local/bin/k8strace
Next, run the build command:
docker build -t k8strace .
This will create a Docker image for the application based on the Dockerfile content.
Next, push the Docker image to the Docker container registry. If you’re deploying on AWS or GCP, you can push the image to the container registry on those platforms.
docker tag k8strace your_name/k8strace:1.0
docker push your_name/k8strace:1.0
This will upload the Docker image to the Docker container registry where Kubernetes can access it.
Now we’ll create a Kubernetes deployment manifest for the application. The contents of the file, which we’ll name k8s-deployment.yaml
, are as follows:
apiVersion: apps/v1
kind: Deployment
metadata:
name: k8strace
spec:
replicas: 1
selector:
matchLabels:
app: k8strace
template:
metadata:
labels:
app: k8strace
spec:
containers:
- name: k8strace
image: your_name/k8strace:1.0
imagePullPolicy: IfNotPresent
ports:
- containerPort: 8080
env:
- name: MY_NAME
value: "Bob"
livenessProbe:
httpGet:
path: /health
port: 8080
scheme: HTTP
initialDelaySeconds: 5
periodSeconds: 15
timeoutSeconds: 5
readinessProbe:
httpGet:
path: /readiness
port: 8080
scheme: HTTP
initialDelaySeconds: 5
timeoutSeconds: 1
The manifest will create a deployment with a single replica of the application. The env
field specifies the environment variable needed by the application (i.e., MY_NAME
set to “Bob”).
Make sure minikube is started:
minikube start
Apply the Kubernetes deployment manifest with this command:
kubectl apply -f k8s-deployment.yaml
This will create the Kubernetes deployment using the manifest written above. Once the application is deployed, Kubernetes will automatically create a pod to run the application. Expose the application for external access using the below command:
kubectl expose deployment k8strace --type=LoadBalancer --port=80 --target-port=8080
You can access the application by navigating to the load balancer’s external IP address in a web browser. To get the IP address, run the following:
kubectl get services k8strace -o wide
You can then forward the Jaeger UI port to your local machine using the command
kubectl port-forward deployment/k8strace 16686:16686
This will allow you to view the Jaeger UI from a web browser by navigating to http://localhost:16686
. Search for the traces of the application k8straces
—once you find them, you can view the details of individual spans and traces, including the duration, tags, and logs associated with each span.
Effective tracing is key to monitoring and debugging applications deployed on Kubernetes. You can gain deeper insight into application performance by using a tracing tool, instrumenting your code with trace spans, and taking advantage of Kubernetes' native support for tracing.
It is advisable to regularly review and analyze your trace data to identify trends and patterns. Adopting this practice can help you identify potential issues before they grow into major problems and ensure that your applications continue to run smoothly.
Write for Site24x7 is a special writing program that supports writers who create content for Site24x7 “Learn” portal. Get paid for your writing.
Apply Now