Distributed Tracing
With the help of Distributed Tracing, you can track transaction traces from one application to another. It allows you to track monitor calls between applications and identify the source of the problem.
Trace
A trace is the execution flow of a transaction, including method calls, database queries, and exceptions. When a transaction exceeds a certain threshold value, it is captured as a trace.
Distributed tracing
Modern applications have evolved from their monolithic ancestry to micro services or distributed systems. Analyzing issues in such environments involves correlating data from multiple services spread across the system, creating an inevitable need for a more sophisticated tracing mechanism to diagnose and troubleshoot issues. Distributed tracing fulfills this need by being able to trace the entire path of a web transaction across application or service boundaries.
Prerequisites to enable Distributed Tracing
Make sure you have installed APM Insight agents across all your applications that need to be monitored.
Ensure that your agents are updated to the following versions or higher:
- Java agent version 4.5.1
- .NET agent version 4.5.2
- Node.js agent version 1.7
- PHP agent version 3.2
- Python agent version 1.2.1
Distributed Tracing can be viewed only for applications under the same account. If your applications are spread over various accounts, you cannot have a consolidated view of all your traces across all your accounts.
How Distributed Tracing works in Site24x7
1. When one application or service makes an API request to another application or service, Site24x7 APM Insight adds an HTTP header containing the metadata required to track the service request.
2. If Site24x7 APM Insight is enabled in the other corresponding service, it will process the HTTP header and return the tracking information to its caller via the response headers.
3. If the other service, in turn, calls another service, the header propagation continues, thus tracing all paths taken by the original application or service across the distributed system.
- Traces are captured only when a request exceeds the specified threshold level in any of the services.
- If your application uses any proxy service, ensure that the corresponding keys, X-Site24x7-Id and X-Site24x7-DT-Data, are added to request and response headers for the agent to track the transaction.
Video
Here's a quick video on how distributed tracing works.
Viewing the data
To view distributed traces, follow the steps below:
- Log in to your Site24x7 account > APM > APM Insight.
- Click on the respective application and navigate to the Traces tab.
- Upon clicking on the Distributed tab, you can see the list of traces that flow to other applications and the corresponding transactions.
- Click on a trace to view the whole picture of it.
Metrics and inference
The Distributed Tracing tab provides various metrics to help you analyze the results of the trace.
- httpURL.do is the parent transaction from the application Zylker.
- EmpSite_EmpApp and NodeApp are the other two applications, wherein the parent transaction has made calls to complete its request.
- dynamodb/createtable, oracle/viewdetails, and customer/add are the corresponding transactions called by the parent transaction, httpURL.do.
The top bar displays detailed metrics like the total response time taken by the transactions (in this case, 8,883ms), partial response time contributed by one of the transactions (3,623ms), memory usage, and more. By clicking Expand Tree View on the right-hand side of the Application Name, you can view the method calls and code blocks that took a significant amount of time to complete this request.
Under normal circumstances, these transaction traces are viewed separately without context. With distributed tracing, you can understand precisely the method calls requested by a transaction from a particular application, and how they exceeded the threshold level. This simplifies your debugging process.