SlideShare a Scribd company logo
Distributed tracing
- Jaeger 101 -
About me
● Worked at eBay
● Worked at Forter as a backend engineer.
● Joined Rookout as a first developer and production engineer
● @itielshwartz on both Github and Twitter
● Also have a personal blog at: https://etlsh.com
Agenda
Intro:
1. State of mind for this Meetup (Super important!)
2. What is Distributed tracing, do i need it?
3. What is open tracing?
4. What is jaeger?
Zero to hero using Jaeger:
1. hello-world example
2. Jaeger terminology
3. Full blown distributed app
Wrap up
1. Demo wrap up
2. Jaeger architecture
3. Opentracing Secret ability
Before we begin (State of mind)
● The system will fail
● Your code is not perfect
● Other people code is even less perfect
● Practice new tools at daytime, don’t start using them in crisis mode
● The system will fail
● Each minute you spend adding logs and metrics can reducde your Mean Time to Resolve (MTTR)
● Keep in mind the developer that’s going to get a pager isn’t the one that wrote the code
● Try to be nice to him - he is going to need it
● The system will fail
As you can probably see i (tried) to emphasize the fact that your system is going to fail, this DOESN'T mean i think you
write bad code - only that we usually have much more trust in our code/infra then we should :)
What is distributedtracing?
With distributed tracing, we can track requests as they pass through multiple services, emitting timing and other metadata
throughout, and this information can then be reassembled to provide a complete picture of the application’s behavior at
runtime - buoyant
Mental model of distributed tracing - Opentracing
Do I need distributed tracing?
As companies move from monolithic to multi-service architectures, existing techniques for debugging and profiling begin to
break down.
Previously, troubleshooting could be accomplished by isolating a single instance of the monolith and reproducing the
problem.
With microservices, this approach is no longer feasible, because no single service provides a complete picture of the
performance or correctness of the application as a whole.
We need new tools to help us manage the real complexity of operating distributed systems at scale. - buoyant
What isopentracing?
The problem is that distributed tracing has long harbored a dirty secret: the necessary source code instrumentation has
been complex, fragile, and difficult to maintain.
This is the problem that OpenTracing solves.
Through standard, consistent APIs in many languages (Java, Javascript, Go, Python, C#, others), the OpenTracing project
gives developers clean, declarative, testable, and vendor-neutral instrumentation.
OpenTracing has focused on standards for explicit software instrumentation.
Distributed Tracing
What is Jaeger?
Jaeger, inspired by Dapper and OpenZipkin, is a distributed tracing system released as open source by
Uber Technologies.
It can be used for monitoring microservices-based distributed systems:
● Distributed context propagation
● Distributed transaction monitoring
● Root cause analysis
● Service dependency analysis
● Performance / latency optimization
Getting started - The Monolith
https://github.com/itielshwartz/jaeger-hello-world/tree/step-1-the-monolith
Getting started - Monolith goingwild
https://github.com/itielshwartz/jaeger-hello-world/tree/step-2-the-monolith-going-wild
Jaeger terminology - Span/ Trace
Span
A span represents a logical unit of work in Jaeger that has an operation name, the start time of the operation, and the
duration. Spans may be nested and ordered to model causal relationships.
Trace
A trace is a data/execution path through the system, and can be thought of as a directed acyclic graph of spans.
Jaeger terminology - Span/ Trace
Getting started - Adding Jaeger
https://github.com/itielshwartz/jaeger-hello-world/tree/step-3-adding-jaeger
Config Jaeger part II - Multiple spans
https://github.com/itielshwartz/jaeger-hello-world/tree/step-4-multiple-spans
Jaeger architecture -Tag/Log
The recommended solution is to annotate spans with tags or logs.
Tag:
A tag is a key-value pair that provides certain metadata about the span.
Log:
A log is similar to a regular log statement, it contains a timestamp and some data, but it is associated with span from which
it was logged.
When and why?
When should we use tags vs. logs? The tags are meant to describe attributes of the span that apply to the whole duration of
the span. For example, if a span represents an HTTP request, then the URL of the request should be recorded as a tag
because it does not make sense to think of the URL as something that's only relevant at different points in time on the span.
On the other hand, if the server responded with a redirect URL, logging it would make more sense since there is a clear
timestamp associated with such event. The OpenTracing Specification provides guidelines called Semantic Conventions for
recommended tags and log fields.
https://github.com/yurishkuro/opentracing-tutorial/tree/master/python/lesson01#annotate-the-trace-with-tags-and-logs
Config Jaeger part III - Tags and Log
https://github.com/itielshwartz/jaeger-hello-world/tree/step-5-tags-and-logs
Going distributed
Polling Write
Redis
Clean_github_
data
Main
Until now we had single server (what kind of defy the purpose of distributed tracing).
Now let’s split our monolith into small parts - we will still have a main server (customer facing) but not we will split
get_repo_contributors And clean_github_data Into two different service.
Get_repo_contributors - Will be a flask server (same as our main)
Clean_github_data - Will Consume data from redis (pushed to it by the master)
So basically it’s going to look like this
MainGet_repo_
contributors
Main
Going distributed - Single span
https://github.com/itielshwartz/jaeger-hello-world/tree/step-6-distribute-single-span
Going distributed - Multiple span
https://github.com/itielshwartz/jaeger-hello-world/tree/step-7-distribute-multiple-spans
Demo wrapup
We now have successfully transformed a monolith beast into a set of small microservices - without losing visibility.
The nice thing about opentracing is that it allow us to move from jaeger to datadog to other solution without (almost)
needing to rewrite our code.
The other cool thing about it is that you don’t need to do everything i just did in this demo!
There are official wrappers for most of the common framework those tools allow you you to integrate with opentracing and
jager without needing to think about “how do i pass the headers inside the request?” or “ how do i read the headers to start
a new span?”
Examples”
● urllib2
● requests
● SQLAlchemy
● MySQLdb
● Tornado
HTTP client
● redis
● Flask
● Django
● More
Jaeger Architecture
Jaeger Architecture
Agent
The Jaeger agent is a network daemon that listens for spans sent over UDP, which it batches and sends to the collector. It
is designed to be deployed to all hosts as an infrastructure component. The agent abstracts the routing and discovery of the
collectors away from the client.
Collector
The Jaeger collector receives traces from Jaeger agents and runs them through a processing pipeline. Currently our
pipeline validates traces, indexes them, performs any transformations, and finally stores them.
Jaeger’s storage is a pluggable component which currently supports Cassandra and ElasticSearch.
Query
Query is a service that retrieves traces from storage and hosts a UI to display them.
OpentracingSecretability
Context propagation
With OpenTracing instrumentation in place, we can support general purpose distributed context propagation where we
associate some metadata with the transaction and make that metadata available anywhere in the distributed call graph. In
OpenTracing this metadata is called baggage, to highlight the fact that it is carried over in-band with all RPC requests, just
like baggage. opentracing-tutorial
The client may use the Baggage to pass additional data to the server and any other downstream server it might call.
# client side
span.context.set_baggage_item('auth-token', '.....')
# server side (one or more levels down from the client)
token = span.context.get_baggage_item('auth-token')
Questions?

More Related Content

PPTX
Prometheus - Intro, CNCF, TSDB,PromQL,Grafana
PPTX
OpenTelemetry For Architects
PDF
OpenTelemetry Introduction
PPTX
MeetUp Monitoring with Prometheus and Grafana (September 2018)
PDF
OSMC 2022 | OpenTelemetry 101 by Dotan Horovit s.pdf
PDF
Open core summit: Observability for data pipelines with OpenLineage
PDF
SRE Demystified - 01 - SLO SLI and SLA
PDF
How to monitor your micro-service with Prometheus?
Prometheus - Intro, CNCF, TSDB,PromQL,Grafana
OpenTelemetry For Architects
OpenTelemetry Introduction
MeetUp Monitoring with Prometheus and Grafana (September 2018)
OSMC 2022 | OpenTelemetry 101 by Dotan Horovit s.pdf
Open core summit: Observability for data pipelines with OpenLineage
SRE Demystified - 01 - SLO SLI and SLA
How to monitor your micro-service with Prometheus?

What's hot (20)

PDF
Apache Kafka Architecture & Fundamentals Explained
PDF
Ship Faster, Reduce Risk, and Build Scale with Feature Flags
PDF
Server monitoring using grafana and prometheus
PDF
Intro to open source observability with grafana, prometheus, loki, and tempo(...
PDF
Prometheus Overview
PPTX
Monitoring_with_Prometheus_Grafana_Tutorial
PDF
Infrastructure & System Monitoring using Prometheus
PPTX
Monitoring & alerting presentation sabin&mustafa
PPTX
Prometheus and Grafana
PDF
Introduction to Kong API Gateway
PDF
Crossplane @ Mastering GitOps.pdf
PPTX
Grafana
ODP
Monitoring With Prometheus
PPTX
How to Move from Monitoring to Observability, On-Premises and in a Multi-Clou...
PDF
Everything You wanted to Know About Distributed Tracing
PPTX
Introduction to Apache Kafka
PDF
Speeding up your team with GitOps
PDF
Grafana Loki: like Prometheus, but for Logs
PPTX
Log management with ELK
PPT
Monitoring using Prometheus and Grafana
Apache Kafka Architecture & Fundamentals Explained
Ship Faster, Reduce Risk, and Build Scale with Feature Flags
Server monitoring using grafana and prometheus
Intro to open source observability with grafana, prometheus, loki, and tempo(...
Prometheus Overview
Monitoring_with_Prometheus_Grafana_Tutorial
Infrastructure & System Monitoring using Prometheus
Monitoring & alerting presentation sabin&mustafa
Prometheus and Grafana
Introduction to Kong API Gateway
Crossplane @ Mastering GitOps.pdf
Grafana
Monitoring With Prometheus
How to Move from Monitoring to Observability, On-Premises and in a Multi-Clou...
Everything You wanted to Know About Distributed Tracing
Introduction to Apache Kafka
Speeding up your team with GitOps
Grafana Loki: like Prometheus, but for Logs
Log management with ELK
Monitoring using Prometheus and Grafana
Ad

Similar to Distributed Tracing (20)

PPTX
Distributed tracing 101
PDF
Opentracing 101
PDF
Go Observability (in practice)
PDF
Prometheus (Microsoft, 2016)
PPT
Multi-tenancy with Rails
PDF
Prometheus: A Next Generation Monitoring System (FOSDEM 2016)
PPT
OGCE Overview for SciDAC 2009
PPT
OGCE Project Overview
PDF
Opentracing jaeger
PDF
Distributed Tracing with Jaeger
PDF
Jaeger Integration with Spring Cloud
PPTX
How to Use OWASP Security Logging
PDF
Introduction to Apache Airflow
PDF
OORPT Dynamic Analysis
PDF
Monitoring&Logging - Stanislav Kolenkin
PDF
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
PPTX
PDF
Nt1310 Unit 3 Language Analysis
PPTX
PDF
Spring batch overivew
Distributed tracing 101
Opentracing 101
Go Observability (in practice)
Prometheus (Microsoft, 2016)
Multi-tenancy with Rails
Prometheus: A Next Generation Monitoring System (FOSDEM 2016)
OGCE Overview for SciDAC 2009
OGCE Project Overview
Opentracing jaeger
Distributed Tracing with Jaeger
Jaeger Integration with Spring Cloud
How to Use OWASP Security Logging
Introduction to Apache Airflow
OORPT Dynamic Analysis
Monitoring&Logging - Stanislav Kolenkin
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
Nt1310 Unit 3 Language Analysis
Spring batch overivew
Ad

Recently uploaded (20)

PDF
Power and position in leadershipDOC-20250808-WA0011..pdf
PDF
20250805_A. Stotz All Weather Strategy - Performance review July 2025.pdf
PDF
Reconciliation AND MEMORANDUM RECONCILATION
PPTX
job Avenue by vinith.pptxvnbvnvnvbnvbnbmnbmbh
PDF
Stem Cell Market Report | Trends, Growth & Forecast 2025-2034
PPTX
The Marketing Journey - Tracey Phillips - Marketing Matters 7-2025.pptx
DOCX
unit 2 cost accounting- Tender and Quotation & Reconciliation Statement
PPTX
Lecture (1)-Introduction.pptx business communication
PDF
Types of control:Qualitative vs Quantitative
PPTX
sales presentation، Training Overview.pptx
PDF
WRN_Investor_Presentation_August 2025.pdf
PPTX
Starting the business from scratch using well proven technique
PDF
Unit 1 Cost Accounting - Cost sheet
PPTX
Belch_12e_PPT_Ch18_Accessible_university.pptx
PDF
A Brief Introduction About Julia Allison
PPTX
5 Stages of group development guide.pptx
PPTX
New Microsoft PowerPoint Presentation - Copy.pptx
PPTX
AI-assistance in Knowledge Collection and Curation supporting Safe and Sustai...
PDF
Hindu Circuler Economy - Model (Concept)
DOCX
unit 1 COST ACCOUNTING AND COST SHEET
Power and position in leadershipDOC-20250808-WA0011..pdf
20250805_A. Stotz All Weather Strategy - Performance review July 2025.pdf
Reconciliation AND MEMORANDUM RECONCILATION
job Avenue by vinith.pptxvnbvnvnvbnvbnbmnbmbh
Stem Cell Market Report | Trends, Growth & Forecast 2025-2034
The Marketing Journey - Tracey Phillips - Marketing Matters 7-2025.pptx
unit 2 cost accounting- Tender and Quotation & Reconciliation Statement
Lecture (1)-Introduction.pptx business communication
Types of control:Qualitative vs Quantitative
sales presentation، Training Overview.pptx
WRN_Investor_Presentation_August 2025.pdf
Starting the business from scratch using well proven technique
Unit 1 Cost Accounting - Cost sheet
Belch_12e_PPT_Ch18_Accessible_university.pptx
A Brief Introduction About Julia Allison
5 Stages of group development guide.pptx
New Microsoft PowerPoint Presentation - Copy.pptx
AI-assistance in Knowledge Collection and Curation supporting Safe and Sustai...
Hindu Circuler Economy - Model (Concept)
unit 1 COST ACCOUNTING AND COST SHEET

Distributed Tracing

  • 2. About me ● Worked at eBay ● Worked at Forter as a backend engineer. ● Joined Rookout as a first developer and production engineer ● @itielshwartz on both Github and Twitter ● Also have a personal blog at: https://etlsh.com
  • 3. Agenda Intro: 1. State of mind for this Meetup (Super important!) 2. What is Distributed tracing, do i need it? 3. What is open tracing? 4. What is jaeger? Zero to hero using Jaeger: 1. hello-world example 2. Jaeger terminology 3. Full blown distributed app Wrap up 1. Demo wrap up 2. Jaeger architecture 3. Opentracing Secret ability
  • 4. Before we begin (State of mind) ● The system will fail ● Your code is not perfect ● Other people code is even less perfect ● Practice new tools at daytime, don’t start using them in crisis mode ● The system will fail ● Each minute you spend adding logs and metrics can reducde your Mean Time to Resolve (MTTR) ● Keep in mind the developer that’s going to get a pager isn’t the one that wrote the code ● Try to be nice to him - he is going to need it ● The system will fail As you can probably see i (tried) to emphasize the fact that your system is going to fail, this DOESN'T mean i think you write bad code - only that we usually have much more trust in our code/infra then we should :)
  • 5. What is distributedtracing? With distributed tracing, we can track requests as they pass through multiple services, emitting timing and other metadata throughout, and this information can then be reassembled to provide a complete picture of the application’s behavior at runtime - buoyant Mental model of distributed tracing - Opentracing
  • 6. Do I need distributed tracing? As companies move from monolithic to multi-service architectures, existing techniques for debugging and profiling begin to break down. Previously, troubleshooting could be accomplished by isolating a single instance of the monolith and reproducing the problem. With microservices, this approach is no longer feasible, because no single service provides a complete picture of the performance or correctness of the application as a whole. We need new tools to help us manage the real complexity of operating distributed systems at scale. - buoyant
  • 7. What isopentracing? The problem is that distributed tracing has long harbored a dirty secret: the necessary source code instrumentation has been complex, fragile, and difficult to maintain. This is the problem that OpenTracing solves. Through standard, consistent APIs in many languages (Java, Javascript, Go, Python, C#, others), the OpenTracing project gives developers clean, declarative, testable, and vendor-neutral instrumentation. OpenTracing has focused on standards for explicit software instrumentation.
  • 9. What is Jaeger? Jaeger, inspired by Dapper and OpenZipkin, is a distributed tracing system released as open source by Uber Technologies. It can be used for monitoring microservices-based distributed systems: ● Distributed context propagation ● Distributed transaction monitoring ● Root cause analysis ● Service dependency analysis ● Performance / latency optimization
  • 10. Getting started - The Monolith https://github.com/itielshwartz/jaeger-hello-world/tree/step-1-the-monolith
  • 11. Getting started - Monolith goingwild https://github.com/itielshwartz/jaeger-hello-world/tree/step-2-the-monolith-going-wild
  • 12. Jaeger terminology - Span/ Trace Span A span represents a logical unit of work in Jaeger that has an operation name, the start time of the operation, and the duration. Spans may be nested and ordered to model causal relationships. Trace A trace is a data/execution path through the system, and can be thought of as a directed acyclic graph of spans.
  • 13. Jaeger terminology - Span/ Trace
  • 14. Getting started - Adding Jaeger https://github.com/itielshwartz/jaeger-hello-world/tree/step-3-adding-jaeger
  • 15. Config Jaeger part II - Multiple spans https://github.com/itielshwartz/jaeger-hello-world/tree/step-4-multiple-spans
  • 16. Jaeger architecture -Tag/Log The recommended solution is to annotate spans with tags or logs. Tag: A tag is a key-value pair that provides certain metadata about the span. Log: A log is similar to a regular log statement, it contains a timestamp and some data, but it is associated with span from which it was logged. When and why? When should we use tags vs. logs? The tags are meant to describe attributes of the span that apply to the whole duration of the span. For example, if a span represents an HTTP request, then the URL of the request should be recorded as a tag because it does not make sense to think of the URL as something that's only relevant at different points in time on the span. On the other hand, if the server responded with a redirect URL, logging it would make more sense since there is a clear timestamp associated with such event. The OpenTracing Specification provides guidelines called Semantic Conventions for recommended tags and log fields. https://github.com/yurishkuro/opentracing-tutorial/tree/master/python/lesson01#annotate-the-trace-with-tags-and-logs
  • 17. Config Jaeger part III - Tags and Log https://github.com/itielshwartz/jaeger-hello-world/tree/step-5-tags-and-logs
  • 18. Going distributed Polling Write Redis Clean_github_ data Main Until now we had single server (what kind of defy the purpose of distributed tracing). Now let’s split our monolith into small parts - we will still have a main server (customer facing) but not we will split get_repo_contributors And clean_github_data Into two different service. Get_repo_contributors - Will be a flask server (same as our main) Clean_github_data - Will Consume data from redis (pushed to it by the master) So basically it’s going to look like this MainGet_repo_ contributors Main
  • 19. Going distributed - Single span https://github.com/itielshwartz/jaeger-hello-world/tree/step-6-distribute-single-span
  • 20. Going distributed - Multiple span https://github.com/itielshwartz/jaeger-hello-world/tree/step-7-distribute-multiple-spans
  • 21. Demo wrapup We now have successfully transformed a monolith beast into a set of small microservices - without losing visibility. The nice thing about opentracing is that it allow us to move from jaeger to datadog to other solution without (almost) needing to rewrite our code. The other cool thing about it is that you don’t need to do everything i just did in this demo! There are official wrappers for most of the common framework those tools allow you you to integrate with opentracing and jager without needing to think about “how do i pass the headers inside the request?” or “ how do i read the headers to start a new span?” Examples” ● urllib2 ● requests ● SQLAlchemy ● MySQLdb ● Tornado HTTP client ● redis ● Flask ● Django ● More
  • 23. Jaeger Architecture Agent The Jaeger agent is a network daemon that listens for spans sent over UDP, which it batches and sends to the collector. It is designed to be deployed to all hosts as an infrastructure component. The agent abstracts the routing and discovery of the collectors away from the client. Collector The Jaeger collector receives traces from Jaeger agents and runs them through a processing pipeline. Currently our pipeline validates traces, indexes them, performs any transformations, and finally stores them. Jaeger’s storage is a pluggable component which currently supports Cassandra and ElasticSearch. Query Query is a service that retrieves traces from storage and hosts a UI to display them.
  • 24. OpentracingSecretability Context propagation With OpenTracing instrumentation in place, we can support general purpose distributed context propagation where we associate some metadata with the transaction and make that metadata available anywhere in the distributed call graph. In OpenTracing this metadata is called baggage, to highlight the fact that it is carried over in-band with all RPC requests, just like baggage. opentracing-tutorial The client may use the Baggage to pass additional data to the server and any other downstream server it might call. # client side span.context.set_baggage_item('auth-token', '.....') # server side (one or more levels down from the client) token = span.context.get_baggage_item('auth-token')