SlideShare a Scribd company logo
© 2017 InfluxData. All rights reserved.1
Distributed Tracing
Frequently Asked Questions
© 2017 InfluxData. All rights reserved.2
Gianluca Arbezzano
Site Reliability Engineer @InfluxData
● https://gianarb.it
● @gianarb
What I like:
● I make dirty hacks that look awesome
● I grow my vegetables 🍅🌻🍆
● Travel for fun and work
Why do I need distributed
tracing?
© 2017 InfluxData. All rights reserved.4
© 2017 InfluxData. All rights reserved.5
It is a way to describe the
distribution’s complexity
© 2017 InfluxData. All rights reserved.6
In practice it is a different
aggregation for the well-known
logs and stats.
© 2017 InfluxData. All rights reserved.7
To tell the story of our
distributed system
How a trace looks like?
© 2017 InfluxData. All rights reserved.9
A span is the smallest unit in
a trace.
© 2017 InfluxData. All rights reserved.10
It describes a single action executed by a program:
● A single HTTP request.
● A database query.
● A message execution in a queue system.
● A lookup from a key/value store.
© 2017 InfluxData. All rights reserved.11
A span is described via:
● span_id the unique identifier in a trace
● trace_id to determine its trace
● parent_id to describe a hierarchy
● labels a set of key/value pairs
● Span Context is a set of value that will be propagated in
the trace
● Logs
© 2017 InfluxData. All rights reserved.12
post: /users
handle.create_user
user_exists
insert_user
send_email
nginx
sA
mysql
mysql
worker
A single trace
© 2017 InfluxData. All rights reserved.13
post: /users
handle.create_user
user_exists
nginx
sA
mysql
mysql
sA
Service Name: mysql
Trace ID: 34ytsy5hs45gs46hs5g
Span ID: se5hs5s5hs45gs45gs
Span Name: user_exists
Duration: 1.2s
Logs:
query: “select * from tb_user where id =
345”
user: sa_service
How do I follow a
request?
© 2017 InfluxData. All rights reserved.15
The implementation changes based on what you are instrumenting
¨ To instrument HTTP services the solution is via HEADER
¨ Same for grpc
¨ For queue system you can pass it as part of the message payload
© 2017 InfluxData. All rights reserved.16
B3-Propagation
https://github.com/openzipkin/b3-propagation
X-B3-TraceId: 80f198ee56343ba864fe8b2a57d3eff7
X-B3-ParentSpanId: 05e3ac9a4f6e3b90
X-B3-SpanId: e457b5a2e4d86bd1
X-B3-Sampled: 1
Do I need a standard
for tracing?
© 2017 InfluxData. All rights reserved.18
YES
© 2017 InfluxData. All rights reserved.19
1. Applications can be written using different languages but at the end you need to
build one single trace. It means that they need to agree on a common
standard/protocol.
2. If you use a widely supported standard you can avoid vendor lock-in.
© 2017 InfluxData. All rights reserved.20
© 2017 InfluxData. All rights reserved.21
log log log
log log
log
Parent Span Span Context / Baggage
Child
Child
Child Span
¨ Spans - Basic unit of timing and causality. Can be tagged with
key/value pairs.
¨ Logs - Structured data recorded on a span.
¨ Span Context - serializable format for linking spans across network
boundaries. Carries baggage, such as a request and client IDs.
¨ Tracers - Anything that plugs into the OpenTracing API to record
information.
¨ ZipKin, Jaeger, LightStep, others
¨ Also metrics (Prometheus) and logging
© 2017 InfluxData. All rights reserved.22
1.5 year old! 🎂
Tracer implementations: Zipkin, Jaeger, LightStep, SkyWalking, others
All sorts of companies use OpenTracing:
© 2017 InfluxData. All rights reserved.23
Rapidly growing OSS and vendor adoption
JDBIJava Webservlet
Jaxr
© 2017 InfluxData. All rights reserved.24
import "github.com/opentracing/opentracing-go"
import ".../some_tracing_impl"
func main() {
opentracing.SetGlobalTracer(
// tracing impl specific:
some_tracing_impl.New(...),
)
...
}
https://github.com/opentracing/opentracing-go
Opentracing: Configure the GlobalTracer
© 2017 InfluxData. All rights reserved.25
func xyz(ctx context.Context, ...) {
...
span, ctx := opentracing.StartSpanFromContext(ctx, "operation_name")
defer span.Finish()
span.LogFields(
log.String("event", "soft error"),
log.String("type", "cache timeout"),
log.Int("waited.millis", 1500))
...
}
https://github.com/opentracing/opentracing-go
Opentracing: Create a Span from the Context
© 2017 InfluxData. All rights reserved.26
func xyz(parentSpan opentracing.Span, ...) {
...
sp := opentracing.StartSpan(
"operation_name",
opentracing.ChildOf(parentSpan.Context()))
defer sp.Finish()
...
}
https://github.com/opentracing/opentracing-go
Opentracing: Create a Child Span
© 2017 InfluxData. All rights reserved.27
OpenCensus: instrumentation spec and libraries by Google
Common
Interface to get
stats and
traces from
your apps
Different
exporters to
persist your
data
How a tracing infrastructure
looks?
© 2017 InfluxData. All rights reserved.29
OpenTracing
API
application logic
µ-service frameworks
Lambda functions
RPC & control-flow frameworks
existing instrumentation
tracing infrastructure
main()
I N S T A N A
J a e g e r
microservice process
Can I have a tracing
infrastructure on-prem?
© 2017 InfluxData. All rights reserved.31
There are different Open Source alternatives:
¨ Zipkin
¨ Java
¨ Sponsored by Twitter
¨ Supported backend: ElasticSearch, MySQL, Cassandra
¨ Jaeger
¨ Go
¨ Sponsored by Uber and part of the CNCF
¨ Supported backend: ElasticSearch, Cassandra
There are as a service tracing
infrastructure?
© 2017 InfluxData. All rights reserved.33
¨ NewRelic
¨ Honeycomb
¨ LightSteps
¨ AWS X-Ray
¨ Google Stack Driver
Can I store traces everywhere?
© 2017 InfluxData. All rights reserved.35
Short answer YES.
At your own risk…
¨ Really high cardinality
¨ High write throughput
Probably databases like InfluxDB, Cassandra, MongoDB are a better option compared with
MySQL, Postgres but it always depends on traffic and amount of data.
© 2017 InfluxData. All rights reserved.36
Reach out:
@gianarb
gianluca@influxdb.com
Any question?

More Related Content

PDF
OSMC 2018 | Tailored SNMP monitoring – Your own SNMP MIB and sub-agent with P...
PDF
OSMC 2018 | SLA Monitoring mit Icinga & Prometheus by Moritz Tanzer
PDF
OSMC 2018 | Current State of Icinga by Bernd Erk
PDF
OSMC 2018 | Logging is coming to Grafana by David kaltschmidt
PDF
OSMC 2018 | Integrating Check_MK agent into Thruk – Windows monitoring made e...
PDF
Encode polkadot club
PDF
A Kong retrospective: from 0.10 to 0.13
PDF
Developing a user-friendly OpenResty application
OSMC 2018 | Tailored SNMP monitoring – Your own SNMP MIB and sub-agent with P...
OSMC 2018 | SLA Monitoring mit Icinga & Prometheus by Moritz Tanzer
OSMC 2018 | Current State of Icinga by Bernd Erk
OSMC 2018 | Logging is coming to Grafana by David kaltschmidt
OSMC 2018 | Integrating Check_MK agent into Thruk – Windows monitoring made e...
Encode polkadot club
A Kong retrospective: from 0.10 to 0.13
Developing a user-friendly OpenResty application

What's hot (20)

PDF
OSMC 2018 | Why we recommend PMM to our clients by Matthias Crauwels
PDF
A Cassandra driver from and for the Lua community
PDF
NATS in action - A Real time Microservices Architecture handled by NATS
PDF
FIWARE Wednesday Webinars - Short Term History within Smart Systems
PDF
Redecentralizing the Web: IPFS and Filecoin
PDF
Scaling your logging infrastructure using syslog-ng
PDF
PDF
Netflow Analysis using Elastic Stack - 조인중
PDF
Distributed Tracing with OpenTracing, ZipKin and Kubernetes
PDF
stackconf 2021 | Embracing change: Policy-as-code for Kubernetes with OPA and...
PDF
stackconf 2021 | GitOps: yea or nay?
PDF
An approach for migrating enterprise apps into open stack
PDF
OSDC 2018 - Distributed monitoring
PDF
Kong in 1.x Territory
PPTX
FIWARE Wednesday Webinars - Core Context Management
PDF
Intro to open source observability with grafana, prometheus, loki, and tempo(...
PDF
stackconf 2021 | Continuous Security – integrating security into your pipelines
PPTX
The 3 Models in the NGINX Microservices Reference Architecture
PPTX
Сергей Сверчков "Want to build a secure private cloud for IoT with high avail...
PDF
Scaling 100PB Data Warehouse in Cloud
OSMC 2018 | Why we recommend PMM to our clients by Matthias Crauwels
A Cassandra driver from and for the Lua community
NATS in action - A Real time Microservices Architecture handled by NATS
FIWARE Wednesday Webinars - Short Term History within Smart Systems
Redecentralizing the Web: IPFS and Filecoin
Scaling your logging infrastructure using syslog-ng
Netflow Analysis using Elastic Stack - 조인중
Distributed Tracing with OpenTracing, ZipKin and Kubernetes
stackconf 2021 | Embracing change: Policy-as-code for Kubernetes with OPA and...
stackconf 2021 | GitOps: yea or nay?
An approach for migrating enterprise apps into open stack
OSDC 2018 - Distributed monitoring
Kong in 1.x Territory
FIWARE Wednesday Webinars - Core Context Management
Intro to open source observability with grafana, prometheus, loki, and tempo(...
stackconf 2021 | Continuous Security – integrating security into your pipelines
The 3 Models in the NGINX Microservices Reference Architecture
Сергей Сверчков "Want to build a secure private cloud for IoT with high avail...
Scaling 100PB Data Warehouse in Cloud
Ad

Similar to OSMC 2018 | Distributed Tracing FAQ by Gianluca Arbezzano (20)

PPTX
Stream Processing and Real-Time Data Pipelines
PDF
Open Tracing, to order and understand your mess. - ApiConf 2017
PDF
How to Use the TICK Stack, CoreOS, & Docker to Make Your SaaS Offering Better
PDF
How to Use the TICK Stack, CoreOS, & Docker to Make Your SaaS Offering Better
PDF
Virtual training Intro to InfluxDB & Telegraf
PDF
Spring Framework 5.0による Reactive Web Application #JavaDayTokyo
PDF
Headless approach for offloading heavy tasks in Magento
PDF
Deep learning beyond the learning - Jörg Schad - Codemotion Rome 2018
PDF
Write your own telegraf plugin
PDF
DevOps Fest 2019. Gianluca Arbezzano. DevOps never sleeps. What we learned fr...
PDF
GitOps Core Concepts & Ways of Structuring Your Repos
PPT
Predictable Big Data Performance in Real-time
PDF
Building ContinuousIntegration with Virtuozzo DevOps
PPTX
Intro elasticsearch taswarbhatti
PDF
All in one
PDF
Monitoring in 2017 - TIAD Camp Docker
PDF
Getting Started: Intro to Telegraf - July 2021
PPTX
Episode 3: Kubernetes and Big Data Services
PDF
Getting started with Hadoop, Hive, Spark and Kafka
PDF
Openshift linuxday 2014
Stream Processing and Real-Time Data Pipelines
Open Tracing, to order and understand your mess. - ApiConf 2017
How to Use the TICK Stack, CoreOS, & Docker to Make Your SaaS Offering Better
How to Use the TICK Stack, CoreOS, & Docker to Make Your SaaS Offering Better
Virtual training Intro to InfluxDB & Telegraf
Spring Framework 5.0による Reactive Web Application #JavaDayTokyo
Headless approach for offloading heavy tasks in Magento
Deep learning beyond the learning - Jörg Schad - Codemotion Rome 2018
Write your own telegraf plugin
DevOps Fest 2019. Gianluca Arbezzano. DevOps never sleeps. What we learned fr...
GitOps Core Concepts & Ways of Structuring Your Repos
Predictable Big Data Performance in Real-time
Building ContinuousIntegration with Virtuozzo DevOps
Intro elasticsearch taswarbhatti
All in one
Monitoring in 2017 - TIAD Camp Docker
Getting Started: Intro to Telegraf - July 2021
Episode 3: Kubernetes and Big Data Services
Getting started with Hadoop, Hive, Spark and Kafka
Openshift linuxday 2014
Ad

Recently uploaded (20)

PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PDF
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
PDF
System and Network Administraation Chapter 3
PDF
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PPTX
ai tools demonstartion for schools and inter college
PDF
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PPTX
Odoo POS Development Services by CandidRoot Solutions
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PPTX
history of c programming in notes for students .pptx
PDF
top salesforce developer skills in 2025.pdf
PDF
wealthsignaloriginal-com-DS-text-... (1).pdf
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PDF
Nekopoi APK 2025 free lastest update
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
System and Network Administraation Chapter 3
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
Which alternative to Crystal Reports is best for small or large businesses.pdf
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
ai tools demonstartion for schools and inter college
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
Internet Downloader Manager (IDM) Crack 6.42 Build 41
Odoo POS Development Services by CandidRoot Solutions
Navsoft: AI-Powered Business Solutions & Custom Software Development
history of c programming in notes for students .pptx
top salesforce developer skills in 2025.pdf
wealthsignaloriginal-com-DS-text-... (1).pdf
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
How to Choose the Right IT Partner for Your Business in Malaysia
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
Nekopoi APK 2025 free lastest update

OSMC 2018 | Distributed Tracing FAQ by Gianluca Arbezzano

  • 1. © 2017 InfluxData. All rights reserved.1 Distributed Tracing Frequently Asked Questions
  • 2. © 2017 InfluxData. All rights reserved.2 Gianluca Arbezzano Site Reliability Engineer @InfluxData ● https://gianarb.it ● @gianarb What I like: ● I make dirty hacks that look awesome ● I grow my vegetables 🍅🌻🍆 ● Travel for fun and work
  • 3. Why do I need distributed tracing?
  • 4. © 2017 InfluxData. All rights reserved.4
  • 5. © 2017 InfluxData. All rights reserved.5 It is a way to describe the distribution’s complexity
  • 6. © 2017 InfluxData. All rights reserved.6 In practice it is a different aggregation for the well-known logs and stats.
  • 7. © 2017 InfluxData. All rights reserved.7 To tell the story of our distributed system
  • 8. How a trace looks like?
  • 9. © 2017 InfluxData. All rights reserved.9 A span is the smallest unit in a trace.
  • 10. © 2017 InfluxData. All rights reserved.10 It describes a single action executed by a program: ● A single HTTP request. ● A database query. ● A message execution in a queue system. ● A lookup from a key/value store.
  • 11. © 2017 InfluxData. All rights reserved.11 A span is described via: ● span_id the unique identifier in a trace ● trace_id to determine its trace ● parent_id to describe a hierarchy ● labels a set of key/value pairs ● Span Context is a set of value that will be propagated in the trace ● Logs
  • 12. © 2017 InfluxData. All rights reserved.12 post: /users handle.create_user user_exists insert_user send_email nginx sA mysql mysql worker A single trace
  • 13. © 2017 InfluxData. All rights reserved.13 post: /users handle.create_user user_exists nginx sA mysql mysql sA Service Name: mysql Trace ID: 34ytsy5hs45gs46hs5g Span ID: se5hs5s5hs45gs45gs Span Name: user_exists Duration: 1.2s Logs: query: “select * from tb_user where id = 345” user: sa_service
  • 14. How do I follow a request?
  • 15. © 2017 InfluxData. All rights reserved.15 The implementation changes based on what you are instrumenting ¨ To instrument HTTP services the solution is via HEADER ¨ Same for grpc ¨ For queue system you can pass it as part of the message payload
  • 16. © 2017 InfluxData. All rights reserved.16 B3-Propagation https://github.com/openzipkin/b3-propagation X-B3-TraceId: 80f198ee56343ba864fe8b2a57d3eff7 X-B3-ParentSpanId: 05e3ac9a4f6e3b90 X-B3-SpanId: e457b5a2e4d86bd1 X-B3-Sampled: 1
  • 17. Do I need a standard for tracing?
  • 18. © 2017 InfluxData. All rights reserved.18 YES
  • 19. © 2017 InfluxData. All rights reserved.19 1. Applications can be written using different languages but at the end you need to build one single trace. It means that they need to agree on a common standard/protocol. 2. If you use a widely supported standard you can avoid vendor lock-in.
  • 20. © 2017 InfluxData. All rights reserved.20
  • 21. © 2017 InfluxData. All rights reserved.21 log log log log log log Parent Span Span Context / Baggage Child Child Child Span ¨ Spans - Basic unit of timing and causality. Can be tagged with key/value pairs. ¨ Logs - Structured data recorded on a span. ¨ Span Context - serializable format for linking spans across network boundaries. Carries baggage, such as a request and client IDs. ¨ Tracers - Anything that plugs into the OpenTracing API to record information. ¨ ZipKin, Jaeger, LightStep, others ¨ Also metrics (Prometheus) and logging
  • 22. © 2017 InfluxData. All rights reserved.22 1.5 year old! 🎂 Tracer implementations: Zipkin, Jaeger, LightStep, SkyWalking, others All sorts of companies use OpenTracing:
  • 23. © 2017 InfluxData. All rights reserved.23 Rapidly growing OSS and vendor adoption JDBIJava Webservlet Jaxr
  • 24. © 2017 InfluxData. All rights reserved.24 import "github.com/opentracing/opentracing-go" import ".../some_tracing_impl" func main() { opentracing.SetGlobalTracer( // tracing impl specific: some_tracing_impl.New(...), ) ... } https://github.com/opentracing/opentracing-go Opentracing: Configure the GlobalTracer
  • 25. © 2017 InfluxData. All rights reserved.25 func xyz(ctx context.Context, ...) { ... span, ctx := opentracing.StartSpanFromContext(ctx, "operation_name") defer span.Finish() span.LogFields( log.String("event", "soft error"), log.String("type", "cache timeout"), log.Int("waited.millis", 1500)) ... } https://github.com/opentracing/opentracing-go Opentracing: Create a Span from the Context
  • 26. © 2017 InfluxData. All rights reserved.26 func xyz(parentSpan opentracing.Span, ...) { ... sp := opentracing.StartSpan( "operation_name", opentracing.ChildOf(parentSpan.Context())) defer sp.Finish() ... } https://github.com/opentracing/opentracing-go Opentracing: Create a Child Span
  • 27. © 2017 InfluxData. All rights reserved.27 OpenCensus: instrumentation spec and libraries by Google Common Interface to get stats and traces from your apps Different exporters to persist your data
  • 28. How a tracing infrastructure looks?
  • 29. © 2017 InfluxData. All rights reserved.29 OpenTracing API application logic µ-service frameworks Lambda functions RPC & control-flow frameworks existing instrumentation tracing infrastructure main() I N S T A N A J a e g e r microservice process
  • 30. Can I have a tracing infrastructure on-prem?
  • 31. © 2017 InfluxData. All rights reserved.31 There are different Open Source alternatives: ¨ Zipkin ¨ Java ¨ Sponsored by Twitter ¨ Supported backend: ElasticSearch, MySQL, Cassandra ¨ Jaeger ¨ Go ¨ Sponsored by Uber and part of the CNCF ¨ Supported backend: ElasticSearch, Cassandra
  • 32. There are as a service tracing infrastructure?
  • 33. © 2017 InfluxData. All rights reserved.33 ¨ NewRelic ¨ Honeycomb ¨ LightSteps ¨ AWS X-Ray ¨ Google Stack Driver
  • 34. Can I store traces everywhere?
  • 35. © 2017 InfluxData. All rights reserved.35 Short answer YES. At your own risk… ¨ Really high cardinality ¨ High write throughput Probably databases like InfluxDB, Cassandra, MongoDB are a better option compared with MySQL, Postgres but it always depends on traffic and amount of data.
  • 36. © 2017 InfluxData. All rights reserved.36 Reach out: @gianarb gianluca@influxdb.com Any question?