SlideShare a Scribd company logo
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS re:INVENT
Tracing and Debugging for Containerized Services
C a l v i n F r e n c h - O w e n
C o - f o u n d e r , S e g m e n t
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
It’s 2am, I’m getting paged…
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
It’s 2am, I’m getting paged…
…now what?
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Segment by the numbers
- 140 billion monthly events
- 160k events/second peak
- 16,000 containers
- 350 ECS services
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Three steps to the (debugging) epiphany
1. build your mental model
2. dig in
3. lean into the cloud
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
1. Build your mental model
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Step 1: Build your mental model
- How does the system fit together?
- How is my service configured?
- What should I even be looking at?
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Step 1: Build your mental model
- How does the system fit together? specs
- How is my service configured?
- What should I even be looking at?
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
$ docker run segment/specs
https://github.com/segmentio/specs
Running Specs
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Step 1: Build your mental model
- How does the system fit together? specs
- How is my service configured? specs & terraform
- What should I even be looking at?
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
A terraform ECS service
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Terraform service - under the hood
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Step 1: Build your mental model
- How does the system fit together? specs
- How is my service configured? specs & terraform
- What should I even be looking at? cloudwatch & datadog
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Step 1: Build your mental model
- How does the system fit together? specs
- How is my service configured? specs & terraform
- What should I even be looking at? cloudwatch & datadog
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
2. Dig in
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Step 2: Dig in
• Stats?
• Logging?
• Tracing?
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Step 2: Dig in
• Stats? cloudwatch & statsd/veneur
• Logging?
• Tracing?
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
service service service
veneur datadog-agent
api.datadog.com cloudwatch
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Step 2: Dig in
• Stats? cloudwatch & statsd/veneur
• Logging? cloudwatch, ecs-logs, cwlogs
• Tracing?
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
…
RateLimitIntervalSec=1m
RateLimitBurst=200000
…
journald.conf
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Logging
• github.com/segmentio/ecs-logs
• github.com/segmentio/ecs-logs-go
• github.com/segmentio/ecs-logs-js
• github.com/segmentio/rate-limiting-log-proxy
• github.com/segmentio/cwlogs
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Step 2: Dig in
• Stats? cloudwatch & Statsd/Veneur
• Logging? cloudwatch, ecs-logs, cwlogs
• Tracing? bcc, pprof-server
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• BPF Compiler Collection
• No kernel modules, no
instrumentation
• Lots of very useful tools
• https://github.com/iovisor/bcc
BCC
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• pprof is automatically exposed
by the go runtime
• gives you profiling, heatmaps,
memory dumps and more
• nice visualizations, one URL
click away
• server exposed by consul
• github.com/segmentio/pprof-
server
pprof-server
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
$ docker run -it --rm -p 6061:6061 
segment/pprof-server 
-registry consul://172.17.0.1:8500
github.com/segmentio/pprof-server
pprof-server
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Step 2: Dig in
• Stats? cloudwatch & Statsd/Veneur
• Logging? cloudwatch, ecs-logs, cwlogs
• Tracing? bcc, pprof-server
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
3. Lean into the cloud
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Step 3: Lean into the cloud
- cattle, not pets
- auto-scale and pre-scale
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Cattle, not pets
- reproduceable machine images
- built with packer
- run via systems
- out-of-the-box autoscaling
- created with terraform
- github.com/segmentio/roll-instances
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
- comes default with any service
- tuned for systems like dynamo
- no autoscaling == not ready for prod
Autoscaling Everywhere
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Lessons Learned
• Build tools to surface actionable information first
• Auto-scaling a huge win
• Give developers alerting + scaling policies out of the box
• Passive tools are easy to build adoption around
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
THANK YOU!

More Related Content

PDF
Best Practice for Online Game Development on AWS
PDF
통합 머신러닝 플랫폼 Amazon SageMaker 활용하기 (강지양 & 김태현, AWS 솔루션즈 아키텍트) :: AWS DevDay2018
PDF
Applying principles of chaos engineering to serverless (reinvent DVC305)
PDF
AWS Black Belt Online Seminar 2017 Amazon GameLift
PDF
Analytics Web Day | Query your Data in S3 with SQL and optimize for Cost and ...
PPTX
Devoxx: Building AI-powered applications on AWS
PPTX
DEV209 A Field Guide to Monitoring in the Cloud: From Lift and Shift to AWS L...
PDF
Amazon Web Services User Group Sydney - February 2018
Best Practice for Online Game Development on AWS
통합 머신러닝 플랫폼 Amazon SageMaker 활용하기 (강지양 & 김태현, AWS 솔루션즈 아키텍트) :: AWS DevDay2018
Applying principles of chaos engineering to serverless (reinvent DVC305)
AWS Black Belt Online Seminar 2017 Amazon GameLift
Analytics Web Day | Query your Data in S3 with SQL and optimize for Cost and ...
Devoxx: Building AI-powered applications on AWS
DEV209 A Field Guide to Monitoring in the Cloud: From Lift and Shift to AWS L...
Amazon Web Services User Group Sydney - February 2018

More from Calvin French-Owen (6)

PPTX
Dbs302 driving a realtime personalization engine with cloud bigtable
PPTX
Terraform Abstractions for Safety and Power
PPTX
Synapse 2018 Guarding against failure in a hundred step pipeline
PPTX
Dash presentation
PPTX
Effective terraform
PPTX
Terraform at Scale
Dbs302 driving a realtime personalization engine with cloud bigtable
Terraform Abstractions for Safety and Power
Synapse 2018 Guarding against failure in a hundred step pipeline
Dash presentation
Effective terraform
Terraform at Scale
Ad

Recently uploaded (20)

PDF
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
PPTX
Transform Your Business with a Software ERP System
PPTX
history of c programming in notes for students .pptx
PDF
top salesforce developer skills in 2025.pdf
PDF
Digital Strategies for Manufacturing Companies
PDF
Understanding Forklifts - TECH EHS Solution
PPTX
Reimagine Home Health with the Power of Agentic AI​
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PDF
Softaken Excel to vCard Converter Software.pdf
PDF
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PDF
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
PDF
System and Network Administration Chapter 2
PDF
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
PDF
Nekopoi APK 2025 free lastest update
PPTX
L1 - Introduction to python Backend.pptx
PDF
How Creative Agencies Leverage Project Management Software.pdf
PPTX
Introduction to Artificial Intelligence
PDF
Upgrade and Innovation Strategies for SAP ERP Customers
PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
Transform Your Business with a Software ERP System
history of c programming in notes for students .pptx
top salesforce developer skills in 2025.pdf
Digital Strategies for Manufacturing Companies
Understanding Forklifts - TECH EHS Solution
Reimagine Home Health with the Power of Agentic AI​
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
Softaken Excel to vCard Converter Software.pdf
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
Which alternative to Crystal Reports is best for small or large businesses.pdf
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
System and Network Administration Chapter 2
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
Nekopoi APK 2025 free lastest update
L1 - Introduction to python Backend.pptx
How Creative Agencies Leverage Project Management Software.pdf
Introduction to Artificial Intelligence
Upgrade and Innovation Strategies for SAP ERP Customers
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
Ad

re:Invent CON320 Tracing and Debugging for Containerized Services

  • 1. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS re:INVENT Tracing and Debugging for Containerized Services C a l v i n F r e n c h - O w e n C o - f o u n d e r , S e g m e n t
  • 2. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 3. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. It’s 2am, I’m getting paged…
  • 4. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. It’s 2am, I’m getting paged… …now what?
  • 5. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Segment by the numbers - 140 billion monthly events - 160k events/second peak - 16,000 containers - 350 ECS services
  • 6. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Three steps to the (debugging) epiphany 1. build your mental model 2. dig in 3. lean into the cloud
  • 7. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 1. Build your mental model
  • 8. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Step 1: Build your mental model - How does the system fit together? - How is my service configured? - What should I even be looking at?
  • 9. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Step 1: Build your mental model - How does the system fit together? specs - How is my service configured? - What should I even be looking at?
  • 10. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 11. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 12. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 13. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. $ docker run segment/specs https://github.com/segmentio/specs Running Specs
  • 14. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Step 1: Build your mental model - How does the system fit together? specs - How is my service configured? specs & terraform - What should I even be looking at?
  • 15. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. A terraform ECS service
  • 16. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Terraform service - under the hood
  • 17. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 18. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 19. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Step 1: Build your mental model - How does the system fit together? specs - How is my service configured? specs & terraform - What should I even be looking at? cloudwatch & datadog
  • 20. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 21. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 22. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Step 1: Build your mental model - How does the system fit together? specs - How is my service configured? specs & terraform - What should I even be looking at? cloudwatch & datadog
  • 23. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 2. Dig in
  • 24. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Step 2: Dig in • Stats? • Logging? • Tracing?
  • 25. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Step 2: Dig in • Stats? cloudwatch & statsd/veneur • Logging? • Tracing?
  • 26. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 27. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 28. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 29. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. service service service veneur datadog-agent api.datadog.com cloudwatch
  • 30. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 31. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 32. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 33. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Step 2: Dig in • Stats? cloudwatch & statsd/veneur • Logging? cloudwatch, ecs-logs, cwlogs • Tracing?
  • 34. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 35. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. … RateLimitIntervalSec=1m RateLimitBurst=200000 … journald.conf
  • 36. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 37. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 38. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 39. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 40. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 41. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 42. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Logging • github.com/segmentio/ecs-logs • github.com/segmentio/ecs-logs-go • github.com/segmentio/ecs-logs-js • github.com/segmentio/rate-limiting-log-proxy • github.com/segmentio/cwlogs
  • 43. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Step 2: Dig in • Stats? cloudwatch & Statsd/Veneur • Logging? cloudwatch, ecs-logs, cwlogs • Tracing? bcc, pprof-server
  • 44. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. • BPF Compiler Collection • No kernel modules, no instrumentation • Lots of very useful tools • https://github.com/iovisor/bcc BCC
  • 45. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 46. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 47. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. • pprof is automatically exposed by the go runtime • gives you profiling, heatmaps, memory dumps and more • nice visualizations, one URL click away • server exposed by consul • github.com/segmentio/pprof- server pprof-server
  • 48. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 49. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 50. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 51. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. $ docker run -it --rm -p 6061:6061 segment/pprof-server -registry consul://172.17.0.1:8500 github.com/segmentio/pprof-server pprof-server
  • 52. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Step 2: Dig in • Stats? cloudwatch & Statsd/Veneur • Logging? cloudwatch, ecs-logs, cwlogs • Tracing? bcc, pprof-server
  • 53. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 3. Lean into the cloud
  • 54. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Step 3: Lean into the cloud - cattle, not pets - auto-scale and pre-scale
  • 55. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Cattle, not pets - reproduceable machine images - built with packer - run via systems - out-of-the-box autoscaling - created with terraform - github.com/segmentio/roll-instances
  • 56. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 57. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 58. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 59. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. - comes default with any service - tuned for systems like dynamo - no autoscaling == not ready for prod Autoscaling Everywhere
  • 60. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 61. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Lessons Learned • Build tools to surface actionable information first • Auto-scaling a huge win • Give developers alerting + scaling policies out of the box • Passive tools are easy to build adoption around
  • 62. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. THANK YOU!