SlideShare a Scribd company logo
Why Loggly Loves Apache 
Kafka, and How We Use Its 
Unbreakable Messaging for 
Better Apache Log Storm 
Management 
Infrastructure Engineering Team 
June 2014 
| Log management as a service Simplify Log Management
What Loggly Does 
World’s most popular cloud-based 
log management service 
§ More than 5,000 customers 
§ Near real-time indexing of events 
Distributed architecture, built on AWS 
Initial production services in 2011 
§ Loggly Generation 2 released in Sept 2013 
| Log management as a service Simplify Log Management
Loggly: Addressing the first big data 
problem every company faces 
§ Centralized logging 
and archival 
§ Real-time processing, 
analysis and 
visualization 
§ Monitoring, alerting 
and troubleshooting 
| Log management as a service Simplify Log Management
Agenda for this Presentation 
§ The challenges of Log 
Management at scale 
§ Overview of Loggly’s 
processing pipeline 
§ Alternative technologies 
considered 
§ Why we love Apache Kafka 
§ How Kafka has added 
flexibility to our pipeline 
| Log management as a service Simplify Log Management
The Challenges of Log Management at Scale 
§ Big data 
– >750 billion events logged to 
date 
– Sustained bursts of 100,000+ 
events per second 
– Data space measured in 
petabytes 
§ Need for high fault tolerance 
§ Near real-time indexing 
requirements 
§ Time-series index 
management 
| Log management as a service Simplify Log Management
Log Management Processing Pipeline: 
Overview 
Load Balancing 
Kafka 
Stage 
2 
Loggly 
Custom 
Module 
| Log management as a service Simplify Log Management
Collectors Can Easily Outpace 
Downstream Processes 
Load Balancing 
Kafka 
Stage 
2 
Loggly 
Custom 
Module 
§ Written in C++ 
§ Designed to ingest 
massive data volumes 
§ Need to collect 
regardless of what’s 
happening 
downstream 
| Log management as a service Simplify Log Management
Solution: 
Queue That’s External to Collector 
Load Balancing 
Kafka 
Stage 
2 
Loggly 
Custom 
Module 
§ Based on Apache 
Kafka 
§ Highly performant 
and reliable 
| Log management as a service Simplify Log Management
Alternate/ Supplementary 
Approaches Considered 
§ Internal buffering in collectors 
– Added complexity 
§ Cassandra 
– Not as good a queue as Kafka 
§ Apache Storm 
– In initial Gen2 architecture, removed after launch 
| Log management as a service Simplify Log Management
The Secret to Log Management at Scale: 
Keep It Simple, Stupid 
Results: 
§ Can process sustained rates of 
100,000+ events per second per cluster 
§ Average message 300 bytes 
| Log management as a service Simplify Log Management
Why We Love 
Kafka 
| Log management as a service Simplify Log Management
What Attracted Us in the First Place 
No single point 
of failure 
• Terabytes of data move through our Kafka cluster 
every day without losing a single event 
• We use age-based retention to purge old data on disks 
Low latency • 99.99999% of the time our data is coming from disk 
cache and RAM; only very rarely do we hit disk 
Performance • Crazy good! 
• We currently have a bunch of Kafka brokers running 
on m2.xlarge instances backed by provisioned IOPS. 
• One of consumer group (eight threads) which maps a 
log to a customer can process about 200,000 events 
per second draining from 192 partitions spread across 
three brokers 
Scalability • Ability to increase partition count per topic and 
downstream consumer threads provides flexibility to 
increase throughput when desired 
| Log management as a service Simplify Log Management
How Our Kafka Crush Has Deepened 
Distributed log 
collection 
• Local pods and collectors spread all over the Internet with 
local Kafka deployments to collect data from customers 
located all over world 
• Can collect logs even when we lose connectivity 
• When network comes back, Kafka sends the logs 
downstream to the rest of the pipeline 
More efficient, 
effective 
DevOps 
• Deploying Kafka throughout pipeline makes it easy to 
disable certain parts of system (for troubleshooting or 
upgrades) 
• No worrying that we will lose customer data 
• Example: Add support for new log type into our 
automatic parsing capabilities by turning off existing 
parser, deploying new one, and processing logs that 
Kafka has queued up 
Controlling 
resource 
utilization 
• Keep collectors as simple as possible for resilience and 
reliability reasons 
• Add intelligence into our pipelines using Kafka 
| Log management as a service Simplify Log Management
Resource Utilization Example: 
“Noisy Neighbors” 
| Log management as a service Simplify Log Management
“Noisy Neighbors” are 
Inherent to SaaS 
§ Sending many times their “normal” level of 
logging volume, inadvertently or because their 
application is in big trouble 
§ Routing logs to separate queue minimizes 
impact on other customers 
| Log management as a service Simplify Log Management
Kafka Queues Add Flexibility to Loggly 
Pipeline 
§ Because Kafka topics are very cheap from a 
performance and overhead standpoint, we 
can create as many queues as we want 
§ Scaled to the performance we want 
§ Optimizing resource utilization across the system 
§ Because they can be created dynamically, we 
can make business rules very flexible 
§ Makes us confident that pipeline will scale as 
customer data volumes do 
| Log management as a service Simplify Log Management
Conclusion: 
Kafka Frees Our Development Team 
to Build Differentiating Features 
§ Kafka deployment working without us thinking 
about it 
§ Plenty of other things to do to keep our 
position as the world’s most popular cloud-based 
log management service! 
| Log management as a service Simplify Log Management
Does Log Management 
Sound Hard? It Should! 
Let us do the heavy lifting for you! 
Try Loggly FREE for 30 days 
About Us: 
Loggly is the world’s most popular cloud-based log management solution, used by 
more than 5,000 happy customers to effortlessly spot problems in real-time, easily 
pinpoint root causes and resolve issues faster to ensure application success. 
Visit us at loggly.com or follow @loggly on Twitter. 
| Log management as a service Simplify Log Management
Did you like this presentation? 
Head over to our blog for 
more great content! 
Take me to the Loggly Blog 
| Log management as a service Simplify Log Management

More Related Content

PDF
AWS re:Invent presentation: Unmeltable Infrastructure at Scale by Loggly
PPTX
Netflix Data Pipeline With Kafka
PDF
Flink forward-2017-netflix keystones-paas
PDF
Netflix Keystone Pipeline at Samza Meetup 10-13-2015
PDF
Data pipeline with kafka
PDF
Netflix keystone streaming data pipeline @scale in the cloud-dbtb-2016
PPTX
Building an Event-oriented Data Platform with Kafka, Eric Sammer
PDF
Unbounded bounded-data-strangeloop-2016-monal-daxini
AWS re:Invent presentation: Unmeltable Infrastructure at Scale by Loggly
Netflix Data Pipeline With Kafka
Flink forward-2017-netflix keystones-paas
Netflix Keystone Pipeline at Samza Meetup 10-13-2015
Data pipeline with kafka
Netflix keystone streaming data pipeline @scale in the cloud-dbtb-2016
Building an Event-oriented Data Platform with Kafka, Eric Sammer
Unbounded bounded-data-strangeloop-2016-monal-daxini

What's hot (20)

PPTX
Netflix Data Pipeline With Kafka
PDF
The Netflix Way to deal with Big Data Problems
PDF
Netflix Keystone—Cloud scale event processing pipeline
PPTX
Kafka - Linkedin's messaging backbone
PPTX
Streaming in Practice - Putting Apache Kafka in Production
PPTX
Introduction to Kafka
PDF
URP? Excuse You! The Three Metrics You Have to Know
PPTX
Building Event-Driven Systems with Apache Kafka
PDF
Beaming flink to the cloud @ netflix ff 2016-monal-daxini
PPTX
Kafka Summit NYC 2017 - Cloud Native Data Streaming Microservices with Spring...
PDF
Deploying Kafka at Dropbox, Mark Smith, Sean Fellows
PDF
Akka, Spark or Kafka? Selecting The Right Streaming Engine For the Job
PPTX
Apache Pulsar: Why Unified Messaging and Streaming Is the Future - Pulsar Sum...
PPTX
Kafka Summit NYC 2017 - Apache Kafka in the Enterprise: What if it Fails?
PPTX
6/18/14 Billing & Payments Engineering Meetup I
PDF
Should you read Kafka as a stream or in batch? Should you even care? | Ido Na...
PPTX
Hive & HBase For Transaction Processing
PPTX
Change Data Capture using Kafka
PPTX
Netflix viewing data architecture evolution - EBJUG Nov 2014
PPT
What Crimean War gunboats teach us about the need for schema registries
Netflix Data Pipeline With Kafka
The Netflix Way to deal with Big Data Problems
Netflix Keystone—Cloud scale event processing pipeline
Kafka - Linkedin's messaging backbone
Streaming in Practice - Putting Apache Kafka in Production
Introduction to Kafka
URP? Excuse You! The Three Metrics You Have to Know
Building Event-Driven Systems with Apache Kafka
Beaming flink to the cloud @ netflix ff 2016-monal-daxini
Kafka Summit NYC 2017 - Cloud Native Data Streaming Microservices with Spring...
Deploying Kafka at Dropbox, Mark Smith, Sean Fellows
Akka, Spark or Kafka? Selecting The Right Streaming Engine For the Job
Apache Pulsar: Why Unified Messaging and Streaming Is the Future - Pulsar Sum...
Kafka Summit NYC 2017 - Apache Kafka in the Enterprise: What if it Fails?
6/18/14 Billing & Payments Engineering Meetup I
Should you read Kafka as a stream or in batch? Should you even care? | Ido Na...
Hive & HBase For Transaction Processing
Change Data Capture using Kafka
Netflix viewing data architecture evolution - EBJUG Nov 2014
What Crimean War gunboats teach us about the need for schema registries
Ad

Viewers also liked (15)

PDF
If Santa Had a Data Audit Log App...
PDF
Like loggly using open source
PDF
6 Critical SaaS Engineering Mistakes to Avoid
PPTX
2014 AWS Re:Invent sharing
PDF
Rumble Entertainment GDC 2014: Maximizing Revenue Through Logging
PPTX
Log Management and Analysis for Cloud Applications
PPTX
Delivering High-Availability Web Services with NGINX Plus on AWS
PPTX
Enterprise Logging and Log Management: Hot Topics by Dr. Anton Chuvakin
PPTX
Log management principle and usage
PDF
Log management with Graylog2 - FrOSCon 2012
PPT
NIST 800-92 Log Management Guide in the Real World
PDF
Framework and Product Comparison for Big Data Log Analytics and ITOA
PDF
SIEM for Beginners: Everything You Wanted to Know About Log Management but We...
PDF
Developing Real-Time Data Pipelines with Apache Kafka
PDF
SIEM vs Log Management - Data Security Solutions 2011
If Santa Had a Data Audit Log App...
Like loggly using open source
6 Critical SaaS Engineering Mistakes to Avoid
2014 AWS Re:Invent sharing
Rumble Entertainment GDC 2014: Maximizing Revenue Through Logging
Log Management and Analysis for Cloud Applications
Delivering High-Availability Web Services with NGINX Plus on AWS
Enterprise Logging and Log Management: Hot Topics by Dr. Anton Chuvakin
Log management principle and usage
Log management with Graylog2 - FrOSCon 2012
NIST 800-92 Log Management Guide in the Real World
Framework and Product Comparison for Big Data Log Analytics and ITOA
SIEM for Beginners: Everything You Wanted to Know About Log Management but We...
Developing Real-Time Data Pipelines with Apache Kafka
SIEM vs Log Management - Data Security Solutions 2011
Ad

Similar to Why @Loggly Loves Apache Kafka, and How We Use Its Unbreakable Messaging for Better Log Management (20)

PPTX
Amazon aws big data demystified | Introduction to streaming and messaging flu...
PDF
Elastically Scaling Kafka Using Confluent
PPTX
Serverless design considerations for Cloud Native workloads
PDF
Event Driven Microservices
PPTX
Distributed Kafka Architecture Taboola Scale
ODP
Stream processing using Kafka
PDF
Twitter’s Apache Kafka Adoption Journey | Ming Liu, Twitter
PPTX
Event Driven Architectures with Apache Kafka
PPTX
Introduction to streaming and messaging flume,kafka,SQS,kinesis
PDF
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
PDF
Redpanda and ClickHouse
PDF
Building Streaming Data Applications Using Apache Kafka
PPTX
Building streaming data applications using Kafka*[Connect + Core + Streams] b...
PDF
Apache Kafka® at Dropbox
PDF
Apache Kafka - Scalable Message-Processing and more !
PPTX
Aws 12 Month Free Tier for Web Designers and Developers
PDF
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015
PPTX
IBM Message Hub service in Bluemix - Apache Kafka in a public cloud
PDF
Self-hosting Kafka at Scale: Netflix's Journey & Challenges
PPTX
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Amazon aws big data demystified | Introduction to streaming and messaging flu...
Elastically Scaling Kafka Using Confluent
Serverless design considerations for Cloud Native workloads
Event Driven Microservices
Distributed Kafka Architecture Taboola Scale
Stream processing using Kafka
Twitter’s Apache Kafka Adoption Journey | Ming Liu, Twitter
Event Driven Architectures with Apache Kafka
Introduction to streaming and messaging flume,kafka,SQS,kinesis
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
Redpanda and ClickHouse
Building Streaming Data Applications Using Apache Kafka
Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Apache Kafka® at Dropbox
Apache Kafka - Scalable Message-Processing and more !
Aws 12 Month Free Tier for Web Designers and Developers
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015
IBM Message Hub service in Bluemix - Apache Kafka in a public cloud
Self-hosting Kafka at Scale: Netflix's Journey & Challenges
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...

More from SolarWinds Loggly (10)

PDF
Loggly - Tools and Techniques For Logging Microservices
PDF
Loggly - 5 Popular .NET Logging Libraries
PDF
Loggly - IT Operations in a Serverless World (Infographic)
PDF
Loggly - Case Study - Loggly and Docker Deliver Powerful Monitoring for XAPPm...
PDF
Loggly - Case Study - Stanley Black & Decker Transforms Work with Support fro...
PDF
Loggly - Case Study - Loggly and Kubernetes Give Molecule Easy Access to the ...
PDF
Loggly - Case Study - Datami Keeps Developer Productivity High with Loggly
PDF
Loggly - Case Study - BEMOBI - Bemobi Monitors the Experience of 500 Million ...
PDF
Loggly - How to Scale Your Architecture and DevOps Practices for Big Data App...
PDF
Loggly - Benchmarking 5 Node.js Logging Libraries
Loggly - Tools and Techniques For Logging Microservices
Loggly - 5 Popular .NET Logging Libraries
Loggly - IT Operations in a Serverless World (Infographic)
Loggly - Case Study - Loggly and Docker Deliver Powerful Monitoring for XAPPm...
Loggly - Case Study - Stanley Black & Decker Transforms Work with Support fro...
Loggly - Case Study - Loggly and Kubernetes Give Molecule Easy Access to the ...
Loggly - Case Study - Datami Keeps Developer Productivity High with Loggly
Loggly - Case Study - BEMOBI - Bemobi Monitors the Experience of 500 Million ...
Loggly - How to Scale Your Architecture and DevOps Practices for Big Data App...
Loggly - Benchmarking 5 Node.js Logging Libraries

Recently uploaded (20)

PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
Big Data Technologies - Introduction.pptx
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPT
Teaching material agriculture food technology
PDF
Machine learning based COVID-19 study performance prediction
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
A Presentation on Artificial Intelligence
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
cuic standard and advanced reporting.pdf
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Big Data Technologies - Introduction.pptx
Mobile App Security Testing_ A Comprehensive Guide.pdf
Teaching material agriculture food technology
Machine learning based COVID-19 study performance prediction
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
A Presentation on Artificial Intelligence
The AUB Centre for AI in Media Proposal.docx
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
cuic standard and advanced reporting.pdf
NewMind AI Weekly Chronicles - August'25 Week I
“AI and Expert System Decision Support & Business Intelligence Systems”
Chapter 3 Spatial Domain Image Processing.pdf
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Review of recent advances in non-invasive hemoglobin estimation
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
The Rise and Fall of 3GPP – Time for a Sabbatical?

Why @Loggly Loves Apache Kafka, and How We Use Its Unbreakable Messaging for Better Log Management

  • 1. Why Loggly Loves Apache Kafka, and How We Use Its Unbreakable Messaging for Better Apache Log Storm Management Infrastructure Engineering Team June 2014 | Log management as a service Simplify Log Management
  • 2. What Loggly Does World’s most popular cloud-based log management service § More than 5,000 customers § Near real-time indexing of events Distributed architecture, built on AWS Initial production services in 2011 § Loggly Generation 2 released in Sept 2013 | Log management as a service Simplify Log Management
  • 3. Loggly: Addressing the first big data problem every company faces § Centralized logging and archival § Real-time processing, analysis and visualization § Monitoring, alerting and troubleshooting | Log management as a service Simplify Log Management
  • 4. Agenda for this Presentation § The challenges of Log Management at scale § Overview of Loggly’s processing pipeline § Alternative technologies considered § Why we love Apache Kafka § How Kafka has added flexibility to our pipeline | Log management as a service Simplify Log Management
  • 5. The Challenges of Log Management at Scale § Big data – >750 billion events logged to date – Sustained bursts of 100,000+ events per second – Data space measured in petabytes § Need for high fault tolerance § Near real-time indexing requirements § Time-series index management | Log management as a service Simplify Log Management
  • 6. Log Management Processing Pipeline: Overview Load Balancing Kafka Stage 2 Loggly Custom Module | Log management as a service Simplify Log Management
  • 7. Collectors Can Easily Outpace Downstream Processes Load Balancing Kafka Stage 2 Loggly Custom Module § Written in C++ § Designed to ingest massive data volumes § Need to collect regardless of what’s happening downstream | Log management as a service Simplify Log Management
  • 8. Solution: Queue That’s External to Collector Load Balancing Kafka Stage 2 Loggly Custom Module § Based on Apache Kafka § Highly performant and reliable | Log management as a service Simplify Log Management
  • 9. Alternate/ Supplementary Approaches Considered § Internal buffering in collectors – Added complexity § Cassandra – Not as good a queue as Kafka § Apache Storm – In initial Gen2 architecture, removed after launch | Log management as a service Simplify Log Management
  • 10. The Secret to Log Management at Scale: Keep It Simple, Stupid Results: § Can process sustained rates of 100,000+ events per second per cluster § Average message 300 bytes | Log management as a service Simplify Log Management
  • 11. Why We Love Kafka | Log management as a service Simplify Log Management
  • 12. What Attracted Us in the First Place No single point of failure • Terabytes of data move through our Kafka cluster every day without losing a single event • We use age-based retention to purge old data on disks Low latency • 99.99999% of the time our data is coming from disk cache and RAM; only very rarely do we hit disk Performance • Crazy good! • We currently have a bunch of Kafka brokers running on m2.xlarge instances backed by provisioned IOPS. • One of consumer group (eight threads) which maps a log to a customer can process about 200,000 events per second draining from 192 partitions spread across three brokers Scalability • Ability to increase partition count per topic and downstream consumer threads provides flexibility to increase throughput when desired | Log management as a service Simplify Log Management
  • 13. How Our Kafka Crush Has Deepened Distributed log collection • Local pods and collectors spread all over the Internet with local Kafka deployments to collect data from customers located all over world • Can collect logs even when we lose connectivity • When network comes back, Kafka sends the logs downstream to the rest of the pipeline More efficient, effective DevOps • Deploying Kafka throughout pipeline makes it easy to disable certain parts of system (for troubleshooting or upgrades) • No worrying that we will lose customer data • Example: Add support for new log type into our automatic parsing capabilities by turning off existing parser, deploying new one, and processing logs that Kafka has queued up Controlling resource utilization • Keep collectors as simple as possible for resilience and reliability reasons • Add intelligence into our pipelines using Kafka | Log management as a service Simplify Log Management
  • 14. Resource Utilization Example: “Noisy Neighbors” | Log management as a service Simplify Log Management
  • 15. “Noisy Neighbors” are Inherent to SaaS § Sending many times their “normal” level of logging volume, inadvertently or because their application is in big trouble § Routing logs to separate queue minimizes impact on other customers | Log management as a service Simplify Log Management
  • 16. Kafka Queues Add Flexibility to Loggly Pipeline § Because Kafka topics are very cheap from a performance and overhead standpoint, we can create as many queues as we want § Scaled to the performance we want § Optimizing resource utilization across the system § Because they can be created dynamically, we can make business rules very flexible § Makes us confident that pipeline will scale as customer data volumes do | Log management as a service Simplify Log Management
  • 17. Conclusion: Kafka Frees Our Development Team to Build Differentiating Features § Kafka deployment working without us thinking about it § Plenty of other things to do to keep our position as the world’s most popular cloud-based log management service! | Log management as a service Simplify Log Management
  • 18. Does Log Management Sound Hard? It Should! Let us do the heavy lifting for you! Try Loggly FREE for 30 days About Us: Loggly is the world’s most popular cloud-based log management solution, used by more than 5,000 happy customers to effortlessly spot problems in real-time, easily pinpoint root causes and resolve issues faster to ensure application success. Visit us at loggly.com or follow @loggly on Twitter. | Log management as a service Simplify Log Management
  • 19. Did you like this presentation? Head over to our blog for more great content! Take me to the Loggly Blog | Log management as a service Simplify Log Management