SlideShare a Scribd company logo
Apache Storm 
A Brief Introduction
What Is Storm 
● Distributed 
● Stream Oriented 
● Real Time* 
● Scalable 
● Reliable**
Topologies 
● Storm’s equivalent of an application 
○ Consists of Components (Spouts and Bolts) 
■ parallelism 
■ Data represented as streams of tuples
Spouts 
● Spouts 
○ Expose an external input source 
■ An unbounded stream of data 
○ Are polled, by storm, for their next Tuple 
○ Produce one or more streams of Tuples 
○ Notified when a Tuple is completely processed 
○ Notified when a Tuple fails to be processed
Bolts 
● Bolts 
○ Subscribes to one or more Streams 
■ Grouped by all, randomly, or by Field values 
○ Produces zero or more Streams 
○ Single threaded execution per instance
An Example 
Classic word count: 
Lets assume we have an unbounded incoming stream of 
sentences and we want to have a constantly updated count 
of the words found in the text. 
Code: 
https://github.com/apache/storm/blob/master/examples/storm-starter/src/jvm/storm/starter/WordCountTopology.java
Building a topology 
We provide a description of the topology to be 
deployed via the fluent TopologyBuilder class. 
TopologyBuilder builder = new TopologyBuilder();
Spout 
First we’ll create a spout to generate random 
sentences and add it to the topology 
public class RandomSentenceSpout extends BaseRichSpout { 
… 
public void nextTuple() { 
String[] sentences = new String[]{ "the cow jumped over the moon", "an apple a day keeps the doctor away", 
"four score and seven years ago", "snow white and the seven dwarfs", "i am at two with nature" }; 
String sentence = sentences[_rand.nextInt(sentences.length)]; 
_collector.emit(new Values(sentence)); 
} 
… 
} 
builder.setSpout("spout", new RandomSentenceSpout(), 5);
Split Messages 
Next we’ll split each item into words. 
The example is in python, here it is in Java: 
public static class SplitSentence extends BaseBasicBolt { 
... 
public void execute(Tuple tuple, BasicOutputCollector collector) { 
String sentence = tuple.getString(0); 
for (String word : sentence.split(“s”)) { 
collector.emit(new Values(word)); 
} 
} 
... 
} 
builder.setBolt("split", new SplitSentence(), 8).shuffleGrouping("spout");
Count Words 
Now we can count the words 
public static class WordCount extends BaseBasicBolt { 
Map<String, Integer> counts = new HashMap<String, Integer>(); 
public void execute(Tuple tuple, BasicOutputCollector collector) { 
String word = tuple.getString(0); 
Integer count = counts.get(word); 
if (count == null) 
count = 0; 
count++; 
counts.put(word, count); 
collector.emit(new Values(word, count)); 
} 
... 
} 
builder.setBolt("count", new WordCount(), 12).fieldsGrouping("split", new Fields("word"));
Discussion 
● Component Lifecycle Events 
● In Memory State 
● Deployment view
Ecosystem 
● Trident 
● SummingBird
Alternatives 
● Spark Streaming
Questions? 
comments, concerns or criticisms?

More Related Content

PPTX
실시간 인벤트 처리
DOC
Async task
PPTX
Namespace1
PPTX
作業系統
PPTX
working with files
PDF
F Files - Learnings from 3 years of Neos Support
ODP
Clock
실시간 인벤트 처리
Async task
Namespace1
作業系統
working with files
F Files - Learnings from 3 years of Neos Support
Clock

What's hot (20)

PDF
Glusterfs session #5 inode t, fd-t lifecycles
RTF
ODP
Java 7 new features
PDF
1 sample16c132 java-programming
DOC
Devry gsp 215 week 6 i lab virtual memory new
PPTX
Ns2 ns3 training in mohali
PDF
ReactPHP – reaktor jądrowy w PHP
DOCX
เกมส์จับคู่
PDF
Glusterfs session #9 index xlator
PPTX
Network simulator 2
PPTX
Network simulator 2
PPTX
Stack Data structure
PDF
Java Week9(A) Notepad
PPTX
Coding convention
PDF
Åsted .Net (CSI .Net)
RTF
PDF
Netty: asynchronous data transfer
PDF
In class, we discussed min-heaps. In a min-heap the element of the heap with ...
PDF
Creating a Name seperator Custom Control using C#
PDF
A698111855 22750 26_2018_finite
Glusterfs session #5 inode t, fd-t lifecycles
Java 7 new features
1 sample16c132 java-programming
Devry gsp 215 week 6 i lab virtual memory new
Ns2 ns3 training in mohali
ReactPHP – reaktor jądrowy w PHP
เกมส์จับคู่
Glusterfs session #9 index xlator
Network simulator 2
Network simulator 2
Stack Data structure
Java Week9(A) Notepad
Coding convention
Åsted .Net (CSI .Net)
Netty: asynchronous data transfer
In class, we discussed min-heaps. In a min-heap the element of the heap with ...
Creating a Name seperator Custom Control using C#
A698111855 22750 26_2018_finite
Ad

Viewers also liked (17)

PPTX
проекторы
PDF
FARO 3D Metrology Solutions
PPTX
Presentasi sej pemikiran islam
PPTX
Patagonia's H2No Digital marketing plan
PPTX
Presentasi ku manajemen pai
PPT
Història escola sant esteve
DOC
Accounting ppt
PPTX
мониторы
PPTX
Music magazine genres
PDF
Case app cirrus aircraft
PDF
Catch Up_Epilogue
DOC
Autocad
PPT
Projection of solids
PPTX
Mary shelley percy Shelley
DOC
Accounting ppt
DOC
Viva questions
DOC
Project on sales promotion in big bazaar
проекторы
FARO 3D Metrology Solutions
Presentasi sej pemikiran islam
Patagonia's H2No Digital marketing plan
Presentasi ku manajemen pai
Història escola sant esteve
Accounting ppt
мониторы
Music magazine genres
Case app cirrus aircraft
Catch Up_Epilogue
Autocad
Projection of solids
Mary shelley percy Shelley
Accounting ppt
Viva questions
Project on sales promotion in big bazaar
Ad

Similar to Storm introduction (20)

PDF
Introduction to Apache Storm
PDF
Real time stream processing presentation at General Assemb.ly
PPTX
PDF
Apache Storm Tutorial
PDF
Real time and reliable processing with Apache Storm
PDF
PigSPARQL: A SPARQL Query Processing Baseline for Big Data
PPTX
storm-170531123446.pptx
PDF
Storm Anatomy
PPT
Real-Time Streaming with Apache Spark Streaming and Apache Storm
PPTX
Sinfonier: How I turned my grandmother into a data analyst - Fran J. Gomez - ...
PPT
Thread
PDF
Streams processing with Storm
DOCX
forwarder.java.txt java forwarder class waits for an in.docx
PPTX
Java 7 & 8 New Features
PDF
Distributed Realtime Computation using Apache Storm
PDF
Intro to Apache Storm
PDF
BWB Meetup: Storm - distributed realtime computation system
PPTX
My lecture stack_queue_operation
PDF
Introduction to clojure
Introduction to Apache Storm
Real time stream processing presentation at General Assemb.ly
Apache Storm Tutorial
Real time and reliable processing with Apache Storm
PigSPARQL: A SPARQL Query Processing Baseline for Big Data
storm-170531123446.pptx
Storm Anatomy
Real-Time Streaming with Apache Spark Streaming and Apache Storm
Sinfonier: How I turned my grandmother into a data analyst - Fran J. Gomez - ...
Thread
Streams processing with Storm
forwarder.java.txt java forwarder class waits for an in.docx
Java 7 & 8 New Features
Distributed Realtime Computation using Apache Storm
Intro to Apache Storm
BWB Meetup: Storm - distributed realtime computation system
My lecture stack_queue_operation
Introduction to clojure

Recently uploaded (20)

PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Empathic Computing: Creating Shared Understanding
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Approach and Philosophy of On baking technology
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Electronic commerce courselecture one. Pdf
PDF
Machine learning based COVID-19 study performance prediction
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Encapsulation theory and applications.pdf
Mobile App Security Testing_ A Comprehensive Guide.pdf
Spectral efficient network and resource selection model in 5G networks
Reach Out and Touch Someone: Haptics and Empathic Computing
Empathic Computing: Creating Shared Understanding
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Diabetes mellitus diagnosis method based random forest with bat algorithm
Approach and Philosophy of On baking technology
Chapter 3 Spatial Domain Image Processing.pdf
Electronic commerce courselecture one. Pdf
Machine learning based COVID-19 study performance prediction
Review of recent advances in non-invasive hemoglobin estimation
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Programs and apps: productivity, graphics, security and other tools
Per capita expenditure prediction using model stacking based on satellite ima...
MIND Revenue Release Quarter 2 2025 Press Release
Encapsulation theory and applications.pdf

Storm introduction

  • 1. Apache Storm A Brief Introduction
  • 2. What Is Storm ● Distributed ● Stream Oriented ● Real Time* ● Scalable ● Reliable**
  • 3. Topologies ● Storm’s equivalent of an application ○ Consists of Components (Spouts and Bolts) ■ parallelism ■ Data represented as streams of tuples
  • 4. Spouts ● Spouts ○ Expose an external input source ■ An unbounded stream of data ○ Are polled, by storm, for their next Tuple ○ Produce one or more streams of Tuples ○ Notified when a Tuple is completely processed ○ Notified when a Tuple fails to be processed
  • 5. Bolts ● Bolts ○ Subscribes to one or more Streams ■ Grouped by all, randomly, or by Field values ○ Produces zero or more Streams ○ Single threaded execution per instance
  • 6. An Example Classic word count: Lets assume we have an unbounded incoming stream of sentences and we want to have a constantly updated count of the words found in the text. Code: https://github.com/apache/storm/blob/master/examples/storm-starter/src/jvm/storm/starter/WordCountTopology.java
  • 7. Building a topology We provide a description of the topology to be deployed via the fluent TopologyBuilder class. TopologyBuilder builder = new TopologyBuilder();
  • 8. Spout First we’ll create a spout to generate random sentences and add it to the topology public class RandomSentenceSpout extends BaseRichSpout { … public void nextTuple() { String[] sentences = new String[]{ "the cow jumped over the moon", "an apple a day keeps the doctor away", "four score and seven years ago", "snow white and the seven dwarfs", "i am at two with nature" }; String sentence = sentences[_rand.nextInt(sentences.length)]; _collector.emit(new Values(sentence)); } … } builder.setSpout("spout", new RandomSentenceSpout(), 5);
  • 9. Split Messages Next we’ll split each item into words. The example is in python, here it is in Java: public static class SplitSentence extends BaseBasicBolt { ... public void execute(Tuple tuple, BasicOutputCollector collector) { String sentence = tuple.getString(0); for (String word : sentence.split(“s”)) { collector.emit(new Values(word)); } } ... } builder.setBolt("split", new SplitSentence(), 8).shuffleGrouping("spout");
  • 10. Count Words Now we can count the words public static class WordCount extends BaseBasicBolt { Map<String, Integer> counts = new HashMap<String, Integer>(); public void execute(Tuple tuple, BasicOutputCollector collector) { String word = tuple.getString(0); Integer count = counts.get(word); if (count == null) count = 0; count++; counts.put(word, count); collector.emit(new Values(word, count)); } ... } builder.setBolt("count", new WordCount(), 12).fieldsGrouping("split", new Fields("word"));
  • 11. Discussion ● Component Lifecycle Events ● In Memory State ● Deployment view
  • 12. Ecosystem ● Trident ● SummingBird