SlideShare a Scribd company logo
Flink for Everyone: Self-Service Data
Analytics with StreamPipes
Patrick Wiener, Philipp Zehnder
Flink Forward Europe 2019, Berlin, 2019-10-08
www.streampipes.org | @streampipes | github.com/streampipes
2
"A self-service IoT toolbox to enable non-technical users
to connect, analyze and explore IoT data streams"
What's StreamPipes?
www.streampipes.org | @streampipes | github.com/streampipes
3
What's StreamPipes?
Big Data / Edge
InfrastructureExecuteReusable
algorithm toolbox
Install
Model pipelines
www.streampipes.org | @streampipes | github.com/streampipes
About us
4
Dominik Riemer
Senior Research Scientist
Philipp Zehnder
Research Scientist
Patrick Wiener
Research Scientist
FZI Research Center for Information Technology, Karlsruhe, Germany
Stream Processing, Data Management, Machine Learning
Non-profit research center for applied ICT research (250 employees)
Started StreamPipes in 2014, first OSS release 2018
www.streampipes.org | @streampipes | github.com/streampipes
Agenda
The need for self-service IoT data analytics1
StreamPipes: Technical Overview
Demo
2
Lessons Learned w/ Flink & Getting Started3
The need for self-service IoT data analytics
1
www.streampipes.org | @streampipes | github.com/streampipes
Conveyor Belts
Pressure
Oil temperature
Dust particles
Production plans
Environmental Data
Gear box drive
Energy consumption
Telematics
Industrial Internet of Things
Data streams everywhere
Continuous Monitoring Situational Awareness
Continuous Data
Harmonization
Flexible data integration
from heterogeneous
sources and monitoring
of current system states
Detect time-critical
situations, e.g., by
means of rules or ML
approaches
Continuous pre-
processing and
transformation of input
streams for third party
systems
Industrial Internet of Things
Typical application scenarios
www.streampipes.org | @streampipes | github.com/streampipes
StreamPipes
Open Source framework to easily manage IoT data
Data Access
Data analytics &
harmonization
Data exploration &
exploitation
Generic adapters
Specific adapters
Metadata
Data streams & sets
Pre-processing
Filter/Aggregation
Pattern Detection
ML
Situation detection
Harmonized data sets
Visualizations
Third-party systems
9
Technical Overview
2
www.streampipes.org | @streampipes | github.com/streampipes
High-level architecture
Analytics Microservices
Data Integration
Data Sources
Adapter Library
Pipeline Editor
Streaming Engine
11
www.streampipes.org | @streampipes | github.com/streampipes
High-level architecture
Analytics Microservices
Data Integration
Data Sources
Adapter Library
Pipeline Editor
Streaming Engine
12
Data Access
StreamPipes Connect: Easily connect IoT sources
www.streampipes.org | @streampipes | github.com/streampipes
Data Access
Machine-interpretable metadata
100
011
010
001
010
010
100
101
000
111
data stream
{
"tstamp": 1453478160,
"machineId": "ID5",
"temperature": 73.5,
"flowRate": 4.2
}
Semantic
metadata
Data type, runtime name,
semantic type
Frequency, latency,
measurement unit
Format, Protocol
Schema
Quality
Grounding
14
www.streampipes.org | @streampipes | github.com/streampipes
Data Access
Machine-interpretable metadata
Example
temperature
schema.org/temperature
schema.org/degreeCelsius
xsd:float
[0,80]
100
011
010
001
010
010
100
101
000
111
data stream
{
"tstamp": 1453478160,
"machineId": "ID5",
"temperature": 73.5,
"flowRate": 4.2
}
Semantic
metadata
15
www.streampipes.org | @streampipes | github.com/streampipes
Data Access
StreamPipes Connect: Architecture
Connect Master
Connect Worker 1 Connect Worker 2 Connect Worker n
MySQL
RESTROS
OPC-UAPLC
MQTT
Messaging
Edge Worker Cloud Worker
…
register
capabilities
16
Demo
Introduction to StreamPipes
Connecting and visualizing flow rate measurements of a multi tank system
Demo
Introduction to StreamPipes
Flow
Sensor
Aggregate
data
VisualizeMQTT
StreamPipes Connect
Connecting and visualizing flow rate measurements of a multi tank system
www.streampipes.org | @streampipes | github.com/streampipes
High-level architecture
Analytics Microservices
Data Sources
Adapter Library
Pipeline Editor
Data Integration
19
Streaming Engine
Analytics microservices
Extensible toolbox
www.streampipes.org | @streampipes | github.com/streampipes
• Extensible toolbox for pre-
processing & analytics
• Semantics-based
consistency checking
• Exchangable run-time
wrappers
• Stateful/stateless
• Inclusion of ML-models
possible
Features
Analytics microservices
Extensible toolbox
21
www.streampipes.org | @streampipes | github.com/streampipes
Analytics microservices
Anatomy of a processing element
Aggregation
Controller
output eventsinput events
Runtime
22
www.streampipes.org | @streampipes | github.com/streampipes
Analytics microservices
How to implement a new processing element
Select Wrapper
Implement
runtime
Describe
controller
Build / Install
Maven
Archetype
StreamPipes
SDK
StreamPipes
SDK
SDK, Docker,
UI
Aggregation
Controller
Runtime
23
www.streampipes.org | @streampipes | github.com/streampipes
Analytics microservices
Runtime Wrapper
Standalone/Edge
Wrapper
Kafka Streams
Wrapper
Python Wrapper
Select
Wrapper
Implement
runtime
Describe
controller
Build /
Install
Aggregation
Controller
output eventsinput events
Runtime
24
Flink Wrapper
www.streampipes.org | @streampipes | github.com/streampipes
Analytics microservices
SDK: Runtime
Select
Wrapper
Implement
runtime
Describe
controller
Build /
Install
Aggregation
Controller
Runtime
25
www.streampipes.org | @streampipes | github.com/streampipes
Analytics microservices
Processing Element Description
User Configuration Output StrategyInput Requirements
Schema, Quality, Protocol,
Format
Text Input, Selections, Domain
Knowledge, …
Keep, Custom, Transform,
Append, …
Semantic Metadata
Select
Wrapper
Implement
runtime
Describe
controller
Build /
Install
Aggregation
Controller
Runtime
26
www.streampipes.org | @streampipes | github.com/streampipes
Analytics microservices
Development: Maven Archetypes & SDK
Select
Wrapper
Implement
runtime
Describe
controller
Build /
Install
Aggregation
Controller
Runtime
27
Input
User Config
Output
www.streampipes.org | @streampipes | github.com/streampipes
Flink Cluster
Aggregation Job
28
Select
Wrapper
Implement
runtime
Describe
controller
Build /
InstallAnalytics microservices
Flink Deployment
Pipeline Management
register start
Controller
output eventsinput events
Runtime
Aggregation
RemoteEnvironment
Upload jar
Submit execution graph
Kafka
Source
Kafka
Sink
Demo
Condition monitoring + StreamPipes
Rule-based monitoring of flow rate measurements in a multi tank system
Demo
Condition monitoring + StreamPipes
Rule-based monitoring of flow rate measurements in a multi tank system
Flow
Sensor
Aggregate
data
Detect
Leakage
Notify
MQTT
IoTDB
StreamPipes Connect
Calculate
Statistics
Lessons Learned & Getting Started
3
www.streampipes.org | @streampipes | github.com/streampipes
Potentially huge stream of sensor data needs scalability
Remote Environment eased the implementation of Flink Wrapper
Clean & intuitive Flink API enables fast processor development
Simple setup for development (mini cluster) and deployment
Easy to configure & monitor
Good integration with Apache Kafka
Flink + StreamPipes
Lessons learned
P
P
P
P
P
P
www.streampipes.org | @streampipes | github.com/streampipes
How to start
Setting up StreamPipes
Docker-based installation
streampipes.org/en/download
Download installer from Github1
./streampipes start2
Finish installation in browser3
33
www.streampipes.org | @streampipes | github.com/streampipes
34
What's next?
Data Access
Data analytics &
harmonization
Data exploration &
exploitation
Metadata recognition
PLC4X
Flink fault tolerance
Python wrapper
AutoML
Historical data
explorer
New features: Current work-in-progress
Infrastructure (Edge / Fog)
Let's connect!
…and if you like StreamPipes, star us on Github J
streampipes.org
docs.streampipes.org
github.com/streampipes/streampipes
twitter.com/streampipes
feedback@streampipes.org

More Related Content

PPTX
Building A Self Service Streaming Platform at Pinterest - Steven Bairos-Novak...
PDF
Event Streaming Architecture for Industry 4.0 - Abdelkrim Hadjidj & Jan Kuni...
PPTX
Real Time Experiment Analytics at Pinterest with Apache Flink - Ben Liu & Par...
PDF
Introduction to Streaming with Apache Flink
PPTX
Analysis of data science software 2020
PPTX
Apache Flink and what it is used for
PPTX
The Past, Present, and Future of Apache Flink®
PPTX
Self-Service Analytics on Hadoop: Lessons Learned
Building A Self Service Streaming Platform at Pinterest - Steven Bairos-Novak...
Event Streaming Architecture for Industry 4.0 - Abdelkrim Hadjidj & Jan Kuni...
Real Time Experiment Analytics at Pinterest with Apache Flink - Ben Liu & Par...
Introduction to Streaming with Apache Flink
Analysis of data science software 2020
Apache Flink and what it is used for
The Past, Present, and Future of Apache Flink®
Self-Service Analytics on Hadoop: Lessons Learned

What's hot (20)

PPTX
Flink Case Study: Bouygues Telecom
PDF
Apache Flink 101 - the rise of stream processing and beyond
PDF
KSQL-ops! Running ksqlDB in the Wild (Simon Aubury, ThoughtWorks) Kafka Summi...
PDF
Bay Area Apache Flink Meetup Community Update August 2015
PDF
Keynote: Customer Journey with Streaming Data on AWS - Rahul Pathak, AWS
PDF
Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...
PPTX
Observing Intraday Indicators Using Real-Time Tick Data on Apache Superset an...
PPTX
Apache Zeppelin Meetup Christian Tzolov 1/21/16
PDF
Jim Dowling – Interactive Flink analytics with HopsWorks and Zeppelin
PPTX
Flink Forward Berlin 2017 Keynote: Ferd Scheepers - Taking away customer fric...
PDF
Virtual Flink Forward 2020: Data driven matchmaking streaming at Hyperconnect...
PDF
Reliable and Scalable Data Ingestion at Airbnb
PDF
Gain Deep Visibility into APIs and Integrations with Anypoint Monitoring
PDF
Flattening the Curve with Kafka (Rishi Tarar, Northrop Grumman Corp.) Kafka S...
PDF
Unified, Efficient, and Portable Data Processing with Apache Beam
PDF
Maximilian Michels - Flink and Beam
PPTX
Apache Flink(tm) - A Next-Generation Stream Processor
PDF
Time Series Analysis Using an Event Streaming Platform
PDF
Apache Metron in the Real World
PDF
A Journey to Building an Autonomous Streaming Data Platform—Scaling to Trilli...
Flink Case Study: Bouygues Telecom
Apache Flink 101 - the rise of stream processing and beyond
KSQL-ops! Running ksqlDB in the Wild (Simon Aubury, ThoughtWorks) Kafka Summi...
Bay Area Apache Flink Meetup Community Update August 2015
Keynote: Customer Journey with Streaming Data on AWS - Rahul Pathak, AWS
Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...
Observing Intraday Indicators Using Real-Time Tick Data on Apache Superset an...
Apache Zeppelin Meetup Christian Tzolov 1/21/16
Jim Dowling – Interactive Flink analytics with HopsWorks and Zeppelin
Flink Forward Berlin 2017 Keynote: Ferd Scheepers - Taking away customer fric...
Virtual Flink Forward 2020: Data driven matchmaking streaming at Hyperconnect...
Reliable and Scalable Data Ingestion at Airbnb
Gain Deep Visibility into APIs and Integrations with Anypoint Monitoring
Flattening the Curve with Kafka (Rishi Tarar, Northrop Grumman Corp.) Kafka S...
Unified, Efficient, and Portable Data Processing with Apache Beam
Maximilian Michels - Flink and Beam
Apache Flink(tm) - A Next-Generation Stream Processor
Time Series Analysis Using an Event Streaming Platform
Apache Metron in the Real World
A Journey to Building an Autonomous Streaming Data Platform—Scaling to Trilli...
Ad

Similar to Flink for Everyone: Self Service Data Analytics with StreamPipes - Philipp Zehnder & Patrick Wiener, FZI Research Center for Information Technology (20)

PDF
Flink for Everyone: Self-Service Data Analytics with StreamPipes
PDF
Self-Service IoT Data Analytics with StreamPipes
PDF
Apache StreamPipes – Flexible Industrial IoT Management
PDF
Io t data streaming
PPTX
Overview of Apache Flink: the 4G of Big Data Analytics Frameworks
PPTX
Overview of Apache Fink: the 4 G of Big Data Analytics Frameworks
PPTX
Overview of Apache Fink: The 4G of Big Data Analytics Frameworks
PPTX
Analysis of Major Trends in Big Data Analytics
PPTX
Analysis of Major Trends in Big Data Analytics
PPTX
Trivento summercamp fast data 9/9/2016
PDF
Spark Streaming into context
PDF
Streaming analytics state of the art
PPTX
Apache Flink: Real-World Use Cases for Streaming Analytics
PPTX
Chicago Flink Meetup: Flink's streaming architecture
PDF
Building end to end streaming application on Spark
PDF
How To Build, Integrate, and Deploy Real-Time Streaming Pipelines On Kubernetes
PDF
Unconference Round Table Notes
PDF
Building Big Data Streaming Architectures
PDF
Self-Service Data Ingestion Using NiFi, StreamSets & Kafka
PPTX
Flink history, roadmap and vision
Flink for Everyone: Self-Service Data Analytics with StreamPipes
Self-Service IoT Data Analytics with StreamPipes
Apache StreamPipes – Flexible Industrial IoT Management
Io t data streaming
Overview of Apache Flink: the 4G of Big Data Analytics Frameworks
Overview of Apache Fink: the 4 G of Big Data Analytics Frameworks
Overview of Apache Fink: The 4G of Big Data Analytics Frameworks
Analysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data Analytics
Trivento summercamp fast data 9/9/2016
Spark Streaming into context
Streaming analytics state of the art
Apache Flink: Real-World Use Cases for Streaming Analytics
Chicago Flink Meetup: Flink's streaming architecture
Building end to end streaming application on Spark
How To Build, Integrate, and Deploy Real-Time Streaming Pipelines On Kubernetes
Unconference Round Table Notes
Building Big Data Streaming Architectures
Self-Service Data Ingestion Using NiFi, StreamSets & Kafka
Flink history, roadmap and vision
Ad

More from Flink Forward (20)

PDF
Building a fully managed stream processing platform on Flink at scale for Lin...
PPTX
Evening out the uneven: dealing with skew in Flink
PPTX
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
PDF
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
PDF
Introducing the Apache Flink Kubernetes Operator
PPTX
Autoscaling Flink with Reactive Mode
PDF
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
PPTX
One sink to rule them all: Introducing the new Async Sink
PPTX
Tuning Apache Kafka Connectors for Flink.pptx
PDF
Flink powered stream processing platform at Pinterest
PPTX
Apache Flink in the Cloud-Native Era
PPTX
Where is my bottleneck? Performance troubleshooting in Flink
PPTX
Using the New Apache Flink Kubernetes Operator in a Production Deployment
PPTX
The Current State of Table API in 2022
PDF
Flink SQL on Pulsar made easy
PPTX
Dynamic Rule-based Real-time Market Data Alerts
PPTX
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
PPTX
Processing Semantically-Ordered Streams in Financial Services
PDF
Tame the small files problem and optimize data layout for streaming ingestion...
PDF
Batch Processing at Scale with Flink & Iceberg
Building a fully managed stream processing platform on Flink at scale for Lin...
Evening out the uneven: dealing with skew in Flink
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing the Apache Flink Kubernetes Operator
Autoscaling Flink with Reactive Mode
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
One sink to rule them all: Introducing the new Async Sink
Tuning Apache Kafka Connectors for Flink.pptx
Flink powered stream processing platform at Pinterest
Apache Flink in the Cloud-Native Era
Where is my bottleneck? Performance troubleshooting in Flink
Using the New Apache Flink Kubernetes Operator in a Production Deployment
The Current State of Table API in 2022
Flink SQL on Pulsar made easy
Dynamic Rule-based Real-time Market Data Alerts
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Processing Semantically-Ordered Streams in Financial Services
Tame the small files problem and optimize data layout for streaming ingestion...
Batch Processing at Scale with Flink & Iceberg

Recently uploaded (20)

PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Approach and Philosophy of On baking technology
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
[발표본] 너의 과제는 클라우드에 있어_KTDS_김동현_20250524.pdf
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
NewMind AI Monthly Chronicles - July 2025
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Advanced Soft Computing BINUS July 2025.pdf
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
Review of recent advances in non-invasive hemoglobin estimation
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
20250228 LYD VKU AI Blended-Learning.pptx
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Approach and Philosophy of On baking technology
Diabetes mellitus diagnosis method based random forest with bat algorithm
[발표본] 너의 과제는 클라우드에 있어_KTDS_김동현_20250524.pdf
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
NewMind AI Monthly Chronicles - July 2025
“AI and Expert System Decision Support & Business Intelligence Systems”
Advanced Soft Computing BINUS July 2025.pdf
Advanced methodologies resolving dimensionality complications for autism neur...
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
MYSQL Presentation for SQL database connectivity
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Review of recent advances in non-invasive hemoglobin estimation
The AUB Centre for AI in Media Proposal.docx
NewMind AI Weekly Chronicles - August'25 Week I
The Rise and Fall of 3GPP – Time for a Sabbatical?

Flink for Everyone: Self Service Data Analytics with StreamPipes - Philipp Zehnder & Patrick Wiener, FZI Research Center for Information Technology

  • 1. Flink for Everyone: Self-Service Data Analytics with StreamPipes Patrick Wiener, Philipp Zehnder Flink Forward Europe 2019, Berlin, 2019-10-08
  • 2. www.streampipes.org | @streampipes | github.com/streampipes 2 "A self-service IoT toolbox to enable non-technical users to connect, analyze and explore IoT data streams" What's StreamPipes?
  • 3. www.streampipes.org | @streampipes | github.com/streampipes 3 What's StreamPipes? Big Data / Edge InfrastructureExecuteReusable algorithm toolbox Install Model pipelines
  • 4. www.streampipes.org | @streampipes | github.com/streampipes About us 4 Dominik Riemer Senior Research Scientist Philipp Zehnder Research Scientist Patrick Wiener Research Scientist FZI Research Center for Information Technology, Karlsruhe, Germany Stream Processing, Data Management, Machine Learning Non-profit research center for applied ICT research (250 employees) Started StreamPipes in 2014, first OSS release 2018
  • 5. www.streampipes.org | @streampipes | github.com/streampipes Agenda The need for self-service IoT data analytics1 StreamPipes: Technical Overview Demo 2 Lessons Learned w/ Flink & Getting Started3
  • 6. The need for self-service IoT data analytics 1
  • 7. www.streampipes.org | @streampipes | github.com/streampipes Conveyor Belts Pressure Oil temperature Dust particles Production plans Environmental Data Gear box drive Energy consumption Telematics Industrial Internet of Things Data streams everywhere
  • 8. Continuous Monitoring Situational Awareness Continuous Data Harmonization Flexible data integration from heterogeneous sources and monitoring of current system states Detect time-critical situations, e.g., by means of rules or ML approaches Continuous pre- processing and transformation of input streams for third party systems Industrial Internet of Things Typical application scenarios
  • 9. www.streampipes.org | @streampipes | github.com/streampipes StreamPipes Open Source framework to easily manage IoT data Data Access Data analytics & harmonization Data exploration & exploitation Generic adapters Specific adapters Metadata Data streams & sets Pre-processing Filter/Aggregation Pattern Detection ML Situation detection Harmonized data sets Visualizations Third-party systems 9
  • 11. www.streampipes.org | @streampipes | github.com/streampipes High-level architecture Analytics Microservices Data Integration Data Sources Adapter Library Pipeline Editor Streaming Engine 11
  • 12. www.streampipes.org | @streampipes | github.com/streampipes High-level architecture Analytics Microservices Data Integration Data Sources Adapter Library Pipeline Editor Streaming Engine 12
  • 13. Data Access StreamPipes Connect: Easily connect IoT sources
  • 14. www.streampipes.org | @streampipes | github.com/streampipes Data Access Machine-interpretable metadata 100 011 010 001 010 010 100 101 000 111 data stream { "tstamp": 1453478160, "machineId": "ID5", "temperature": 73.5, "flowRate": 4.2 } Semantic metadata Data type, runtime name, semantic type Frequency, latency, measurement unit Format, Protocol Schema Quality Grounding 14
  • 15. www.streampipes.org | @streampipes | github.com/streampipes Data Access Machine-interpretable metadata Example temperature schema.org/temperature schema.org/degreeCelsius xsd:float [0,80] 100 011 010 001 010 010 100 101 000 111 data stream { "tstamp": 1453478160, "machineId": "ID5", "temperature": 73.5, "flowRate": 4.2 } Semantic metadata 15
  • 16. www.streampipes.org | @streampipes | github.com/streampipes Data Access StreamPipes Connect: Architecture Connect Master Connect Worker 1 Connect Worker 2 Connect Worker n MySQL RESTROS OPC-UAPLC MQTT Messaging Edge Worker Cloud Worker … register capabilities 16
  • 17. Demo Introduction to StreamPipes Connecting and visualizing flow rate measurements of a multi tank system
  • 18. Demo Introduction to StreamPipes Flow Sensor Aggregate data VisualizeMQTT StreamPipes Connect Connecting and visualizing flow rate measurements of a multi tank system
  • 19. www.streampipes.org | @streampipes | github.com/streampipes High-level architecture Analytics Microservices Data Sources Adapter Library Pipeline Editor Data Integration 19 Streaming Engine
  • 21. www.streampipes.org | @streampipes | github.com/streampipes • Extensible toolbox for pre- processing & analytics • Semantics-based consistency checking • Exchangable run-time wrappers • Stateful/stateless • Inclusion of ML-models possible Features Analytics microservices Extensible toolbox 21
  • 22. www.streampipes.org | @streampipes | github.com/streampipes Analytics microservices Anatomy of a processing element Aggregation Controller output eventsinput events Runtime 22
  • 23. www.streampipes.org | @streampipes | github.com/streampipes Analytics microservices How to implement a new processing element Select Wrapper Implement runtime Describe controller Build / Install Maven Archetype StreamPipes SDK StreamPipes SDK SDK, Docker, UI Aggregation Controller Runtime 23
  • 24. www.streampipes.org | @streampipes | github.com/streampipes Analytics microservices Runtime Wrapper Standalone/Edge Wrapper Kafka Streams Wrapper Python Wrapper Select Wrapper Implement runtime Describe controller Build / Install Aggregation Controller output eventsinput events Runtime 24 Flink Wrapper
  • 25. www.streampipes.org | @streampipes | github.com/streampipes Analytics microservices SDK: Runtime Select Wrapper Implement runtime Describe controller Build / Install Aggregation Controller Runtime 25
  • 26. www.streampipes.org | @streampipes | github.com/streampipes Analytics microservices Processing Element Description User Configuration Output StrategyInput Requirements Schema, Quality, Protocol, Format Text Input, Selections, Domain Knowledge, … Keep, Custom, Transform, Append, … Semantic Metadata Select Wrapper Implement runtime Describe controller Build / Install Aggregation Controller Runtime 26
  • 27. www.streampipes.org | @streampipes | github.com/streampipes Analytics microservices Development: Maven Archetypes & SDK Select Wrapper Implement runtime Describe controller Build / Install Aggregation Controller Runtime 27 Input User Config Output
  • 28. www.streampipes.org | @streampipes | github.com/streampipes Flink Cluster Aggregation Job 28 Select Wrapper Implement runtime Describe controller Build / InstallAnalytics microservices Flink Deployment Pipeline Management register start Controller output eventsinput events Runtime Aggregation RemoteEnvironment Upload jar Submit execution graph Kafka Source Kafka Sink
  • 29. Demo Condition monitoring + StreamPipes Rule-based monitoring of flow rate measurements in a multi tank system
  • 30. Demo Condition monitoring + StreamPipes Rule-based monitoring of flow rate measurements in a multi tank system Flow Sensor Aggregate data Detect Leakage Notify MQTT IoTDB StreamPipes Connect Calculate Statistics
  • 31. Lessons Learned & Getting Started 3
  • 32. www.streampipes.org | @streampipes | github.com/streampipes Potentially huge stream of sensor data needs scalability Remote Environment eased the implementation of Flink Wrapper Clean & intuitive Flink API enables fast processor development Simple setup for development (mini cluster) and deployment Easy to configure & monitor Good integration with Apache Kafka Flink + StreamPipes Lessons learned P P P P P P
  • 33. www.streampipes.org | @streampipes | github.com/streampipes How to start Setting up StreamPipes Docker-based installation streampipes.org/en/download Download installer from Github1 ./streampipes start2 Finish installation in browser3 33
  • 34. www.streampipes.org | @streampipes | github.com/streampipes 34 What's next? Data Access Data analytics & harmonization Data exploration & exploitation Metadata recognition PLC4X Flink fault tolerance Python wrapper AutoML Historical data explorer New features: Current work-in-progress Infrastructure (Edge / Fog)
  • 35. Let's connect! …and if you like StreamPipes, star us on Github J streampipes.org docs.streampipes.org github.com/streampipes/streampipes twitter.com/streampipes feedback@streampipes.org