Introduction to WSO2
Analytics Platform
Srinath Perera
VP Research
WSO2 Inc.
Analytics is Growing Up
▪ It is no longer about doing
your first analytics usecase.
▪ It is about
▪ How to do it everyday,
efficiently?
▪ How to recover?
▪ How to make
decisions?
▪ How to do other forms
like real-time ,
Interactive, and
predicative analytics
Analytics 2.0 Platform
▪ One platform for all
four forms of analytics
▪ Single consistent
programming model
▪ One analytics archive
format)
▪ Support for the lifecycle
of analytics Apps
Integrate well with rest of the
enterprise!!
Introduction to WSO2 Analytics Platform: 2016 Q2 Update
Collect Data
▪ One Sensor API to
publish events
- REST, Thrift, JMS, Kafka
- Java clients, java script
clients*
▪ First you define streams
(think it as a infinite table
in SQL DB)
▪ Then send events via
Sensor API
Can send to batch pipeline, Realtime pipeline or both via
configuration!
Collecting Data: Example
 Java example: create and send events
 Events send asynchronously
 See client given in http://goo.gl/vIJzqc for more info
Agent agent = new Agent(agentConfiguration);
publisher = new AsyncDataPublisher("tcp://hostname:7612", .. );
StreamDefinition definition = new
StreamDefinition(STREAM_NAME,VERSION);
definition.addPayloadData("sid", STRING);
...
publisher.addStreamDefinition(definition);
...
Event event = new Event();
event.setPayloadData(eventData);
publisher.publish(STREAM_NAME, VERSION, event); Send event
Define Stream
Initialize Agent
Analysis: Batch Analytics
Complex Event Processing
Analytics logic with SQL like
Queries
▪ Both BAM and CEP provides a
SQL like data processing language
▪ Since many understands SQL,
above languages made large scale
data processing Big Data
accessible to many
▪ Expressive, short, and sweet.
▪ Define core operations that covers
90% of problems
▪ Lets experts dig in when they like!
(via User Defined functions)
Scaling CEP Queries on top of
Storm
▪Accepts CEP queries with hints about how to partition streams
▪Partition streams, build a Apache Storm topology running CEP
nodes as Storm Sprouts, and run it. (see http://goo.gl/pP3kdX )
Predictive Analytics
▪ Predictive Analytics learns a
decision function (a model)
using examples
▪ Is this fraud?
▪ How to drive?
▪ Handwritten text
▪ Build models and use them
with WSO2 CEP, BAM and
ESB using WSO2 Machine
Learner Product ( 2015 Q3)
▪ Build model using R, export
them as PMML, and use
within WSO2 CEP
WSO2 Machine Learner
▪ A wizard to sample,
explore, and understand
data through
visualizations
▪ A wizard to configure,
train machine learning
models, and select the
best model
▪ Find and use those
models with WSO2 CEP,
BAM and ESB
▪ Powered by Apache
Spark MLLib
Communicate: Dashboards
▪ Idea is to give a “Overall idea” in a glance (e.g. car dashboard)
▪ Support for personalization, you can build your own dashboard.
▪ Also the entry point for Drill down
▪ How to build?
- Dashboard via Google Gadget and content via HTML5 + java scripts
- Use charting libraries like Vega or D3
Communicate: Alerts
▪ Detecting conditions can
be done via CEP Queries
▪ Key is the “Last Mile”
- Email
- SMS
- Push notifications to a UI
- Pager
- Trigger physical Alarm
▪ How?
- Select Email sender “Output Adaptor” from CEP, or send from
CEP to ESB, and ESB has lot of connectors
Communicate: APIs
▪ With mobile Apps, most data
are exposed and shared as
APIs (REST/Json ) to end
users.
▪ Need to expose analytics
results as API
▪ Following are some challenges
- Security and Permissions
- API Discovery
- Billing, throttling, quotas &
SLA
▪ How?
- Write data to a database from CEP event tables
- Build Services via WSO2 Data Service
- Expose them as APIs via API Manager
Event Stream Store
▪ One stop place for all
event stream definitions
▪ Let users
▪ Publish and consume
though Multiple protocols
like REST, JMS, Thrift,
Web Sockets etc.
▪ Discover event streams
▪ Enforce security and
authorization
▪ Per-pay subscriptions
▪ Effectively a Event Stream
Market Place!!
▪ This will automate APIs
creation as discussed in the
slide before.
What is it good for?
▪ Batch Analytics
▪ Realtime Streaming analytics
▪ Realtime Interactive analytics
▪ Lambda Architecture
▪ Train and use a ML model
▪ Selective Detailed Analysis
http://tinybuddha.com/blog/a-simple-technique-to-
solve-problems-before-they-get-bigger/
Selective Detailed Analysis
• Too expensive to do
detailed analysis on all the
data
• Instead detect the condition,
and dig into related data
• Fraud toolbox
• Other usecases
– Dynamic offers at Retail
Site
– Weather
Lambda Architecture
• Same code in both batch and realtime layers
• Idea is to fill the time between two batch runs
• Batch layer writes the data to a DB
• Realtime layer merge with batch data via Event Tables
Real Life Use Cases
▪ Health, Smart Parking solutions
▪ Financial Monitoring
▪ Smart City project, Vehicle
tracking, Building monitoring
▪ Railway monitoring
▪ Throttling and Anomaly
Detection
▪ API Analytics (13+ customers)
▪ Connected Car
Case Study: DEBS Grand Challenges
▪ DEBS ((Distributed Event Based Systems) Grand
Challenge is a yearly event processing challenge.
▪ 2014 Challenge:
▪ Smart Home electricity data: 2000 sensors, 40
houses, 4 Billion events. We posted (400K
events/sec) and close to one million
distributed throughput with 4 nodes.
▪ one of the four finalists
▪ 2015 Challenge:
▪ Based on taxi activities collected from New
York City over the year 2013. 14,144 taxis 173
million taxi trip records. We posted 300K/sec
on a single node and one of the finalists.
https://www.flickr.com/photos/shedboy/3681317392/
Case Study: Realtime Soccer
Analysis
Watch at:
https://www.youtube.com/watch?v=nRI6buQ0NOM
Case Study: TFL Traffic Analysis
Built using TFL (
Transport for
London) open data
feeds.
http://goo.gl/04tX
6k
http://goo.gl/9xNi
Cm
Select the Product
Product Features
WSO2 Data
Analytics Server
(DAS)
Everything : Batch,
Realtime, Interactive,
and Predictive
Analytics
WSO2 Complex
Event Processor
(CEP)
Realtime Analytics
only
WSO2 Machine
Learner
Predictive Analytics
only
Questions?
Thank You

More Related Content

PPTX
Introduction to WSO2 Data Analytics Platform
PPTX
WSO2 Big Data Platform and Applications
PPT
Introduction to Large Scale Data Analysis with WSO2 Analytics Platform
PPTX
ICTER 2014 Invited Talk: Large Scale Data Processing in the Real World: from ...
PDF
DEBS 2015 Tutorial : Patterns for Realtime Streaming Analytics
PPTX
Tuning Java Servers
PDF
ACM DEBS 2015: Realtime Streaming Analytics Patterns
PPTX
Solving DEBS Grand Challenge with WSO2 CEP
Introduction to WSO2 Data Analytics Platform
WSO2 Big Data Platform and Applications
Introduction to Large Scale Data Analysis with WSO2 Analytics Platform
ICTER 2014 Invited Talk: Large Scale Data Processing in the Real World: from ...
DEBS 2015 Tutorial : Patterns for Realtime Streaming Analytics
Tuning Java Servers
ACM DEBS 2015: Realtime Streaming Analytics Patterns
Solving DEBS Grand Challenge with WSO2 CEP

What's hot (20)

PDF
AI-Powered Streaming Analytics for Real-Time Customer Experience
PDF
Spark Summit - Stratio Streaming
PPTX
Using Apache Pulsar to Provide Real-Time IoT Analytics on the Edge
PDF
Streaming Analytics for Financial Enterprises
PDF
Real-Time Fraud Detection at Scale—Integrating Real-Time Deep-Link Graph Anal...
PDF
Sensing the world with data of things
PDF
Real-Time Anomoly Detection with Spark MLib, Akka and Cassandra by Natalino Busa
PDF
Scalable Event Processing with WSO2CEP @ WSO2Con2015eu
PDF
Credit Fraud Prevention with Spark and Graph Analysis
PDF
Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...
PDF
Streamlio and IoT analytics with Apache Pulsar
PDF
Auto-Pilot for Apache Spark Using Machine Learning
PDF
Real-Time Analytics and Actions Across Large Data Sets with Apache Spark
PPTX
Production ready big ml workflows from zero to hero daniel marcous @ waze
PDF
Introduction to Apache Apex by Thomas Weise
PPTX
Speed layer : Real time views in LAMBDA architecture
PPTX
Realtime streaming architecture in INFINARIO
PDF
Application and Challenges of Streaming Analytics and Machine Learning on Mu...
PDF
Introduction to Real-time data processing
PDF
Spark Streaming and IoT by Mike Freedman
AI-Powered Streaming Analytics for Real-Time Customer Experience
Spark Summit - Stratio Streaming
Using Apache Pulsar to Provide Real-Time IoT Analytics on the Edge
Streaming Analytics for Financial Enterprises
Real-Time Fraud Detection at Scale—Integrating Real-Time Deep-Link Graph Anal...
Sensing the world with data of things
Real-Time Anomoly Detection with Spark MLib, Akka and Cassandra by Natalino Busa
Scalable Event Processing with WSO2CEP @ WSO2Con2015eu
Credit Fraud Prevention with Spark and Graph Analysis
Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...
Streamlio and IoT analytics with Apache Pulsar
Auto-Pilot for Apache Spark Using Machine Learning
Real-Time Analytics and Actions Across Large Data Sets with Apache Spark
Production ready big ml workflows from zero to hero daniel marcous @ waze
Introduction to Apache Apex by Thomas Weise
Speed layer : Real time views in LAMBDA architecture
Realtime streaming architecture in INFINARIO
Application and Challenges of Streaming Analytics and Machine Learning on Mu...
Introduction to Real-time data processing
Spark Streaming and IoT by Mike Freedman
Ad

Similar to Introduction to WSO2 Analytics Platform: 2016 Q2 Update (20)

PDF
WSO2Con EU 2015: An Introduction to the WSO2 Data Analytics Platform
PPTX
WSO2 Workshop Sydney 2016 - Analytics
PDF
WSO2 Analytics Platform - The one stop shop for all your data needs
PDF
WSO2Con EU 2016: An Introduction to the WSO2 Analytics Platform
PDF
WSO2Con USA 2015: An Introduction to the WSO2 Analytics Platform
PDF
WSO2 Data Analytics Server - Product Overview
PDF
WSO2 Machine Learner - Product Overview
PDF
WSO2Con ASIA 2016: An Introduction to the WSO2 Analytics Platform
PDF
Solutions Using WSO2 Analytics
PDF
Introduction to Big Data Analytics: Batch, Real-Time, and the Best of Both Wo...
PPTX
Big Data Analytics Strategy and Roadmap
PDF
Analytics in Your Enterprise
PPTX
Big Data, Analytics and Real Time Event Processing
PDF
An introduction to the WSO2 Analytics Platform
PPTX
WSO2Con USA 2015: An Introduction to the WSO2 Analytics Platform
PDF
Big Brother for Enterprises - The WSO2 Advantage
PDF
WSO2Con USA 2017: Driving Insights for Your Digital Business With Analytics
PDF
Building your big data solution
PDF
WSO2 Product Release Webinar: WSO2 Data Analytics Server 3.0
PDF
Stream Processing in Action
WSO2Con EU 2015: An Introduction to the WSO2 Data Analytics Platform
WSO2 Workshop Sydney 2016 - Analytics
WSO2 Analytics Platform - The one stop shop for all your data needs
WSO2Con EU 2016: An Introduction to the WSO2 Analytics Platform
WSO2Con USA 2015: An Introduction to the WSO2 Analytics Platform
WSO2 Data Analytics Server - Product Overview
WSO2 Machine Learner - Product Overview
WSO2Con ASIA 2016: An Introduction to the WSO2 Analytics Platform
Solutions Using WSO2 Analytics
Introduction to Big Data Analytics: Batch, Real-Time, and the Best of Both Wo...
Big Data Analytics Strategy and Roadmap
Analytics in Your Enterprise
Big Data, Analytics and Real Time Event Processing
An introduction to the WSO2 Analytics Platform
WSO2Con USA 2015: An Introduction to the WSO2 Analytics Platform
Big Brother for Enterprises - The WSO2 Advantage
WSO2Con USA 2017: Driving Insights for Your Digital Business With Analytics
Building your big data solution
WSO2 Product Release Webinar: WSO2 Data Analytics Server 3.0
Stream Processing in Action
Ad

More from Srinath Perera (20)

PDF
Book: Software Architecture and Decision-Making
PDF
Data science Applications in the Enterprise
PDF
An Introduction to APIs
PDF
An Introduction to Blockchain for Finance Professionals
PDF
AI in the Real World: Challenges, and Risks and how to handle them?
PDF
Healthcare + AI: Use cases & Challenges
PDF
How would AI shape Future Integrations?
PDF
The Role of Blockchain in Future Integrations
PDF
Future of Serverless
PDF
Blockchain: Where are we? Where are we going?
PDF
Few thoughts about Future of Blockchain
PDF
A Visual Canvas for Judging New Technologies
PDF
Privacy in Bigdata Era
PDF
Blockchain, Impact, Challenges, and Risks
PPTX
Today's Technology and Emerging Technology Landscape
PDF
An Emerging Technologies Timeline
PDF
The Rise of Streaming SQL and Evolution of Streaming Applications
PDF
Analytics and AI: The Good, the Bad and the Ugly
PDF
Transforming a Business Through Analytics
PDF
SoC Keynote:The State of the Art in Integration Technology
Book: Software Architecture and Decision-Making
Data science Applications in the Enterprise
An Introduction to APIs
An Introduction to Blockchain for Finance Professionals
AI in the Real World: Challenges, and Risks and how to handle them?
Healthcare + AI: Use cases & Challenges
How would AI shape Future Integrations?
The Role of Blockchain in Future Integrations
Future of Serverless
Blockchain: Where are we? Where are we going?
Few thoughts about Future of Blockchain
A Visual Canvas for Judging New Technologies
Privacy in Bigdata Era
Blockchain, Impact, Challenges, and Risks
Today's Technology and Emerging Technology Landscape
An Emerging Technologies Timeline
The Rise of Streaming SQL and Evolution of Streaming Applications
Analytics and AI: The Good, the Bad and the Ugly
Transforming a Business Through Analytics
SoC Keynote:The State of the Art in Integration Technology

Recently uploaded (20)

PPT
lectureusjsjdhdsjjshdshshddhdhddhhd1.ppt
PDF
Global Data and Analytics Market Outlook Report
PDF
Votre score augmente si vous choisissez une catégorie et que vous rédigez une...
PPTX
FMIS 108 and AISlaudon_mis17_ppt_ch11.pptx
PPTX
New ISO 27001_2022 standard and the changes
PPTX
Phase1_final PPTuwhefoegfohwfoiehfoegg.pptx
PPTX
CYBER SECURITY the Next Warefare Tactics
PDF
Navigating the Thai Supplements Landscape.pdf
PPTX
CHAPTER-2-THE-ACCOUNTING-PROCESS-2-4.pptx
PPTX
Tapan_20220802057_Researchinternship_final_stage.pptx
PPTX
SET 1 Compulsory MNH machine learning intro
PPTX
eGramSWARAJ-PPT Training Module for beginners
PDF
CS3352FOUNDATION OF DATA SCIENCE _1_MAterial.pdf
PPTX
Topic 5 Presentation 5 Lesson 5 Corporate Fin
PPTX
sac 451hinhgsgshssjsjsjheegdggeegegdggddgeg.pptx
PDF
Best Data Science Professional Certificates in the USA | IABAC
PPTX
Machine Learning and working of machine Learning
PPTX
MBA JAPAN: 2025 the University of Waseda
DOCX
Factor Analysis Word Document Presentation
PPTX
statsppt this is statistics ppt for giving knowledge about this topic
lectureusjsjdhdsjjshdshshddhdhddhhd1.ppt
Global Data and Analytics Market Outlook Report
Votre score augmente si vous choisissez une catégorie et que vous rédigez une...
FMIS 108 and AISlaudon_mis17_ppt_ch11.pptx
New ISO 27001_2022 standard and the changes
Phase1_final PPTuwhefoegfohwfoiehfoegg.pptx
CYBER SECURITY the Next Warefare Tactics
Navigating the Thai Supplements Landscape.pdf
CHAPTER-2-THE-ACCOUNTING-PROCESS-2-4.pptx
Tapan_20220802057_Researchinternship_final_stage.pptx
SET 1 Compulsory MNH machine learning intro
eGramSWARAJ-PPT Training Module for beginners
CS3352FOUNDATION OF DATA SCIENCE _1_MAterial.pdf
Topic 5 Presentation 5 Lesson 5 Corporate Fin
sac 451hinhgsgshssjsjsjheegdggeegegdggddgeg.pptx
Best Data Science Professional Certificates in the USA | IABAC
Machine Learning and working of machine Learning
MBA JAPAN: 2025 the University of Waseda
Factor Analysis Word Document Presentation
statsppt this is statistics ppt for giving knowledge about this topic

Introduction to WSO2 Analytics Platform: 2016 Q2 Update

  • 1. Introduction to WSO2 Analytics Platform Srinath Perera VP Research WSO2 Inc.
  • 2. Analytics is Growing Up ▪ It is no longer about doing your first analytics usecase. ▪ It is about ▪ How to do it everyday, efficiently? ▪ How to recover? ▪ How to make decisions? ▪ How to do other forms like real-time , Interactive, and predicative analytics
  • 3. Analytics 2.0 Platform ▪ One platform for all four forms of analytics ▪ Single consistent programming model ▪ One analytics archive format) ▪ Support for the lifecycle of analytics Apps Integrate well with rest of the enterprise!!
  • 5. Collect Data ▪ One Sensor API to publish events - REST, Thrift, JMS, Kafka - Java clients, java script clients* ▪ First you define streams (think it as a infinite table in SQL DB) ▪ Then send events via Sensor API Can send to batch pipeline, Realtime pipeline or both via configuration!
  • 6. Collecting Data: Example  Java example: create and send events  Events send asynchronously  See client given in http://goo.gl/vIJzqc for more info Agent agent = new Agent(agentConfiguration); publisher = new AsyncDataPublisher("tcp://hostname:7612", .. ); StreamDefinition definition = new StreamDefinition(STREAM_NAME,VERSION); definition.addPayloadData("sid", STRING); ... publisher.addStreamDefinition(definition); ... Event event = new Event(); event.setPayloadData(eventData); publisher.publish(STREAM_NAME, VERSION, event); Send event Define Stream Initialize Agent
  • 9. Analytics logic with SQL like Queries ▪ Both BAM and CEP provides a SQL like data processing language ▪ Since many understands SQL, above languages made large scale data processing Big Data accessible to many ▪ Expressive, short, and sweet. ▪ Define core operations that covers 90% of problems ▪ Lets experts dig in when they like! (via User Defined functions)
  • 10. Scaling CEP Queries on top of Storm ▪Accepts CEP queries with hints about how to partition streams ▪Partition streams, build a Apache Storm topology running CEP nodes as Storm Sprouts, and run it. (see http://goo.gl/pP3kdX )
  • 11. Predictive Analytics ▪ Predictive Analytics learns a decision function (a model) using examples ▪ Is this fraud? ▪ How to drive? ▪ Handwritten text ▪ Build models and use them with WSO2 CEP, BAM and ESB using WSO2 Machine Learner Product ( 2015 Q3) ▪ Build model using R, export them as PMML, and use within WSO2 CEP
  • 12. WSO2 Machine Learner ▪ A wizard to sample, explore, and understand data through visualizations ▪ A wizard to configure, train machine learning models, and select the best model ▪ Find and use those models with WSO2 CEP, BAM and ESB ▪ Powered by Apache Spark MLLib
  • 13. Communicate: Dashboards ▪ Idea is to give a “Overall idea” in a glance (e.g. car dashboard) ▪ Support for personalization, you can build your own dashboard. ▪ Also the entry point for Drill down ▪ How to build? - Dashboard via Google Gadget and content via HTML5 + java scripts - Use charting libraries like Vega or D3
  • 14. Communicate: Alerts ▪ Detecting conditions can be done via CEP Queries ▪ Key is the “Last Mile” - Email - SMS - Push notifications to a UI - Pager - Trigger physical Alarm ▪ How? - Select Email sender “Output Adaptor” from CEP, or send from CEP to ESB, and ESB has lot of connectors
  • 15. Communicate: APIs ▪ With mobile Apps, most data are exposed and shared as APIs (REST/Json ) to end users. ▪ Need to expose analytics results as API ▪ Following are some challenges - Security and Permissions - API Discovery - Billing, throttling, quotas & SLA ▪ How? - Write data to a database from CEP event tables - Build Services via WSO2 Data Service - Expose them as APIs via API Manager
  • 16. Event Stream Store ▪ One stop place for all event stream definitions ▪ Let users ▪ Publish and consume though Multiple protocols like REST, JMS, Thrift, Web Sockets etc. ▪ Discover event streams ▪ Enforce security and authorization ▪ Per-pay subscriptions ▪ Effectively a Event Stream Market Place!! ▪ This will automate APIs creation as discussed in the slide before.
  • 17. What is it good for? ▪ Batch Analytics ▪ Realtime Streaming analytics ▪ Realtime Interactive analytics ▪ Lambda Architecture ▪ Train and use a ML model ▪ Selective Detailed Analysis http://tinybuddha.com/blog/a-simple-technique-to- solve-problems-before-they-get-bigger/
  • 18. Selective Detailed Analysis • Too expensive to do detailed analysis on all the data • Instead detect the condition, and dig into related data • Fraud toolbox • Other usecases – Dynamic offers at Retail Site – Weather
  • 19. Lambda Architecture • Same code in both batch and realtime layers • Idea is to fill the time between two batch runs • Batch layer writes the data to a DB • Realtime layer merge with batch data via Event Tables
  • 20. Real Life Use Cases ▪ Health, Smart Parking solutions ▪ Financial Monitoring ▪ Smart City project, Vehicle tracking, Building monitoring ▪ Railway monitoring ▪ Throttling and Anomaly Detection ▪ API Analytics (13+ customers) ▪ Connected Car
  • 21. Case Study: DEBS Grand Challenges ▪ DEBS ((Distributed Event Based Systems) Grand Challenge is a yearly event processing challenge. ▪ 2014 Challenge: ▪ Smart Home electricity data: 2000 sensors, 40 houses, 4 Billion events. We posted (400K events/sec) and close to one million distributed throughput with 4 nodes. ▪ one of the four finalists ▪ 2015 Challenge: ▪ Based on taxi activities collected from New York City over the year 2013. 14,144 taxis 173 million taxi trip records. We posted 300K/sec on a single node and one of the finalists. https://www.flickr.com/photos/shedboy/3681317392/
  • 22. Case Study: Realtime Soccer Analysis Watch at: https://www.youtube.com/watch?v=nRI6buQ0NOM
  • 23. Case Study: TFL Traffic Analysis Built using TFL ( Transport for London) open data feeds. http://goo.gl/04tX 6k http://goo.gl/9xNi Cm
  • 24. Select the Product Product Features WSO2 Data Analytics Server (DAS) Everything : Batch, Realtime, Interactive, and Predictive Analytics WSO2 Complex Event Processor (CEP) Realtime Analytics only WSO2 Machine Learner Predictive Analytics only