SlideShare a Scribd company logo
Findability Day 2016 - Big data analytics and machine learning
kai
wähner
Tibco
Kai Wähner
Technology Evangelist
kontakt@kai-waehner.de
LinkedIn
@KaiWaehner
www.kai-waehner.de
Findability Day 2016 (Stockholm, Sweden)
How to Leverage Machine Learning to Find Insights in Historical Data
© Copyright 2000-2016 TIBCO Software Inc.
Apply Big Data Analytics to Real Time Processing
© Copyright 2000-2016 TIBCO Software Inc.
Analyze and Act on Critical Business Moments
© Copyright 2000-2016 TIBCO Software Inc.
Agenda
1) Machine Learning and Big Data Analytics
2) Building an Analytic Model
3) Real Time Processing
4) Real World Scenario
© Copyright 2000-2016 TIBCO Software Inc.
Agenda
1) Machine Learning and Big Data Analytics
2) Building an Analytic Model
3) Real Time Processing
4) Real World Scenario
Machine Learning
…. allows computers to find hidden insights without being
explicitly programmed where to look.
Real World Examples of Machine Learning
Spam Detection
Search Results +
Product Recommendation
Picture Detection
(Friends, Locations, Products)
Machine Learning is already present in daily life…
Now, every enterprise is beginning to leverage it!
The Next Disruption:
Google Beats Go Champion
© Copyright 2000-2016 TIBCO Software Inc.
Analytics Maturity Model
Immediate
Long-Term	
Competitive	AdvantageValue to the Organization
A good Big Data Analytics platform can provide value to the organization
across the full spectrum of use cases
Self-service	
Dashboards
Event	Processing	Advanced	Analytics
Measure Diagnose Predict Optimize Alert Automate
Analytics Maturity
Visual	Analytics Event	Processing	
Analytics
© Copyright 2000-2016 TIBCO Software Inc.
Analytics Maturity Model
Immediate
Long-Term	
Competitive	AdvantageValue to the Organization
Visual	Analytics Event	Processing	Advanced	Analytics
Measure Diagnose Predict Optimize Alert Automate
Analytics Maturity
A good Big Data Analytics platform can provide value to the organization
across the full spectrum of use cases
Analytics
© Copyright 2000-2016 TIBCO Software Inc.
Analytics Maturity Model
Immediate
Long-Term	
Competitive	AdvantageValue to the Organization
Self-service	
Dashboards
Event	Processing	Advanced	Analytics
Measure Diagnose Predict Optimize Alert Automate
Analytics Maturity
A good Big Data Analytics platform can provide value to the organization
across the full spectrum of use cases
Visual	Analytics Event	Processing	
Analytics
© Copyright 2000-2016 TIBCO Software Inc.
The first task in a new analytics projects
is to define a Business Case!
© Copyright 2000-2016 TIBCO Software Inc.
Agenda
1) Machine Learning and Big Data Analytics
2) Building an Analytic Model
3) Real Time Processing
4) Real World Scenario
© Copyright 2000-2016 TIBCO Software Inc.
Analytical Pipeline
© Copyright 2000-2016 TIBCO Software Inc.
Analytics Maturity Model
Immediate
Long-Term	
Competitive	AdvantageValue to the Organization
A good Big Data Analytics platform can provide value to the organization
across the full spectrum of use cases
Self-service	
Dashboards
Event	Processing	Advanced	Analytics
Measure Diagnose Predict Optimize Alert Automate
Analytics Maturity
Visual	Analytics Event	Processing	
Analytics
© Copyright 2000-2016 TIBCO Software Inc.
Analytical Pipeline
© Copyright 2000-2016 TIBCO Software Inc.
Data Acquisition
© Copyright 2000-2016 TIBCO Software Inc.
Analytical Pipeline
cust_id dept sku dollar gift date
1 104 C 12003 2.40 FALSE 2016-10-17
2 105 A 12005 62.85 FALSE 2016-10-17
3 102 C 12007 69.23 TRUE 2016-10-17
4 104 B 12004 9.33 FALSE 2016-10-18
5 105 C 12010 14.16 TRUE 2016-10-18
6 101 B 12003 90.43 FALSE 2016-10-19
7 103 C 12005 90.97 FALSE 2016-10-19
n … … … … … …
cust_id A B C total # orders first_dat
e
last_dat
e
1 100 21.76 23.67 0.00 45.43 2 2016-10-
19
2016-10-
20
2 101 0.01 74.65 0.00 74.66 3 2016-10-
19
2016-10-
20
3 102 0.00 60.92 50.29 111.21 6 2016-10-
17
2016-10-
20
4 103 0.00 0.00 52.30 52.30 2 2016-10-
19
2016-10-
20© Copyright 2000-2016 TIBCO Software Inc.
Data Munging - Transformations
© Copyright 2000-2016 TIBCO Software Inc.
Analytical Pipeline
“The greatest value of a picture
is when it forces us to notice
what we never expected to see”
John W. Tukey, 1977
© Copyright 2000-2016 TIBCO Software Inc.
Exploratory Data Analysis
Visual Analytics - Interactive Brush-Linked
© Copyright 2000-2016 TIBCO Software Inc.
© Copyright 2000-2016 TIBCO Software Inc.
Analytics Maturity Model
Immediate
Long-Term	
Competitive	AdvantageValue to the Organization
Visual	Analytics Event	Processing	Advanced	Analytics
Measure Diagnose Predict Optimize Alert Automate
Analytics Maturity
A good Big Data Analytics platform can provide value to the organization
across the full spectrum of use cases
Analytics
© Copyright 2000-2016 TIBCO Software Inc.
Analytical Pipeline
© Copyright 2000-2016 TIBCO Software Inc.
Which picture represents a model?
A model is a simplification of the truth that helps you with decision making.
© Copyright 2000-2016 TIBCO Software Inc.
Model Building
© Copyright 2000-2016 TIBCO Software Inc.
Model Building
Employees who write longer emails earn higher salaries!
© Copyright 2000-2016 TIBCO Software Inc.
Model Building
© Copyright 2000-2016 TIBCO Software Inc.
Model Improvement
Managers
Staff
© Copyright 2000-2016 TIBCO Software Inc.
Model Improvement
© Copyright 2000-2016 TIBCO Software Inc.
Analytical Pipeline
© Copyright 2000-2016 TIBCO Software Inc.
Model Validation
How is the IQ of a kid related to the IQ of his / her mum?
© Copyright 2000-2016 TIBCO Software Inc.
Frameworks and Tooling
© Copyright 2000-2016 TIBCO Software Inc.
“…as a next-generation data discovery capability that automatically finds and explains
insights from advanced analytics to business users or citizen data scientists”
Smart Data Discovery (for the Business User)
Leverage Machine Learning
without the help of a Data Scientist
Advanced Analytics and Big Data Tools (for Data Scientists)
Many more ….
TIBCO Spotfire with R / TERR Integration
© Copyright 2000-2016 TIBCO Software Inc.
Let the business user leverage Analytic Models (created by the Data Scientist) to find insights!
Example: Customer Churn with Random Forest Algorithm
• ‘refresh model’ button lives a ‘random forest algorithm’
• requires no a priori assumptions at all, it just always works
• The business user doesn’t need to know what random forest is to be empowered by it
Select variables
for the model
© Copyright 2000-2016 TIBCO Software Inc.
Agenda
1) Machine Learning and Big Data Analytics
2) Building an Analytic Model
3) Real Time Processing
4) Real World Scenario
© Copyright 2000-2016 TIBCO Software Inc.
Analytics Maturity Model
Immediate
Long-Term	
Competitive	AdvantageValue to the Organization
Self-service	
Dashboards
Event	Processing	Advanced	Analytics
Measure Diagnose Predict Optimize Alert Automate
Analytics Maturity
A good Big Data Analytics platform can provide value to the organization
across the full spectrum of use cases
Visual	Analytics Event	Processing	
Analytics
© Copyright 2000-2016 TIBCO Software Inc.
Operational Intelligence and Human Interaction
Actions by Operations
Human	decisions	in	real	time	informed	by	
up	to	date	information
38
Automated	action	based	on	models	of	history	
combined	with	live	context	and	business	rules
Machine-to-Machine Automation
© Copyright 2000-2016 TIBCO Software Inc.
Visual Coding for Streaming Analytics with TIBCO StreamBase
• Streaming	Operators
• Connectivity
• Visual	Development
• Testing	&	Simulation
• Mature	Tooling	/	Support
• Middleware	Integration
© Copyright 2000-2016 TIBCO Software Inc.
Live Visual Analytics UI with TIBCO Live Datamart
Dynamic	aggregation	
Live	visualization
Ad-hoc	continuous	query
Alerts
Action
© Copyright 2000-2016 TIBCO Software Inc.
How to
apply analytic models
to real time processing
without redevelopment?
TIBCO
StreamBaseH20.ai
Open
Source
R
TERR
Spark
ML
MATLAB
SAS
PMML
© Copyright 2000-2016 TIBCO Software Inc.
TIBCO StreamBase Connector for R and TERR
© Copyright 2000-2016 TIBCO Software Inc.
Agenda
1) Machine Learning and Big Data Analytics
2) Building an Analytic Model
3) Real Time Processing
4) Real World Scenario
Scenario: Predictive Scrapping of Parts in an Assembly Line
Goal: Scrap parts as early as possible automatically to reduce costs in a manufacturing process.
Question: When to scrap a part in Station 1 instead of doing re-work or sending it to Station 2?
Station 1 Station 2
Cost Before
9€
7€ 13€
Total Cost
29€
(or more)
Scrap? Scrap?
TIBCO Spotfire with H2O Integration
Data Discovery / Data Mining (“Are parts that repeat a station more likely scrap parts?”)
TIBCO Live Datamart
Operational Intelligence (“Monitor the manufacturing process and change rules in real time!”)
Live Dartmart Desktop Client
TIBCO Live Datamart
Operational Intelligence (“Monitor the manufacturing process and change rules in real time!”)
Live Dartmart Web API
© Copyright 2000-2016 TIBCO Software Inc.
TIBCO Accelerator for Apache Spark
1. Fast Data Preparation for IoT
Dozens of enterprise and IoT data preparation adapters:
MQTT, Databases; inbound creation of HDFS, Parquet, Hbase,
Avro…
2. Spotfire Model Discovery Template
Use Spotfire to explore Spark data lake, create predictive
model, train in H20, and deploy to Streaming Analytics.
3. Operationalize Predictive Models
Zookeeper deployment to StreamBase nodes living in Spark
cluster via H20, PMML, TERR models
4. Streaming Analytics for Automation
Automate action based on predictive models – make offers to
customers, stop fraudulent transactions, alert.
5. Monitor & Retrain Model
Monitor behavior of model, retrain when necessary.
6. Drag & Drop for Business Solution Developers
Code-free development environment for work with H20, HDFS,
Avro, TERR
The TIBCO Accelerator for Spark is a TIBCO
engineered, light-weight open-source fast-
start for systems to stream data into Spark,
discover patterns in Spark with Spotfire, and
operationalize the insights on Big Data.
FUNCTIONAL COMPONENTS
© Copyright 2000-2016 TIBCO Software Inc.
Key Take-Aways
Ø Insights are hidden in Historical Data on Big Data Platforms
Ø Machine Learning and Big Data Analytics find these Insights by building Analytics Models
Ø Event Processing uses these Models (without Redevelopment) to take Action in Real Time
Questions? Please contact me!
Kai Wähner
Technology Evangelist
kontakt@kai-waehner.de
@KaiWaehner
www.kai-waehner.de
LinkedIn

More Related Content

PPTX
Findability Day 2016 - Augmented intelligence
PDF
Ibm big data-platform
PDF
Ml, AI and IBM Watson - 101 for Business
PDF
The book of elephant tattoo
PDF
KM - Cognitive Computing overview by Ken Martin 13Apr2016
PDF
The (very) basics of AI for the Radiology resident
PDF
Big Data Scotland
PDF
Telco Big Data Workshop Sample
Findability Day 2016 - Augmented intelligence
Ibm big data-platform
Ml, AI and IBM Watson - 101 for Business
The book of elephant tattoo
KM - Cognitive Computing overview by Ken Martin 13Apr2016
The (very) basics of AI for the Radiology resident
Big Data Scotland
Telco Big Data Workshop Sample

What's hot (20)

PDF
Big data Introduction by Mohan
PDF
Cognitive computing big_data_statistical_analytics
PDF
Take the Big Data Challenge - Take Advantage of ALL of Your Data 16 Sept 2014
PPT
Robert Lecklin - BigData is making a difference
PPTX
An AI Maturity Roadmap for Becoming a Data-Driven Organization
PDF
How Watson Works
PDF
Democratization - New Wave of Data Science (홍운표 상무, DataRobot) :: AWS Techfor...
PDF
Big Data Overview
PDF
Big Data on AWS
PDF
Big Data LDN 2017: The New Dominant Companies Are Running on Data
PDF
Full-Stack Data Science: How to be a One-person Data Team
PDF
Watson - Who What Why
PDF
Top 10 Big Data Technologies | Edureka
PDF
Big data ibm keynote d advani presentation
PDF
Issues on Big Data & Cloud Computing
PPTX
Big Data Analytics Strategy and Roadmap
PDF
Future of Big Data
PPT
YHORG Presentation 23 February 2016
PDF
Top 10 ways BigInsights BigIntegrate and BigQuality will improve your life
PDF
NextGen Infrastructure for Big Data
Big data Introduction by Mohan
Cognitive computing big_data_statistical_analytics
Take the Big Data Challenge - Take Advantage of ALL of Your Data 16 Sept 2014
Robert Lecklin - BigData is making a difference
An AI Maturity Roadmap for Becoming a Data-Driven Organization
How Watson Works
Democratization - New Wave of Data Science (홍운표 상무, DataRobot) :: AWS Techfor...
Big Data Overview
Big Data on AWS
Big Data LDN 2017: The New Dominant Companies Are Running on Data
Full-Stack Data Science: How to be a One-person Data Team
Watson - Who What Why
Top 10 Big Data Technologies | Edureka
Big data ibm keynote d advani presentation
Issues on Big Data & Cloud Computing
Big Data Analytics Strategy and Roadmap
Future of Big Data
YHORG Presentation 23 February 2016
Top 10 ways BigInsights BigIntegrate and BigQuality will improve your life
NextGen Infrastructure for Big Data
Ad

Viewers also liked (20)

PDF
Findability Day 2016 - Enterprise Search and Findability Survey 2016
PDF
Scania Interim Report January – June 2016
PDF
Scania Interim Report, January–March 2016
PDF
Scania interim report january september 2016
PPTX
Findability Day 2016 - Get started with GDPR
PPTX
Findability Day 2016 - Enterprise social collaboration
DOCX
비아그라 정품구매ぃ// 7cc,kr //비아그라 정품 구매く시알리스 정품구매コ아드레닌 정품구매з아이코스 정품구매└비아그라 판매,비아그라 구...
PPTX
Architecture of Search Systems and Measuring the Search Effectiveness
PPTX
En bra sökfunktion - så här gör du
PPTX
Findability Day 2016 - SKF case study
PPTX
Findability Day 2016 - What is GDPR?
PDF
Enterprise Search in SharePoint 2013
PDF
Best Practices for Enterprise Search - What Leading Practitioners Do
PPTX
Findability Day 2016 - Structuring content for user experience
PPTX
웹드라마(중문)
PDF
Findability Day 2016 - Enterprise Search and Findability Survey 2016
PPTX
Best places to visit in shimla - Top Places to Visit In Shimla
PDF
How to be successful with search in your organisation
PDF
Scania Year-end Report January-December 2016
PDF
Tema 11 política fiscal
Findability Day 2016 - Enterprise Search and Findability Survey 2016
Scania Interim Report January – June 2016
Scania Interim Report, January–March 2016
Scania interim report january september 2016
Findability Day 2016 - Get started with GDPR
Findability Day 2016 - Enterprise social collaboration
비아그라 정품구매ぃ// 7cc,kr //비아그라 정품 구매く시알리스 정품구매コ아드레닌 정품구매з아이코스 정품구매└비아그라 판매,비아그라 구...
Architecture of Search Systems and Measuring the Search Effectiveness
En bra sökfunktion - så här gör du
Findability Day 2016 - SKF case study
Findability Day 2016 - What is GDPR?
Enterprise Search in SharePoint 2013
Best Practices for Enterprise Search - What Leading Practitioners Do
Findability Day 2016 - Structuring content for user experience
웹드라마(중문)
Findability Day 2016 - Enterprise Search and Findability Survey 2016
Best places to visit in shimla - Top Places to Visit In Shimla
How to be successful with search in your organisation
Scania Year-end Report January-December 2016
Tema 11 política fiscal
Ad

Similar to Findability Day 2016 - Big data analytics and machine learning (20)

PDF
Apply Machine Learning to Microservices
PDF
HOW TO APPLY BIG DATA ANALYTICS AND MACHINE LEARNING TO REAL TIME PROCESSING ...
PDF
How to Apply Big Data Analytics and Machine Learning to Real Time Processing ...
PDF
How to Leverage Machine Learning (R, Hadoop, Spark, H2O) for Real Time Proces...
PDF
Machine Learning Applied to Real Time Scoring in Manufacturing and Energy Uti...
PDF
How to Leverage Machine Learning (R, Hadoop, Spark, H2O) for Real Time Proces...
PDF
R, Spark, Tensorflow, H20.ai Applied to Streaming Analytics
PDF
Tibco Augmented Intelligence - Analytics, IoT, Big Data, Streaming 20161025
PDF
Streaming Analytics Comparison of Open Source Frameworks, Products, Cloud Ser...
PDF
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
PDF
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...
PDF
Bitrock manufacturing
PDF
TIBCO presentation at the Chief Analytics Officer Forum East Coast 2016 (#CAO...
PDF
Big Data LDN 2017: How Big Data Insights Become Easily Accessible With Workfl...
PPTX
Introduction to jaspersoft7 customer webinar
PDF
Smart Manufacturing and Industry 4.0 - Tibco PoV
PPTX
Make your application stand out with bi that blends in
PDF
Miguel Angel Perdiguero - Head of BIG data & analytics Atos Iberia - semanain...
PDF
Cubitic: Predictive Analytics
PDF
TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...
Apply Machine Learning to Microservices
HOW TO APPLY BIG DATA ANALYTICS AND MACHINE LEARNING TO REAL TIME PROCESSING ...
How to Apply Big Data Analytics and Machine Learning to Real Time Processing ...
How to Leverage Machine Learning (R, Hadoop, Spark, H2O) for Real Time Proces...
Machine Learning Applied to Real Time Scoring in Manufacturing and Energy Uti...
How to Leverage Machine Learning (R, Hadoop, Spark, H2O) for Real Time Proces...
R, Spark, Tensorflow, H20.ai Applied to Streaming Analytics
Tibco Augmented Intelligence - Analytics, IoT, Big Data, Streaming 20161025
Streaming Analytics Comparison of Open Source Frameworks, Products, Cloud Ser...
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...
Bitrock manufacturing
TIBCO presentation at the Chief Analytics Officer Forum East Coast 2016 (#CAO...
Big Data LDN 2017: How Big Data Insights Become Easily Accessible With Workfl...
Introduction to jaspersoft7 customer webinar
Smart Manufacturing and Industry 4.0 - Tibco PoV
Make your application stand out with bi that blends in
Miguel Angel Perdiguero - Head of BIG data & analytics Atos Iberia - semanain...
Cubitic: Predictive Analytics
TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...

More from Findwise (20)

PDF
White Arkitekter - Findability Day Roadshow 2017
PDF
AI och maskininlärning - Findability Day Roadshow 2017
PDF
De kognitiva eran med IBM Watson - Findability Day Roadshow 2017
PDF
Findwise and IBM Watson
PPTX
Digital workplace och informationshantering i office 365
PPTX
Findability Day 2015 - Mickel Grönroos - Findwise - How to increase safety on...
PDF
Findability Day 2015 - Abby Covert - Keynote - How to make sense of any mess
PPTX
Findability Day 2015 - Noel Garry - IBM - Information governance and a 360 de...
PPTX
Findability Day 2015 Mattias Ellison - Findwise - Enterprise Search and fin...
PPTX
Findability Day 2015 - Martin White - The future is search!
PPTX
Findability Day 2015 Liam Holley - Dassault systems - Insight and discovery...
PPTX
Findability Day 2015 Joachim Dahl - Virtual Works - 360 degree view of the ...
PPTX
Findability Day 2015 Anders Fors - Volvo Bus - A cost efficient R&D with EX...
PDF
Logganalys med Elastic & Findwise
PPTX
BigData med logganalys
PDF
Intranet focus search strategy a z - from Findability Day 2014
PDF
Findability Day 2014 Neo4j how graph data boost your insights
PDF
Martin White it's not the technology it's the content
PDF
Models and beer Findability Day 2014
PDF
Designing the search experience the language of discovery - Findability Day 2014
White Arkitekter - Findability Day Roadshow 2017
AI och maskininlärning - Findability Day Roadshow 2017
De kognitiva eran med IBM Watson - Findability Day Roadshow 2017
Findwise and IBM Watson
Digital workplace och informationshantering i office 365
Findability Day 2015 - Mickel Grönroos - Findwise - How to increase safety on...
Findability Day 2015 - Abby Covert - Keynote - How to make sense of any mess
Findability Day 2015 - Noel Garry - IBM - Information governance and a 360 de...
Findability Day 2015 Mattias Ellison - Findwise - Enterprise Search and fin...
Findability Day 2015 - Martin White - The future is search!
Findability Day 2015 Liam Holley - Dassault systems - Insight and discovery...
Findability Day 2015 Joachim Dahl - Virtual Works - 360 degree view of the ...
Findability Day 2015 Anders Fors - Volvo Bus - A cost efficient R&D with EX...
Logganalys med Elastic & Findwise
BigData med logganalys
Intranet focus search strategy a z - from Findability Day 2014
Findability Day 2014 Neo4j how graph data boost your insights
Martin White it's not the technology it's the content
Models and beer Findability Day 2014
Designing the search experience the language of discovery - Findability Day 2014

Recently uploaded (20)

PPTX
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPT
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm
PDF
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
PPTX
05. PRACTICAL GUIDE TO MICROSOFT EXCEL.pptx
PPT
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPTX
Supervised vs unsupervised machine learning algorithms
PDF
Fluorescence-microscope_Botany_detailed content
PPTX
climate analysis of Dhaka ,Banglades.pptx
PPT
Quality review (1)_presentation of this 21
PPTX
Computer network topology notes for revision
PPTX
1_Introduction to advance data techniques.pptx
PPTX
IB Computer Science - Internal Assessment.pptx
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
Data_Analytics_and_PowerBI_Presentation.pptx
IBA_Chapter_11_Slides_Final_Accessible.pptx
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
05. PRACTICAL GUIDE TO MICROSOFT EXCEL.pptx
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
Miokarditis (Inflamasi pada Otot Jantung)
Clinical guidelines as a resource for EBP(1).pdf
Supervised vs unsupervised machine learning algorithms
Fluorescence-microscope_Botany_detailed content
climate analysis of Dhaka ,Banglades.pptx
Quality review (1)_presentation of this 21
Computer network topology notes for revision
1_Introduction to advance data techniques.pptx
IB Computer Science - Internal Assessment.pptx
STUDY DESIGN details- Lt Col Maksud (21).pptx
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...

Findability Day 2016 - Big data analytics and machine learning

  • 3. Kai Wähner Technology Evangelist kontakt@kai-waehner.de LinkedIn @KaiWaehner www.kai-waehner.de Findability Day 2016 (Stockholm, Sweden) How to Leverage Machine Learning to Find Insights in Historical Data
  • 4. © Copyright 2000-2016 TIBCO Software Inc. Apply Big Data Analytics to Real Time Processing
  • 5. © Copyright 2000-2016 TIBCO Software Inc. Analyze and Act on Critical Business Moments
  • 6. © Copyright 2000-2016 TIBCO Software Inc. Agenda 1) Machine Learning and Big Data Analytics 2) Building an Analytic Model 3) Real Time Processing 4) Real World Scenario
  • 7. © Copyright 2000-2016 TIBCO Software Inc. Agenda 1) Machine Learning and Big Data Analytics 2) Building an Analytic Model 3) Real Time Processing 4) Real World Scenario
  • 8. Machine Learning …. allows computers to find hidden insights without being explicitly programmed where to look.
  • 9. Real World Examples of Machine Learning Spam Detection Search Results + Product Recommendation Picture Detection (Friends, Locations, Products) Machine Learning is already present in daily life… Now, every enterprise is beginning to leverage it! The Next Disruption: Google Beats Go Champion
  • 10. © Copyright 2000-2016 TIBCO Software Inc. Analytics Maturity Model Immediate Long-Term Competitive AdvantageValue to the Organization A good Big Data Analytics platform can provide value to the organization across the full spectrum of use cases Self-service Dashboards Event Processing Advanced Analytics Measure Diagnose Predict Optimize Alert Automate Analytics Maturity Visual Analytics Event Processing Analytics
  • 11. © Copyright 2000-2016 TIBCO Software Inc. Analytics Maturity Model Immediate Long-Term Competitive AdvantageValue to the Organization Visual Analytics Event Processing Advanced Analytics Measure Diagnose Predict Optimize Alert Automate Analytics Maturity A good Big Data Analytics platform can provide value to the organization across the full spectrum of use cases Analytics
  • 12. © Copyright 2000-2016 TIBCO Software Inc. Analytics Maturity Model Immediate Long-Term Competitive AdvantageValue to the Organization Self-service Dashboards Event Processing Advanced Analytics Measure Diagnose Predict Optimize Alert Automate Analytics Maturity A good Big Data Analytics platform can provide value to the organization across the full spectrum of use cases Visual Analytics Event Processing Analytics
  • 13. © Copyright 2000-2016 TIBCO Software Inc. The first task in a new analytics projects is to define a Business Case!
  • 14. © Copyright 2000-2016 TIBCO Software Inc. Agenda 1) Machine Learning and Big Data Analytics 2) Building an Analytic Model 3) Real Time Processing 4) Real World Scenario
  • 15. © Copyright 2000-2016 TIBCO Software Inc. Analytical Pipeline
  • 16. © Copyright 2000-2016 TIBCO Software Inc. Analytics Maturity Model Immediate Long-Term Competitive AdvantageValue to the Organization A good Big Data Analytics platform can provide value to the organization across the full spectrum of use cases Self-service Dashboards Event Processing Advanced Analytics Measure Diagnose Predict Optimize Alert Automate Analytics Maturity Visual Analytics Event Processing Analytics
  • 17. © Copyright 2000-2016 TIBCO Software Inc. Analytical Pipeline
  • 18. © Copyright 2000-2016 TIBCO Software Inc. Data Acquisition
  • 19. © Copyright 2000-2016 TIBCO Software Inc. Analytical Pipeline
  • 20. cust_id dept sku dollar gift date 1 104 C 12003 2.40 FALSE 2016-10-17 2 105 A 12005 62.85 FALSE 2016-10-17 3 102 C 12007 69.23 TRUE 2016-10-17 4 104 B 12004 9.33 FALSE 2016-10-18 5 105 C 12010 14.16 TRUE 2016-10-18 6 101 B 12003 90.43 FALSE 2016-10-19 7 103 C 12005 90.97 FALSE 2016-10-19 n … … … … … … cust_id A B C total # orders first_dat e last_dat e 1 100 21.76 23.67 0.00 45.43 2 2016-10- 19 2016-10- 20 2 101 0.01 74.65 0.00 74.66 3 2016-10- 19 2016-10- 20 3 102 0.00 60.92 50.29 111.21 6 2016-10- 17 2016-10- 20 4 103 0.00 0.00 52.30 52.30 2 2016-10- 19 2016-10- 20© Copyright 2000-2016 TIBCO Software Inc. Data Munging - Transformations
  • 21. © Copyright 2000-2016 TIBCO Software Inc. Analytical Pipeline
  • 22. “The greatest value of a picture is when it forces us to notice what we never expected to see” John W. Tukey, 1977 © Copyright 2000-2016 TIBCO Software Inc. Exploratory Data Analysis
  • 23. Visual Analytics - Interactive Brush-Linked © Copyright 2000-2016 TIBCO Software Inc.
  • 24. © Copyright 2000-2016 TIBCO Software Inc. Analytics Maturity Model Immediate Long-Term Competitive AdvantageValue to the Organization Visual Analytics Event Processing Advanced Analytics Measure Diagnose Predict Optimize Alert Automate Analytics Maturity A good Big Data Analytics platform can provide value to the organization across the full spectrum of use cases Analytics
  • 25. © Copyright 2000-2016 TIBCO Software Inc. Analytical Pipeline
  • 26. © Copyright 2000-2016 TIBCO Software Inc. Which picture represents a model? A model is a simplification of the truth that helps you with decision making.
  • 27. © Copyright 2000-2016 TIBCO Software Inc. Model Building
  • 28. © Copyright 2000-2016 TIBCO Software Inc. Model Building
  • 29. Employees who write longer emails earn higher salaries! © Copyright 2000-2016 TIBCO Software Inc. Model Building
  • 30. © Copyright 2000-2016 TIBCO Software Inc. Model Improvement
  • 31. Managers Staff © Copyright 2000-2016 TIBCO Software Inc. Model Improvement
  • 32. © Copyright 2000-2016 TIBCO Software Inc. Analytical Pipeline
  • 33. © Copyright 2000-2016 TIBCO Software Inc. Model Validation How is the IQ of a kid related to the IQ of his / her mum?
  • 34. © Copyright 2000-2016 TIBCO Software Inc. Frameworks and Tooling
  • 35. © Copyright 2000-2016 TIBCO Software Inc. “…as a next-generation data discovery capability that automatically finds and explains insights from advanced analytics to business users or citizen data scientists” Smart Data Discovery (for the Business User) Leverage Machine Learning without the help of a Data Scientist
  • 36. Advanced Analytics and Big Data Tools (for Data Scientists) Many more ….
  • 37. TIBCO Spotfire with R / TERR Integration © Copyright 2000-2016 TIBCO Software Inc. Let the business user leverage Analytic Models (created by the Data Scientist) to find insights! Example: Customer Churn with Random Forest Algorithm • ‘refresh model’ button lives a ‘random forest algorithm’ • requires no a priori assumptions at all, it just always works • The business user doesn’t need to know what random forest is to be empowered by it Select variables for the model
  • 38. © Copyright 2000-2016 TIBCO Software Inc. Agenda 1) Machine Learning and Big Data Analytics 2) Building an Analytic Model 3) Real Time Processing 4) Real World Scenario
  • 39. © Copyright 2000-2016 TIBCO Software Inc. Analytics Maturity Model Immediate Long-Term Competitive AdvantageValue to the Organization Self-service Dashboards Event Processing Advanced Analytics Measure Diagnose Predict Optimize Alert Automate Analytics Maturity A good Big Data Analytics platform can provide value to the organization across the full spectrum of use cases Visual Analytics Event Processing Analytics
  • 40. © Copyright 2000-2016 TIBCO Software Inc. Operational Intelligence and Human Interaction Actions by Operations Human decisions in real time informed by up to date information 38 Automated action based on models of history combined with live context and business rules Machine-to-Machine Automation
  • 41. © Copyright 2000-2016 TIBCO Software Inc. Visual Coding for Streaming Analytics with TIBCO StreamBase • Streaming Operators • Connectivity • Visual Development • Testing & Simulation • Mature Tooling / Support • Middleware Integration
  • 42. © Copyright 2000-2016 TIBCO Software Inc. Live Visual Analytics UI with TIBCO Live Datamart Dynamic aggregation Live visualization Ad-hoc continuous query Alerts Action
  • 43. © Copyright 2000-2016 TIBCO Software Inc. How to apply analytic models to real time processing without redevelopment? TIBCO StreamBaseH20.ai Open Source R TERR Spark ML MATLAB SAS PMML
  • 44. © Copyright 2000-2016 TIBCO Software Inc. TIBCO StreamBase Connector for R and TERR
  • 45. © Copyright 2000-2016 TIBCO Software Inc. Agenda 1) Machine Learning and Big Data Analytics 2) Building an Analytic Model 3) Real Time Processing 4) Real World Scenario
  • 46. Scenario: Predictive Scrapping of Parts in an Assembly Line Goal: Scrap parts as early as possible automatically to reduce costs in a manufacturing process. Question: When to scrap a part in Station 1 instead of doing re-work or sending it to Station 2? Station 1 Station 2 Cost Before 9€ 7€ 13€ Total Cost 29€ (or more) Scrap? Scrap?
  • 47. TIBCO Spotfire with H2O Integration Data Discovery / Data Mining (“Are parts that repeat a station more likely scrap parts?”)
  • 48. TIBCO Live Datamart Operational Intelligence (“Monitor the manufacturing process and change rules in real time!”) Live Dartmart Desktop Client
  • 49. TIBCO Live Datamart Operational Intelligence (“Monitor the manufacturing process and change rules in real time!”) Live Dartmart Web API
  • 50. © Copyright 2000-2016 TIBCO Software Inc. TIBCO Accelerator for Apache Spark 1. Fast Data Preparation for IoT Dozens of enterprise and IoT data preparation adapters: MQTT, Databases; inbound creation of HDFS, Parquet, Hbase, Avro… 2. Spotfire Model Discovery Template Use Spotfire to explore Spark data lake, create predictive model, train in H20, and deploy to Streaming Analytics. 3. Operationalize Predictive Models Zookeeper deployment to StreamBase nodes living in Spark cluster via H20, PMML, TERR models 4. Streaming Analytics for Automation Automate action based on predictive models – make offers to customers, stop fraudulent transactions, alert. 5. Monitor & Retrain Model Monitor behavior of model, retrain when necessary. 6. Drag & Drop for Business Solution Developers Code-free development environment for work with H20, HDFS, Avro, TERR The TIBCO Accelerator for Spark is a TIBCO engineered, light-weight open-source fast- start for systems to stream data into Spark, discover patterns in Spark with Spotfire, and operationalize the insights on Big Data. FUNCTIONAL COMPONENTS
  • 51. © Copyright 2000-2016 TIBCO Software Inc. Key Take-Aways Ø Insights are hidden in Historical Data on Big Data Platforms Ø Machine Learning and Big Data Analytics find these Insights by building Analytics Models Ø Event Processing uses these Models (without Redevelopment) to take Action in Real Time
  • 52. Questions? Please contact me! Kai Wähner Technology Evangelist kontakt@kai-waehner.de @KaiWaehner www.kai-waehner.de LinkedIn