SlideShare a Scribd company logo
Data Science
in the Cloud
Margriet Groenendijk
IP EXPO Nordic
Stockholm - 20 September 2017
What is…
Data Science?
@MargrietGr
Data Science?What is Data Science?
@MargrietGr http://visual.ly/exports-and-imports-scotland
Data Science?What is Data Science?
@MargrietGr
Data Science?What is Data Science?
@MargrietGr
Data Science is…
Big Data?
@MargrietGr
Data Science is…
Extracting new
insights from data
@MargrietGr
Data Science is…
Using the
Scientific Method
@MargrietGr
PredictionsHypothesis TestsObservations
@MargrietGr
Scientific Method
@MargrietGr
Discover
Data
Use Data Publish
Data
Socialize
Data
The ideal workflow
@MargrietGr
In reality…
@MargrietGr
In reality…
https://medium.com/towards-data-science/the-ten-fallacies-of-data-science-9b2af78a1862
Data Science
in the Cloud
@MargrietGr
Normal Day
325TB per day
2.4 Billion API requests every
15 minutes
50,000 videos played
@MargrietGr
The Weather Company
Hurricane Harvey
500TB per day
3.4 Billion API requests every
15 minutes
750,000 videos played
Example
@MargrietGr
Traffic
Collisions
and Weather
www.pexels.com/photo/blur-cars-dew-drops-125510/
Collect Data Store Data Explore Data Publish Results
@MargrietGr
Collect Data Explore Data Publish Results
@MargrietGr
Store Data
NYPD Traffic Collisions
@MargrietGr
812,526
Collisions since April
2014
https://data.cityofnewyork.us/Public-Safety/NYPD-
Motor-Vehicle-Collisions/h9gi-nx95
Weather
@MargrietGr
Historic
Weather
https://business.weather.com/products/
weather-data-packages
Collect Data
Traffic
Collisions
Historic
Weather
Store Data Explore Data Publish Results
@MargrietGr
Store Data Object Store Relational Database NoSQL Database
@MargrietGr
Store Data Object Store
Static data
Unstructured data
Relational Database NoSQL Database
@MargrietGr
Store Data Object Store
Static data
Unstructured data
Relational Database
Tables with a fixed
schema
NoSQL Database
@MargrietGr
Store Data Object Store
Static data
Unstructured data
Relational Database
Tables with a fixed
schema
NoSQL Database
Everything else
@MargrietGr
Collect DataCollect Data
Traffic
Collisions
Historic
Weather
Store Data
Object-Store
Explore Data Publish Results
@MargrietGr
@MargrietGr https://stackoverflow.blog/2017/09/06/incredible-growth-python/
@MargrietGr https://stackoverflow.blog/2017/09/14/python-growing-quickly/
Explore data in Jupyter notebooks
@MargrietGr
Notebooks in the Cloud
@MargrietGr
https://datascience.ibm.com/
@MargrietGr
Organized in Collaborative Projects
Explore data in notebooks in the Cloud
@MargrietGr
ibm.co/pixiedust
@MargrietGr
Collisions Data
@MargrietGr
Load into Pandas DataFrame
Collisions Data
@MargrietGr
Load into Pandas DataFrame
Explore Data with PixieDust
@MargrietGr
Load into Pandas DataFrame
Explore Data with PixieDust
@MargrietGr
@MargrietGr
Explore Weather Data with PixieDust
Hypothesis
The number of traffic
collisions is influenced by
the weather
@MargrietGr
Combine collisions and weather data
@MargrietGr
Combine collisions and weather data
@MargrietGr
Analysis with scikit-learn
@MargrietGr
Get the data ready for analysis
@MargrietGr
@MargrietGr
Other models
@MargrietGr
Dimensionality Reduction (PCA)
Stepwise Linear Regression
Decision Trees
Random Forest
Kmeans clustering
Collect DataCollect Data
Traffic
Collisions
Historic
Weather
Store Data
Object-Store
Explore Data
New hypothesis
Add more data
Aggregate and
summarize in
different ways
Publish Results
@MargrietGr
Collect DataCollect Data
Traffic
Collisions
Historic
Weather
Store Data
Object-Store
Explore Data
New hypothesis
Add more data
Aggregate and
summarize in
different ways
Publish Results
@MargrietGr
Publish Results
@MargrietGr
Use Weather forecast Data
Model as a RESTful API - Flask app deployed in
Bluemix
Watson Machine Learning Service
Publish Results – PixieApp in a notebook
@MargrietGr
PixieApps
Dashboards within a notebook
Why Data Science to the Cloud?
@MargrietGr
Scales up – unlimited resources
All tools connected
Github integration
Local development, easy to move
Collaboration!
Thank you!
Dr. Margriet Groenendijk
Developer Advocate
mgroenen@uk.ibm.com
@MargrietGr
Slides
https://www.slideshare.net/MargrietGroenen
dijk/presentations
Blog
https://medium.com/ibm-watson-data-lab
@MargrietGr
IBM Data Science Experience
https://datascience.ibm.com
PixieDust
https://ibm-cds-labs.github.io/pixiedust/
Notebooks
https://github.com/ibm-cds-labs/python-
notebooks
Weather Data
https://business.weather.com/products/weather-
data-packages
IBM Bluemix
https://console.ng.bluemix.net/
@MargrietGr

More Related Content

PDF
IP EXPO Europe: Data Science in the Cloud
PDF
This Month in Things - November 2015
PDF
The Convergence of Data Science and Software Development
PPTX
The crusade for big data in the AAL domain
PPTX
10 Tools to Tap Value in Your Real Estate Data
PDF
Big data landscape v 3.0 - Matt Turck (FirstMark)
PPTX
Big Data, Big Deal? (A Big Data 101 presentation)
PDF
Design Thinking for Data Superwomen & Supermen
IP EXPO Europe: Data Science in the Cloud
This Month in Things - November 2015
The Convergence of Data Science and Software Development
The crusade for big data in the AAL domain
10 Tools to Tap Value in Your Real Estate Data
Big data landscape v 3.0 - Matt Turck (FirstMark)
Big Data, Big Deal? (A Big Data 101 presentation)
Design Thinking for Data Superwomen & Supermen

What's hot (19)

PDF
Using Google Cloud Dataprep to Wrangle Strava, Fitbit and Google Locations Data
PDF
Space Data Strategy
PPTX
Big insights with big data
PPTX
Big data landscape version 2.0
PDF
Thinking in graphs v1.0
PDF
Google's Infrastructure and Specific IoT Services
PDF
Top 5 Deep Learning and AI Stories - August 31, 2018
 
PPTX
Big data analytics presented at meetup big data for decision makers
PDF
The Convergence of Data Science and Software Development
PDF
Big data vendor panel - MarkLogic
PDF
2013 Enterprise Track, 3D Spatial Analysis in the Web by Brady Hustad and Che...
PDF
Biq query devfest2017_slides
PPTX
Seven Ways to Boost Artificial Intelligence Research
 
PDF
Opensourceday 2014-iot
PDF
5 important trends in big data cloud & big data services
PPTX
Sensika Intra.NET Reloaded Berlin 2013
PPTX
Transforming Healthcare at GTC Silicon Valley
 
PDF
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
PDF
Using Google Cloud Dataprep to Wrangle Strava, Fitbit and Google Locations Data
Using Google Cloud Dataprep to Wrangle Strava, Fitbit and Google Locations Data
Space Data Strategy
Big insights with big data
Big data landscape version 2.0
Thinking in graphs v1.0
Google's Infrastructure and Specific IoT Services
Top 5 Deep Learning and AI Stories - August 31, 2018
 
Big data analytics presented at meetup big data for decision makers
The Convergence of Data Science and Software Development
Big data vendor panel - MarkLogic
2013 Enterprise Track, 3D Spatial Analysis in the Web by Brady Hustad and Che...
Biq query devfest2017_slides
Seven Ways to Boost Artificial Intelligence Research
 
Opensourceday 2014-iot
5 important trends in big data cloud & big data services
Sensika Intra.NET Reloaded Berlin 2013
Transforming Healthcare at GTC Silicon Valley
 
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
Using Google Cloud Dataprep to Wrangle Strava, Fitbit and Google Locations Data
Ad

Similar to IP EXPO Nordic: Data Science in the Cloud (20)

PDF
Big Data Analytics London - Data Science in the Cloud
PDF
Cloud architectures for data science
PDF
The convergence of Data Science and Software Development
PDF
The Convergence of Data Science and Software Development
PDF
Weather and Climate Data: Not Just for Meteorologists
PDF
Data Science in the Cloud
PDF
Navigating the Magical Data Visualisation Forest
PPTX
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...
PPTX
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...
PDF
ODSC Europe: Weather and Climate Data: Not Just for Meteorologists
PDF
ODSC UK 2016: How To Analyse Weather Data and Twitter Sentiment with Spark an...
PPTX
Infochimps Cloudcon 2012
PDF
Trusting machines with robust, unbiased and reproducible AI
PPTX
Making IoT Data Actionable Using Predictive Analytics
PDF
What is the Living room of the future for #mydata2019
PDF
ITCamp 2018 - Cristiana Fernbach - GDPR compliance in the industry 4.0
 
PDF
Machine Learning meets Granular Computing
PDF
Big Data Scotland 2017
PPTX
ICT 2018 Smart Home crawler (AGT)
PDF
Introduction to CW Future Devices & Technologies Group (#CWFDT)
 
Big Data Analytics London - Data Science in the Cloud
Cloud architectures for data science
The convergence of Data Science and Software Development
The Convergence of Data Science and Software Development
Weather and Climate Data: Not Just for Meteorologists
Data Science in the Cloud
Navigating the Magical Data Visualisation Forest
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...
ODSC Europe: Weather and Climate Data: Not Just for Meteorologists
ODSC UK 2016: How To Analyse Weather Data and Twitter Sentiment with Spark an...
Infochimps Cloudcon 2012
Trusting machines with robust, unbiased and reproducible AI
Making IoT Data Actionable Using Predictive Analytics
What is the Living room of the future for #mydata2019
ITCamp 2018 - Cristiana Fernbach - GDPR compliance in the industry 4.0
 
Machine Learning meets Granular Computing
Big Data Scotland 2017
ICT 2018 Smart Home crawler (AGT)
Introduction to CW Future Devices & Technologies Group (#CWFDT)
 
Ad

More from Margriet Groenendijk (11)

PDF
Trusting machines with robust, unbiased and reproducible AI
PDF
Trusting machines with robust, unbiased and reproducible AI
PDF
Weather and Climate Data: Not Just for Meteorologists
PDF
PyParis - weather and climate data
PDF
PyData Barcelona - weather and climate data
PDF
GeoPython - Mapping Data in Jupyter Notebooks with PixieDust
PDF
Data Science Festival - Beginners Guide to Weather and Climate Data
PDF
Introduction to the IBM Watson Data Platform
PDF
Beginners guide to weather and climate data
PDF
PDF
Connecting and Visualising Open Data from Multiple Sources
Trusting machines with robust, unbiased and reproducible AI
Trusting machines with robust, unbiased and reproducible AI
Weather and Climate Data: Not Just for Meteorologists
PyParis - weather and climate data
PyData Barcelona - weather and climate data
GeoPython - Mapping Data in Jupyter Notebooks with PixieDust
Data Science Festival - Beginners Guide to Weather and Climate Data
Introduction to the IBM Watson Data Platform
Beginners guide to weather and climate data
Connecting and Visualising Open Data from Multiple Sources

Recently uploaded (20)

PPT
Miokarditis (Inflamasi pada Otot Jantung)
PDF
Mega Projects Data Mega Projects Data
PPTX
Supervised vs unsupervised machine learning algorithms
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PDF
Foundation of Data Science unit number two notes
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PPTX
Introduction to machine learning and Linear Models
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PDF
.pdf is not working space design for the following data for the following dat...
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PPTX
Database Infoormation System (DBIS).pptx
PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
PPTX
Business Acumen Training GuidePresentation.pptx
Miokarditis (Inflamasi pada Otot Jantung)
Mega Projects Data Mega Projects Data
Supervised vs unsupervised machine learning algorithms
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
Foundation of Data Science unit number two notes
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
Introduction to machine learning and Linear Models
IBA_Chapter_11_Slides_Final_Accessible.pptx
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
.pdf is not working space design for the following data for the following dat...
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
Database Infoormation System (DBIS).pptx
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
Business Acumen Training GuidePresentation.pptx

IP EXPO Nordic: Data Science in the Cloud