SlideShare a Scribd company logo
CFSummit: Data Science on Cloud Foundry
Data Science on Cloud Foundry
Ian Huston @ianhuston
Alexander Kagoshima @akagoshima
Who are we?
•  Data Scientists at Pivotal Labs
•  Using Cloud Foundry since 2013
•  Working with enterprises to get value out
of their data
Image by Drew Conway: http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram
Data Scientist (n.): 

Person who is better at statistics than any
software engineer and better at software
engineering than any statistician.

- Josh Wills
Typical Projects
Risk
Analysis
Predictive
Maintenance
Understanding
Your Customer
CFSummit: Data Science on Cloud Foundry
Data Services
Easy control of incoming data
Data Services
Bind and scale system services 
–  Databases, NoSQL, message queues etc.
$	
  cf	
  create-­‐service	
  rediscloud	
  PLAN_NAME	
  
INSTANCE_NAME	
  
$	
  cf	
  bind-­‐service	
  APP_NAME	
  INSTANCE_NAME	
  
	
  
Add User Provided Services
–  Standalone Hadoop or Apache Spark cluster,
Big Data System
$	
  cf	
  cups	
  SERVICE_INSTANCE	
  -­‐p	
  "host,	
  
port,	
  username,	
  password"	
  	
  
	
  
Data Service
App
 App
 App
App
App
App
Deploy a Model Prediction API
Control distributed computation
h"ps://github.com/ihuston/python-­‐conda-­‐buildpack	
  
Install	
  PyData	
  packages	
  with	
  binary	
  builds	
  using	
  conda	
  
h"ps://github.com/alexkago/cf-­‐buildpack-­‐r	
  
R	
  interpreter	
  and	
  package	
  setup,	
  ready	
  for	
  RShiny	
  
Siloed
Data
Siloed
Systems
Distributed
Big Data
Platform
HOW TO 
DEPLOY
MODELS?
 Data Extract
?
(Model
development
happens here!)
(Business
needs model
predictions
here!)
App
App
App
App
App
Big Data Platform
Big Data Storage
R
E
S
T

A
P
I
Send data as JSON
Data
Ingest
Model
Create Model
Redis
Kicking off
periodic
retraining
Save training
data
Save model
object
Send JSON data
without label
Receive prediction
from trained model
instance
Deployed at:
http://dsoncf.cfapps.io
Code:
https://github.com/pivotalsoftware/ds-cfpylearning
PREDICTION API
ARCHITECTURE
$	
  cf	
  create-­‐service	
  
rediscloud	
  
PLAN_NAME	
  
INSTANCE_NAME	
  
MODEL
INTERFACE
Data Driven Applications
SIMPLE HTML + JS
MODEL
PREDICTIONS
http://ds-demo-transport.cfapps.io
RSHINY APP
INTERACTIVE
EXPLORATION
https://ak-insurance-demo.cfapps.io:4443/	
  
Show off your data
science related Cloud
Foundry apps:

Twitter: @dsoncf
http://dsoncf.com
@ianhuston
@akagoshima
R
E
S
T

A
P
I
Send data as JSON
Data
Ingest
Model
Create Model
Redis
Kicking off
periodic
retraining
Save training
data
Save model
object
Send JSON data
without label
Receive prediction
from trained model
instance
Deployed at:
http://dsoncf.cfapps.io
Code:
https://github.com/pivotalsoftware/ds-cfpylearning
Visualization
PREDICTION API
ARCHITECTURE
Data Services
Bind and scale system services 
–  Databases, NoSQL, message queues etc.
$	
  cf	
  create-­‐service	
  rediscloud	
  PLAN_NAME	
  INSTANCE_NAME	
  
$	
  cf	
  bind-­‐service	
  APP_NAME	
  INSTANCE_NAME	
  
	
  
Add User Provided Services
–  Standalone Hadoop or Apache Spark cluster, Big Data System
$	
  cf	
  cups	
  SERVICE_INSTANCE	
  -­‐p	
  "host,	
  port,	
  username,	
  
password"	
  	
  
	
  

More Related Content

PDF
Cloud Foundry for Data Science
PDF
Flink for Everyone: Self Service Data Analytics with StreamPipes - Philipp Ze...
PDF
Scaling graphite for application metrics
PPTX
Big Data Analysis on a Cloud Ecosystem-PATW 2013
PPT
2 hadoop@e bay-hug-2010-07-21
PPTX
Apache Zeppelin Meetup Christian Tzolov 1/21/16
PDF
Helium makes Zeppelin fly!
PDF
Powering Data Science and AI with Apache Spark, Alluxio, and IBM
Cloud Foundry for Data Science
Flink for Everyone: Self Service Data Analytics with StreamPipes - Philipp Ze...
Scaling graphite for application metrics
Big Data Analysis on a Cloud Ecosystem-PATW 2013
2 hadoop@e bay-hug-2010-07-21
Apache Zeppelin Meetup Christian Tzolov 1/21/16
Helium makes Zeppelin fly!
Powering Data Science and AI with Apache Spark, Alluxio, and IBM

What's hot (20)

PDF
The hidden engineering behind machine learning products at Helixa
PDF
Prediction io 架構與整合 -DataCon.TW-2017
PDF
Modernizing Global Shared Data Analytics Platform and our Alluxio Journey
PDF
Accelerate Analytics and ML in the Hybrid Cloud Era
PPTX
Big Data on OpenStack
PDF
Cloudera Operational DB (Apache HBase & Apache Phoenix)
PDF
How to Improve Data Labels and Feedback Loops Through High-Frequency Sensor A...
PDF
Druid + Kafka: transform your data-in-motion to analytics-in-motion | Gian Me...
PDF
RedisConf17 - Redis Powers Next-gen Ambient Intelligence Platform
PPTX
Real-time Analysis of Data Processing Pipelines with Spring Cloud Data Flow a...
PDF
Presto + Alluxio on steroids a romantic drama on Production with happy end
PDF
Presto & differences between popular SQL engines (Spark, Redshift, and Hive)
PDF
A Journey to Building an Autonomous Streaming Data Platform—Scaling to Trilli...
PDF
Streaming Sensor Data with Grafana and InfluxDB | Ryan Mckinley | Grafana
PDF
Introducing the Hub for Data Orchestration
PPTX
Pixie dust overview
PDF
Using the FLaNK Stack for edge ai (flink, nifi, kafka, kudu)
PDF
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
PPTX
Data Science lifecycle with Apache Zeppelin and Spark by Moonsoo Lee
PDF
Optimizing Spark Deployments for Containers: Isolation, Safety, and Performan...
The hidden engineering behind machine learning products at Helixa
Prediction io 架構與整合 -DataCon.TW-2017
Modernizing Global Shared Data Analytics Platform and our Alluxio Journey
Accelerate Analytics and ML in the Hybrid Cloud Era
Big Data on OpenStack
Cloudera Operational DB (Apache HBase & Apache Phoenix)
How to Improve Data Labels and Feedback Loops Through High-Frequency Sensor A...
Druid + Kafka: transform your data-in-motion to analytics-in-motion | Gian Me...
RedisConf17 - Redis Powers Next-gen Ambient Intelligence Platform
Real-time Analysis of Data Processing Pipelines with Spring Cloud Data Flow a...
Presto + Alluxio on steroids a romantic drama on Production with happy end
Presto & differences between popular SQL engines (Spark, Redshift, and Hive)
A Journey to Building an Autonomous Streaming Data Platform—Scaling to Trilli...
Streaming Sensor Data with Grafana and InfluxDB | Ryan Mckinley | Grafana
Introducing the Hub for Data Orchestration
Pixie dust overview
Using the FLaNK Stack for edge ai (flink, nifi, kafka, kudu)
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
Data Science lifecycle with Apache Zeppelin and Spark by Moonsoo Lee
Optimizing Spark Deployments for Containers: Isolation, Safety, and Performan...
Ad

Similar to CFSummit: Data Science on Cloud Foundry (20)

PPTX
Data Science in the cloud with Microsoft Azure
PDF
4. aws enterprise summit seoul 기존 엔터프라이즈 it 솔루션 클라우드로 이전하기 - thomas park
PDF
Introduction to Apache Kafka
PDF
Cloud computing & Security presentation
PDF
Cloud and Bid data Dr.VK.pdf
PPTX
Spark on Azure HDInsight - spark meetup seattle
PDF
Machine Learning on dirty data - Dataiku - Forum du GFII 2014
PPTX
PowerApps Course In Ameerpet PowerApps Training.pptx
PDF
Awesome Banking API's
PPTX
SMAC - Social, Mobile, Analytics and Cloud - An overview
PPTX
Big Data on Azure Tutorial
PPTX
Cloud Computing & Big Data
PPTX
Get Started with Cloudera’s Cyber Solution
PPTX
Sycamore Quantum Computer 2019 developed.pptx
PDF
Science cloud foster june 2013
PPTX
Science as a Service: How On-Demand Computing can Accelerate Discovery
PPTX
Cloud Computing
PDF
Cloud computing terms -basic definition.pdf
PDF
CloudComputing_M2_2I2T_Chap1_Overview.pdf
PPSX
Wowrack cloud uc
Data Science in the cloud with Microsoft Azure
4. aws enterprise summit seoul 기존 엔터프라이즈 it 솔루션 클라우드로 이전하기 - thomas park
Introduction to Apache Kafka
Cloud computing & Security presentation
Cloud and Bid data Dr.VK.pdf
Spark on Azure HDInsight - spark meetup seattle
Machine Learning on dirty data - Dataiku - Forum du GFII 2014
PowerApps Course In Ameerpet PowerApps Training.pptx
Awesome Banking API's
SMAC - Social, Mobile, Analytics and Cloud - An overview
Big Data on Azure Tutorial
Cloud Computing & Big Data
Get Started with Cloudera’s Cyber Solution
Sycamore Quantum Computer 2019 developed.pptx
Science cloud foster june 2013
Science as a Service: How On-Demand Computing can Accelerate Discovery
Cloud Computing
Cloud computing terms -basic definition.pdf
CloudComputing_M2_2I2T_Chap1_Overview.pdf
Wowrack cloud uc
Ad

More from Ian Huston (10)

PDF
Python on Cloud Foundry
PDF
Data Science Amsterdam - Massively Parallel Processing with Procedural Languages
PDF
Massively Parallel Processing with Procedural Python (PyData London 2014)
PDF
Driving the Future of Smart Cities - How to Beat the Traffic (Pivotal talk at...
PDF
Calculating Non-adiabatic Pressure Perturbations during Multi-field Inflation
PDF
Second Order Perturbations - National Astronomy Meeting 2011
PDF
Second Order Perturbations During Inflation Beyond Slow-roll
PDF
Inflation as a solution to the problems of the Big Bang
PDF
Cosmological Perturbations and Numerical Simulations
PDF
Cosmo09 presentation
Python on Cloud Foundry
Data Science Amsterdam - Massively Parallel Processing with Procedural Languages
Massively Parallel Processing with Procedural Python (PyData London 2014)
Driving the Future of Smart Cities - How to Beat the Traffic (Pivotal talk at...
Calculating Non-adiabatic Pressure Perturbations during Multi-field Inflation
Second Order Perturbations - National Astronomy Meeting 2011
Second Order Perturbations During Inflation Beyond Slow-roll
Inflation as a solution to the problems of the Big Bang
Cosmological Perturbations and Numerical Simulations
Cosmo09 presentation

Recently uploaded (20)

PPTX
Big Data Technologies - Introduction.pptx
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
Spectroscopy.pptx food analysis technology
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Big Data Technologies - Introduction.pptx
Diabetes mellitus diagnosis method based random forest with bat algorithm
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Network Security Unit 5.pdf for BCA BBA.
Review of recent advances in non-invasive hemoglobin estimation
Spectroscopy.pptx food analysis technology
The AUB Centre for AI in Media Proposal.docx
NewMind AI Weekly Chronicles - August'25 Week I
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Mobile App Security Testing_ A Comprehensive Guide.pdf
Per capita expenditure prediction using model stacking based on satellite ima...
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Dropbox Q2 2025 Financial Results & Investor Presentation
Chapter 3 Spatial Domain Image Processing.pdf
Reach Out and Touch Someone: Haptics and Empathic Computing
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Unlocking AI with Model Context Protocol (MCP)
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
How UI/UX Design Impacts User Retention in Mobile Apps.pdf

CFSummit: Data Science on Cloud Foundry