SlideShare a Scribd company logo
Scalable Prediction Services with R
#RstatsNYC @Socure
• Real-time fraud detection service using social and online data.
• Predictive R models.
• Latency SLA with customers.
• Model versioning.
• Zero-downtime updates.
#RstatsNYC @Socure
Challenges
• R not dev-ops friendly.
• Enterprise prediction services a large commitment.
• Enterprise prediction services offer limited model types.
• Transferability and transparency of models.
• Vendor lock-in.
#RstatsNYC @Socure
Solution
• Embed R models within dev-op friendly middleware.
• Management, deployment, integration leverages existing dev-op
processes.
• Service scaling using established strategies and methods.
#RstatsNYC @Socure
<file>
gen_20150215.rds
saveRDS()
#RstatsNYC @Socure
<model>
name = generic
version = 20150215
<file>
gen_20150215.rds
readRDS()
saveRDS()
#RstatsNYC @Socure
Rook
http://…./model/20150215
<model>
name = generic
version = 20150215
name version
model 20150215
Model Map
<file>
gen_20150215.rds
readRDS()
saveRDS()
#RstatsNYC @Socure
Rook
http://…./model/20150215
<model>
name = generic
version = 20150215
name version
model 20150215
Model Map
predict()
<file>
gen_20150215.rds
readRDS()
saveRDS()
JSON
#RstatsNYC @Socure
POST generic/20150215
Rook Rook Rook Rook
fork()
……..
#RstatsNYC @Socure
pmml
http://…./generic/20150215
org.jpmml.evaluator
ModelEvaluatordoPost()
Servlet
evaluate()
unmarshalPMML()
pmml.gbm()
#RstatsNYC @Socure
ServletServletServletServlet
POST generic/20150215
……..
#RstatsNYC @Socure
Virtual Machine
Docker Public Repository ECS
ElasticBeanStalk
R R RR R R
#RstatsNYC @Socure
http://…./generic/20150215
ElasticBeanStalk
Prediction
Service
Prediction
ServicePrediction
Service
US-EAST-1A
Prediction
Service
Prediction
ServicePrediction
Service
US-EAST-1A
Prediction
Service
Prediction
ServicePrediction
Service
US-EAST-1A
#RstatsNYC @Socure
#RstatsNYC @Socure
#RstatsNYC @Socure
#RstatsNYC @Socure
Conclusions
• Rapid deployment of R models in a scalable robust environment.
• Directly leverage R models developed by data scientists and
analysts.
• Apply existing dev-ops processes for testing, monitoring, scaling,
alerting of predictive models.
• Possible use of PMML to serialize models in future for compliance.
#RstatsNYC @Socure
GitHub
https://github.com/Socure/moduleR
#RstatsNYC @Socure
We’re Hiring
http://www.socure.com/hiring
Director of Data Science
Senior Data Scientist
Director of Engineering

More Related Content

PDF
A Statistician Walks into a Tech Company: R at a Rapidly Scaling Healthcare S...
PDF
Dr. Datascience or: How I Learned to Stop Munging and Love Tests
PDF
Scaling Analysis Responsibly
PDF
What We Learned Building an R-Python Hybrid Predictive Analytics Pipeline
PDF
Improving data interoperability in Python and R
PDF
High-Performance Python
PDF
Open Data Science Conference Agile Data
PPTX
Data ops in practice
A Statistician Walks into a Tech Company: R at a Rapidly Scaling Healthcare S...
Dr. Datascience or: How I Learned to Stop Munging and Love Tests
Scaling Analysis Responsibly
What We Learned Building an R-Python Hybrid Predictive Analytics Pipeline
Improving data interoperability in Python and R
High-Performance Python
Open Data Science Conference Agile Data
Data ops in practice

What's hot (16)

PDF
Julia + R for Data Science
PPTX
Cloud-native Enterprise Data Science Teams
PPTX
Beyond the Science Gateway
PDF
Web Applications of the Future with TypeScript and GraphQL
PDF
#rstats lessons for #measure
PDF
Consolidating MLOps at One of Europe’s Biggest Airports
PPTX
Netflix Data Engineering @ Uber Engineering Meetup
PDF
Big data debunking some of the myths
PDF
Spectator to Participant. Contributing to Cassandra (Patrick McFadin, DataSta...
PPTX
Anaconda Data Science Collaboration
PDF
Fast Data processing with RFX
PDF
Big Data - Hadoop and MapReduce for QA and testing by Aditya Garg
PDF
Tracking data lineage at Stitch Fix
PPTX
R at Microsoft
PDF
From Chatbots to Augmented Conversational Assistants
PDF
Big Data Meets Learning Science: Keynote by Al Essa
Julia + R for Data Science
Cloud-native Enterprise Data Science Teams
Beyond the Science Gateway
Web Applications of the Future with TypeScript and GraphQL
#rstats lessons for #measure
Consolidating MLOps at One of Europe’s Biggest Airports
Netflix Data Engineering @ Uber Engineering Meetup
Big data debunking some of the myths
Spectator to Participant. Contributing to Cassandra (Patrick McFadin, DataSta...
Anaconda Data Science Collaboration
Fast Data processing with RFX
Big Data - Hadoop and MapReduce for QA and testing by Aditya Garg
Tracking data lineage at Stitch Fix
R at Microsoft
From Chatbots to Augmented Conversational Assistants
Big Data Meets Learning Science: Keynote by Al Essa
Ad

Viewers also liked (16)

PPTX
Inside the R Consortium
PDF
R Packages for Time-Varying Networks and Extremal Dependence
PDF
Data Science Challenges in Personal Program Analysis
PDF
Broom: Converting Statistical Models to Tidy Data Frames
PDF
Analyzing NYC Transit Data
PDF
The Feels
PDF
Reflection on the Data Science Profession in NYC
PDF
The Political Impact of Social Penumbras
PDF
I Don't Want to Be a Dummy! Encoding Predictors for Trees
PDF
One Algorithm to Rule Them All: How to Automate Statistical Computation
PDF
Improving Data Interoperability for Python and R
PDF
Thinking Small About Big Data
PDF
Using R at NYT Graphics
PDF
Iterating over statistical models: NCAA tournament edition
PDF
R for Everything
PDF
Scaling Data Science at Airbnb
Inside the R Consortium
R Packages for Time-Varying Networks and Extremal Dependence
Data Science Challenges in Personal Program Analysis
Broom: Converting Statistical Models to Tidy Data Frames
Analyzing NYC Transit Data
The Feels
Reflection on the Data Science Profession in NYC
The Political Impact of Social Penumbras
I Don't Want to Be a Dummy! Encoding Predictors for Trees
One Algorithm to Rule Them All: How to Automate Statistical Computation
Improving Data Interoperability for Python and R
Thinking Small About Big Data
Using R at NYT Graphics
Iterating over statistical models: NCAA tournament edition
R for Everything
Scaling Data Science at Airbnb
Ad

Similar to Building Scalable Prediction Services in R (20)

PDF
Scalable Prediction Services with R
DOC
Candra_CollinsCV112016
PDF
Bluegranite AA Webinar FINAL 28JUN16
PDF
Predictive Analysis using Microsoft SQL Server R Services
DOC
pega cssa sample Resume
PDF
Professional Services packaged solutions for SAP
DOC
Pardha Srinivas-13+yrs_Technical Architech (1)
DOCX
What is rad model
DOC
354836_(General_Format)Mahaboob Basha Shaik
DOC
Big Data Analyst at BankofAmerica
PDF
microsoft r server for distributed computing
DOC
RajivRanjan_Resume
PDF
Microdeployments for microservices dev ops nashville
DOC
IT Consultant
DOC
RakeshReddy-CV
PDF
Microsoft Dynamics - SA Technologies Capability Overview
PDF
Vinay Vaishnav Resume
DOC
CV_PraveenKumar
DOC
Srujana Unnam Microstrategy Profile
PDF
Technitab solutions
Scalable Prediction Services with R
Candra_CollinsCV112016
Bluegranite AA Webinar FINAL 28JUN16
Predictive Analysis using Microsoft SQL Server R Services
pega cssa sample Resume
Professional Services packaged solutions for SAP
Pardha Srinivas-13+yrs_Technical Architech (1)
What is rad model
354836_(General_Format)Mahaboob Basha Shaik
Big Data Analyst at BankofAmerica
microsoft r server for distributed computing
RajivRanjan_Resume
Microdeployments for microservices dev ops nashville
IT Consultant
RakeshReddy-CV
Microsoft Dynamics - SA Technologies Capability Overview
Vinay Vaishnav Resume
CV_PraveenKumar
Srujana Unnam Microstrategy Profile
Technitab solutions

More from Work-Bench (8)

PDF
2017 Enterprise Almanac
PDF
AI to Enable Next Generation of People Managers
PDF
Startup Recruiting Workbook: Sourcing and Interview Process
PDF
Cloud Native Infrastructure Management Solutions Compared
PPTX
Building a Demand Generation Machine at MongoDB
PPTX
How to Market Your Startup to the Enterprise
PDF
Marketing & Design for the Enterprise
PDF
Playing the Marketing Long Game
2017 Enterprise Almanac
AI to Enable Next Generation of People Managers
Startup Recruiting Workbook: Sourcing and Interview Process
Cloud Native Infrastructure Management Solutions Compared
Building a Demand Generation Machine at MongoDB
How to Market Your Startup to the Enterprise
Marketing & Design for the Enterprise
Playing the Marketing Long Game

Recently uploaded (20)

PPT
Reliability_Chapter_ presentation 1221.5784
PPT
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm
PPTX
1_Introduction to advance data techniques.pptx
PDF
Lecture1 pattern recognition............
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PPTX
Major-Components-ofNKJNNKNKNKNKronment.pptx
PDF
Introduction to Business Data Analytics.
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PPTX
IB Computer Science - Internal Assessment.pptx
PDF
Foundation of Data Science unit number two notes
PPTX
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
PDF
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
PPTX
Moving the Public Sector (Government) to a Digital Adoption
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PPTX
Supervised vs unsupervised machine learning algorithms
PPTX
Business Acumen Training GuidePresentation.pptx
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
Reliability_Chapter_ presentation 1221.5784
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm
1_Introduction to advance data techniques.pptx
Lecture1 pattern recognition............
Galatica Smart Energy Infrastructure Startup Pitch Deck
Major-Components-ofNKJNNKNKNKNKronment.pptx
Introduction to Business Data Analytics.
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
IB Computer Science - Internal Assessment.pptx
Foundation of Data Science unit number two notes
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
Moving the Public Sector (Government) to a Digital Adoption
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
Supervised vs unsupervised machine learning algorithms
Business Acumen Training GuidePresentation.pptx
Data_Analytics_and_PowerBI_Presentation.pptx
Clinical guidelines as a resource for EBP(1).pdf
Introduction-to-Cloud-ComputingFinal.pptx

Building Scalable Prediction Services in R