SlideShare a Scribd company logo
PyData NYC 2015
November 10th 2015
Karim Chine
karim.chine@rosettahub.com
Towards a universal platform
for
data science
on
public and private clouds
2
A universal open platform
for data science
Computational Components
R packages, Wrapped C,C++,Fortran code, Python modules, Matlab Toolkits…
Open source or commercial
Computational Resources
Clusters, grids, private or public clouds
Free or pay-per-use
Computational GUIs
HTML5 and Desktop Workbench
Built-in views /Plugins /Collaborative views
Open source or commercial
Computational Scripts
R / Python / Matlab / Groovy
Computational APIs
Java / SOAP / REST, Stateless and stateful
Computational Storage
Local, NFS, FTP, Amazon S3, EBS
Generated Computational Web Services
Stateful or stateless, mapping of R objects/functions
Elastic-
R
3
Infrastructures federation:
rosetta virtual cloud
Public
Clouds
Private Cloud
44
AWS: programmable
infrastructure
Command Line
Web Console
SDK
API
55
Command Line
Web Console
SDK
API
rosettaHUB: programming with
data and infrastructure
6
Google Docs-like real time
collaboration
7
Traceable and Reproducible
data science
Elastic-R
AMI 1
R 2.10 BioC
2.5
Elastic-R
AMI 2
R 2.9 BioC
2.3
Elastic-R
AMI 3
R 2.8
BioC 2.0
Elastic-R Amazon Machine Images
Elastic-R
EBS 1
Data Set
XXX
Elastic-R
EBS 2
Data Set
YYY
Elastic-R
EBS 3
Data Set
ZZZ
Elastic-R
EBS 4
Data Set VVV
Elastic-R
AMI 2
R 2.9
BioC 2.3
Elastic-R EBS
4
Data Set VVV
Amazon Elastic Block Stores
Eastic-R
AMI 2
R 2.9
BioC 2.3
Elastic-R.org
Elastic-R EBS
4
Data Set VVV
8
Architecture
9
Architecture
10
Data science universal engine
 Remote Java/R
Processes
 Events-driven Remote
Objects/Engines
 R, Python, Mathematica,
Matlab, Scilab, ...
 Collaborative Spreadsheets
 Collaborative Scientific
Graphics Canvas
 Collaborative Dashboard with
collaborative widgets
11
www.rosettahub.com

More Related Content

PPTX
BDE_SC4_WS3_6_Luigi Selmi - Pilot SC4
PDF
Big Data Europe: Simplifying Development and Deployment of Big Data Applications
PPTX
BDE-BDVA Webinar: BDE Technical Overview
PDF
Airline Reservations and Routing: A Graph Use Case
PPTX
Societal Challenge 6: Social Sciences - Spending Comparison
PDF
Python Científico
PDF
Exploring Graph Use Cases with JanusGraph
PPTX
Release webinar: Sansa and Ontario
BDE_SC4_WS3_6_Luigi Selmi - Pilot SC4
Big Data Europe: Simplifying Development and Deployment of Big Data Applications
BDE-BDVA Webinar: BDE Technical Overview
Airline Reservations and Routing: A Graph Use Case
Societal Challenge 6: Social Sciences - Spending Comparison
Python Científico
Exploring Graph Use Cases with JanusGraph
Release webinar: Sansa and Ontario

What's hot (20)

PPTX
Python in geoinformatics
PDF
Graph Computing with JanusGraph
PDF
Graph Computing with Apache TinkerPop
PDF
Up and Down the Python Data & Web Visualization Stack by Rob Story PyData SV ...
PPTX
Platform introduction & Summary
PDF
SmartCity IoT on Kubernetes and OpenStack
PPTX
Release webinar architecture
PDF
Start Flying with Python & Apache TinkerPop
PPTX
Big Data Europe Transport Pilot case, Luigi Selmi
PPTX
Powers of Ten Redux
PPTX
Release webinar end users
PDF
resume-yifei-wang
PDF
Map Analytics in Starcraft II
 
PDF
Going Elastic - Philipp Krenn - Codemotion Amsterdam 2016
PPTX
6 Lowpan Rpl Tutorial Code
PDF
BDE SC3.3 Workshop - BDE Platform: Technical overview
PDF
OpenNebulaConf2017EU: Growing into the Petabytes for Fun and Profit by Michal...
PDF
Map Analytics in Starcraft II
 
PDF
BASTA 2020 VS Code Data Visualisation
PDF
OpenNebulaConf2017EU: Enabling Dev and Infra teams by Lodewijk De Schuyter,De...
Python in geoinformatics
Graph Computing with JanusGraph
Graph Computing with Apache TinkerPop
Up and Down the Python Data & Web Visualization Stack by Rob Story PyData SV ...
Platform introduction & Summary
SmartCity IoT on Kubernetes and OpenStack
Release webinar architecture
Start Flying with Python & Apache TinkerPop
Big Data Europe Transport Pilot case, Luigi Selmi
Powers of Ten Redux
Release webinar end users
resume-yifei-wang
Map Analytics in Starcraft II
 
Going Elastic - Philipp Krenn - Codemotion Amsterdam 2016
6 Lowpan Rpl Tutorial Code
BDE SC3.3 Workshop - BDE Platform: Technical overview
OpenNebulaConf2017EU: Growing into the Petabytes for Fun and Profit by Michal...
Map Analytics in Starcraft II
 
BASTA 2020 VS Code Data Visualisation
OpenNebulaConf2017EU: Enabling Dev and Infra teams by Lodewijk De Schuyter,De...
Ad

Viewers also liked (16)

PDF
Beyond Hadoop and MapReduce
PDF
NVAR Well and Septic Addendum 2012
DOCX
Euro paints profile
PDF
Form long sejour_2010
PPT
PPTX
C/c++ 표준 int 타입
PDF
ציפורקה נוימן זילברשטיין
DOCX
Historia de la nba
PDF
F1 laura galván-mipresentacion
PDF
Veranstaltungsausblick auf Samstag.pdf
DOCX
Printsreean
PPTX
Personal Powerpoint
PPTX
Presentation - Test Automation in Digital Transformation - IITPSA SIGIST 2016042
PPTX
"The Insidious Path of Counterfeit Cancer Drugs from Turkey to Toledo: How d...
PPTX
Drug importation: Maine's Experience with Unreliable Foreign 'Pharmacies,' 20...
PPT
Data types
Beyond Hadoop and MapReduce
NVAR Well and Septic Addendum 2012
Euro paints profile
Form long sejour_2010
C/c++ 표준 int 타입
ציפורקה נוימן זילברשטיין
Historia de la nba
F1 laura galván-mipresentacion
Veranstaltungsausblick auf Samstag.pdf
Printsreean
Personal Powerpoint
Presentation - Test Automation in Digital Transformation - IITPSA SIGIST 2016042
"The Insidious Path of Counterfeit Cancer Drugs from Turkey to Toledo: How d...
Drug importation: Maine's Experience with Unreliable Foreign 'Pharmacies,' 20...
Data types
Ad

Similar to Py datanyc2015 (20)

PDF
Elastic r sc10-tutorial
PDF
Cloud Biocep
PPTX
Microsoft Azure + R
PDF
Bhadale group of companies our technology ecosystem
PPTX
Scientific Computing @ Fred Hutch
PDF
2024 Feb AI Meetup NYC GenAI_LLMs_ML_Data Codeless Generative AI Pipelines
PDF
28March2024-Codeless-Generative-AI-Pipelines
PDF
ODSC East 2020 Accelerate ML Lifecycle with Kubernetes and Containerized Da...
PDF
Alex Wade, Digital Library Interoperability
PPTX
Use r 2013 tutorial - r and cloud computing for higher education and research
PDF
ЯРОСЛАВ РАВЛІНКО «Data Science at scale. Next generation data processing plat...
PDF
Apache Arrow at DataEngConf Barcelona 2018
PPTX
Researh toolbox - Data analysis with python
PDF
Researh toolbox-data-analysis-with-python
PDF
Cytoscape and External Data Analysis Tools
PDF
Delivering Agile Data Science on Openshift - Red Hat Summit 2019
PDF
Resume
PDF
ApacheCon 2021 Apache Deep Learning 302
PDF
Node-RED Interoperability Test
PDF
Democratizing Data Science on Kubernetes
Elastic r sc10-tutorial
Cloud Biocep
Microsoft Azure + R
Bhadale group of companies our technology ecosystem
Scientific Computing @ Fred Hutch
2024 Feb AI Meetup NYC GenAI_LLMs_ML_Data Codeless Generative AI Pipelines
28March2024-Codeless-Generative-AI-Pipelines
ODSC East 2020 Accelerate ML Lifecycle with Kubernetes and Containerized Da...
Alex Wade, Digital Library Interoperability
Use r 2013 tutorial - r and cloud computing for higher education and research
ЯРОСЛАВ РАВЛІНКО «Data Science at scale. Next generation data processing plat...
Apache Arrow at DataEngConf Barcelona 2018
Researh toolbox - Data analysis with python
Researh toolbox-data-analysis-with-python
Cytoscape and External Data Analysis Tools
Delivering Agile Data Science on Openshift - Red Hat Summit 2019
Resume
ApacheCon 2021 Apache Deep Learning 302
Node-RED Interoperability Test
Democratizing Data Science on Kubernetes

Recently uploaded (20)

PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPT
Reliability_Chapter_ presentation 1221.5784
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPTX
Introduction to Knowledge Engineering Part 1
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PDF
Fluorescence-microscope_Botany_detailed content
PPTX
Computer network topology notes for revision
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPT
ISS -ESG Data flows What is ESG and HowHow
PDF
Business Analytics and business intelligence.pdf
PDF
Mega Projects Data Mega Projects Data
PPTX
Supervised vs unsupervised machine learning algorithms
PDF
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPTX
Qualitative Qantitative and Mixed Methods.pptx
IBA_Chapter_11_Slides_Final_Accessible.pptx
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Reliability_Chapter_ presentation 1221.5784
STUDY DESIGN details- Lt Col Maksud (21).pptx
Introduction to Knowledge Engineering Part 1
Galatica Smart Energy Infrastructure Startup Pitch Deck
Fluorescence-microscope_Botany_detailed content
Computer network topology notes for revision
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
ISS -ESG Data flows What is ESG and HowHow
Business Analytics and business intelligence.pdf
Mega Projects Data Mega Projects Data
Supervised vs unsupervised machine learning algorithms
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
Qualitative Qantitative and Mixed Methods.pptx

Py datanyc2015

  • 1. PyData NYC 2015 November 10th 2015 Karim Chine karim.chine@rosettahub.com Towards a universal platform for data science on public and private clouds
  • 2. 2 A universal open platform for data science Computational Components R packages, Wrapped C,C++,Fortran code, Python modules, Matlab Toolkits… Open source or commercial Computational Resources Clusters, grids, private or public clouds Free or pay-per-use Computational GUIs HTML5 and Desktop Workbench Built-in views /Plugins /Collaborative views Open source or commercial Computational Scripts R / Python / Matlab / Groovy Computational APIs Java / SOAP / REST, Stateless and stateful Computational Storage Local, NFS, FTP, Amazon S3, EBS Generated Computational Web Services Stateful or stateless, mapping of R objects/functions Elastic- R
  • 3. 3 Infrastructures federation: rosetta virtual cloud Public Clouds Private Cloud
  • 5. 55 Command Line Web Console SDK API rosettaHUB: programming with data and infrastructure
  • 6. 6 Google Docs-like real time collaboration
  • 7. 7 Traceable and Reproducible data science Elastic-R AMI 1 R 2.10 BioC 2.5 Elastic-R AMI 2 R 2.9 BioC 2.3 Elastic-R AMI 3 R 2.8 BioC 2.0 Elastic-R Amazon Machine Images Elastic-R EBS 1 Data Set XXX Elastic-R EBS 2 Data Set YYY Elastic-R EBS 3 Data Set ZZZ Elastic-R EBS 4 Data Set VVV Elastic-R AMI 2 R 2.9 BioC 2.3 Elastic-R EBS 4 Data Set VVV Amazon Elastic Block Stores Eastic-R AMI 2 R 2.9 BioC 2.3 Elastic-R.org Elastic-R EBS 4 Data Set VVV
  • 10. 10 Data science universal engine  Remote Java/R Processes  Events-driven Remote Objects/Engines  R, Python, Mathematica, Matlab, Scilab, ...  Collaborative Spreadsheets  Collaborative Scientific Graphics Canvas  Collaborative Dashboard with collaborative widgets