SlideShare a Scribd company logo
#Py2SAIS
Spark from Notebook to Cloud
Native Application
Rebecca Simmonds
Senior Software Engineer
rsimmond@redhat.com
@becky_simmonds
#Py2SAIS
To empower others
with the tools and tips to go
from prototype to production
using Apache Spark
Aim
#Py2SAIS
Prototype
#Py2SAIS
1. Use case
2. Problem domain
3. Data set
4. Tools and techniques
Requirements
#Py2SAIS
Use Case
+
#Py2SAIS
Variety Country Points Region
Tinta de
Toro
Spain 98 Toro
Cabernet
Sauvignon
US 70 Napa
Valley
Macauley US 50 Knights
Valley
#Py2SAIS
● Open-source web application
● Create and share live code examples
● Python code
● It empowers users with visualisation tools
Jupyter Notebook
#Py2SAIS
Spark Driver Process Executors
#Py2SAIS
Demo
#Py2SAIS
● Easy to setup and get going
● Lots of visualisations to practise with
● Great method for proof of concept
Conclusions
#Py2SAIS
Production
#Py2SAIS
1. Cloud based for scale and portability
2. Tooling and techniques
3. Database/more robust store
4. Testing
Next Steps
#Py2SAIS
Applications that are:
1. designed to run in the cloud
2. scalable
3. modular
4. and resilient
Cloud Native Applications
#Py2SAIS
● Allow you to package and isolate a runtime
environment
● Easily portable to different environments
● Scalable
● Quick and easy to deploy
Containers
#Py2SAIS
Monolithic
Architecture
Microservices Architecture
User Interface
Business Logic
Data Access
Layer
Database
User Interface
MicroserviceMicroserviceMicroservice
Database Database Database
#Py2SAIS
Kubernetes
kubectl
apiserver
schedulerreplication
controller
kubelet
pod
pod
pod
proxy
Node
Node
Node
#Py2SAIS
An open source community working to empower
intelligent applications on kubernetes
Projects and tutorials to empower developers with
machine learning techniques
Radanaytics.io
#Py2SAIS
Oshinko
#Py2SAIS
Oshinko Deployment
Oshinko source to image
#Py2SAIS
Architecture
Postgresql
Spark
Spark
Spark
Wine Map Application
Load and
Calculate
Response
Response
Request
Response
Request
Job
Web Browser
#Py2SAIS
Demo
#Py2SAIS
# test command
os::cmd::try_until_text
# what to test
'oc new-app --template=oshinko-python-spark-build-dc
-p APPLICATION_NAME=winemap
-p GIT_URI=https://github.com/radanalyticsio/winemap.git
# expected result
'Success'
#Py2SAIS
● Jupyter notebook for prototyping
● VISIT radanalytics.io
● Deploy your own cloud native applications
@becky_simmonds
rsimmond@redhat.com
https://radanalytics.io/applications/wine-map
Conclusion

More Related Content

PDF
DevOps Toolkit - DevOps Day Salvador
PPTX
Oscon15 : ASP.NET 5 : Hey ASP.NET isn’t just for enterprise
PDF
Debug varnish
PPTX
.NET Interactive for your code and Azure
PDF
Go for Operations
PDF
Deploying Python Machine Learning Models with Apache Spark with Brandon Hamri...
PDF
From Data Science to Production - deploy, scale, enjoy! / PyData Amsterdam - ...
PDF
Docker Birthday #5 Meetup Cluj - Presentation
DevOps Toolkit - DevOps Day Salvador
Oscon15 : ASP.NET 5 : Hey ASP.NET isn’t just for enterprise
Debug varnish
.NET Interactive for your code and Azure
Go for Operations
Deploying Python Machine Learning Models with Apache Spark with Brandon Hamri...
From Data Science to Production - deploy, scale, enjoy! / PyData Amsterdam - ...
Docker Birthday #5 Meetup Cluj - Presentation

Similar to Apache Spark from Notebook to Cloud Native Application with Rebecca Simmonds (20)

PDF
Continuuity Presents at Under the Radar 2013
PDF
Dagster @ R&S MNT
PDF
Delivering Fantastic Brand Experiences With Low-Code
PDF
Vibe Coding_ Develop a web application using AI (1).pdf
PPTX
Building Node.js applications for Microsoft Azure cloud
PDF
RubiOne: Apache Spark as the Backbone of a Retail Analytics Development Envir...
PDF
0626 2014 01_toronto-smac meetup_io_t
PPTX
Google cloud Study Jam 2023.pptx
PPTX
Docker Bday #5, SF Edition: Introduction to Docker
PPTX
Docker Containers for Continuous Delivery
PDF
Bringing Deep Learning into production
PPTX
.net developer for Jupyter Notebook and Apache Spark and viceversa
PDF
Oracle nosql twjug-oktober-2014_taiwan_print_v01
PPTX
Unlock the value of your big data infrastructure
PDF
Tampere Docker meetup - Happy 5th Birthday Docker
PDF
Infrastructure for Deep Learning in Apache Spark
PDF
London DevOps Meetup - PaaS as a platform for devops
PDF
Building an Enterprise Data Platform with Azure Databricks to Enable Machine ...
PPTX
Meetups - The Oracle Ace Way
PDF
Selling the Open-Source Philosophy - DrupalCon Latin America
Continuuity Presents at Under the Radar 2013
Dagster @ R&S MNT
Delivering Fantastic Brand Experiences With Low-Code
Vibe Coding_ Develop a web application using AI (1).pdf
Building Node.js applications for Microsoft Azure cloud
RubiOne: Apache Spark as the Backbone of a Retail Analytics Development Envir...
0626 2014 01_toronto-smac meetup_io_t
Google cloud Study Jam 2023.pptx
Docker Bday #5, SF Edition: Introduction to Docker
Docker Containers for Continuous Delivery
Bringing Deep Learning into production
.net developer for Jupyter Notebook and Apache Spark and viceversa
Oracle nosql twjug-oktober-2014_taiwan_print_v01
Unlock the value of your big data infrastructure
Tampere Docker meetup - Happy 5th Birthday Docker
Infrastructure for Deep Learning in Apache Spark
London DevOps Meetup - PaaS as a platform for devops
Building an Enterprise Data Platform with Azure Databricks to Enable Machine ...
Meetups - The Oracle Ace Way
Selling the Open-Source Philosophy - DrupalCon Latin America
Ad

More from Databricks (20)

PPTX
DW Migration Webinar-March 2022.pptx
PPTX
Data Lakehouse Symposium | Day 1 | Part 1
PPT
Data Lakehouse Symposium | Day 1 | Part 2
PPTX
Data Lakehouse Symposium | Day 2
PPTX
Data Lakehouse Symposium | Day 4
PDF
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
PDF
Democratizing Data Quality Through a Centralized Platform
PDF
Learn to Use Databricks for Data Science
PDF
Why APM Is Not the Same As ML Monitoring
PDF
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
PDF
Stage Level Scheduling Improving Big Data and AI Integration
PDF
Simplify Data Conversion from Spark to TensorFlow and PyTorch
PDF
Scaling your Data Pipelines with Apache Spark on Kubernetes
PDF
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
PDF
Sawtooth Windows for Feature Aggregations
PDF
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
PDF
Re-imagine Data Monitoring with whylogs and Spark
PDF
Raven: End-to-end Optimization of ML Prediction Queries
PDF
Processing Large Datasets for ADAS Applications using Apache Spark
PDF
Massive Data Processing in Adobe Using Delta Lake
DW Migration Webinar-March 2022.pptx
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 4
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
Democratizing Data Quality Through a Centralized Platform
Learn to Use Databricks for Data Science
Why APM Is Not the Same As ML Monitoring
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
Stage Level Scheduling Improving Big Data and AI Integration
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Sawtooth Windows for Feature Aggregations
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Re-imagine Data Monitoring with whylogs and Spark
Raven: End-to-end Optimization of ML Prediction Queries
Processing Large Datasets for ADAS Applications using Apache Spark
Massive Data Processing in Adobe Using Delta Lake
Ad

Recently uploaded (20)

PDF
Mega Projects Data Mega Projects Data
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPTX
Database Infoormation System (DBIS).pptx
PDF
Lecture1 pattern recognition............
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
SAP 2 completion done . PRESENTATION.pptx
PPTX
Supervised vs unsupervised machine learning algorithms
PPT
ISS -ESG Data flows What is ESG and HowHow
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPTX
Computer network topology notes for revision
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PDF
Introduction to the R Programming Language
PPTX
1_Introduction to advance data techniques.pptx
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
Mega Projects Data Mega Projects Data
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
Database Infoormation System (DBIS).pptx
Lecture1 pattern recognition............
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
SAP 2 completion done . PRESENTATION.pptx
Supervised vs unsupervised machine learning algorithms
ISS -ESG Data flows What is ESG and HowHow
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
Clinical guidelines as a resource for EBP(1).pdf
Computer network topology notes for revision
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Introduction to the R Programming Language
1_Introduction to advance data techniques.pptx
Acceptance and paychological effects of mandatory extra coach I classes.pptx
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
Data_Analytics_and_PowerBI_Presentation.pptx
STUDY DESIGN details- Lt Col Maksud (21).pptx

Apache Spark from Notebook to Cloud Native Application with Rebecca Simmonds