SlideShare a Scribd company logo
Data Science Apps: Beyond Notebooks
Natalino Busa
2 Natalino Busa - @natbusa
Linkedin + Twitter + Github:
@natbusa
DBS
Teradata
Cognitive Finance
ING Group
O’Reilly
Philips
3 Natalino Busa - @natbusa
Icons made by Gregor Cresnar
from www.flaticon.com is licensed by CC
Learning: The Scientific Method
Ørsted's "First Introduction to General Physics" (1811)
https://en.m.wikipedia.org/wiki/History_of_scientific_method
observation hypothesis deduction synthesis
Hans Christian Ørsted
experiment
4 Natalino Busa - @natbusa
Data Scientist Experience
5 Natalino Busa - @natbusa
CloudTools Math Humans
6 Natalino Busa - @natbusa
The Jupyter Project
http://jupyter.org
7 Natalino Busa - @natbusa
Jupyter notebook: what is it?
The Jupyter Notebook
The Jupyter Notebook is a web application that
allows you to create and share documents that
contain live code, equations, visualizations and
explanatory text.
Uses include: data cleaning and
transformation, numerical simulation,
statistical modeling, machine learning and
much more.
credit : Jupyter project
extracted from http://jupyter.org/index.html
8 Natalino Busa - @natbusa
Jupyter notebook: why?
Language of choice
The Notebook has support for
over 40 programming
languages, including those
popular in Data Science such as
Python, R, Julia and Scala.
Share notebooks
Notebooks can be shared with
others using email, Dropbox,
GitHub and the Jupyter
Notebook Viewer.
Interactive widgets
Code can produce rich output
such as images, videos, LaTeX,
and JavaScript. Interactive
widgets can be used to
manipulate and visualize data in
realtime.
Big data integration
Leverage big data tools, such as
Apache Spark, from Python, R
and Scala. Explore that same
data with pandas, scikit-learn,
ggplot2, dplyr, etc.
credit : Jupyter project
extracted from http://jupyter.org/index.html
9 Natalino Busa - @natbusa
Text Cell
Code Cell
Cell Input
Cell Output
Edit, Run, Kernel, Widgets Menu’s
Kernel Type
Cell output: ASCII, HTML, Image.
etc
10 Natalino Busa - @natbusa
Architecture of a Jupyter Notebook
Jupyter Notebook Server Kernel
∅MQ
Notebook files
Jupyter Notebook
Web App
Web
Browser
HTTP
Websockets
https://jupyter.readthedocs.io/en/latest/architecture/how_jupyter_ipython_work.html
11 Natalino Busa - @natbusa
Architecture of a Jupyter Notebook
• Modular architecture:
Web App, Server, Kernel
• Kernels:
Python, R, Scala, Bash, SQL
• Web App:
Asynchronous, rich editing, syntax highlight, export and share
12 Natalino Busa - @natbusa
Jupyter Notebook
● Narratives and Use Cases
Narratives are collaborative, shareable, publishable, and reproducible. We believe that
Narratives help both yourself and other researchers by sharing your use of Jupyter
projects, technical specifics of your deployment, and installation and configuration tips so
that others can learn from your experiences.
From https://jupyter.readthedocs.io/en/latest/use-cases/content-user.html
13 Natalino Busa - @natbusa
Jupyter is more than Notebooks
“ What if I told you that the notebook
is NOT the only sort of narrative that
you can create with the Jupyter
project? ”
14 Natalino Busa - @natbusa
Examples of Jupyter powered narratives
● O’Reilly Orioles
● Examples - build your own!
15 Natalino Busa - @natbusa
Orioles: A powerful educational narrative
16 Natalino Busa - @natbusa
Geolocated clustering and prediction
services with scikit-learn
Learn how to build a venue
recommender and a geofencing
alerting engine using geolocated data,
ML clustering algorithms, and
scikit-learn
17 Natalino Busa - @natbusa
Build your own narrative!
What do you need?
Understand how to communicate to the jupyter server
Two ways: websockets or http api endpoints
Build your own web application
Many ways: e.g. angular, polymer, dart, etc
1
2
18 Natalino Busa - @natbusa
Demos: kernel gateway
Purpose:
- Understand how to expose API endpoints
- Build your own narrative!
- Productivity gain: faster app prototyping
19 Natalino Busa - @natbusa
20 Natalino Busa - @natbusa
Jupyter Gateway: expose API endpoints
Declare the endpoint
Declear MIME type, Headers, Status
GET http://localhost:8800/counters/my_counter
21 Natalino Busa - @natbusa
Jupyter: docker stacks
Docker container:
jupyter notebook + apache toree
https://github.com/jupyter/docker-stacks
22 Natalino Busa - @natbusa
Dockerize your jupyter gateway api
IMAGE=demos/kernel_gateway_demo
docker build -t $(IMAGE) .
docker run -p 8888:8888 $(IMAGE) 
jupyter kernelgateway
--KernelGatewayApp.ip=0.0.0.0 
--KernelGatewayApp.port=8888 
--KernelGatewayApp.api=notebook-http 
--KernelGatewayApp.seed_uri=/srv/notebooks/autoscience.ipynb
23 Natalino Busa - @natbusa
Big Data apps:
Dockerize your jupyter gateway api with Toree
Jupyter Kernel Gateway Toree Kernel
∅MQ
Notebook files
Web
Browser
Your own
Web App
HTTP REST API
Docker
Containers
onewebsession=
oneserveronacloud
24 Natalino Busa - @natbusa
Summary
• Jupyter notebook is a great way to create and share
data-driven uses cases and projects
• Jupyter is more than notebooks
– gateway, kernels, hub, etc
• Narratives powered by jupyter
– O’ Reilly Orioles
– build your own narrative
25 Natalino Busa - @natbusa
Resources
Jupyter
http://jupyter.org/index.html
https://jupyter.readthedocs.io/en/latest/index.html#
Jupyter Kernel Gateway
https://github.com/jupyter/kernel_gateway
http://jupyter-kernel-gateway.readthedocs.io/en/latest/
Jupyter Con (first of its kind!)
https://conferences.oreilly.com/jupyter/jup-ny
Apache Toree (Spark Kernel)
https://toree.apache.org/
Web application dev
https://angular.io/
https://www.polymer-project.org/1.0/
Docker
https://github.com/jupyter/docker-stacks
https://www.docker.com/
26 Natalino Busa - @natbusa
Linkedin and Twitter:
@natbusa

More Related Content

PDF
7 steps for highly effective deep neural networks
PDF
Data science apps: beyond notebooks
PDF
Creating Art with a Raspberry Pi - Stephanie Nemeth - Codemotion Amsterdam 2017
PDF
Power of Python with Big Data
PDF
Scaling PyData Up and Out
PDF
Python in Data Science Work
PPTX
Python for Big Data Analytics
PDF
DjangoCon Lightning Talk: Hello from Hubble
7 steps for highly effective deep neural networks
Data science apps: beyond notebooks
Creating Art with a Raspberry Pi - Stephanie Nemeth - Codemotion Amsterdam 2017
Power of Python with Big Data
Scaling PyData Up and Out
Python in Data Science Work
Python for Big Data Analytics
DjangoCon Lightning Talk: Hello from Hubble

What's hot (14)

PPTX
H2O & Tensorflow - Fabrizio
PDF
Big Data with Modern R & Spark
PPTX
OpenStack NSA
PDF
Reproducible Workflow with Cytoscape and Jupyter Notebook
PDF
Building Reproducible Network Data Analysis / Visualization Workflows
PPTX
Programming for Everybody in Python
PDF
Cytoscape and External Data Analysis Tools
PPTX
Deep learning with Tensorflow in R
PDF
Collaborations in the Extreme: 
The rise of open code development in the scie...
PDF
Halko_santafe_2015
PDF
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
PDF
Introduction to Biological Network Analysis and Visualization with Cytoscape ...
PPTX
Making Machine Learning Scale: Single Machine and Distributed
PPTX
Python for Big Data Analytics
H2O & Tensorflow - Fabrizio
Big Data with Modern R & Spark
OpenStack NSA
Reproducible Workflow with Cytoscape and Jupyter Notebook
Building Reproducible Network Data Analysis / Visualization Workflows
Programming for Everybody in Python
Cytoscape and External Data Analysis Tools
Deep learning with Tensorflow in R
Collaborations in the Extreme: 
The rise of open code development in the scie...
Halko_santafe_2015
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
Introduction to Biological Network Analysis and Visualization with Cytoscape ...
Making Machine Learning Scale: Single Machine and Distributed
Python for Big Data Analytics
Ad

Similar to Data science apps powered by Jupyter Notebooks (20)

PDF
Data Science Apps: Beyond Notebooks - Natalino Busa - Codemotion Amsterdam 2017
PPTX
2018 02 20-jeg_index
PDF
Computable content: Notebooks, containers, and data-centric organizational le...
PDF
Data analysis with Pandas and Spark
PDF
Computable Content: Lessons Learned
PDF
Computable Content
PDF
Jupyter notebooks on steroids
PPTX
Introduction to Jupyter notebook and MS Azure Machine Learning Studio
PPTX
Introduction to Jupyter notebook and MS Azure Machine Learning Studio
PDF
An Enterprise Analytics Platform with Jupyter Notebooks and Apache Spark
PDF
Jupyter For Data Science Exploratory Analysis Statistical Modeling Machine Le...
PDF
Jupyter con meetup extended jupyter kernel gateway
PDF
PLOTCON NYC: The Architecture of Jupyter: Protocols for Interactive Data Expl...
PDF
Jupyter, A Platform for Data Science at Scale
PPTX
Blastn plus jupyter on Docker
PDF
Continuum Analytics and Python
PDF
Big analytics meetup - Extended Jupyter Kernel Gateway
PDF
Jupyter: A Gateway for Scientific Collaboration and Education
PDF
The Analytic Platform behind IBM’s Watson Data Platform - Big Data Spain 2017
PDF
JupyterHub for Interactive Data Science Collaboration
Data Science Apps: Beyond Notebooks - Natalino Busa - Codemotion Amsterdam 2017
2018 02 20-jeg_index
Computable content: Notebooks, containers, and data-centric organizational le...
Data analysis with Pandas and Spark
Computable Content: Lessons Learned
Computable Content
Jupyter notebooks on steroids
Introduction to Jupyter notebook and MS Azure Machine Learning Studio
Introduction to Jupyter notebook and MS Azure Machine Learning Studio
An Enterprise Analytics Platform with Jupyter Notebooks and Apache Spark
Jupyter For Data Science Exploratory Analysis Statistical Modeling Machine Le...
Jupyter con meetup extended jupyter kernel gateway
PLOTCON NYC: The Architecture of Jupyter: Protocols for Interactive Data Expl...
Jupyter, A Platform for Data Science at Scale
Blastn plus jupyter on Docker
Continuum Analytics and Python
Big analytics meetup - Extended Jupyter Kernel Gateway
Jupyter: A Gateway for Scientific Collaboration and Education
The Analytic Platform behind IBM’s Watson Data Platform - Big Data Spain 2017
JupyterHub for Interactive Data Science Collaboration
Ad

More from Natalino Busa (17)

PDF
Data Production Pipelines: Legacy, practices, and innovation
PDF
[Ai in finance] AI in regulatory compliance, risk management, and auditing
PDF
Strata London 16: sightseeing, venues, and friends
PDF
Data in Action
PDF
Real-Time Anomaly Detection with Spark MLlib, Akka and Cassandra
PDF
The evolution of data analytics
PDF
Towards Real-Time banking API's: Introducing Coral, a web api for realtime st...
PDF
Streaming Api Design with Akka, Scala and Spray
PDF
Hadoop + Cassandra: Fast queries on data lakes, and wikipedia search tutorial.
PDF
Big data solutions for advanced marketing analytics
PDF
Awesome Banking API's
PDF
Yo. big data. understanding data science in the era of big data.
PDF
Big and fast a quest for relevant and real-time analytics
PDF
Big Data and APIs - a recon tour on how to successfully do Big Data analytics
PDF
Strata 2014: Data science and big data trending topics
PDF
Streaming computing: architectures, and tchnologies
PDF
Big data landscape
Data Production Pipelines: Legacy, practices, and innovation
[Ai in finance] AI in regulatory compliance, risk management, and auditing
Strata London 16: sightseeing, venues, and friends
Data in Action
Real-Time Anomaly Detection with Spark MLlib, Akka and Cassandra
The evolution of data analytics
Towards Real-Time banking API's: Introducing Coral, a web api for realtime st...
Streaming Api Design with Akka, Scala and Spray
Hadoop + Cassandra: Fast queries on data lakes, and wikipedia search tutorial.
Big data solutions for advanced marketing analytics
Awesome Banking API's
Yo. big data. understanding data science in the era of big data.
Big and fast a quest for relevant and real-time analytics
Big Data and APIs - a recon tour on how to successfully do Big Data analytics
Strata 2014: Data science and big data trending topics
Streaming computing: architectures, and tchnologies
Big data landscape

Recently uploaded (20)

PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PPTX
Introduction to Knowledge Engineering Part 1
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PPTX
Moving the Public Sector (Government) to a Digital Adoption
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PPTX
Computer network topology notes for revision
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PPTX
climate analysis of Dhaka ,Banglades.pptx
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PDF
Foundation of Data Science unit number two notes
PPTX
Global journeys: estimating international migration
PPTX
IB Computer Science - Internal Assessment.pptx
PPT
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm
PPTX
Logistic Regression ml machine learning.pptx
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PDF
Taxes Foundatisdcsdcsdon Certificate.pdf
PPT
Reliability_Chapter_ presentation 1221.5784
Data_Analytics_and_PowerBI_Presentation.pptx
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
Introduction to Knowledge Engineering Part 1
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
STUDY DESIGN details- Lt Col Maksud (21).pptx
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
Moving the Public Sector (Government) to a Digital Adoption
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
Computer network topology notes for revision
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
climate analysis of Dhaka ,Banglades.pptx
oil_refinery_comprehensive_20250804084928 (1).pptx
Foundation of Data Science unit number two notes
Global journeys: estimating international migration
IB Computer Science - Internal Assessment.pptx
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm
Logistic Regression ml machine learning.pptx
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Taxes Foundatisdcsdcsdon Certificate.pdf
Reliability_Chapter_ presentation 1221.5784

Data science apps powered by Jupyter Notebooks

  • 1. Data Science Apps: Beyond Notebooks Natalino Busa
  • 2. 2 Natalino Busa - @natbusa Linkedin + Twitter + Github: @natbusa DBS Teradata Cognitive Finance ING Group O’Reilly Philips
  • 3. 3 Natalino Busa - @natbusa Icons made by Gregor Cresnar from www.flaticon.com is licensed by CC Learning: The Scientific Method Ørsted's "First Introduction to General Physics" (1811) https://en.m.wikipedia.org/wiki/History_of_scientific_method observation hypothesis deduction synthesis Hans Christian Ørsted experiment
  • 4. 4 Natalino Busa - @natbusa Data Scientist Experience
  • 5. 5 Natalino Busa - @natbusa CloudTools Math Humans
  • 6. 6 Natalino Busa - @natbusa The Jupyter Project http://jupyter.org
  • 7. 7 Natalino Busa - @natbusa Jupyter notebook: what is it? The Jupyter Notebook The Jupyter Notebook is a web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, machine learning and much more. credit : Jupyter project extracted from http://jupyter.org/index.html
  • 8. 8 Natalino Busa - @natbusa Jupyter notebook: why? Language of choice The Notebook has support for over 40 programming languages, including those popular in Data Science such as Python, R, Julia and Scala. Share notebooks Notebooks can be shared with others using email, Dropbox, GitHub and the Jupyter Notebook Viewer. Interactive widgets Code can produce rich output such as images, videos, LaTeX, and JavaScript. Interactive widgets can be used to manipulate and visualize data in realtime. Big data integration Leverage big data tools, such as Apache Spark, from Python, R and Scala. Explore that same data with pandas, scikit-learn, ggplot2, dplyr, etc. credit : Jupyter project extracted from http://jupyter.org/index.html
  • 9. 9 Natalino Busa - @natbusa Text Cell Code Cell Cell Input Cell Output Edit, Run, Kernel, Widgets Menu’s Kernel Type Cell output: ASCII, HTML, Image. etc
  • 10. 10 Natalino Busa - @natbusa Architecture of a Jupyter Notebook Jupyter Notebook Server Kernel ∅MQ Notebook files Jupyter Notebook Web App Web Browser HTTP Websockets https://jupyter.readthedocs.io/en/latest/architecture/how_jupyter_ipython_work.html
  • 11. 11 Natalino Busa - @natbusa Architecture of a Jupyter Notebook • Modular architecture: Web App, Server, Kernel • Kernels: Python, R, Scala, Bash, SQL • Web App: Asynchronous, rich editing, syntax highlight, export and share
  • 12. 12 Natalino Busa - @natbusa Jupyter Notebook ● Narratives and Use Cases Narratives are collaborative, shareable, publishable, and reproducible. We believe that Narratives help both yourself and other researchers by sharing your use of Jupyter projects, technical specifics of your deployment, and installation and configuration tips so that others can learn from your experiences. From https://jupyter.readthedocs.io/en/latest/use-cases/content-user.html
  • 13. 13 Natalino Busa - @natbusa Jupyter is more than Notebooks “ What if I told you that the notebook is NOT the only sort of narrative that you can create with the Jupyter project? ”
  • 14. 14 Natalino Busa - @natbusa Examples of Jupyter powered narratives ● O’Reilly Orioles ● Examples - build your own!
  • 15. 15 Natalino Busa - @natbusa Orioles: A powerful educational narrative
  • 16. 16 Natalino Busa - @natbusa Geolocated clustering and prediction services with scikit-learn Learn how to build a venue recommender and a geofencing alerting engine using geolocated data, ML clustering algorithms, and scikit-learn
  • 17. 17 Natalino Busa - @natbusa Build your own narrative! What do you need? Understand how to communicate to the jupyter server Two ways: websockets or http api endpoints Build your own web application Many ways: e.g. angular, polymer, dart, etc 1 2
  • 18. 18 Natalino Busa - @natbusa Demos: kernel gateway Purpose: - Understand how to expose API endpoints - Build your own narrative! - Productivity gain: faster app prototyping
  • 19. 19 Natalino Busa - @natbusa
  • 20. 20 Natalino Busa - @natbusa Jupyter Gateway: expose API endpoints Declare the endpoint Declear MIME type, Headers, Status GET http://localhost:8800/counters/my_counter
  • 21. 21 Natalino Busa - @natbusa Jupyter: docker stacks Docker container: jupyter notebook + apache toree https://github.com/jupyter/docker-stacks
  • 22. 22 Natalino Busa - @natbusa Dockerize your jupyter gateway api IMAGE=demos/kernel_gateway_demo docker build -t $(IMAGE) . docker run -p 8888:8888 $(IMAGE) jupyter kernelgateway --KernelGatewayApp.ip=0.0.0.0 --KernelGatewayApp.port=8888 --KernelGatewayApp.api=notebook-http --KernelGatewayApp.seed_uri=/srv/notebooks/autoscience.ipynb
  • 23. 23 Natalino Busa - @natbusa Big Data apps: Dockerize your jupyter gateway api with Toree Jupyter Kernel Gateway Toree Kernel ∅MQ Notebook files Web Browser Your own Web App HTTP REST API Docker Containers onewebsession= oneserveronacloud
  • 24. 24 Natalino Busa - @natbusa Summary • Jupyter notebook is a great way to create and share data-driven uses cases and projects • Jupyter is more than notebooks – gateway, kernels, hub, etc • Narratives powered by jupyter – O’ Reilly Orioles – build your own narrative
  • 25. 25 Natalino Busa - @natbusa Resources Jupyter http://jupyter.org/index.html https://jupyter.readthedocs.io/en/latest/index.html# Jupyter Kernel Gateway https://github.com/jupyter/kernel_gateway http://jupyter-kernel-gateway.readthedocs.io/en/latest/ Jupyter Con (first of its kind!) https://conferences.oreilly.com/jupyter/jup-ny Apache Toree (Spark Kernel) https://toree.apache.org/ Web application dev https://angular.io/ https://www.polymer-project.org/1.0/ Docker https://github.com/jupyter/docker-stacks https://www.docker.com/
  • 26. 26 Natalino Busa - @natbusa Linkedin and Twitter: @natbusa