SlideShare a Scribd company logo
Data Visualization in Python
Marc Garcia - @datapythonista
Data Visualisation Summit - London, 2017
1 / 34
Data Visualization in Python - @datapythonista
About me
http://datapythonista.github.io
2 / 34
Data Visualization in Python - @datapythonista
Python for data science
3 / 34
Data Visualization in Python - @datapythonista
Python for data science
Why Python?
Python is the favorite of many:
Fast to write: Batteries included
Easy to read: Readability is KEY
Excellent community: Conferences, local groups, stackoverflow...
Ubiquitous: Present in all major platforms
Easy to integrate: Implements main protocols and formats
Easy to extend: C extensions for low-level operations
4 / 34
Data Visualization in Python - @datapythonista
Python for data science
Python performance
Is Python fast for data science?
Short answer: No
Long answer: Yes
numpy
Cython
C extensions
Numba
etc.
5 / 34
Data Visualization in Python - @datapythonista
Python for data science
Python is great for data science
A whole ecosystem exists:
numpy
scipy
pandas
statsmodels
scikit-learn
etc.
6 / 34
Data Visualization in Python - @datapythonista
Python for data science
Python environment
One ring to rule them all:
7 / 34
Data Visualization in Python - @datapythonista
Python for data science
Python platform
Jupyter notebook
8 / 34
Data Visualization in Python - @datapythonista
Python for data science
Python for visualization
Main libraries:
Matplotlib
Seaborn
Bokeh
HoloViews
Datashader
Domain-specific
Folium: maps
yt: volumetric data
9 / 34
Data Visualization in Python - @datapythonista
Visualization tools
10 / 34
Data Visualization in Python - @datapythonista
Visualization tools
Matplotlib
First Python visualization tool
Still a de-facto standard
Replicates Matlab API
Supports many backends
11 / 34
Data Visualization in Python - @datapythonista
Visualization tools
Matplotlib
import numpy
from matplotlib import pyplot
x = numpy.linspace(0., 100., 1001)
y = x + numpy.random.randn(1001) * 5
pyplot.plot(x, y)
pyplot.xlabel(’time (seconds)’)
pyplot.ylabel(’some noisy signal’)
pyplot.title(’A simple plot in matplotlib’)
12 / 34
Data Visualization in Python - @datapythonista
Visualization tools
Matplotlib
13 / 34
Data Visualization in Python - @datapythonista
Visualization tools
Matplotlib
import numpy
from matplotlib import pyplot
x = numpy.linspace(0., 100., 1001)
y1 = x + numpy.random.randn(1001) * 3
y2 = 45 + x * .4 + numpy.random.randn(1001) * 7
pyplot.plot(x, y1, label=’Our previous signal’)
pyplot.plot(x, y2, color=’orange’, label=’A new signal’)
pyplot.xlabel(’time (seconds)’)
pyplot.ylabel(’some noisy signal’)
pyplot.title(’A simple plot in matplotlib’)
pyplot.legend()
14 / 34
Data Visualization in Python - @datapythonista
Visualization tools
Matplotlib
15 / 34
Data Visualization in Python - @datapythonista
Visualization tools
Seaborn
Matplotlib wrapper
Built-in themes
Higher level plots:
Heatmap
Violin plot
Pair plot
16 / 34
Data Visualization in Python - @datapythonista
Visualization tools
Seaborn
from matplotlib import pyplot
import seaborn
flights_flat = seaborn.load_dataset(’flights’)
flights = flights_flat.pivot(’month’, ’year’, ’passengers’)
seaborn.heatmap(flights, annot=True, fmt=’d’)
pyplot.title(’Number of flight passengers (thousands)’)
17 / 34
Data Visualization in Python - @datapythonista
Visualization tools
Seaborn
18 / 34
Data Visualization in Python - @datapythonista
Visualization tools
Bokeh
Client-server architecture: JavaScript front-end
Interactive
Drawing shapes to generate plots
19 / 34
Data Visualization in Python - @datapythonista
Visualization tools
Bokeh
Demo
20 / 34
Data Visualization in Python - @datapythonista
Visualization tools
HoloViews
Bokeh wrapper
Higher level plots
Mainly for Bokeh, but other backends supported
21 / 34
Data Visualization in Python - @datapythonista
Visualization tools
HoloViews
import numpy as np
import holoviews as hv
from bokeh.sampledata.us_counties import data as counties
from bokeh.sampledata.unemployment import data as unemployment
hv.extension(’bokeh’)
counties = {code: county for code, county in counties.items() if county[’state’] == ’tx’}
county_xs = [county[’lons’] for county in counties.values()]
county_ys = [county[’lats’] for county in counties.values()]
county_names = [county[’name’] for county in counties.values()]
county_rates = [unemployment[county_id] for county_id in counties]
county_polys = {name: hv.Polygons((xs, ys), level=rate, vdims=[’Unemployment’])
for name, xs, ys, rate in zip(county_names, county_xs, county_ys,
county_rates)}
choropleth = hv.NdOverlay(county_polys, kdims=[’County’])
plot_opts = dict(logz=True, tools=[’hover’], xaxis=None, yaxis=None,
show_grid=False, show_frame=False, width=500, height=500)
style = dict(line_color=’white’)
choropleth({’Polygons’: {’style’: style, ’plot’: plot_opts}})
22 / 34
Data Visualization in Python - @datapythonista
Visualization tools
HoloViews
23 / 34
Data Visualization in Python - @datapythonista
Visualization tools
Datashader
Bokeh wrapper
Built for big data
Advanced subsampling and binning techniques
24 / 34
Data Visualization in Python - @datapythonista
Visualization tools
Datashader
25 / 34
Data Visualization in Python - @datapythonista
Visualization tools
Folium
Visualization of maps
Compatible with Google maps and Open street maps
Visualization of markers, paths and polygons
26 / 34
Data Visualization in Python - @datapythonista
Visualization tools
Folium
import folium
m = folium.Map(location=[45.372, -121.6972],
zoom_start=12,
tiles=’Stamen Terrain’)
folium.Marker(location=[45.3288, -121.6625],
popup=’Mt. Hood Meadows’,
icon=folium.Icon(icon=’cloud’)).add_to(m)
folium.Marker(location=[45.3311, -121.7113],
popup=’Timberline Lodge’,
icon=folium.Icon(color=’green’)).add_to(m)
folium.Marker(location=[45.3300, -121.6823],
popup=’Some Other Location’,
icon=folium.Icon(color=’red’, icon=’info-sign’)).add_to(m)
m
27 / 34
Data Visualization in Python - @datapythonista
Visualization tools
Folium
28 / 34
Data Visualization in Python - @datapythonista
Visualization tools
yt
Visualization of volumetric data
Compatible with many formats
Projects multidimensional data to a 2-D plane
29 / 34
Data Visualization in Python - @datapythonista
Visualization tools
yt
import yt
ds = yt.load(’MOOSE_sample_data/out.e-s010’)
sc = yt.create_scene(ds)
ms = sc.get_source()
ms.cmap = ’Eos A’
cam = sc.camera
cam.focus = ds.arr([0.0, 0.0, 0.0], ’code_length’)
cam_pos = ds.arr([-3.0, 3.0, -3.0], ’code_length’)
north_vector = ds.arr([0.0, -1.0, -1.0], ’dimensionless’)
cam.set_position(cam_pos, north_vector)
cam.resolution = (800, 800)
sc.save()
30 / 34
Data Visualization in Python - @datapythonista
Visualization tools
yt
31 / 34
Data Visualization in Python - @datapythonista
Conclusions
32 / 34
Data Visualization in Python - @datapythonista
Conclusions
Conclusions
Python is great as a programming language
And is great for data science
Plenty of options for visualization:
Standard plots
Ad-hoc plots
Interactive
3D plots
Maps
Big data
Specialized
33 / 34
Data Visualization in Python - @datapythonista
Conclusions
Questions?
@datapythonista
34 / 34
Data Visualization in Python - @datapythonista

More Related Content

PDF
Data Analysis and Visualization using Python
PDF
Data Visualization(s) Using Python
PDF
Data Visualization in Python
PPTX
Python Seaborn Data Visualization
ODP
Data Analysis in Python
PDF
pandas - Python Data Analysis
PPTX
Basic of python for data analysis
PPTX
Visualization and Matplotlib using Python.pptx
Data Analysis and Visualization using Python
Data Visualization(s) Using Python
Data Visualization in Python
Python Seaborn Data Visualization
Data Analysis in Python
pandas - Python Data Analysis
Basic of python for data analysis
Visualization and Matplotlib using Python.pptx

What's hot (20)

PPTX
Introduction to matplotlib
PDF
Data visualization
PPTX
Introduction to pandas
PDF
Strings in python
PPT
Python ppt
PDF
Python NumPy Tutorial | NumPy Array | Edureka
PDF
Python list
PDF
Python for Data Science
PPTX
Python Scipy Numpy
PPTX
Python Functions
PPTX
Machine Learning
PDF
Python Matplotlib Tutorial | Matplotlib Tutorial | Python Tutorial | Python T...
PDF
Supervised and Unsupervised Machine Learning
PDF
Data science presentation
PPTX
Data Analysis with Python Pandas
PDF
Pandas
PDF
List,tuple,dictionary
PPTX
Presentation on supervised learning
PDF
Python set
Introduction to matplotlib
Data visualization
Introduction to pandas
Strings in python
Python ppt
Python NumPy Tutorial | NumPy Array | Edureka
Python list
Python for Data Science
Python Scipy Numpy
Python Functions
Machine Learning
Python Matplotlib Tutorial | Matplotlib Tutorial | Python Tutorial | Python T...
Supervised and Unsupervised Machine Learning
Data science presentation
Data Analysis with Python Pandas
Pandas
List,tuple,dictionary
Presentation on supervised learning
Python set
Ad

Similar to Data visualization in Python (20)

PDF
datavisualizationinpythonv2-171103225436.pdf
PDF
DAVLectuer3 Exploratory data analysis .pdf
PPTX
Exploring-Data-Visualization-in-Python.pptx
PPTX
Data-Visualization-with-Python-2 PPT.pptx
PDF
DAVLectuer3 Exploratory data analysis .pdf
PPTX
Data Visualization in Python of b.tech student.pptx
PPTX
Python for Data Science
PDF
Python Visualisation for Data Science
PPTX
DATA ANALYSIS AND VISUALISATION using python 2
PPTX
Python_for_Data_Visualization.pptx python for BE &Mtech
PDF
PyLadies Seattle - Lessons in Interactive Visualizations
DOCX
Start Data Analysis Right_ Python Libraries You Need to Know.docx
PPTX
UNIT-5-II IT-DATA VISUALIZATION TECHNIQUES
PPTX
CH 4_TYBSC(CS)_Data Science_Visualisation
PDF
Exploratory Data Analysis in Spark
PDF
Unlocking Insights Data Analysis Visualization
PPTX
Radhika (30323U09065).pptx data science with python
PPTX
data analytics and visualization CO4_18_Data Types for Plotting.pptx
PPTX
VANITHA S.docx.pptxdata science with python
PPTX
python libray for data analytics seaborn[1].pptx
datavisualizationinpythonv2-171103225436.pdf
DAVLectuer3 Exploratory data analysis .pdf
Exploring-Data-Visualization-in-Python.pptx
Data-Visualization-with-Python-2 PPT.pptx
DAVLectuer3 Exploratory data analysis .pdf
Data Visualization in Python of b.tech student.pptx
Python for Data Science
Python Visualisation for Data Science
DATA ANALYSIS AND VISUALISATION using python 2
Python_for_Data_Visualization.pptx python for BE &Mtech
PyLadies Seattle - Lessons in Interactive Visualizations
Start Data Analysis Right_ Python Libraries You Need to Know.docx
UNIT-5-II IT-DATA VISUALIZATION TECHNIQUES
CH 4_TYBSC(CS)_Data Science_Visualisation
Exploratory Data Analysis in Spark
Unlocking Insights Data Analysis Visualization
Radhika (30323U09065).pptx data science with python
data analytics and visualization CO4_18_Data Types for Plotting.pptx
VANITHA S.docx.pptxdata science with python
python libray for data analytics seaborn[1].pptx
Ad

More from Marc Garcia (6)

PDF
Replicating the human brain: Deep learning in action
PDF
Machine Learning for Digital Advertising
PDF
Machine learning for digital advertising
PDF
Understanding random forests
PDF
CART: Not only Classification and Regression Trees
PDF
High Performance Python - Marc Garcia
Replicating the human brain: Deep learning in action
Machine Learning for Digital Advertising
Machine learning for digital advertising
Understanding random forests
CART: Not only Classification and Regression Trees
High Performance Python - Marc Garcia

Recently uploaded (20)

PDF
Upgrade and Innovation Strategies for SAP ERP Customers
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PDF
System and Network Administraation Chapter 3
PDF
Understanding Forklifts - TECH EHS Solution
PPT
Introduction Database Management System for Course Database
PPTX
L1 - Introduction to python Backend.pptx
PDF
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PPTX
Operating system designcfffgfgggggggvggggggggg
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PDF
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
PDF
top salesforce developer skills in 2025.pdf
PPTX
CHAPTER 2 - PM Management and IT Context
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
PPTX
history of c programming in notes for students .pptx
PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PDF
System and Network Administration Chapter 2
PPTX
Introduction to Artificial Intelligence
PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
Upgrade and Innovation Strategies for SAP ERP Customers
Which alternative to Crystal Reports is best for small or large businesses.pdf
System and Network Administraation Chapter 3
Understanding Forklifts - TECH EHS Solution
Introduction Database Management System for Course Database
L1 - Introduction to python Backend.pptx
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
How to Choose the Right IT Partner for Your Business in Malaysia
Operating system designcfffgfgggggggvggggggggg
Design an Analysis of Algorithms I-SECS-1021-03
Internet Downloader Manager (IDM) Crack 6.42 Build 41
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
top salesforce developer skills in 2025.pdf
CHAPTER 2 - PM Management and IT Context
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
history of c programming in notes for students .pptx
Wondershare Filmora 15 Crack With Activation Key [2025
System and Network Administration Chapter 2
Introduction to Artificial Intelligence
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...

Data visualization in Python

  • 1. Data Visualization in Python Marc Garcia - @datapythonista Data Visualisation Summit - London, 2017 1 / 34 Data Visualization in Python - @datapythonista
  • 2. About me http://datapythonista.github.io 2 / 34 Data Visualization in Python - @datapythonista
  • 3. Python for data science 3 / 34 Data Visualization in Python - @datapythonista
  • 4. Python for data science Why Python? Python is the favorite of many: Fast to write: Batteries included Easy to read: Readability is KEY Excellent community: Conferences, local groups, stackoverflow... Ubiquitous: Present in all major platforms Easy to integrate: Implements main protocols and formats Easy to extend: C extensions for low-level operations 4 / 34 Data Visualization in Python - @datapythonista
  • 5. Python for data science Python performance Is Python fast for data science? Short answer: No Long answer: Yes numpy Cython C extensions Numba etc. 5 / 34 Data Visualization in Python - @datapythonista
  • 6. Python for data science Python is great for data science A whole ecosystem exists: numpy scipy pandas statsmodels scikit-learn etc. 6 / 34 Data Visualization in Python - @datapythonista
  • 7. Python for data science Python environment One ring to rule them all: 7 / 34 Data Visualization in Python - @datapythonista
  • 8. Python for data science Python platform Jupyter notebook 8 / 34 Data Visualization in Python - @datapythonista
  • 9. Python for data science Python for visualization Main libraries: Matplotlib Seaborn Bokeh HoloViews Datashader Domain-specific Folium: maps yt: volumetric data 9 / 34 Data Visualization in Python - @datapythonista
  • 10. Visualization tools 10 / 34 Data Visualization in Python - @datapythonista
  • 11. Visualization tools Matplotlib First Python visualization tool Still a de-facto standard Replicates Matlab API Supports many backends 11 / 34 Data Visualization in Python - @datapythonista
  • 12. Visualization tools Matplotlib import numpy from matplotlib import pyplot x = numpy.linspace(0., 100., 1001) y = x + numpy.random.randn(1001) * 5 pyplot.plot(x, y) pyplot.xlabel(’time (seconds)’) pyplot.ylabel(’some noisy signal’) pyplot.title(’A simple plot in matplotlib’) 12 / 34 Data Visualization in Python - @datapythonista
  • 13. Visualization tools Matplotlib 13 / 34 Data Visualization in Python - @datapythonista
  • 14. Visualization tools Matplotlib import numpy from matplotlib import pyplot x = numpy.linspace(0., 100., 1001) y1 = x + numpy.random.randn(1001) * 3 y2 = 45 + x * .4 + numpy.random.randn(1001) * 7 pyplot.plot(x, y1, label=’Our previous signal’) pyplot.plot(x, y2, color=’orange’, label=’A new signal’) pyplot.xlabel(’time (seconds)’) pyplot.ylabel(’some noisy signal’) pyplot.title(’A simple plot in matplotlib’) pyplot.legend() 14 / 34 Data Visualization in Python - @datapythonista
  • 15. Visualization tools Matplotlib 15 / 34 Data Visualization in Python - @datapythonista
  • 16. Visualization tools Seaborn Matplotlib wrapper Built-in themes Higher level plots: Heatmap Violin plot Pair plot 16 / 34 Data Visualization in Python - @datapythonista
  • 17. Visualization tools Seaborn from matplotlib import pyplot import seaborn flights_flat = seaborn.load_dataset(’flights’) flights = flights_flat.pivot(’month’, ’year’, ’passengers’) seaborn.heatmap(flights, annot=True, fmt=’d’) pyplot.title(’Number of flight passengers (thousands)’) 17 / 34 Data Visualization in Python - @datapythonista
  • 18. Visualization tools Seaborn 18 / 34 Data Visualization in Python - @datapythonista
  • 19. Visualization tools Bokeh Client-server architecture: JavaScript front-end Interactive Drawing shapes to generate plots 19 / 34 Data Visualization in Python - @datapythonista
  • 20. Visualization tools Bokeh Demo 20 / 34 Data Visualization in Python - @datapythonista
  • 21. Visualization tools HoloViews Bokeh wrapper Higher level plots Mainly for Bokeh, but other backends supported 21 / 34 Data Visualization in Python - @datapythonista
  • 22. Visualization tools HoloViews import numpy as np import holoviews as hv from bokeh.sampledata.us_counties import data as counties from bokeh.sampledata.unemployment import data as unemployment hv.extension(’bokeh’) counties = {code: county for code, county in counties.items() if county[’state’] == ’tx’} county_xs = [county[’lons’] for county in counties.values()] county_ys = [county[’lats’] for county in counties.values()] county_names = [county[’name’] for county in counties.values()] county_rates = [unemployment[county_id] for county_id in counties] county_polys = {name: hv.Polygons((xs, ys), level=rate, vdims=[’Unemployment’]) for name, xs, ys, rate in zip(county_names, county_xs, county_ys, county_rates)} choropleth = hv.NdOverlay(county_polys, kdims=[’County’]) plot_opts = dict(logz=True, tools=[’hover’], xaxis=None, yaxis=None, show_grid=False, show_frame=False, width=500, height=500) style = dict(line_color=’white’) choropleth({’Polygons’: {’style’: style, ’plot’: plot_opts}}) 22 / 34 Data Visualization in Python - @datapythonista
  • 23. Visualization tools HoloViews 23 / 34 Data Visualization in Python - @datapythonista
  • 24. Visualization tools Datashader Bokeh wrapper Built for big data Advanced subsampling and binning techniques 24 / 34 Data Visualization in Python - @datapythonista
  • 25. Visualization tools Datashader 25 / 34 Data Visualization in Python - @datapythonista
  • 26. Visualization tools Folium Visualization of maps Compatible with Google maps and Open street maps Visualization of markers, paths and polygons 26 / 34 Data Visualization in Python - @datapythonista
  • 27. Visualization tools Folium import folium m = folium.Map(location=[45.372, -121.6972], zoom_start=12, tiles=’Stamen Terrain’) folium.Marker(location=[45.3288, -121.6625], popup=’Mt. Hood Meadows’, icon=folium.Icon(icon=’cloud’)).add_to(m) folium.Marker(location=[45.3311, -121.7113], popup=’Timberline Lodge’, icon=folium.Icon(color=’green’)).add_to(m) folium.Marker(location=[45.3300, -121.6823], popup=’Some Other Location’, icon=folium.Icon(color=’red’, icon=’info-sign’)).add_to(m) m 27 / 34 Data Visualization in Python - @datapythonista
  • 28. Visualization tools Folium 28 / 34 Data Visualization in Python - @datapythonista
  • 29. Visualization tools yt Visualization of volumetric data Compatible with many formats Projects multidimensional data to a 2-D plane 29 / 34 Data Visualization in Python - @datapythonista
  • 30. Visualization tools yt import yt ds = yt.load(’MOOSE_sample_data/out.e-s010’) sc = yt.create_scene(ds) ms = sc.get_source() ms.cmap = ’Eos A’ cam = sc.camera cam.focus = ds.arr([0.0, 0.0, 0.0], ’code_length’) cam_pos = ds.arr([-3.0, 3.0, -3.0], ’code_length’) north_vector = ds.arr([0.0, -1.0, -1.0], ’dimensionless’) cam.set_position(cam_pos, north_vector) cam.resolution = (800, 800) sc.save() 30 / 34 Data Visualization in Python - @datapythonista
  • 31. Visualization tools yt 31 / 34 Data Visualization in Python - @datapythonista
  • 32. Conclusions 32 / 34 Data Visualization in Python - @datapythonista
  • 33. Conclusions Conclusions Python is great as a programming language And is great for data science Plenty of options for visualization: Standard plots Ad-hoc plots Interactive 3D plots Maps Big data Specialized 33 / 34 Data Visualization in Python - @datapythonista
  • 34. Conclusions Questions? @datapythonista 34 / 34 Data Visualization in Python - @datapythonista