SlideShare a Scribd company logo
Data Visualisation for
Beginners
Sara-Jayne Terp
2nd October 2014
Introduction
(XKCD.com: data science humor)
Talking About
Source: wordle.net
Not Talking About
‱ Algorithm details
‱ Machine learning
‱ Big data techniques
‱ Koalas
Cogs image from Worradmu on freedigitalphotos.net; Koala from koalastothemax.com
Data Visualisation
Images: Hans Rosling Ted Talk; erm.. something about India and women
Data Science
The Process
Storytelling
Ask Good Questions
“Data science is all about asking questions. You
engage in it whenever you interactively and
iteratively search for deep, hidden patterns.” –
James Kobielus
‱ Do people have more phones than toilets?
‱ How is Ebola spreading?
‱ Is using wood ïŹres sustainable here?
‱ Can we feed 9 billion people?
(Simple, Actionable,Incremental)
Get the data
An Aside: Big Data
Volume
Velocity
Variety
CSV, json, xml,
excel, pdf,
text,
webpages, rss,
scanned pages,
images, videos,
audiofiles,
maps,
proprietary
formats etc.
Veracity
The “3 other Vs” are Viability, Validation, Verification.
!
Validation, checking that inputs are real, is a big deal for development data.
Getting more data
Explore your Data
‱ Spend time with your dataset:
‱ Understand where it came from - can you live with
the assumptions the data collectors made?
‱ Look at it
‱ Plot it
‱ Where are there holes? Inconsistencies? Anomalies?
‱ Clean your data, ïŹnd better datasets, get more data
Data is Often Inconvenient
(plus often sitting on someone’s laptop)
Data is (almost) Always Dirty
DR Congo in data.UN.org: “Congo, Democratic Republic of
the”, “Congo Democratic”, “Democratic Republic of the
Congo”, “Congo (Democratic Republic of the)”, “Congo, Dem.
Rep.”, “Congo Dem. Rep.”, “Congo, Democratic Republic of”,
“Dem. Rep. of Congo”, “Dem. Rep. of the Congo”
!
DR Congo in common standards: “Democratic Republic of the
Congo” (UN Stats), “Congo, The Democratic Republic of
the” (ISO3166), “Congo, Democratic Republic of the” (FIPS10,
Stanag), “180” (UN Stats), “COD” (ISO3166, Stanag),
“CG” (FIPS10)
Use multiple datasets
Process your data
Everything is a dataset (if you look hard enough)
Process your data
(The relationships between things are interesting
- these are my Facebook ‘friends’, on Gephi)
Process your data
(machine learning can be useful too,
e.g. if you’re working in a language with no stopword lists)
Model your data
‱ You’re persuading people with ‘truths’: do your best
to make sure they’re truthful
‱ Always cross-check
‱ Statistics is your friend
Explain your results
‱ You’re trying to persuade people to change:
‱ Their opinions
‱ Their actions
‱ Visuals are (often) more persuasive:
‱ ”I already knew that increased incarceration didn’t
lower crime, but I wasn’t sure of the statistics. To
see it on the graphs is really eye opening.” *
*: Pandey et al, The Persuasive Power of Data Visualisation
Tools
selection.datavisualization.ch
Tools: Excel
http://peltiertech.com/clustered-stacked-column-bar-charts/
Tools: QGIS
Tools: Google Fusion Tables
(link to google spreadsheets)
Tools: Tableau
http://www.tableausoftware.com/public/gallery/major-league-baseball
Tools: Tableau
(this is a choropleth)
Tools: Python, R
(AFINN sentiment analysis: sometimes you have to code)
Tools: D3, Javascript
D3 gallery: https://github.com/mbostock/d3/wiki/Gallery
D3: Interactive Play
auremoser.github.io/VitalSigns-water
What's that visualisation?
http://www.visual-literacy.org/periodic_table/periodic_table.html
What’s a dashboard?
(India’s government employee attendance system)
Psychology checklist
‱ What’s your important message (and what are you trying to
hide?)
‱ Medium matters (laptop, phone, sms?)
‱ Colours matter
‱ As do angles and relative lengths.
‱ And think about your audience, e.g. what local effects do you
need to be aware of, how do you compensate for
colorblindness etc etc.
‱ New visualization type? Check the Gestalt principles
http://www.creativebloq.com/how-design-better-data-visualisations-8134175.
Please don’t do this

Tools checklist
‱ Who’s your audience?
‱ What’s the medium: paper, static webpage, tablet, phone?
‱ Which languages do you need to display?
‱ And are they right-to-left?
‱ Is this a one-off visualisation or will you need to update it as new
data comes in?
‱ Are your audience viewing this online or ofïŹ‚ine?
‱ What resources do you have for updating the visualisation?
Where to go from here
‱ Websites, e.g. Information is Beautiful, DataScience
Central, ïŹ‚owing data, ILoveCharts, Chart Porn, junk
charts, visual.ly blog, ïŹvethirtyeight.com
‱ Meetups and events, e.g. DataKind, NYC Data
Skeptics
‱ Books e.g. Nathan Yao “Visualise this!”, anything by
Tufte
‱ Spring course on data science
Ask good questions;
Tell good stories

More Related Content

PPTX
New Frontiers in IA: Design in the Era of Cognitive Computing
PPT
Searching over the past, present and future
PPTX
From Queries to Answers in the Web
PPT
Présentation GĂŒnter MĂŒhlberger, BnF Information Day
PPTX
Manichean Progress: Positive and Negative States of the Art in Web-Scale Data...
PPTX
Insights From Data Visualization - Stephen Lett (Procter & Gamble)
PDF
R Data Visualization Tutorial: Bar Plots
PDF
Data Visualization: Introduction to Shiny Web Applications
New Frontiers in IA: Design in the Era of Cognitive Computing
Searching over the past, present and future
From Queries to Answers in the Web
Présentation GĂŒnter MĂŒhlberger, BnF Information Day
Manichean Progress: Positive and Negative States of the Art in Web-Scale Data...
Insights From Data Visualization - Stephen Lett (Procter & Gamble)
R Data Visualization Tutorial: Bar Plots
Data Visualization: Introduction to Shiny Web Applications

Similar to Data visualization for development (20)

PDF
Data Visualization in Data Science
PDF
Data visualization in a nutshell
PPTX
Data visualization is the representation of data through use of common graphi...
PDF
Art and Science of Dashboard Design
PDF
Explore Data: Data Science + Visualization
PPTX
Principles of data visualisation 2020
PPTX
Introduction to Data Visualization Slides
PPTX
Data Visualization1.pptx
PDF
Creating Effective Data Visualizations for Online Learning
PPTX
Unit III.pptx
PDF
Dataviz 101: Data is beautiful, please don't ruin it by Anne-Marie Tousch, Se...
PPTX
Data Visualization by David Kretch
PDF
principlesofdatavisualisation2021-210407141546.pdf
PDF
Principles of data visualisation 2021
PDF
What Is Good DataViz Design?
PPTX
Introduction to Data Visualization_Day 1.pptx
PPTX
Data vispresupdate.pptx
KEY
Data Visualization Strategies & Open Source Tools
 
PDF
Data Visualization for Non-Programmers
PPTX
Data/Visualization - Digital Center Cohort - 13_0222
Data Visualization in Data Science
Data visualization in a nutshell
Data visualization is the representation of data through use of common graphi...
Art and Science of Dashboard Design
Explore Data: Data Science + Visualization
Principles of data visualisation 2020
Introduction to Data Visualization Slides
Data Visualization1.pptx
Creating Effective Data Visualizations for Online Learning
Unit III.pptx
Dataviz 101: Data is beautiful, please don't ruin it by Anne-Marie Tousch, Se...
Data Visualization by David Kretch
principlesofdatavisualisation2021-210407141546.pdf
Principles of data visualisation 2021
What Is Good DataViz Design?
Introduction to Data Visualization_Day 1.pptx
Data vispresupdate.pptx
Data Visualization Strategies & Open Source Tools
 
Data Visualization for Non-Programmers
Data/Visualization - Digital Center Cohort - 13_0222
Ad

More from Sara-Jayne Terp (20)

PPTX
Distributed defense against disinformation: disinformation risk management an...
PPTX
Risk, SOCs, and mitigations: cognitive security is coming of age
PPTX
disinformation risk management: leveraging cyber security best practices to s...
PPTX
Cognitive security: all the other things
PPTX
The Business(es) of Disinformation
PPTX
2021-05-SJTerp-AMITT_disinfoSoc-umaryland
PPTX
2021 IWC presentation: Risk, SOCs and Mitigations: Cognitive Security is Comi...
PPTX
2021-02-10_CogSecCollab_UBerkeley
PPTX
Using AMITT and ATT&CK frameworks
PPTX
2020 12 nyu-workshop_cog_sec
PPTX
2020 09-01 disclosure
PDF
2019 11 terp_mansonbulletproof_master copy
PPTX
BSidesLV 2018 talk: social engineering at scale, a community guide
PPTX
Social engineering at scale
PPTX
engineering misinformation
PPTX
Online misinformation: they're coming for our brainz now
PPTX
Sj terp ciwg_nyc2017_credibility_belief
PPT
Belief: learning about new problems from old things
PPT
risks and mitigations of releasing data
PPTX
Session 10 handling bigger data
Distributed defense against disinformation: disinformation risk management an...
Risk, SOCs, and mitigations: cognitive security is coming of age
disinformation risk management: leveraging cyber security best practices to s...
Cognitive security: all the other things
The Business(es) of Disinformation
2021-05-SJTerp-AMITT_disinfoSoc-umaryland
2021 IWC presentation: Risk, SOCs and Mitigations: Cognitive Security is Comi...
2021-02-10_CogSecCollab_UBerkeley
Using AMITT and ATT&CK frameworks
2020 12 nyu-workshop_cog_sec
2020 09-01 disclosure
2019 11 terp_mansonbulletproof_master copy
BSidesLV 2018 talk: social engineering at scale, a community guide
Social engineering at scale
engineering misinformation
Online misinformation: they're coming for our brainz now
Sj terp ciwg_nyc2017_credibility_belief
Belief: learning about new problems from old things
risks and mitigations of releasing data
Session 10 handling bigger data
Ad

Recently uploaded (20)

PPTX
ManageIQ - Sprint 268 Review - Slide Deck
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PPTX
Online Work Permit System for Fast Permit Processing
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PDF
How to Migrate SBCGlobal Email to Yahoo Easily
PPTX
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
PDF
Design an Analysis of Algorithms II-SECS-1021-03
PPTX
Transform Your Business with a Software ERP System
PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
 
PDF
Nekopoi APK 2025 free lastest update
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PDF
PTS Company Brochure 2025 (1).pdf.......
PDF
System and Network Administraation Chapter 3
PDF
medical staffing services at VALiNTRY
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
ManageIQ - Sprint 268 Review - Slide Deck
How to Choose the Right IT Partner for Your Business in Malaysia
Design an Analysis of Algorithms I-SECS-1021-03
Adobe Illustrator 28.6 Crack My Vision of Vector Design
Online Work Permit System for Fast Permit Processing
Internet Downloader Manager (IDM) Crack 6.42 Build 41
Which alternative to Crystal Reports is best for small or large businesses.pdf
How to Migrate SBCGlobal Email to Yahoo Easily
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
Design an Analysis of Algorithms II-SECS-1021-03
Transform Your Business with a Software ERP System
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
 
Nekopoi APK 2025 free lastest update
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PTS Company Brochure 2025 (1).pdf.......
System and Network Administraation Chapter 3
medical staffing services at VALiNTRY
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool

Data visualization for development