Jeffrey Heer @jeffrey_heer
Univ. of Washington + Trifacta
Visualization for
DISCOVERY
Effectiveness of
Penicillin,
Neomycin &
Streptomycin vs.
Bacteria Species
Visualization for Discovery
Visualization for Discovery
Which antibiotic should one use?
Show data variation,
not design variation.
- Edward Tufte
Aerobacter aerogenes
Brucella abortus
Bacillus anthracis
Diplococcus pneumoniae
Escherichia coli
Klebsiella pneumoniae
Mycobacterium tuberculosis
Proteus vulgaris
Pseudomonas aeruginosa
Salmonella typhosa
Salmonella schottmuelleri
Staphylococcus albus
Staphylococcus aureus
Streptococcus fecalis
Streptococcus hemolyticus
Streptococcus viridans
0.0010.010.11101001,000
Log10(1 / Neomycin)
0.001
0.01
0.1
1
10
100
1,000
Log10(1/Penicillin) Genus
Streptococcus
Staphylococcus
Salmonella
Other
Wainer & Lysen
Am. Sci. 2009
Aerobacter aerogenes
Brucella abortus
Bacillus anthracis
Diplococcus pneumoniae
Escherichia coli
Klebsiella pneumoniae
Mycobacterium tuberculosis
Proteus vulgaris
Pseudomonas aeruginosa
Salmonella typhosa
Salmonella schottmuelleri
Staphylococcus albus
Staphylococcus aureus
Streptococcus fecalis
Streptococcus hemolyticus
Streptococcus viridans
0.0010.010.11101001,000
Log10(1 / Neomycin)
0.001
0.01
0.1
1
10
100
1,000
Log10(1/Penicillin) Genus
Streptococcus
Staphylococcus
Salmonella
Other
Wainer & Lysen
Am. Sci. 2009
What does antibiotic
response reveal about the
biology of bacteria?
Aerobacter aerogenes
Brucella abortus
Bacillus anthracis
Diplococcus pneumoniae
Escherichia coli
Klebsiella pneumoniae
Mycobacterium tuberculosis
Proteus vulgaris
Pseudomonas aeruginosa
Salmonella typhosa
Salmonella schottmuelleri
Staphylococcus albus
Staphylococcus aureus
Streptococcus fecalis
Streptococcus hemolyticus
Streptococcus viridans
0.0010.010.11101001,000
Log10(1 / Neomycin)
0.001
0.01
0.1
1
10
100
1,000
Log10(1/Penicillin) Genus
Streptococcus
Staphylococcus
Salmonella
Other
Wainer & Lysen
Am. Sci. 2009
Aerobacter aerogenes
Brucella abortus
Bacillus anthracis
Diplococcus pneumoniae
Escherichia coli
Klebsiella pneumoniae
Mycobacterium tuberculosis
Proteus vulgaris
Pseudomonas aeruginosa
Salmonella typhosa
Salmonella schottmuelleri
Staphylococcus albus
Staphylococcus aureus
Streptococcus fecalis
Streptococcus hemolyticus
Streptococcus viridans
0.0010.010.11101001,000
Log10(1 / Neomycin)
0.001
0.01
0.1
1
10
100
1,000
Log10(1/Penicillin) Genus
Streptococcus
Staphylococcus
Salmonella
Other
Wainer & Lysen
Am. Sci. 2009
?
?
Aerobacter aerogenes
Brucella abortus
Bacillus anthracis
Diplococcus pneumoniae
Escherichia coli
Klebsiella pneumoniae
Mycobacterium tuberculosis
Proteus vulgaris
Pseudomonas aeruginosa
Salmonella typhosa
Salmonella schottmuelleri
Staphylococcus albus
Staphylococcus aureus
Streptococcus fecalis
Streptococcus hemolyticus
Streptococcus viridans
0.0010.010.11101001,000
Log10(1 / Neomycin)
0.001
0.01
0.1
1
10
100
1,000
Log10(1/Penicillin) Genus
Streptococcus
Staphylococcus
Salmonella
Other
Wainer & Lysen
Am. Sci. 2009
Not a
streptococcus!
Actually a
streptococcus!
How might our tools
spur new questions and
prompt skepticism?
Visualization for Discovery
Visualization for Discovery
0 10 20 30 40
Neomycin
0
200
400
600
800
Penicillin
0.001 0.01 0.1 1 10 100
Log10(Neomycin)
0.001
0.01
0.1
1
10
100
1,000
Log10(Penicillin)
0.0010.010.1110100
Log10(1 / Neomycin)
0.001
0.01
0.1
1
10
100
1,000
Log10(1/Penicillin)
0.0010.010.11101001,000
Log10(1 / Neomycin)
0.001
0.01
0.1
1
10
100
1,000
Log10(1/Penicillin)
0.0010.010.11101001,000
Log10(1 / Neomycin)
0.001
0.01
0.1
1
10
100
1,000
Log10(1/Penicillin) Genus
Aerobacter
Brucella
Bacillus
Diplococcus
Escherichia
Klebsiella
Mycobacterium
Proteus
Pseudomonas
Salmonella
Staphylococcus
Streptococcus
0.0010.010.11101001,000
Log10(1 / Neomycin)
0.001
0.01
0.1
1
10
100
1,000
Log10(1/Penicillin) Genus
Streptococcus
Staphylococcus
Salmonella
Other
Aerobacter aerogenes
Brucella abortus
Bacillus anthracis
Diplococcus pneumoniae
Escherichia coli
Klebsiella pneumoniae
Mycobacterium tuberculosis
Proteus vulgaris
Pseudomonas aeruginosa
Salmonella typhosa
Salmonella schottmuelleri
Staphylococcus albus
Staphylococcus aureus
Streptococcus fecalis
Streptococcus hemolyticus
Streptococcus viridans
0.0010.010.11101001,000
Log10(1 / Neomycin)
0.001
0.01
0.1
1
10
100
1,000
Log10(1/Penicillin) Genus
Streptococcus
Staphylococcus
Salmonella
Other
A Combinatorial Design Space
1. Variable Selection
A Combinatorial Design Space
1. Variable Selection
2. Data Transformation
A Combinatorial Design Space
1. Variable Selection
2. Data Transformation
3. Visual Encoding Design
A Combinatorial Design Space
1. Variable Selection
2. Data Transformation
3. Visual Encoding Design
—> Thousands of possible charts!
A Combinatorial Design Space
How might we augment
manual chart construction
with interactive browsing of
recommended visualizations?
Visualization for Discovery
ggplot(diamonds, aes(x=price, fill=cut))
+ geom_bar(position="dodge")
Visualization for Discovery
1. Trifacta Visual Profiler
2. Data Voyager (UW + Tableau)
1. Trifacta Visual Profiler
2. Data Voyager (UW + Tableau)
Trifacta Visual Profiler
OverviewOverview
Trifacta Visual Profiler
DetailsDetails
Trifacta Visual Profiler
Trifacta Visual Profiler
Trifacta Visual Profiler
Trifacta Visual Profiler
Trifacta Visual Profiler
Trifacta Visual Profiler
Trifacta Visual Profiler
1. Trifacta Visual Profiler
2. Data Voyager (UW + Tableau)
Visualization for Discovery
User
Voyager 

Visualization Browser
User


Data Set
Voyager 

Visualization Browser
Compass
Recommendation Engine
Data Schema
& Statistics 

User
Voyager 

Visualization Browser
Compass
Recommendation Engine
Data Schema
& Statistics 

User
1. Select data variables
2. Apply transformations
3. Pick visual encodings
Voyager 

Visualization Browser
Compass
Recommendation Engine
Data Schema
& Statistics 

User
Constrain & rank choices
by data type, statistics &
perceptual principles.
Voyager 

Visualization Browser
Data Schema
& Statistics 

Ranked and Clustered
Vega-lite Specifications
User
Compass
Recommendation Engine
Voyager 

Visualization Browser
Compass
Recommendation Engine
Vega-lite
Compiler
Vega-lite
Specifications
Ranked and Clustered
Vega-lite Specifications
User
Data Schema
& Statistics 

Voyager 

Visualization Browser
Compass
Recommendation Engine
Vega
Renderer
Vega-lite
Compiler
Vega-lite
Specifications
Vega

Specifications
Ranked and Clustered
Vega-lite Specifications
User
Data Schema
& Statistics 

Voyager 

Visualization Browser
Compass
Recommendation Engine
Vega
Renderer
Interactive
Visualizations
Vega-lite
Specifications
Vega

Specifications
Ranked and Clustered
Vega-lite Specifications
User
Data Schema
& Statistics 

Vega-lite
Compiler
Voyager 

Visualization Browser
Interactive
Visualizations
Compass
Recommendation Engine
Vega
Renderer
Interactive
Visualizations
Vega-lite
Specifications
Vega

Specifications
Ranked and Clustered
Vega-lite Specifications
User
Data Schema
& Statistics 

Vega-lite
Compiler
Voyager 

Visualization Browser
Interactive
Visualizations
User 

Selection
Compass
Recommendation Engine
Vega
Renderer
Interactive
Visualizations
Vega-lite
Specifications
Vega

Specifications
User Selection,
Data Schema
& Statistics 

Ranked and Clustered
Vega-lite Specifications
User
Vega-lite
Compiler
Voyager 

Visualization Browser
Interactive
Visualizations
User 

Selection
Compass
Recommendation Engine
Vega
Renderer
Interactive
Visualizations
Vega-lite
Specifications
Vega

Specifications
User Selection,
Data Schema
& Statistics 

Ranked and Clustered
Vega-lite Specifications
User
Improves data coverage!
+3x variable sets shown
+1.5x more interacted with
Vega-lite
Compiler
Refining visualization recommendation
What to optimize? How to evaluate?
Scaling interactive visualizations
Large D harder than large N…
Help avoid statistical pitfalls
Recognize mix effects, convey uncertainty
Ongoing Challenges
How might our tools
spur new questions and
prompt skepticism?
Visualization for Discovery
vega.github.io
Jeffrey Heer @jeffrey_heer
Univ. of Washington + Trifacta
Visualization for
DISCOVERY

More Related Content

PDF
Introduction to 16S rRNA gene multivariate analysis
PPTX
Emerging technology analysis visualization based data discovery tools
PDF
VDD Tools
PDF
Exploiting bigger data and collaborative tools for predictive drug discovery
PPTX
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoop
PDF
Role of Tableau on the Data Discovery Market
PPTX
Alteryx Architecture
PDF
Data Wrangling on Hadoop - Olivier De Garrigues, Trifacta
Introduction to 16S rRNA gene multivariate analysis
Emerging technology analysis visualization based data discovery tools
VDD Tools
Exploiting bigger data and collaborative tools for predictive drug discovery
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoop
Role of Tableau on the Data Discovery Market
Alteryx Architecture
Data Wrangling on Hadoop - Olivier De Garrigues, Trifacta

Similar to Visualization for Discovery (14)

PDF
Identifying Antibiotics posing potential Health Risk: Microbial Resistance Sc...
PDF
Bacterial Pangenomics Methods And Protocols 1st Edition Alessio Mengoni
PPT
Application of Whole Genome Sequencing in the infectious disease’ in vitro di...
PPTX
Eccmid meet the expert 2015
PDF
Antibiotic Resistance Protocols Second Edition 2nd Edition Jodi A Lindsay
PPTX
Toolbox for bacterial population analysis using NGS
PPT
Susceptibility Update 2025 - Margie Morgan, PhD
PDF
3214ijscai01.pdf
PDF
Bacteria identification from microscopic
PPTX
Genomic Epidemiology: How High Throughput Sequencing changed our view on bac...
PPT
Antibiotic choices
PDF
ADVANCES IN PROKARYOTE CLASSIFICATION FROM MICROSCOPIC IMAGES
PDF
Advances in prokaryote classification from microscopic images
PDF
resentation jblp
Identifying Antibiotics posing potential Health Risk: Microbial Resistance Sc...
Bacterial Pangenomics Methods And Protocols 1st Edition Alessio Mengoni
Application of Whole Genome Sequencing in the infectious disease’ in vitro di...
Eccmid meet the expert 2015
Antibiotic Resistance Protocols Second Edition 2nd Edition Jodi A Lindsay
Toolbox for bacterial population analysis using NGS
Susceptibility Update 2025 - Margie Morgan, PhD
3214ijscai01.pdf
Bacteria identification from microscopic
Genomic Epidemiology: How High Throughput Sequencing changed our view on bac...
Antibiotic choices
ADVANCES IN PROKARYOTE CLASSIFICATION FROM MICROSCOPIC IMAGES
Advances in prokaryote classification from microscopic images
resentation jblp
Ad

More from Turi, Inc. (20)

PPTX
Webinar - Analyzing Video
PDF
Webinar - Patient Readmission Risk
PPTX
Webinar - Know Your Customer - Arya (20160526)
PPTX
Webinar - Product Matching - Palombo (20160428)
PPTX
Webinar - Pattern Mining Log Data - Vega (20160426)
PPTX
Webinar - Fraud Detection - Palombo (20160428)
PPTX
Scaling Up Machine Learning: How to Benchmark GraphLab Create on Huge Datasets
PDF
Pattern Mining: Extracting Value from Log Data
PPTX
Intelligent Applications with Machine Learning Toolkits
PPTX
Text Analysis with Machine Learning
PPTX
Machine Learning with GraphLab Create
PPTX
Machine Learning in Production with Dato Predictive Services
PPTX
Machine Learning in 2016: Live Q&A with Carlos Guestrin
PDF
Scalable data structures for data science
PPTX
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015
PDF
Introduction to Recommender Systems
PDF
Machine learning in production
PPTX
Overview of Machine Learning and Feature Engineering
PPTX
SFrame
PPT
Building Personalized Data Products with Dato
Webinar - Analyzing Video
Webinar - Patient Readmission Risk
Webinar - Know Your Customer - Arya (20160526)
Webinar - Product Matching - Palombo (20160428)
Webinar - Pattern Mining Log Data - Vega (20160426)
Webinar - Fraud Detection - Palombo (20160428)
Scaling Up Machine Learning: How to Benchmark GraphLab Create on Huge Datasets
Pattern Mining: Extracting Value from Log Data
Intelligent Applications with Machine Learning Toolkits
Text Analysis with Machine Learning
Machine Learning with GraphLab Create
Machine Learning in Production with Dato Predictive Services
Machine Learning in 2016: Live Q&A with Carlos Guestrin
Scalable data structures for data science
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015
Introduction to Recommender Systems
Machine learning in production
Overview of Machine Learning and Feature Engineering
SFrame
Building Personalized Data Products with Dato
Ad

Recently uploaded (20)

PPTX
Tartificialntelligence_presentation.pptx
PDF
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
PDF
Hybrid model detection and classification of lung cancer
PDF
WOOl fibre morphology and structure.pdf for textiles
PDF
Hindi spoken digit analysis for native and non-native speakers
PDF
Getting Started with Data Integration: FME Form 101
PPT
Module 1.ppt Iot fundamentals and Architecture
PPTX
Modernising the Digital Integration Hub
PDF
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
PDF
CloudStack 4.21: First Look Webinar slides
PDF
Taming the Chaos: How to Turn Unstructured Data into Decisions
PPTX
Web Crawler for Trend Tracking Gen Z Insights.pptx
PDF
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
PPTX
Chapter 5: Probability Theory and Statistics
PDF
sustainability-14-14877-v2.pddhzftheheeeee
PDF
A contest of sentiment analysis: k-nearest neighbor versus neural network
PDF
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
PDF
Developing a website for English-speaking practice to English as a foreign la...
Tartificialntelligence_presentation.pptx
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
Hybrid model detection and classification of lung cancer
WOOl fibre morphology and structure.pdf for textiles
Hindi spoken digit analysis for native and non-native speakers
Getting Started with Data Integration: FME Form 101
Module 1.ppt Iot fundamentals and Architecture
Modernising the Digital Integration Hub
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
CloudStack 4.21: First Look Webinar slides
Taming the Chaos: How to Turn Unstructured Data into Decisions
Web Crawler for Trend Tracking Gen Z Insights.pptx
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
Chapter 5: Probability Theory and Statistics
sustainability-14-14877-v2.pddhzftheheeeee
A contest of sentiment analysis: k-nearest neighbor versus neural network
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
Developing a website for English-speaking practice to English as a foreign la...

Visualization for Discovery