SlideShare a Scribd company logo
Visual Analytics in omics - why, what, how?
Prof Jan Aerts

STADIUS - ESAT, Faculty of Engineering, University of Leuven, Belgium

Data Visualization Lab

!
jan.aerts@esat.kuleuven.be

jan@datavislab.org
creativecommons.org/licenses/by-nc/3.0/
• What problem are we trying to solve?


• What is Visual Analytics and how can it help?


• How do we actually do this?


• Some examples


• Challenges

2
A. What’s the problem?

3
hypothesis-driven -> data-driven
Scientific Research Paradigms (Jim Gray, Microsoft)

!

1st

1,000s years ago

empirical

!

2nd

100s years ago

theoretical

!

3rd

last few decades

computational

4rd

today

data exploration

!

I have an hypothesis -> need to generate data to (dis)prove it.

I have data -> need to find hypotheses that I can test.

4
What does this mean?
• immense re-use of existing datasets

• biologically interesting signals may be too poorly understood to be analyzed
in automated fashion

• much of initial analysis is exploratory in nature => what’s my hypothesis?

=> searching for unknown unknowns

• automated algorithms often act as black boxes => biologists must have blind
faith in bioinformatician (and bioinformatician in his/her own skills)

5
Visual Analytics in Omics - why, what, how?
For domain expert: what’s my hypothesis?

Martin Krzywinski
7
For developer and domain expert:

opening the black box
input
filter 1
filter 2
filter 3
output A

output B

output C
8
B. What is Visual Analytics and how can it help?

9
Our research interest:

visual design + interaction design + backend

10
What is visualization?

visualization of simulations

in situ visualization

of real-world structures

11
What is visualization?

T. Munzner

12
What is visualization?

cognition <=> perception
cognitive task => perceptive task

T. Munzner

13
Why do we visualize data?
• record information

• blueprints, photographs,

seismographs, ...

• analyze data to support reasoning

• develop & assess hypotheses

• discover errors in data

• expand memory

• find patterns (see Snow’s cholera map)

• communicate information

• share & persuade

• collaborate & revise
14
Sedlmair et al. IEEE Transactions on Visualization and Computer Graphics. 2012
The strength of visualization
pictorial superiority effect
“information”
72hr

“informa”
65%

“i”
10%
17
Steven’s psychophysical law
= proposed relationship between the magnitude of a physical stimulus and its
perceived intensity or strength

18
Accuracy of quantitative perceptual tasks
how much (quantitative)

what/where (qualitative)

McKinlay
19
Accuracy of quantitative perceptual tasks
how much (quantitative)

what/where (qualitative)

McKinlay
20
Accuracy of quantitative perceptual tasks
how much (quantitative)

what/where (qualitative)

“power of the plane”

McKinlay
21
Pre-attentive vision
= ability of low-level human visual system to rapidly identify certain basic visual
properties

• some features “pop out”

• used for:

• target detection

• boundary detection

• counting/estimation

• ...

• visual system takes over => all cognitive power available for interpreting the
figure, rather than needing part of it for processing the figure
22
23
24
Limitations of preattentive vision
1. Combining pre-attentive features does not always work => would need to
resort to “serial search” (most channel pairs; all channel triplets)

e.g. is there a red square in this picture

2. Speed depends on which channel (use one that is good for
categorical)

25
Gestalt laws - interplay between parts and the
whole

26
Gestalt laws - interplay between parts and the
whole
• simplicity


• familiarity


• proximity


• symmetry

• similarity

• connectedness

• good continuation

• common fate


27
Bret Victor - Ladder of abstration

28
For domain expert: what’s my hypothesis?

Martin Krzywinski
29
Martin Krzywinski
30
Martin Krzywinski
31
For developer and domain expert:

opening the black box
input
filter 1
filter 2
filter 3
output A

output B

output C
32
B

A

C
33
B

A

C
34
B

A

C
35
C. How do we actually do this?

36
Talking to domain experts

37
Data visualization framework

38
Card sorting

39
Tools of the trade

40
Processing - http://processing.org
• java

41
D3 - http://d3js.org/
• javascript

42
Vega - https://github.com/trifacta/vega/wiki
• html + json

43
D. Examples

Data exploration
Data filtering
User-guided analysis

44
Data exploration

HiTSee
Bertini E et al. IEEE Symposium on Biological Data Visualization (2011)
Aracari
Bartlett C et al. BMC Bioinformatics (2012)

Ryo Sakai
46
Reveal
Jäger, G et al. Bioinformatics (2012)
Meander
Pavlopoulos et al. Nucl Acids Res (2013)

Georgios
Pavlopoulos

48
ParCoord

Endeavour gene prioritization

Boogaerts T et al. IEEE International Conference on
Bioinformatics & Bioengineering (2012)

Thomas Boogaerts
49
Sequence logo
Seagull
Visual Analytics in Omics - why, what, how?
subgroup

similarity

difference
Data filtering (visual parameter setting)

TrioVis
Sakai R et al. Bioinformatics (2013)

Ryo Sakai

54
User-guided analysis
clustering

regions of interest

Spark
Nielsen et al. Genome Research (2012)

data samples
chromatin modification

DNA methylation
RNA-Seq

55
BaobabView
decision trees

van den Elzen S & van Wijk J. IEEE Conference on
Visual Analytics Science and Technology (2011)
E. Challenges

57
Many challenges remain
• scalability (data processing + perception), uncertainty, “interestingness”,
interaction, evaluation

• infrastructure & architecture

• fast imprecise answers with progressive refinement

• incremental re-computation

• steering computation towards data regions of interest

58
Computational scalability
• speed
• preprocessing big data: mapreduce = batch

• interactivity: max 0.3 sec lag!

• size
• multiple data resolutions => data size increase

• not all resolutions necessary for all data regions: steer computation to
regions of interest
• Options:


• distribute visualization calculations over cluster


• distributing scala/spark or other “real-time” mapreduce paradigm


• functional programming paradigm?


• lazy evaluation and smart preprocessing: only calculate what’s needed


=> generic framework
Perceptual scalability
• “overview first, then zoom and filter, details on demand”: breaks down with
very big datasets

• “analyze first, show results, then zoom and filter, details on demand” => need
to identify regions of interest and “interestingness features”

• identify higher-level structure in data (e.g. clustering, dimensionality
reduction) -> use these to guide user
Thank you
• Georgios Pavlopoulos

• Ryo Sakai

• Thomas Boogaerts

• Toni Verbeiren

• Data Visualization Lab (datavislab.org)

• Erik Duval

• Andrew Vande Moere
62

More Related Content

PDF
Visual Analytics in Omics: why, what, how?
PDF
Visual Analytics talk at ISMB2013
PDF
Deep Learning 2.0
PDF
AI/ML as an empirical science
PPTX
Incept-N: A Convolutional Neural Network based Classification Approach for Pr...
PDF
Machine reasoning
PDF
Layer-wise CNN Surgery for Visual Sentiment Prediction
PDF
Deep learning and Healthcare
Visual Analytics in Omics: why, what, how?
Visual Analytics talk at ISMB2013
Deep Learning 2.0
AI/ML as an empirical science
Incept-N: A Convolutional Neural Network based Classification Approach for Pr...
Machine reasoning
Layer-wise CNN Surgery for Visual Sentiment Prediction
Deep learning and Healthcare

What's hot (20)

PDF
From Representation to Mediation: A New Agenda for Conceptual Modeling Resear...
PDF
Empirical AI Research
PPTX
Designing Interactive Visualisations to Solve Analytical Problems in Biology
PDF
Data Science, Machine Learning and Neural Networks
PDF
PDF
ChemnitzDec2014.key.compressed
PDF
Chemnitz dec2014
PDF
PDF
"Got a nail? I got a hammer": Lessons for data science from the "dawn" of big...
PDF
Machine Learning Introduction
PDF
Deep learning for fun and profit (a simple introduction to Artificial Intelli...
PDF
Biological Foundations for Deep Learning: Towards Decision Networks
PPTX
Design science research
PPTX
Introduction to Machine Learning
PDF
Pearson Correlation Coefficient acceleration for modelling and mapping of neu...
PDF
NeuroVault and the vision for data sharing in neuroimaging
PPTX
Computational Neuroscience - The Brain - Computer Science Interface
PDF
A Pragmatic Perspective on Software Visualization
PPTX
Android Malware 2020 (CCCS-CIC-AndMal-2020)
PDF
Chris Currin computational neuroscience intro AIMS MIIA 2017-12
From Representation to Mediation: A New Agenda for Conceptual Modeling Resear...
Empirical AI Research
Designing Interactive Visualisations to Solve Analytical Problems in Biology
Data Science, Machine Learning and Neural Networks
ChemnitzDec2014.key.compressed
Chemnitz dec2014
"Got a nail? I got a hammer": Lessons for data science from the "dawn" of big...
Machine Learning Introduction
Deep learning for fun and profit (a simple introduction to Artificial Intelli...
Biological Foundations for Deep Learning: Towards Decision Networks
Design science research
Introduction to Machine Learning
Pearson Correlation Coefficient acceleration for modelling and mapping of neu...
NeuroVault and the vision for data sharing in neuroimaging
Computational Neuroscience - The Brain - Computer Science Interface
A Pragmatic Perspective on Software Visualization
Android Malware 2020 (CCCS-CIC-AndMal-2020)
Chris Currin computational neuroscience intro AIMS MIIA 2017-12
Ad

Similar to Visual Analytics in Omics - why, what, how? (20)

PDF
Scientific Data Visualizations - Data Doesn't Care What You Believe.
PDF
Data Visualization The State Of The Art 1st Edition Dirk Bartz
PPTX
Big data visualization state of the art
PPTX
On Integrating Information Visualization Techniques into Data Mining: A Revie...
PDF
Visualisation - introduction, guidelines, principles and design
PPT
Visual Analytics in Big Data
PPTX
The Visualization Pipeline
PPTX
Data Visualization
PPTX
Bigdata analytics
PDF
The Visualization Handbook 1st Edition Christopher R. Johnson
PPTX
Data Visualization in Big Data Analytics
PDF
Delineating Cancer Genomics through Data Visualization
PDF
More Than Pretty Pictures: A Guide to Data Visualization for Info Pros
PDF
빅데이터윈윈 컨퍼런스_데이터시각화자료
DOCX
International Conference on Smart Computing and Electronic Ent.docx
DOCX
International Conference on Smart Computing and Electronic Ent.docx
PDF
Introduction to Information Visualization (Part 1)
PDF
Developing a library-based data visualization service
PPTX
Visualization Best Practices Webinar
PDF
Developing a library based data visualization service
Scientific Data Visualizations - Data Doesn't Care What You Believe.
Data Visualization The State Of The Art 1st Edition Dirk Bartz
Big data visualization state of the art
On Integrating Information Visualization Techniques into Data Mining: A Revie...
Visualisation - introduction, guidelines, principles and design
Visual Analytics in Big Data
The Visualization Pipeline
Data Visualization
Bigdata analytics
The Visualization Handbook 1st Edition Christopher R. Johnson
Data Visualization in Big Data Analytics
Delineating Cancer Genomics through Data Visualization
More Than Pretty Pictures: A Guide to Data Visualization for Info Pros
빅데이터윈윈 컨퍼런스_데이터시각화자료
International Conference on Smart Computing and Electronic Ent.docx
International Conference on Smart Computing and Electronic Ent.docx
Introduction to Information Visualization (Part 1)
Developing a library-based data visualization service
Visualization Best Practices Webinar
Developing a library based data visualization service
Ad

More from Jan Aerts (20)

PDF
VIZBI 2014 - Visualizing Genomic Variation
PDF
Visualizing the Structural Variome (VMLS-Eurovis 2013)
PPT
Humanizing Data Analysis
PDF
Intro to data visualization
PDF
L Fu - Dao: a novel programming language for bioinformatics
PPTX
J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...
PDF
S Cain - GMOD in the cloud
PDF
B Temperton - The Bioinformatics Testing Consortium
PDF
J Goecks - The Galaxy Visual Analysis Framework
PDF
S Cain - GMOD in the cloud
PDF
B Chapman - Toolkit for variation comparison and analysis
PDF
P Rocca-Serra - The open source ISA metadata tracking framework: from data cu...
PPT
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
PPT
S Cheng - eagle-i: development and expansion of a scientific resource discove...
PPTX
A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...
PDF
A Kalderimis - InterMine: Embeddable datamining components
PDF
E Afgan - Zero to a bioinformatics analysis platform in four minutes
PPT
B Kinoshita - Creating biology pipelines with BioUno
PPT
D Baker - Galaxy Update
PPTX
M Reich - GenomeSpace
VIZBI 2014 - Visualizing Genomic Variation
Visualizing the Structural Variome (VMLS-Eurovis 2013)
Humanizing Data Analysis
Intro to data visualization
L Fu - Dao: a novel programming language for bioinformatics
J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...
S Cain - GMOD in the cloud
B Temperton - The Bioinformatics Testing Consortium
J Goecks - The Galaxy Visual Analysis Framework
S Cain - GMOD in the cloud
B Chapman - Toolkit for variation comparison and analysis
P Rocca-Serra - The open source ISA metadata tracking framework: from data cu...
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
S Cheng - eagle-i: development and expansion of a scientific resource discove...
A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...
A Kalderimis - InterMine: Embeddable datamining components
E Afgan - Zero to a bioinformatics analysis platform in four minutes
B Kinoshita - Creating biology pipelines with BioUno
D Baker - Galaxy Update
M Reich - GenomeSpace

Recently uploaded (20)

PDF
Microbial disease of the cardiovascular and lymphatic systems
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PDF
Chinmaya Tiranga quiz Grand Finale.pdf
PPTX
Cell Structure & Organelles in detailed.
PPTX
master seminar digital applications in india
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
O7-L3 Supply Chain Operations - ICLT Program
PPTX
202450812 BayCHI UCSC-SV 20250812 v17.pptx
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
Complications of Minimal Access Surgery at WLH
PPTX
Institutional Correction lecture only . . .
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PDF
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
Microbial disease of the cardiovascular and lymphatic systems
O5-L3 Freight Transport Ops (International) V1.pdf
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
Chinmaya Tiranga quiz Grand Finale.pdf
Cell Structure & Organelles in detailed.
master seminar digital applications in india
STATICS OF THE RIGID BODIES Hibbelers.pdf
Pharmacology of Heart Failure /Pharmacotherapy of CHF
O7-L3 Supply Chain Operations - ICLT Program
202450812 BayCHI UCSC-SV 20250812 v17.pptx
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
human mycosis Human fungal infections are called human mycosis..pptx
Complications of Minimal Access Surgery at WLH
Institutional Correction lecture only . . .
FourierSeries-QuestionsWithAnswers(Part-A).pdf
Final Presentation General Medicine 03-08-2024.pptx
Module 4: Burden of Disease Tutorial Slides S2 2025
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...

Visual Analytics in Omics - why, what, how?