20151223application of deep learning in basic bio

Applications of
Deep Learning
in
Basic Biology Research
Charlene Hsuan-Lin Her
12/28/2015 冏 1

Outline
• Motivation
• What is Deep Learning: A brief review of the history of machine
learning and AI
• Example : The human splicing code reveals new insights into the
genetic determinant of disease
• Conclusion
12/28/2015 冏 2

The goal of AI: build a machine to understand
to world around us.
12/28/2015 冏 5

Attempts
Attempt 1:
Attempt 2:
getting features: wheels, handle
12/28/2015 冏 6

inputs
Feature
representation
Learning
algorithm
Domain knowledge
Very task specific
difficulty
12/28/2015 冏 7

our brain UNDERSTANDS the world better
than ANY algorithm
• The neural-rewiring experiment one algorithm hypothesis
-difficult to train
(computationally
expensive, needs
MASSIVE labelled
data)
12/28/2015 冏 8

Deep Learning
• Semi-supervised learning
• What we already know about the brain
• Sparse distributed representation
• Unsupervised feature learning
12/28/2015 冏 9

Reference
• Andrew Ng: Deep Learning, Self-Taught Learning and Unsupervised
Feature Learning (UCLA graduate summer school)
• Geoffery Hinton: The Next Generation of Neural Networks (google
tech talks)
• Yousha Bengio: Deep Learning (Machine Learning summer school
2014)
• Deep Learning: The Theoretician's Nightmare or Paradise? (LeCun,
NYU, August 2012)
12/28/2015 冏 10

Example: The human splicing
code reveals new insights into
the genetic determinant of
disease
{Xiong, 2015 #490}
12/28/2015 冏 11

Question
Genetic variants disease
Intronic exonic
Synomou
s
mutation
Directly
Alter
protein
sequenc
e
?
Splicing
12/28/2015 冏 12

Previous approach{Barash, 2010 #499}:
regulatory model
12/28/2015 冏 13

Study design
Train the model Mined 10,689 exons that displayed
evidence of alternative splicing and
extracted 1393 sequence features
from each exon and its neighboring
introns and exons
Model must not contradict
with current molBio
knowledge
RBP binding ability,
RBP expression,
context dependent effect of
splicing codes
REP knockdown data
4 individual blood sample
Linking it to disease SNV
autism,
spinal muscular atrophy,
nonpolyposis colorectal cancer
12/28/2015 冏 14

Model
Linear Model:
R^2=0.66
High v.s.
low(33%):
AUC=95.5%
High v.s.
low(10%)
AUC=99.1%
12/28/2015 冏 15

RNA-binding protein (RBP) v.s. residual
splicing activity
• residual splicing activity= observed Ψ – predicted Ψ
12/28/2015 冏 16

Trans-acting factor (RBPs)
MBNL RBP
knockdown
Altered slicing (ΔΨ >>0)
Exons that are not affected
(ΔΨ~0)
MBNL feature model Predicted
ΔΨ
12/28/2015 冏 17

SNV v.s ΔΨ
• studied the effects of SNVs using the largest value of all tissue
SNV
SNP: common
MAF: rare and
linked to disease
12/28/2015 冏 18

SNV v.s ΔΨ: do disease SNVs disrupt splicing
more frequently than common SNVs?
12/28/2015 冏 19

SNV v.s ΔΨ
12/28/2015 冏 20

Spinal muscular atrophy (autosomal
recessive)
12/28/2015 冏 21

Spinal muscular atrophy (autosomal
recessive)
12/28/2015 冏 22

Nonpolyposis colorectal cancer (oligogenic)
12/28/2015 冏 23

Autism spectrum disease (multigenic)
12/28/2015 冏 24

Autism spectrum disease (multigenic)
12/28/2015 冏 25

Conclusion
• We built a model to address how genetic variation affects splicing
• The disease variants have regulatory scores significantly different
from those of the rare and common variants, but the distribution of
regulatory scores is indistinguishable for rare and common variants
• Potential sources of prediction error include unaccounted-for RNA
features, inaccuracies in computed features, imperfect modeling of
splicing levels, and limitations due to a focus on cassette splicing.
• it will be important to seek regulatory models that encompass other
major steps in gene regulation
12/28/2015 冏 26

Reference
• 1. Barash, Y. et al. Deciphering the splicing code. Nature 465, 53-9
(2010).
• 2. Xiong, H.Y. et al. RNA splicing. The human splicing code reveals
new insights into the genetic determinants of disease. Science 347,
1254806 (2015).
12/28/2015 冏 27

• Neural nets are supposed to do what humans are good at….
• HOW will these models help biologists understand the world better?
• Challenges
• Validation
• Insufficient information
12/28/2015 冏 28

20151223application of deep learning in basic bio

More Related Content

Viewers also liked (17)

Similar to 20151223application of deep learning in basic bio (20)

Recently uploaded (20)

20151223application of deep learning in basic bio

Editor's Notes