Clinical Risk Prediction with Temporal Probabilistic Asymmetric Multi-Task Learning

1
Clinical Risk Prediction with Temporal Probabilistic
Asymmetric Multi-Task Learning
1School of Computing, 2Graduate School of AI,
Korea Advanced Institute of Science and Technology,
3Aitrics, 4Department of Computer Science, University of Oxford
Tuan Nguyen* 1,4, Hyewon Jeong* 1, Eunho Yang 1,2,3, and Sung Ju Hwang 1,2,3

Clinical Risk Prediction with Multi-Task Learning
Hae Beom Lee, Eunho Yang, and Sung Ju Hwang. Deep asymmetric multi-task feature learning. ICML 2018.
2
Introduction
Heart Rate (HR)
Respiratory Rate (RR)
Oxygen saturation (SpO2)
Body Temperature (BT)
White Blood Cell Count (WBC)
Body Temperature Elevation
Vital Sign (>37.7 C, 99.9 F)
Diagnostic
Test
Symptoms and Signs
as a result of infection
Positive for Bacteria
/ Fungus / Virus
Task 1 : Fever Task 2 : Infection
Evidence & Proof of infection
One probable
result of infection
Task 3 : Mortality
Mortality
Features Tasks
Task1: Fever
Task2: Infection
Task3: Mortality
Negative Transfer
MTL: clinical setting (MIMIC III-Infection)

Negative Transfer Problem in Multi-Task Learning
3
Introduction
Heart Rate (HR)
Respiratory Rate (RR)
Oxygen saturation (SpO2)
Body Temperature (BT)
White Blood Cell Count (WBC)
Vital Sign (>37.7 C, 99.9 F)
Diagnostic
Test
Symptoms and Signs
/ Fungus / Virus
One probable
result of infection
Task 3 : Mortality
Mortality
Features Tasks
Task1: Fever
Task2: Infection
Task3: Mortality
Negative Transfer
Unreliable Predictor

Asymmetric Knowledge Transfer Across Timesteps
4
Introduction
𝑓!
𝑓"
𝑓#
…
Fever
𝑖!
𝑖"
𝑖#
Step 1
𝑚!
𝑚"
𝑚#
…
…
Step 2 Step T
Infection
Mortality
낮은
불확실성
높은
불확실성
Vital Sign (>37.7 C, 99.9 F)
Diagnostic
Test
Symptoms and Signs
/ Fungus / Virus
One probable
result of infection
Task 3 : Mortality
Mortality
Deep AMTFL

Probabilistic Asymmetric Multi-Task Learning (P-AMTL)
Introduction
Uncertainty-Aware Asymmetric Multi-Task Learning

Probabilistic Asymmetric Multi-Task Learning (P-AMTL)
6
0.3
0.4
0.5
0.6
0.7
0
0.02
0.04
0.06
0.08
0.1
0.12
Task 0 Task 1
Knowledge
Transfer
Loss
KT in Loss-based AMTL
Loss
KT
0
0.02
0.04
0.06
0.08
0.1
0.12
0
0.2
0.4
0.6
0.8
Task 0 Task 1
Knowledge
Transfer
Uncertainty
KT in P-AMTL
UC
KT
2000 200
instances 2000 200
instances
-0.02
-0.015
-0.01
-0.005
0
0.005
0.01
0.015
Accuracy
Improvement
over
STL
Loss-based AMTL
TPAMTL
…
…
Step 1 Step 2 Step T
Loss
Loss
Loss-Based AMTL (Lee et al., 2018)
fd
(1) fd
(2) fd
(3)
fj
(1) fj
(2) fj
(3)
…
Low UC
High UC
𝑓!
(#)
UC-aware AMTL
…
Step 1 Step 2 Step T
fd
(1) fd
(2) fd
(3)
fj
(1) fj
(2) fj
(3)
𝑍!, 𝑍" : High level latent feature
𝑓!, 𝑓" : Multiple features
across timesteps
(𝑍! = 𝑓!
#
, 𝑓!
$
, … , 𝑓!
%
)
(𝑍" = 𝑓"
#
, 𝑓"
$
, … , 𝑓"
%
)
Task J
Task D
Approach
Failure of Loss-based Asymmetric Multi-Task Learning
Multiple Features
Across Timesteps

Failure of Loss-based AMTL
7
Approach
Table 1. Task Performance of MNIST-variation Experiment
(AUROC over 5 runs. MTL model accuracies lower than those of their STL counterparts are colored in red)

Knowledge transfer happens from more reliable to less reliable features. Knowledge transfer happens
inter-task(in order to capture task relatedness) and across-timestep.
Uncertainty Aware Knowledge Transfer: example case
!
Multiple Features
(zj for Task j)
+ Gj
2
αd,j
Gd
1
!
fd
(1)
Multiple Features
(zd for Task d)
αj,d
Gj
1
+
Gj
1 Gj
1
fd
(3)
fd
(1)
fj
(1)
fd
(3)
fd
(1)
fj
(1)
Transform from more reliable to less reliable latent features.
Knowledge transfer from Certain (low UC) task to Uncertain (high UC) task
!!,# = #!,# $!,#, $#, &!,#
$ , &#
$
!#,! = ##,! $#,!, $!, &#,!
$
, &!
$
"!
(#)
= $!
(#)
+ &!(∑ ∑ )%,!
',#
∗ &%
#
'()
*
%+) $%
'
) ∀. ∈ {1,2, … , !}
* Same also happens for intra-task, inter-timestep knowledge transfer
TP-AMTL: Uncertainty-Aware Knowledge Transfer
Approach

TP-AMTL: Uncertainty-Aware Knowledge Transfer
Knowledge transfer happens from more reliable to less reliable features. Knowledge transfer happens
inter-task(in order to capture task relatedness) and across-timestep.
Uncertainty Aware Knowledge Transfer: example case
𝑇
Multiple Features
(zj for Task j)
+ Gj
2
αd,j
Gd
1
𝑇
fd
(1)
Multiple Features
(zd for Task d)
αj,d
Gj
1
+
Gj
1 Gj
1
fd
(3)
fd
(1)
fj
(1)
fd
(3)
fd
(1)
fj
(1)
Transform from more reliable to less reliable latent features.
Knowledge transfer from Certain (low UC) task to Uncertain (high UC) task
Approach
𝛼!,# = 𝐹!,# 𝑍!,#, 𝑍#, 𝜎!,#
$
, 𝜎#
$
𝛼#,! = 𝐹#,! 𝑍#,!, 𝑍!, 𝜎#,!
$
, 𝜎!
$
𝐶%
(&)
= 𝑓%
(&)
+ 𝐺%(∑!'(
)
∑*+(
&
𝛼!,%
*,&
∗ 𝐺! 𝑓!
*
) ∀𝑡 ∈ {1,2, … , 𝑇}
* Same also happens for intra-task, inter-timestep knowledge transfer
𝑧# ∼ 𝑝% 𝑧# 𝑥, 𝜔
𝑝% 𝑧# 𝑥, 𝜔 ∼ 𝒩(𝑧#; 𝜇#, 𝑑𝑖𝑎𝑔 𝜎#
$
)

Complexity Analysis
10
Approach
Supplementary Table 1. Time Complexity of the Baseline Models

Tasks and Datasets
11
Task 1 : Stay < 3
Length of ICU Stay
Task 2 : Cardiac
Recovering from
Cardiac Surgery
Task 4 : Mortality
Task 3 : Recovery
Recovering from
general surgery
PhysioNet2012
Vital Sign (>37.7 C, 99.9 F)
Diagnostic
Test
Symptoms and Signs
/ Fungus / Virus
One probable
result of infection
Task 3 : Mortality
Mortality
MIMIC - III Infection
2,000 data points
Tasks : Fever à Infection à Mortality
Features: 12 Infection related features : including heart rate,
arterial blood pressure, and Glasgow Coma Scale(GCS) etc.
4,000 distinct hospital (ICU) records
Tasks: Stay < 3 / Cardiac / Recovery à Mortality
Features: 31 physiological signs including heart rate,
respiratory rate, temperature, etc.
Experiments
Information on MIMIC - III Respiratory Failure, Heart Failure can be found in the supplementary file

Quantitative Results
12
STL : Singletask Learning
MTL : Multitask Learning
Our model, TP-AMTL obtains significant improvement over all Single-Task Learning and
Multi-Task Learning(MTL) baselines on both datasets.
Experiments
Table 2. Task Performance of the MIMIC-III Infection and PhysioNet Dataset.
(Average AUROC over 5 runs. MTL model accuracies lower than those of their STL counterparts are colored in red)

13
Experiments
1

14
Experiments

15
Experiments

16
Experiments

17
Experiments

Source features with low uncertainties transfer knowledge more, while at the target,
features with high uncertainties receive more knowledge transfer.
Qualitative Results: Knowledge Transfer Graph
Normalized amount of knowledge transfer from
multiple sources (task 𝑗 at time 𝑡) to task 𝑑
(normalized over the number of targets)
18
Normalized amount of knowledge transfer to multiple
targets (task 𝑑 at time 𝑡) from task 𝑗
(normalized over the number of sources)
Incoming Transfer to different Targets
Outgoing Transfer from different Sources
𝛼!,#
&,&
+ 𝛼!,#
&,&'(
+ ⋯ + 𝛼!,#
&,)
𝑇 − 𝑡 + 1
− (1)
𝛼!,%
(,&
+ 𝛼!,%
-,&
+ ⋯ + 𝛼!,%
&,&
𝑡
− (2)
Experiments

Qualitative Results: Medical Interpretation
19
Interpretation of the Learned Knowledge Graph
By analyzing selected clinical case studies, we could identify steps where knowledge transferred as we
designed and meaningful medical events occur, which correlates with interactions between selected tasks.
MechVent - Mechanical Ventilation, FiO2 - Fractional inspired Oxygen, SBP - Systolic arterial blood pressure,
DBP - Diastolic arterial blood pressure, HR - Heart Rate, Temp - Body Temperature, Urine - Urine output,
GCS - Glasgow Coma Score, WBC - White Blood Cell Count, Culture - Culture Results.
Experiments

Ablation Study
20
AMTL-Intratask
Effectiveness of Inter-Task and Inter-Timestep Knowledge Transfer
AMTL-Samestep
TD-AMTL
Deterministic variant of TP-AMTL
Experiments
TP-AMTL (constrained)
Effectiveness of Future-to-Past Transfer
TP-AMTL (epistemic)
Effectiveness of Uncertainty Types
TP-AMTL (aleatoric)
𝑝. 𝑧% 𝑥, 𝜔 ∼ 𝒩(𝑧%; 𝜇%, 0)
Knowledge Transfer only happens from the later timestep
to earlier ones

Ablation Study
21
AMTL-Intratask
AMTL-Samestep
TD-AMTL
Experiments
TP-AMTL (epistemic)
TP-AMTL (aleatoric)
𝑝. 𝑧% 𝑥, 𝜔 ∼ 𝒩(𝑧%; 𝜇%, 0)
to earlier ones

Ablation Study
22
AMTL-Intratask
AMTL-Samestep
TD-AMTL
Experiments
TP-AMTL (epistemic)
TP-AMTL (aleatoric)
𝑝. 𝑧% 𝑥, 𝜔 ∼ 𝒩(𝑧%; 𝜇%, 0)
to earlier ones

Ablation Study
23
AMTL-Intratask
AMTL-Samestep
TD-AMTL
Experiments
TP-AMTL (epistemic)
TP-AMTL (aleatoric)
𝑝. 𝑧% 𝑥, 𝜔 ∼ 𝒩(𝑧%; 𝜇%, 0)
to earlier ones

• We proposed a novel probabilistic asymmetric multi-task learning framework
that allows asymmetric knowledge transfer between tasks at different timesteps,
based on the uncertainty.
• We use a probabilistic Bayesian formulation for asymmetric knowledge transfer,
where the amount of knowledge transfer depends on the uncertainty at the
feature level.
• We validate our model on clinical risk prediction tasks, on which it achieves
significant improvements over baselines and provides meaningful interpretations,
including temporal relationships between tasks.
Conclusions
24

Clinical Risk Prediction with Temporal Probabilistic Asymmetric Multi-Task Learning

More Related Content

Similar to Clinical Risk Prediction with Temporal Probabilistic Asymmetric Multi-Task Learning (20)

More from MLAI2 (20)

Recently uploaded (20)

Clinical Risk Prediction with Temporal Probabilistic Asymmetric Multi-Task Learning