SlideShare a Scribd company logo
TEMPORAL GRAPH
PATTERN MINING
PRESENTED BY
EUGENEYANG EY120@GEORGETOWN.EDU
MOTIVATION
• Find the relationship between clinical activities
• Encode these information as feature
DEPENDENCY PATTERNS IN CLINICAL
PATHWAYS
Lin, Fu-ren, et al. "Mining time dependency patterns in clinical
pathways." International Journal of Medical Informatics 62.1 (2001): 11-25.
BASIC SETUP
• An activity record(process log data) of the process 𝑃𝑖 can be written as a 4-
tuple
𝑃𝑖, 𝐴𝑖𝑗, 𝑇𝑠,𝑖𝑗, 𝑇𝑒,𝑖𝑗
• If activity 𝐴𝑖𝑗 starts right after 𝐴𝑖𝑘 without any other activities in between, we
say 𝑨𝒊𝒋 is depends on 𝑨𝒊𝒌
• We can then construct a directed acyclic dependency graph
𝐺 𝑝 = (𝑉𝑝, 𝐸 𝑝)
where 𝑉𝑝 is the activities with pseudo start/end node and 𝐸 𝑝 is the Boolean
dependency between every 2 activities.
Temporal Graph Pattern Mining
AGGREGATING THE INFORMATION
• A Large Graph 𝐿𝐺 𝑘 is set of aggregated graphs with at most 𝑘 nodes
• Algorithm
1. Create activity nodes by counting the activity with support over certain
threshold (i.e. creating 𝐿𝐺1).
2. By looking at dependency between every 2 activities, create 𝐿𝐺2 by adding
pairs with certain minimum support.
3. Construct 𝐿𝐺 𝑘 by adding edges that have not included in 𝐿𝐺 𝑘−1 from 𝐿𝐺2
Temporal Graph Pattern Mining
Temporal Graph Pattern Mining
EXAMPLE ON REAL DATA
• Adding some runtime activities
• E.g. blood pressure
• Using training/testing to test the validity and stability of this method
𝑆𝑖𝑚 𝑃𝑖, 𝑃𝑗 =
2 × 𝐸𝑖 ∪ 𝐸𝑗
E𝑖 + 𝐸𝑗
𝑃𝑟𝑒𝑑𝑖𝑐𝑖𝑜𝑛 𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =
1
𝑛
෍
𝑖=1
𝑛
max
𝑃 𝑗∈𝑇𝑟𝑎𝑖𝑛𝑖𝑛𝑔
𝑆𝑖𝑚(𝑃𝑖, 𝑃𝑗)
Temporal Graph Pattern Mining
Temporal Graph Pattern Mining
TEMPORAL PHENOTYPING WITH
GRAPH BASED FRAMEWORK
Liu, Chuanren, et al. "Temporal phenotyping from longitudinal electronic health
records: A graph based framework." Proceedings of the 21th ACM SIGKDD
International Conference on Knowledge Discovery and Data Mining. ACM, 2015.
BASIC SETUP
• Motivation: try to embed relationship between events as features
• Construct a weighted directed temporal graph of event sequence
• E.g. diagnosis, medication, lab test, etc.
• Weight Function
𝑊𝑖𝑗
𝑛
=
1
𝐿 𝑛
෍
1≤𝑝≤𝑞≤𝐿 𝑛
𝑥 𝑛𝑝 = 𝑖 ∧ 𝑥 𝑛𝑞 = 𝑗 𝜅(𝑡 𝑛𝑞 − 𝑡 𝑛𝑝)
where 𝜅 ∙ is a non-increasing function(here we choose as Exponential
distribution function if input is greater than constant Δ)
Temporal Graph Pattern Mining
BASIS DECOMPOSITION OF GRAPH
• Decompose graph weight matrix
into aggregation of basis
• Use index of basis as feature
BASIS DECOMPOSITION OF GRAPH
• For weight matrix 𝑊 𝑛
,
𝑊 𝑛
= ෍
𝑘=1
𝐾
𝐴 𝑛𝑘 𝐵 𝑘
where 𝐴 ∈ ℝ 𝑁×𝐾
• We minimize the estimation error
𝒥 𝐴, 𝐵 =
1
2
෍
𝑛=1
𝑁
𝑊 𝑛
− ෍
𝑘=1
𝐾
𝐴 𝑛𝑘 𝐵 𝑘
𝐹
2
where ∙ 𝐹 is the matrix Frobenius norm.
REGULARIZATION
• Like most of the ML minimization problem, we can we can add
regularization term to objective function
𝒥 𝐴, 𝐵 =
1
2
෍
𝑛=1
𝑁
𝑊 𝑛 − ෍
𝑘=1
𝐾
𝐴 𝑛𝑘 𝐵 𝑘
𝐹
2
+ 𝜆Ω 𝐴
where regularization function Ω 𝐴 ≥ 0 and some constant 𝜆 ≥ 0
REGULARIZATION
• Similarity based regularization
Ω 𝐴 =
1
2
෍
𝑛1,𝑛2
1
2
𝐴 𝑛1
− 𝐴 𝑛2
2
𝑆 𝑛1 𝑛2
=
1
2
𝑡𝑟 𝐴′ 𝐿𝐴
where 𝑆 is the similarity symmetric matrix and 𝐿 = 𝐷 − 𝑆
• Model based regularization
Ω 𝐴 = −
1
ℒ
෍
𝑛∈ℒ
log Pr 𝐴 𝑛, 𝑌𝑛 ℋ
where ℒ is the training set and 𝑌𝑛 is the correspondent label
REGULARIZATION
• We can easily embed a discriminative model under this setup, e.g. logistic
regression
Pr 𝐴 𝑛, 𝑌𝑛 ℋ =
1
1 + exp(−𝑌𝑛 𝑓(𝐴 𝑛))
ℋ: 𝐴 𝑛 ⟼ 𝑓 𝐴 𝑛 = AnΘ + 𝜃
• Or hinge loss(SVM)
Ω 𝐴 𝑛 =
1
ℒ
෍
𝑛∈ℒ
max(0, 1 − 𝑌𝑛 𝑓(𝐴 𝑛))
OPTIMIZATION
• Similarity based regularization
𝐴 ⟵ 𝑝𝑟𝑜𝑗𝑠𝑝𝑙𝑥 𝐴 − 𝛼
𝜕𝒥
𝜕𝐴
min
𝐵𝑖𝑗
∗
1
2
෍
𝑛=1
𝑁
𝑊𝑖𝑗
𝑛
− ෍
𝑘=1
𝐾
𝐴 𝑛𝑘 𝐵𝑖𝑗
𝑘
2
OPTIMIZATION
• Model based regularization
• The loss function space could be
convex but not differentiable(e.g.
SVM)
• Apply proximal gradient
optimization
Temporal Graph Pattern Mining
Temporal Graph Pattern Mining
Temporal Graph Pattern Mining
REAL EXAMPLE
• Congestive Heart Failure(CHF)
• One-year hospitalization
prediction after CHF confirmation
• Prediction of CHF
Temporal Graph Pattern Mining
Temporal Graph Pattern Mining
Temporal Graph Pattern Mining
Temporal Graph Pattern Mining
QUESTIONS?

More Related Content

PPTX
Application of Principal Components Analysis in Quality Control Problem
PPTX
Time series data and engel granger test
PDF
Forecasting of electric consumption in a semiconductor plant using time serie...
PDF
Parameter Estimation for the Exponential distribution model Using Least-Squar...
PDF
Short-term load forecasting with using multiple linear regression
PDF
Lecture notes 01
PDF
Overview and Implementation of Principal Component Analysis
PPT
Rayleigh Ritz Method
Application of Principal Components Analysis in Quality Control Problem
Time series data and engel granger test
Forecasting of electric consumption in a semiconductor plant using time serie...
Parameter Estimation for the Exponential distribution model Using Least-Squar...
Short-term load forecasting with using multiple linear regression
Lecture notes 01
Overview and Implementation of Principal Component Analysis
Rayleigh Ritz Method

What's hot (18)

PPTX
Presentation
PPTX
Project work smaple
DOCX
Method of weighted residuals
PPTX
Varaiational formulation fem
PDF
012
PDF
International Journal of Computational Engineering Research(IJCER)
PPTX
Dimensional analysis
PDF
1 s2.0-0272696386900197-main
PDF
The convergence of the iterated irs method
PPTX
Optimization Simulated Annealing
PPT
090528 Miller Process Forensics Talk @ Asq
PDF
Dimensional analysis - Part 1
PDF
MC2015Posterlandscape
PDF
A statistical comparative study of
PDF
Calculation of solar radiation by using regression methods
PDF
S4101116121
PDF
Paper id 41201612
PDF
DIMENSIONAL ANALYSIS (Lecture notes 08)
Presentation
Project work smaple
Method of weighted residuals
Varaiational formulation fem
012
International Journal of Computational Engineering Research(IJCER)
Dimensional analysis
1 s2.0-0272696386900197-main
The convergence of the iterated irs method
Optimization Simulated Annealing
090528 Miller Process Forensics Talk @ Asq
Dimensional analysis - Part 1
MC2015Posterlandscape
A statistical comparative study of
Calculation of solar radiation by using regression methods
S4101116121
Paper id 41201612
DIMENSIONAL ANALYSIS (Lecture notes 08)
Ad

Similar to Temporal Graph Pattern Mining (20)

PDF
Time series for yotube_3_data anlysis.pdf
PPTX
03 Data Mining Techniques
PPTX
Optimization in QBD
PDF
Deep Feed Forward Neural Networks and Regularization
PDF
A Study on Performance Analysis of Different Prediction Techniques in Predict...
PDF
B04460815
PDF
Lecture 5: Neural Networks II
PDF
On selection of periodic kernels parameters in time series prediction
PPT
Unit1 pg math model
PDF
Tudelft stramien 16_9_on_optimization
PPTX
ngboost.pptx
PDF
The Needleman-Wunsch Algorithm for Sequence Alignment
PPTX
Time series analysis
PPT
SPC WithAdrian Adrian Beale
PPT
Qc tools
PPT
Qc tools
PPT
Unit 3 Total Quality Management _SPC.ppt
Time series for yotube_3_data anlysis.pdf
03 Data Mining Techniques
Optimization in QBD
Deep Feed Forward Neural Networks and Regularization
A Study on Performance Analysis of Different Prediction Techniques in Predict...
B04460815
Lecture 5: Neural Networks II
On selection of periodic kernels parameters in time series prediction
Unit1 pg math model
Tudelft stramien 16_9_on_optimization
ngboost.pptx
The Needleman-Wunsch Algorithm for Sequence Alignment
Time series analysis
SPC WithAdrian Adrian Beale
Qc tools
Qc tools
Unit 3 Total Quality Management _SPC.ppt
Ad

More from Eugene Yang (10)

PPTX
Blockchain Overview
PDF
Js overview
PPTX
Website Series 6 - PHP
PPTX
Website Series 5 - MySQL
PPTX
Website Series 4 - JavaScript
PPTX
Website Series 3 - CSS
PPTX
Website Series 2 - HTML
PPTX
Website Series 1 - Basic Website Framework
PPTX
Website Series 0 - Website Development Environment
PDF
Black-Scholes Calculator on Web
Blockchain Overview
Js overview
Website Series 6 - PHP
Website Series 5 - MySQL
Website Series 4 - JavaScript
Website Series 3 - CSS
Website Series 2 - HTML
Website Series 1 - Basic Website Framework
Website Series 0 - Website Development Environment
Black-Scholes Calculator on Web

Recently uploaded (20)

PPT
protein biochemistry.ppt for university classes
PPTX
Comparative Structure of Integument in Vertebrates.pptx
PPTX
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
DOCX
Viruses (History, structure and composition, classification, Bacteriophage Re...
PPTX
neck nodes and dissection types and lymph nodes levels
PDF
Biophysics 2.pdffffffffffffffffffffffffff
PDF
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
PPT
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
PDF
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
PPTX
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
PDF
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
PPTX
Introduction to Cardiovascular system_structure and functions-1
PPTX
Microbiology with diagram medical studies .pptx
PDF
HPLC-PPT.docx high performance liquid chromatography
PDF
An interstellar mission to test astrophysical black holes
PPTX
Taita Taveta Laboratory Technician Workshop Presentation.pptx
PDF
Placing the Near-Earth Object Impact Probability in Context
PDF
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
PDF
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
PPTX
BIOMOLECULES PPT........................
protein biochemistry.ppt for university classes
Comparative Structure of Integument in Vertebrates.pptx
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
Viruses (History, structure and composition, classification, Bacteriophage Re...
neck nodes and dissection types and lymph nodes levels
Biophysics 2.pdffffffffffffffffffffffffff
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
Introduction to Cardiovascular system_structure and functions-1
Microbiology with diagram medical studies .pptx
HPLC-PPT.docx high performance liquid chromatography
An interstellar mission to test astrophysical black holes
Taita Taveta Laboratory Technician Workshop Presentation.pptx
Placing the Near-Earth Object Impact Probability in Context
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
BIOMOLECULES PPT........................

Temporal Graph Pattern Mining

  • 1. TEMPORAL GRAPH PATTERN MINING PRESENTED BY EUGENEYANG EY120@GEORGETOWN.EDU
  • 2. MOTIVATION • Find the relationship between clinical activities • Encode these information as feature
  • 3. DEPENDENCY PATTERNS IN CLINICAL PATHWAYS Lin, Fu-ren, et al. "Mining time dependency patterns in clinical pathways." International Journal of Medical Informatics 62.1 (2001): 11-25.
  • 4. BASIC SETUP • An activity record(process log data) of the process 𝑃𝑖 can be written as a 4- tuple 𝑃𝑖, 𝐴𝑖𝑗, 𝑇𝑠,𝑖𝑗, 𝑇𝑒,𝑖𝑗 • If activity 𝐴𝑖𝑗 starts right after 𝐴𝑖𝑘 without any other activities in between, we say 𝑨𝒊𝒋 is depends on 𝑨𝒊𝒌 • We can then construct a directed acyclic dependency graph 𝐺 𝑝 = (𝑉𝑝, 𝐸 𝑝) where 𝑉𝑝 is the activities with pseudo start/end node and 𝐸 𝑝 is the Boolean dependency between every 2 activities.
  • 6. AGGREGATING THE INFORMATION • A Large Graph 𝐿𝐺 𝑘 is set of aggregated graphs with at most 𝑘 nodes • Algorithm 1. Create activity nodes by counting the activity with support over certain threshold (i.e. creating 𝐿𝐺1). 2. By looking at dependency between every 2 activities, create 𝐿𝐺2 by adding pairs with certain minimum support. 3. Construct 𝐿𝐺 𝑘 by adding edges that have not included in 𝐿𝐺 𝑘−1 from 𝐿𝐺2
  • 9. EXAMPLE ON REAL DATA • Adding some runtime activities • E.g. blood pressure • Using training/testing to test the validity and stability of this method 𝑆𝑖𝑚 𝑃𝑖, 𝑃𝑗 = 2 × 𝐸𝑖 ∪ 𝐸𝑗 E𝑖 + 𝐸𝑗 𝑃𝑟𝑒𝑑𝑖𝑐𝑖𝑜𝑛 𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = 1 𝑛 ෍ 𝑖=1 𝑛 max 𝑃 𝑗∈𝑇𝑟𝑎𝑖𝑛𝑖𝑛𝑔 𝑆𝑖𝑚(𝑃𝑖, 𝑃𝑗)
  • 12. TEMPORAL PHENOTYPING WITH GRAPH BASED FRAMEWORK Liu, Chuanren, et al. "Temporal phenotyping from longitudinal electronic health records: A graph based framework." Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2015.
  • 13. BASIC SETUP • Motivation: try to embed relationship between events as features • Construct a weighted directed temporal graph of event sequence • E.g. diagnosis, medication, lab test, etc. • Weight Function 𝑊𝑖𝑗 𝑛 = 1 𝐿 𝑛 ෍ 1≤𝑝≤𝑞≤𝐿 𝑛 𝑥 𝑛𝑝 = 𝑖 ∧ 𝑥 𝑛𝑞 = 𝑗 𝜅(𝑡 𝑛𝑞 − 𝑡 𝑛𝑝) where 𝜅 ∙ is a non-increasing function(here we choose as Exponential distribution function if input is greater than constant Δ)
  • 15. BASIS DECOMPOSITION OF GRAPH • Decompose graph weight matrix into aggregation of basis • Use index of basis as feature
  • 16. BASIS DECOMPOSITION OF GRAPH • For weight matrix 𝑊 𝑛 , 𝑊 𝑛 = ෍ 𝑘=1 𝐾 𝐴 𝑛𝑘 𝐵 𝑘 where 𝐴 ∈ ℝ 𝑁×𝐾 • We minimize the estimation error 𝒥 𝐴, 𝐵 = 1 2 ෍ 𝑛=1 𝑁 𝑊 𝑛 − ෍ 𝑘=1 𝐾 𝐴 𝑛𝑘 𝐵 𝑘 𝐹 2 where ∙ 𝐹 is the matrix Frobenius norm.
  • 17. REGULARIZATION • Like most of the ML minimization problem, we can we can add regularization term to objective function 𝒥 𝐴, 𝐵 = 1 2 ෍ 𝑛=1 𝑁 𝑊 𝑛 − ෍ 𝑘=1 𝐾 𝐴 𝑛𝑘 𝐵 𝑘 𝐹 2 + 𝜆Ω 𝐴 where regularization function Ω 𝐴 ≥ 0 and some constant 𝜆 ≥ 0
  • 18. REGULARIZATION • Similarity based regularization Ω 𝐴 = 1 2 ෍ 𝑛1,𝑛2 1 2 𝐴 𝑛1 − 𝐴 𝑛2 2 𝑆 𝑛1 𝑛2 = 1 2 𝑡𝑟 𝐴′ 𝐿𝐴 where 𝑆 is the similarity symmetric matrix and 𝐿 = 𝐷 − 𝑆 • Model based regularization Ω 𝐴 = − 1 ℒ ෍ 𝑛∈ℒ log Pr 𝐴 𝑛, 𝑌𝑛 ℋ where ℒ is the training set and 𝑌𝑛 is the correspondent label
  • 19. REGULARIZATION • We can easily embed a discriminative model under this setup, e.g. logistic regression Pr 𝐴 𝑛, 𝑌𝑛 ℋ = 1 1 + exp(−𝑌𝑛 𝑓(𝐴 𝑛)) ℋ: 𝐴 𝑛 ⟼ 𝑓 𝐴 𝑛 = AnΘ + 𝜃 • Or hinge loss(SVM) Ω 𝐴 𝑛 = 1 ℒ ෍ 𝑛∈ℒ max(0, 1 − 𝑌𝑛 𝑓(𝐴 𝑛))
  • 20. OPTIMIZATION • Similarity based regularization 𝐴 ⟵ 𝑝𝑟𝑜𝑗𝑠𝑝𝑙𝑥 𝐴 − 𝛼 𝜕𝒥 𝜕𝐴 min 𝐵𝑖𝑗 ∗ 1 2 ෍ 𝑛=1 𝑁 𝑊𝑖𝑗 𝑛 − ෍ 𝑘=1 𝐾 𝐴 𝑛𝑘 𝐵𝑖𝑗 𝑘 2
  • 21. OPTIMIZATION • Model based regularization • The loss function space could be convex but not differentiable(e.g. SVM) • Apply proximal gradient optimization
  • 25. REAL EXAMPLE • Congestive Heart Failure(CHF) • One-year hospitalization prediction after CHF confirmation • Prediction of CHF