SlideShare a Scribd company logo
INVESTIGATION OF
IMAGE PROCESSING
ALGORITHMS FOR
MEDICAL APPLICATION
Zhafir AglnaTijani
U1120208F
A final year project presentation in partial fullfilment of the
requirement for the degree of Bachelor of Engineering
1
Background
and Theory
Implementation
Result and
Discussion
Conclusion
Outline
2
Background
and Theory
•Problems and Objectives
•Gene Regulatory Network
•Granger Causality
•3 Methods of Granger Causality
•Project Focus
Implementation
Result and
Discussion
Conclusion
Outline
3
“It is more pragmatic to cure the
cause of disease at its sources
than to handle the actual
diseases”
Gene
4
• The Interaction between genes is
called Gene Regulatory Network
• The discovery of this network still
have a lot of challenge because
of complexity of the network
• Efficient Computational Tools are
required
To find an effective and efficient
means to discover unknown Gene
Regulatory Network
Objective
5
Modelling of GRN
• Nodes and Edges
• Depicting the
relation between
genes
• Obtained from DNA
Microarray
• Prominent Method :
Granger Causality
http://img.medicalxpress.com/newman/gfx/news/hires/2013/1-novelnoninva.jpg
6
Granger Causality
• Method for Time Series Analysis
• Utilized Vector Auto-regression (VAR) Model to calculate
causality based on Time Series data.
Granger (1969)
A B
Time Series Time Series
Ut =
𝑘=1
𝑝
AkUt−k + εt 𝐹𝑌→𝑋 ≡ ln
|Σ 𝑥𝑥
′
|
|Σ 𝑥𝑥|
7
Granger Causality
“If past values of A and B can predict future value
of B better than past values of B alone,
Then, time series A granger cause time series B”
Granger (1969)
A B
Time Series Time Series
8
MVGC Lasso CopulaBarnett et al. (2013) Arnold et al. (2007) Liu and Bahadori (2012)
3 Methods of Implementing
Granger Causality
“These 3 Methods has been implemented independently,
but never been compared using the same condition.” 9
Main Focus of the Project
• Comparative Study of
Algorithms
• Focus on the Performance
of 3 Algorithms
• Finding Strength and
Weaknesses
• Utilizing Control Variables
and Metrics Performance
10
Background
and Theory
Implementation
Result and
Discussion
Conclusion
•Control Variables
•Causality Graph and Matrix
•Edge Analysis)
•Performance Metrics
•Data for Analysis
Outline
11
Implementation
Time Series
input
GC
Algorithm
Causality
Matrix and
Graph
Edge
Analysis
Data for
Discussion
• Implementation using MATLAB 2010b
• Based on Existing Toolboxes :
• MVGC Toolbox ( Barnett, 2013 )
• Lasso Granger
• Copula Granger ( Liu and Bahadori, 2012 )
• GLMnet
Program Flow
12
Implementation
Control Variables
• Based on Set of Equations
• Linear Time Series Dataset
• Generated by specifying The Number of Time Points
• Advantages :
• Provide Ground Truth Network : Actual Causality of the Time Series
• Ground Truth can be compared with the Algorithm Output to measure the
performance of Algorithms
• 2 Types of Dataset : 3-VAR and 5-VAR Time Series
• 8 different Number of Time Points : 200, 400, 800, 1200, 1600, 2400, 3200, 4000
Synthetic Time Series Dataset
13
3 Granger Causality Algorithms
14
Causality Matrix
• 1 represent : Link Exist between Variables
• 0 represent : Link Does not Exist
0 0 0 0 0
1 0 0 0 0
1 0 0 0 0
1 0 0 0 1
1 0 0 1 0
• Output of GC Algorithm is the Causality Matrix
• Depict granger causality between time series
15
Edge Analysis
• The result of Algorithm are masked with Binary Masking with
the threshold of 0.0001
0 0 0 0 0
1 0 0 0 0
1 0 0 0 0
1 0 0 0 1
1 0 0 1 0
• Edge Analysis is a method to measure the performance of
an Algorithm by comparing it with the Benchmark
• Benchmark = Ground Truth
0 0 1 0 1
1 0 1 1 1
1 1 1 0 1
0 1 0 1 1
1 1 1 0 1
Ground Truth Lasso Method
16
Edge Analysis
For above example
• TP : 4
• TN : 6
• FP : 13
• FN : 2
0 0 0 0 0
1 0 0 0 0
1 0 0 0 0
1 0 0 0 1
1 0 0 1 0
• Using Parameters from Confusion Matrix :
• True Positives, True Negatives, False Positives, and False
Negatives
0 0 1 0 1
1 0 1 1 1
1 1 1 0 1
0 1 0 1 1
1 1 1 0 1
Ground Truth Lasso Method
17
7 Performance Metrics
𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦 =
𝑇𝑃
𝑇𝑃 + 𝐹𝑁
𝑆𝑝𝑒𝑐𝑖𝑓𝑖𝑐𝑖𝑡𝑦 =
𝑇𝑁
𝑇𝑁 + 𝐹𝑃
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =
𝑇𝑃
𝑇𝑃 + 𝐹𝑃
𝐹𝑎𝑙𝑠𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑅𝑎𝑡𝑒 =
𝐹𝑃
𝑇𝑁 + 𝐹𝑃
𝐹𝑎𝑙𝑠𝑒 𝐷𝑖𝑠𝑐𝑜𝑣𝑒𝑟𝑦 𝑅𝑎𝑡𝑒 =
𝐹𝑃
𝑇𝑃 + 𝐹𝑃
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =
𝑇𝑃 + 𝑇𝑁
𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁
𝐹1 𝑆𝑐𝑜𝑟𝑒 =
2𝑇𝑃
2𝑇𝑃 + 𝐹𝑃 + 𝐹𝑁
• Calculated based on the value of TP, TN, FP, and FN
• Used in Past Research in Similar Topic
18
Data for Analysis
• The Result of Granger Causality depends
on the generated time series
• Few sample was not sufficient, Since time
series generated was different each time
• The experiment was iterated by 2000 times
• Mean Value of each performance metrics
will be the basis for comparative study
0 0 1 0 1
1 0 1 1 1
1 1 1 0 1
0 1 0 1 1
1 1 1 0 1
0 0 1 0 0
0 0 1 1 0
1 1 1 0 1
0 1 0 1 1
0 1 0 1 1
Lasso : 1st Iteration
Lasso : 2nd Iteration
19
Background
and Theory
Implementation
Result and
Discussion
Conclusion
Outline
•Performance Metrics Scores
•Specific Result
•5-VAR Accuracy
•3-VAR and 5-VAR F1 Score
•Overall Score Result
20
Scores of Metrics
• Bar chart to represent the score
of each performance metrics on
3 methods
• X axis : Number of Time Points
• Y axis : Score of Metrics
• 7 Metrics Performance
• 2 Scenario : 3-VAR and 5-VAR
0
0.1
0.2
0.3
0.4
0.5
0.6
200 400 800 1200 1600 2400 3200 4000
Score
Number of Time Points
VAR5 F1 Score
MVGC
LASSO
COPULA
21
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
200 400 800 1200 1600 2400 3200 4000
Score
Number of Time Points
VAR5 Specificity
MVGC
LASSO
COPULA
0
0.2
0.4
0.6
0.8
1
1.2
200 400 800 1200 1600 2400 3200 4000
Score
Number of Time Points
VAR5 Sensitivity
MVGC
LASSO
COPULA
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
200 400 800 1200 1600 2400 3200 4000
Score
Number of Time Points
VAR5 Precision
MVGC
LASSO
COPULA
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
200 400 800 1200 1600 2400 3200 4000
Score
Number of Time Points
VAR5 False Positive Rate
MVGC
LASSO
COPULA
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
200 400 800 1200 1600 2400 3200 4000
Score
Number of Time Points
VAR5 False Discovery Rate
MVGC
LASSO
COPULA
0
0.1
0.2
0.3
0.4
0.5
0.6
200 400 800 1200 1600 2400 3200 4000
Score
Number of Time Points
VAR5 Accuracy
MVGC
LASSO
COPULA
22
0
0.1
0.2
0.3
0.4
0.5
0.6
200 400 800 1200 1600 2400 3200 4000
Score
Number of Time Points
VAR5 Accuracy
MVGC
LASSO
COPULA
5-VAR Accuracy
• Accuracy
• Proportion of true result among total links
available
• MVGC
• Increasing as Number of time Points
Increase
• Score range was small ( around 0,1 )
• Lasso
• Increasing as Number of Time Points
Increase
• Two extreme scores, Wide score Range
• Copula
• Optimized during number of time points
around 400
• Bad performance at higher number of
time points
23
3-VAR and 5-VAR F1 Score
• F1 Score
• Statistical Significance based on
Harmonic mean of Precision and Recall
• MVGC
• Consistent Pattern, Increases as time
point increases
• Lasso
• Contrast Pattern
• Heavily affected by number of variables
• Copula
• Unique Pattern
• Has a certain point / range where
performance is optimized
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
200 400 800 1200 1600 2400 3200 4000
Score
Number of Time Points
VAR3 F1 Score
MVGC
LASSO
COPULA
0
0.1
0.2
0.3
0.4
0.5
0.6
200 400 800 1200 1600 2400 3200 4000
Score
Number of Time Points
VAR5 F1 Score
MVGC
LASSO
COPULA
24
Overall Performance
Metrics type Best Performance Average Performance Worst Performance
Sensitivity Lasso Copula MVGC
Specificity MVGC Lasso Copula
Precision MVGC Lasso Copula
False Positive Rate MVGC Lasso Copula
False Discovery Rate MVGC Lasso Copula
Accuracy MVGC Lasso Copula
F1 – Score MVGC Lasso Copula
3 – Variable Time Series
• Overall performance based on average score of all time points
• MVGC Outperforms other two methods in 3-VAR Scenario
• Lasso scores was good during high number of time points
• Copula has certain range which their score was high ( around 200 – 800 time points ),
but outside of that the score were lower than other method
25
Metrics type Best Performance Average Performance Worst Performance
Sensitivity MVGC Copula Lasso
Specificity Lasso Copula MVGC
Precision MVGC Copula Lasso
False Positive Rate Lasso Copula MVGC
False Discovery Rate MVGC Copula Lasso
Accuracy Copula MVGC Lasso
F1 – Score MVGC Copula Lasso
5 – Variable Time Series
• MVGC shows Consistency in both 5-VAR and 3-VAR
• Copula provides best accuracy compared to other method, especially during 200 –
800 time points
• Lasso score is the highest during high number of time points, but the score during low
number of time points were low.
26
Background
and Theory
Implementation
Result and
Discussion
Conclusion
Outline
•Conclusion
•Future Works
27
Conclusion
• 3 Methods of GC : MVGC, Lasso, and
Copula can be compared using 7
Performance Metrics
• MVGC provides consistency in most of
condition
• Lasso has advantages in handling
high number of time points
• Copula has certain range which their
performance was optimized
• Even though overall score favours
MVGC compared to other methods,
the results are still conditional
28
Suggestions for Future Work
• Granger Causality Algorithms for non-linear Data
• Non-linear data provides better representation for Gene Regulatory Network
• Application to Real Dataset
• Granger Causality Analysis may be applied to real dataset
• Other Algorithm for GRN ( Dynamic Bayesian Network )
• DBN is another prominent method in this topic
29
Thank You
30
Q & A
31

More Related Content

PDF
Mining Assumptions for Software Components using Machine Learning
PDF
Markov Blanket Causal Discovery Using Minimum Message Length
PDF
An Introduction to NLP4L
PPTX
Extracting Temporal and Causal Relations between Events
PPT
Criteria for causal association
PDF
PDF
L2. Evaluating Machine Learning Algorithms I
Mining Assumptions for Software Components using Machine Learning
Markov Blanket Causal Discovery Using Minimum Message Length
An Introduction to NLP4L
Extracting Temporal and Causal Relations between Events
Criteria for causal association
L2. Evaluating Machine Learning Algorithms I

Viewers also liked (6)

PPTX
Bayesian networks and the search for causality
PPTX
Association and causation
PPTX
Association causation
PPT
Granger Causality
PPT
z transforms
DOCX
Granger causality testing
Bayesian networks and the search for causality
Association and causation
Association causation
Granger Causality
z transforms
Granger causality testing
Ad

Similar to Comparative Study of Granger Causality Algorithm for Gene Regulatory Network (20)

PDF
SRA final project
PPT
Sampling-SDM2012_Jun
PPT
Electric Meter and Transformer Testing in an AMI World - AclaraConnect 2018
PPTX
Monte carlo presentation for analysis of business growth
PPT
Presentazione L.M. Rinaldi Ivan
PDF
62 friesen field_data_requirements_for_the_validation_of_pv_module_performanc...
PDF
“Diagnosing Problems and Implementing Solutions for Deep Neural Network Train...
PPTX
Prediction of pKa from chemical structure using free and open source tools
PDF
Incheon National University - EATED SRA
PPTX
Engineering
PDF
4th Year Project Presentation Slides
PPTX
PhD First Year Conference (MAY 2019)
PDF
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
PPT
cdsfdsfdsssssssssssssssssssssssssssssssssssssssssssssss
PPTX
Qualification of HPLC & LCMS.pptxfjddjdjdhdjdjj
PPTX
Qualification of HPLC & LCMS.pptdjdjdjdjfjkfx
PPTX
Policy Based reinforcement Learning for time series Anomaly detection
PDF
Applications of Search-based Software Testing to Trustworthy Artificial Intel...
PDF
Practical Tools for Measurement Systems Analysis
PPTX
machineLearningTypingTool_Rev1
SRA final project
Sampling-SDM2012_Jun
Electric Meter and Transformer Testing in an AMI World - AclaraConnect 2018
Monte carlo presentation for analysis of business growth
Presentazione L.M. Rinaldi Ivan
62 friesen field_data_requirements_for_the_validation_of_pv_module_performanc...
“Diagnosing Problems and Implementing Solutions for Deep Neural Network Train...
Prediction of pKa from chemical structure using free and open source tools
Incheon National University - EATED SRA
Engineering
4th Year Project Presentation Slides
PhD First Year Conference (MAY 2019)
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
cdsfdsfdsssssssssssssssssssssssssssssssssssssssssssssss
Qualification of HPLC & LCMS.pptxfjddjdjdhdjdjj
Qualification of HPLC & LCMS.pptdjdjdjdjfjkfx
Policy Based reinforcement Learning for time series Anomaly detection
Applications of Search-based Software Testing to Trustworthy Artificial Intel...
Practical Tools for Measurement Systems Analysis
machineLearningTypingTool_Rev1
Ad

Recently uploaded (20)

PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
PPTX
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
PPTX
bas. eng. economics group 4 presentation 1.pptx
PDF
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
PPTX
Construction Project Organization Group 2.pptx
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PPTX
Lecture Notes Electrical Wiring System Components
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PDF
Digital Logic Computer Design lecture notes
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PPT
CRASH COURSE IN ALTERNATIVE PLUMBING CLASS
PPTX
CH1 Production IntroductoryConcepts.pptx
PPTX
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
PDF
Well-logging-methods_new................
PDF
R24 SURVEYING LAB MANUAL for civil enggi
PPTX
Geodesy 1.pptx...............................................
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PPT
Project quality management in manufacturing
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
bas. eng. economics group 4 presentation 1.pptx
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
Construction Project Organization Group 2.pptx
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
Lecture Notes Electrical Wiring System Components
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
Digital Logic Computer Design lecture notes
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
CRASH COURSE IN ALTERNATIVE PLUMBING CLASS
CH1 Production IntroductoryConcepts.pptx
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
Well-logging-methods_new................
R24 SURVEYING LAB MANUAL for civil enggi
Geodesy 1.pptx...............................................
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
Project quality management in manufacturing

Comparative Study of Granger Causality Algorithm for Gene Regulatory Network

  • 1. INVESTIGATION OF IMAGE PROCESSING ALGORITHMS FOR MEDICAL APPLICATION Zhafir AglnaTijani U1120208F A final year project presentation in partial fullfilment of the requirement for the degree of Bachelor of Engineering 1
  • 3. Background and Theory •Problems and Objectives •Gene Regulatory Network •Granger Causality •3 Methods of Granger Causality •Project Focus Implementation Result and Discussion Conclusion Outline 3
  • 4. “It is more pragmatic to cure the cause of disease at its sources than to handle the actual diseases” Gene 4
  • 5. • The Interaction between genes is called Gene Regulatory Network • The discovery of this network still have a lot of challenge because of complexity of the network • Efficient Computational Tools are required To find an effective and efficient means to discover unknown Gene Regulatory Network Objective 5
  • 6. Modelling of GRN • Nodes and Edges • Depicting the relation between genes • Obtained from DNA Microarray • Prominent Method : Granger Causality http://img.medicalxpress.com/newman/gfx/news/hires/2013/1-novelnoninva.jpg 6
  • 7. Granger Causality • Method for Time Series Analysis • Utilized Vector Auto-regression (VAR) Model to calculate causality based on Time Series data. Granger (1969) A B Time Series Time Series Ut = 𝑘=1 𝑝 AkUt−k + εt 𝐹𝑌→𝑋 ≡ ln |Σ 𝑥𝑥 ′ | |Σ 𝑥𝑥| 7
  • 8. Granger Causality “If past values of A and B can predict future value of B better than past values of B alone, Then, time series A granger cause time series B” Granger (1969) A B Time Series Time Series 8
  • 9. MVGC Lasso CopulaBarnett et al. (2013) Arnold et al. (2007) Liu and Bahadori (2012) 3 Methods of Implementing Granger Causality “These 3 Methods has been implemented independently, but never been compared using the same condition.” 9
  • 10. Main Focus of the Project • Comparative Study of Algorithms • Focus on the Performance of 3 Algorithms • Finding Strength and Weaknesses • Utilizing Control Variables and Metrics Performance 10
  • 11. Background and Theory Implementation Result and Discussion Conclusion •Control Variables •Causality Graph and Matrix •Edge Analysis) •Performance Metrics •Data for Analysis Outline 11
  • 12. Implementation Time Series input GC Algorithm Causality Matrix and Graph Edge Analysis Data for Discussion • Implementation using MATLAB 2010b • Based on Existing Toolboxes : • MVGC Toolbox ( Barnett, 2013 ) • Lasso Granger • Copula Granger ( Liu and Bahadori, 2012 ) • GLMnet Program Flow 12
  • 13. Implementation Control Variables • Based on Set of Equations • Linear Time Series Dataset • Generated by specifying The Number of Time Points • Advantages : • Provide Ground Truth Network : Actual Causality of the Time Series • Ground Truth can be compared with the Algorithm Output to measure the performance of Algorithms • 2 Types of Dataset : 3-VAR and 5-VAR Time Series • 8 different Number of Time Points : 200, 400, 800, 1200, 1600, 2400, 3200, 4000 Synthetic Time Series Dataset 13
  • 14. 3 Granger Causality Algorithms 14
  • 15. Causality Matrix • 1 represent : Link Exist between Variables • 0 represent : Link Does not Exist 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 1 1 0 0 1 0 • Output of GC Algorithm is the Causality Matrix • Depict granger causality between time series 15
  • 16. Edge Analysis • The result of Algorithm are masked with Binary Masking with the threshold of 0.0001 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 1 1 0 0 1 0 • Edge Analysis is a method to measure the performance of an Algorithm by comparing it with the Benchmark • Benchmark = Ground Truth 0 0 1 0 1 1 0 1 1 1 1 1 1 0 1 0 1 0 1 1 1 1 1 0 1 Ground Truth Lasso Method 16
  • 17. Edge Analysis For above example • TP : 4 • TN : 6 • FP : 13 • FN : 2 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 1 1 0 0 1 0 • Using Parameters from Confusion Matrix : • True Positives, True Negatives, False Positives, and False Negatives 0 0 1 0 1 1 0 1 1 1 1 1 1 0 1 0 1 0 1 1 1 1 1 0 1 Ground Truth Lasso Method 17
  • 18. 7 Performance Metrics 𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦 = 𝑇𝑃 𝑇𝑃 + 𝐹𝑁 𝑆𝑝𝑒𝑐𝑖𝑓𝑖𝑐𝑖𝑡𝑦 = 𝑇𝑁 𝑇𝑁 + 𝐹𝑃 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = 𝑇𝑃 𝑇𝑃 + 𝐹𝑃 𝐹𝑎𝑙𝑠𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑅𝑎𝑡𝑒 = 𝐹𝑃 𝑇𝑁 + 𝐹𝑃 𝐹𝑎𝑙𝑠𝑒 𝐷𝑖𝑠𝑐𝑜𝑣𝑒𝑟𝑦 𝑅𝑎𝑡𝑒 = 𝐹𝑃 𝑇𝑃 + 𝐹𝑃 𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = 𝑇𝑃 + 𝑇𝑁 𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁 𝐹1 𝑆𝑐𝑜𝑟𝑒 = 2𝑇𝑃 2𝑇𝑃 + 𝐹𝑃 + 𝐹𝑁 • Calculated based on the value of TP, TN, FP, and FN • Used in Past Research in Similar Topic 18
  • 19. Data for Analysis • The Result of Granger Causality depends on the generated time series • Few sample was not sufficient, Since time series generated was different each time • The experiment was iterated by 2000 times • Mean Value of each performance metrics will be the basis for comparative study 0 0 1 0 1 1 0 1 1 1 1 1 1 0 1 0 1 0 1 1 1 1 1 0 1 0 0 1 0 0 0 0 1 1 0 1 1 1 0 1 0 1 0 1 1 0 1 0 1 1 Lasso : 1st Iteration Lasso : 2nd Iteration 19
  • 20. Background and Theory Implementation Result and Discussion Conclusion Outline •Performance Metrics Scores •Specific Result •5-VAR Accuracy •3-VAR and 5-VAR F1 Score •Overall Score Result 20
  • 21. Scores of Metrics • Bar chart to represent the score of each performance metrics on 3 methods • X axis : Number of Time Points • Y axis : Score of Metrics • 7 Metrics Performance • 2 Scenario : 3-VAR and 5-VAR 0 0.1 0.2 0.3 0.4 0.5 0.6 200 400 800 1200 1600 2400 3200 4000 Score Number of Time Points VAR5 F1 Score MVGC LASSO COPULA 21
  • 22. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 200 400 800 1200 1600 2400 3200 4000 Score Number of Time Points VAR5 Specificity MVGC LASSO COPULA 0 0.2 0.4 0.6 0.8 1 1.2 200 400 800 1200 1600 2400 3200 4000 Score Number of Time Points VAR5 Sensitivity MVGC LASSO COPULA 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 200 400 800 1200 1600 2400 3200 4000 Score Number of Time Points VAR5 Precision MVGC LASSO COPULA 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 200 400 800 1200 1600 2400 3200 4000 Score Number of Time Points VAR5 False Positive Rate MVGC LASSO COPULA 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 200 400 800 1200 1600 2400 3200 4000 Score Number of Time Points VAR5 False Discovery Rate MVGC LASSO COPULA 0 0.1 0.2 0.3 0.4 0.5 0.6 200 400 800 1200 1600 2400 3200 4000 Score Number of Time Points VAR5 Accuracy MVGC LASSO COPULA 22
  • 23. 0 0.1 0.2 0.3 0.4 0.5 0.6 200 400 800 1200 1600 2400 3200 4000 Score Number of Time Points VAR5 Accuracy MVGC LASSO COPULA 5-VAR Accuracy • Accuracy • Proportion of true result among total links available • MVGC • Increasing as Number of time Points Increase • Score range was small ( around 0,1 ) • Lasso • Increasing as Number of Time Points Increase • Two extreme scores, Wide score Range • Copula • Optimized during number of time points around 400 • Bad performance at higher number of time points 23
  • 24. 3-VAR and 5-VAR F1 Score • F1 Score • Statistical Significance based on Harmonic mean of Precision and Recall • MVGC • Consistent Pattern, Increases as time point increases • Lasso • Contrast Pattern • Heavily affected by number of variables • Copula • Unique Pattern • Has a certain point / range where performance is optimized 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 200 400 800 1200 1600 2400 3200 4000 Score Number of Time Points VAR3 F1 Score MVGC LASSO COPULA 0 0.1 0.2 0.3 0.4 0.5 0.6 200 400 800 1200 1600 2400 3200 4000 Score Number of Time Points VAR5 F1 Score MVGC LASSO COPULA 24
  • 25. Overall Performance Metrics type Best Performance Average Performance Worst Performance Sensitivity Lasso Copula MVGC Specificity MVGC Lasso Copula Precision MVGC Lasso Copula False Positive Rate MVGC Lasso Copula False Discovery Rate MVGC Lasso Copula Accuracy MVGC Lasso Copula F1 – Score MVGC Lasso Copula 3 – Variable Time Series • Overall performance based on average score of all time points • MVGC Outperforms other two methods in 3-VAR Scenario • Lasso scores was good during high number of time points • Copula has certain range which their score was high ( around 200 – 800 time points ), but outside of that the score were lower than other method 25
  • 26. Metrics type Best Performance Average Performance Worst Performance Sensitivity MVGC Copula Lasso Specificity Lasso Copula MVGC Precision MVGC Copula Lasso False Positive Rate Lasso Copula MVGC False Discovery Rate MVGC Copula Lasso Accuracy Copula MVGC Lasso F1 – Score MVGC Copula Lasso 5 – Variable Time Series • MVGC shows Consistency in both 5-VAR and 3-VAR • Copula provides best accuracy compared to other method, especially during 200 – 800 time points • Lasso score is the highest during high number of time points, but the score during low number of time points were low. 26
  • 28. Conclusion • 3 Methods of GC : MVGC, Lasso, and Copula can be compared using 7 Performance Metrics • MVGC provides consistency in most of condition • Lasso has advantages in handling high number of time points • Copula has certain range which their performance was optimized • Even though overall score favours MVGC compared to other methods, the results are still conditional 28
  • 29. Suggestions for Future Work • Granger Causality Algorithms for non-linear Data • Non-linear data provides better representation for Gene Regulatory Network • Application to Real Dataset • Granger Causality Analysis may be applied to real dataset • Other Algorithm for GRN ( Dynamic Bayesian Network ) • DBN is another prominent method in this topic 29