SlideShare a Scribd company logo
© 2022 Neo4j, Inc. All rights reserved.
Enabling Patient-Driven Medicine Using
Graph Database
A systems biology paradigm for medicine
Kasthuri Kannan, PhD
Associate Professor, Director (Computational Pathology Program)
Departments of Translational Molecular Pathology & Neurosurgery
© 2022 Neo4j, Inc. All rights reserved.
Internship
Knowledge Based Systems
Jan. 2007-Dec. 2007
Research Specialist/Associate
Stowers Institute for
Medical Research/
Penn State
Jan. 2008--Mar. 2011
Research Fellow/Associate
Memorial Sloan-Kettering Cancer Center
Apr. 2011-Oct. 2013
Assistant Professor
New York University
Nov. 2013-Aug. 2019
Bachelor of Science
Mathematics (BSc)
University of Madras
Apr. 1998
Master of Science
Mathematics (MSc)
IIT, Madras
Apr. 2000
Master of Science
Mathematics (MS)
Texas A&M University
Apr. 2002
education
acquired
credentials
Kasthuri
Kannan
Doctor of Philosophy
Computer Science (PhD)
Texas A&M University
Aug. 2008
creativity
skills
Several successful
collaborations
Active learner and listener
Emotional intelligence
Communication
Proven track record of
applying data science to
biology and omics data
Image analysis/processing
Designing/directing data
science & machine learning
courses
Software development
Bioinformatics pipeline
development
NGS analysis/processing
Systems Biology
Directed insights using
data science
Mathematical & statistical
modeling + Visualization
Data science (14 years)
Graph Databases (Neo4j)
Machine learning
Several bioinformatics tools
Implementation skills
R/Python/C/C++
Strong Unix skills
Image analysis
Statistics, Mathematics
Top publications
work
experience
directed
efforts
Associate Professor & Director
UT MD Anderson Cancer Center
Feb. 2020-Present
self-
motivation
Data Scientist
(Cancer Research)
© 2022 Neo4j, Inc. All rights reserved.
3
The state of modern biomedicine
© 2022 Neo4j, Inc. All rights reserved.
4
Basics (Genetics)
Gene Expression. Image Credit: udaix/Shutterstock.com
Gene expression
© 2022 Neo4j, Inc. All rights reserved.
5
Same DNA but different mRNA (and hence different proteins)
Image Credits: Hubrecht Institute, https://www.pnas.org/doi/10.1073/pnas.0509380102
© 2022 Neo4j, Inc. All rights reserved.
6
How different mRNAs are produced by the same DNA?
A
UGA
Protein X
Protein Y
Image Credits: courses.lumenlearning.com, https://www.clancymedicalgroup.com/amino-acids/, themedicalbiochemistrypage.org
© 2022 Neo4j, Inc. All rights reserved.
7
© 2022 Neo4j, Inc. All rights reserved.
8
Yet another mechanism…
Genes Expressed
Image Credit: https://www.zymoresearch.com/pages/what-is-epigenetics
© 2022 Neo4j, Inc. All rights reserved.
9
Nature vs Nurture
Image Credit: https://www.genomicseducation.hee.nhs.uk/
© 2022 Neo4j, Inc. All rights reserved.
10
Absence of consensus
© 2022 Neo4j, Inc. All rights reserved.
11
Algorithm
• Find all 3.2 billion letters as a sequence
• or extract sequential coding letters of the genome ~ 20K genes
• Look for gene sequence changes
• Look for methylated regions (promotors)
• Correlate with the mRNA numbers (transcripts)
© 2022 Neo4j, Inc. All rights reserved.
12
Next-generation sequencing
Made identifying the sequences of
nucleotides possible
Can identify both DNA, RNA and
methylated nucleotides/sequences
Affordable with less turn-around
time (< $1000 & 8-11 days)
Reference genome available
through Human Genome Project
Image Credit: https://bitesizebio.com/
© 2022 Neo4j, Inc. All rights reserved.
13
Data deluge in biology
• Explosion of biological datasets
Image Credits: https://www.nature.com/articles/d41586-019-03536-x,
https://th.wikipedia.org/wiki/%E0%B9%84%E0%B8%9F%E0%B8%A5%E0%B9%8C:Karyotype_color_chromosomes_white_b
ackground.png, https://www.activemotif.com/catalog/2/gene-regulation, https://www.activemotif.com/catalog/42/dna-
methylation-enrichment
© 2022 Neo4j, Inc. All rights reserved.
14
Yet no consensus 
Image Credit: https://www.flyaeroguard.com/learning-center/parts-of-an-airplane/
© 2022 Neo4j, Inc. All rights reserved.
15
Systems thinking vs reductionism in biology
• Modern biology is more or less “reductionistic”
• Divide and conquer may not work when studying complex systems
• Biomedicine and especially cancer is complex and adaptive
• Reductionism is like relational databases (breaking into tables)
• Need an integrative paradigm – SYSTEMS APPROACH
© 2022 Neo4j, Inc. All rights reserved.
16
Integrativetechnology defines systems biology
Image Credit: Institute of Systems Biology
© 2022 Neo4j, Inc. All rights reserved.
17
Systems biology initiative for brain tumors
Goal of the project: creating a unifying computational
framework to integrate all available patient-centric brain
tumor recurrence data to generate hypotheses in an
unbiased manner for actionable target validation and
predicting disease outcomes for initiating clinical trials.
© 2022 Neo4j, Inc. All rights reserved.
18
Glioblastoma – most lethal of several brain tumors
Image Credit: braintumor.org
© 2022 Neo4j, Inc. All rights reserved.
19
Graph architecture
• Integration and unbiased
analysis becomes natural
• Integration is handled by
defining the
relationships and
unbiased analysis is
handled by writing
appropriate queries
© 2022 Neo4j, Inc. All rights reserved.
20
Is this idea new?
1Irina Balaur, Mansoor Saqi, Ana Barat, Artem Lysenko, Alexander Mazein, Christopher J. Rawlings, Heather J. Ruskin, and Charles
Auffray. EpiGeNet: A Graph Database of Interdependencies Between Genetic and Epigenetic Events in Colorectal Cancer. Journal of
Computational Biology. Oct 2017.969-980
Integration is either subjective,
restricted to datatypes or
phenotypes, i.e., relationships
are not general enough
Necessity to create additional
relationship types for new data
Results in restricted survival
analysis
© 2022 Neo4j, Inc. All rights reserved.
21
What is new?
Molecular
Event
patients > 1
Our database (Basic Representation)
Molecular
Event
© 2022 Neo4j, Inc. All rights reserved.
22
Graph ENgine fOr systeMs mEdicine (GENOME)
Such architecture allows any type of patient-data integration and allows comprehensive survival analysis
Copy number
events
Expression
events
Mutation
events
Methylation
events
Any two
events are
connected if
they are
shared by
more than
one patient
Graph ENgine fOr systeMs mEdicine (GENOME)
© 2022 Neo4j, Inc. All rights reserved.
23
GENOME overview
© 2022 Neo4j, Inc. All rights reserved.
24
Filters
fastq
files align
.bam
files
re-align
(GATK)
variant
calling
filtering
driver
analysis
driver
mutations
Molecular
Event
patients > 1 Molecular
Event
© 2022 Neo4j, Inc. All rights reserved.
25
Survival correlation
A
B
C
13
w
x
y
z
(wx), (xy), (yz), (zw): 1-length paths (4)
(wxy), (xyz), (yzw), (zwx): 2-length paths (4)
(wxyz), (xyzw), (yzwx), (zwxy): 3-length paths (4)
(wxyzw): 4-length paths (1)
11K nodes and 16M relationships
© 2022 Neo4j, Inc. All rights reserved.
26
Random walk simulations I
Path no. 51
p < 0.03
ITGA8-ABCA4-C22orf42-
Intergenic CpG-NGF-
CCL1-HBBP1-14q32.13-
10q11.22-C17orf54
Path no. 54
Path no. 51
CAMTA1-17q25.3-10q26.13-
WWOX-Intergenic CpG-
CHST1-17q25o3-Intergenic
CpG-TRRAP-CPS1
Path no. 54
p < 0.03
© 2022 Neo4j, Inc. All rights reserved.
27
Random walk simulations II
p < 0.005
© 2022 Neo4j, Inc. All rights reserved.
28
Random walk simulations III
NMUR2-GABRG2-FKBP8- KCNK9-HOXD1-6p22.3
p < 0.001
Path no. 16
© 2022 Neo4j, Inc. All rights reserved.
29
Power of this approach
• Identified several functionally important genes and
regions for conducting biological experiments
© 2022 Neo4j, Inc. All rights reserved.
30
Novelty of this approach
• Provides a strong framework for systems biology paradigm in
medicine
• Patient-driven than patient-properties driven
• Truly unbiased and integrative
◦ Not time dependent (based on scientific consensus) or
subjective
◦ Correlation with survival combining multiple biomarkers has
not been attempted
© 2022 Neo4j, Inc. All rights reserved.
31
Acknowledgements and support
• Patients and their families – entrusting us with their data
• Neo4j and Neo4j team (Rob Martin, Phani Dathar and Greg Shirley)
• Yang Liu, PhD student – actively developing the database
• Drs. Jason Huse, Vinay Puduvalli, Frederick Lang - unwavering support
• Dr. David Jaffray, Sr. Vice President and CTO (MDACC) - insights, database availability
• Moon ShotProgram® - coordinators and administrative support (neurosurgery, neuro-onc.)
– funding and logistics to make the project possible
• Brain Tumor Center, MD Anderson Cancer Center; The GLASS consortium
• Rui Jiang, Suren – database support at MD Anderson Cancer Center
• IT support staff at MD Anderson
Dr. David Jaffray
Dr. Jason Huse
Yang, Liu
© 2022 Neo4j, Inc. All rights reserved.
32
Thank you!
Questions?!
Contact information:
kskannan@mdanderson.org, kasthuri@gmail.com
https://kannan-kasthuri.github.io/

More Related Content

PPT
Quantitative Medicine Feb 2009
PDF
Overall Vision for NRNB: 2015-2020
PDF
Grid07 6 Jacq
PDF
MiAIRR:Minimum information about an Adaptive Immune Receptor Repertoire Seque...
PDF
Predictive in vitro & in silico Methods for Precision Medicine- Robert G. Hun...
PDF
Next Generation Sequencing in Big Data
PPTX
NRNB EAC Meeting 2012
PDF
Master's Thesis - deep genomics: harnessing the power of deep neural networks...
Quantitative Medicine Feb 2009
Overall Vision for NRNB: 2015-2020
Grid07 6 Jacq
MiAIRR:Minimum information about an Adaptive Immune Receptor Repertoire Seque...
Predictive in vitro & in silico Methods for Precision Medicine- Robert G. Hun...
Next Generation Sequencing in Big Data
NRNB EAC Meeting 2012
Master's Thesis - deep genomics: harnessing the power of deep neural networks...

Similar to Enabling Patient-Driven Medicine Using Graph Database (20)

PDF
Mastering CADD E Book.pdf_compressed.pdf
PDF
AI at AZ Festival of Genomics 2025 final.pdf
PDF
AI at AZ Festival of Genomics 2025 final.pdf
PPTX
NRNB EAC Report 2011
PPTX
Biocuration activities for the International Cancer Genome Consortium (ICGC).
PPTX
Open data, compound repurposing, and rare diseases -- Point Loma Nazarene Uni...
PDF
GASCAN: A Novel Database for Gastric Cancer Genes and Primers
PPT
High Performance Cyberinfrastructure to Support Data-Intensive Biomedical Res...
PDF
MDC Connects: Make the Molecules that Matter
PPTX
VariantSpark: applying Spark-based machine learning methods to genomic inform...
PPTX
International Cancer Genomics Consortium (ICGC) Data Coordinating Center
PDF
Cancer genome repository_berkeley
PPTX
Free webinar-introduction to bioinformatics - biologist-1
PDF
Knowledge Graphs & Graph Data Science, More Context, Better Predictions - Neo...
PPTX
Bioinformatics Courses and Services Provider
PPTX
Data Commons & Data Science Workshop
PDF
Unveiling DNA sequences: a comparison of machine learning and deep learning t...
PDF
Unveiling DNA sequences: a comparison of machine learning and deep learning t...
PDF
Data-integration platform for cancer research:cBioPortal demo
DOC
V1_I1_2012_Paper5.doc
Mastering CADD E Book.pdf_compressed.pdf
AI at AZ Festival of Genomics 2025 final.pdf
AI at AZ Festival of Genomics 2025 final.pdf
NRNB EAC Report 2011
Biocuration activities for the International Cancer Genome Consortium (ICGC).
Open data, compound repurposing, and rare diseases -- Point Loma Nazarene Uni...
GASCAN: A Novel Database for Gastric Cancer Genes and Primers
High Performance Cyberinfrastructure to Support Data-Intensive Biomedical Res...
MDC Connects: Make the Molecules that Matter
VariantSpark: applying Spark-based machine learning methods to genomic inform...
International Cancer Genomics Consortium (ICGC) Data Coordinating Center
Cancer genome repository_berkeley
Free webinar-introduction to bioinformatics - biologist-1
Knowledge Graphs & Graph Data Science, More Context, Better Predictions - Neo...
Bioinformatics Courses and Services Provider
Data Commons & Data Science Workshop
Unveiling DNA sequences: a comparison of machine learning and deep learning t...
Unveiling DNA sequences: a comparison of machine learning and deep learning t...
Data-integration platform for cancer research:cBioPortal demo
V1_I1_2012_Paper5.doc
Ad

More from Neo4j (20)

PDF
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
PDF
Jin Foo - Prospa GraphSummit Sydney Presentation.pdf
PDF
GraphSummit Singapore Master Deck - May 20, 2025
PPTX
Graphs & GraphRAG - Essential Ingredients for GenAI
PPTX
Neo4j Knowledge for Customer Experience.pptx
PPTX
GraphTalk New Zealand - The Art of The Possible.pptx
PDF
Neo4j: The Art of the Possible with Graph
PDF
Smarter Knowledge Graphs For Public Sector
PDF
GraphRAG and Knowledge Graphs Exploring AI's Future
PDF
Matinée GenAI & GraphRAG Paris - Décembre 24
PDF
ANZ Presentation: GraphSummit Melbourne 2024
PDF
Google Cloud Presentation GraphSummit Melbourne 2024: Building Generative AI ...
PDF
Telstra Presentation GraphSummit Melbourne: Optimising Business Outcomes with...
PDF
Hands-On GraphRAG Workshop: GraphSummit Melbourne 2024
PDF
Démonstration Digital Twin Building Wire Management
PDF
Swiss Life - Les graphes au service de la détection de fraude dans le domaine...
PDF
Démonstration Supply Chain - GraphTalk Paris
PDF
The Art of Possible - GraphTalk Paris Opening Session
PPTX
How Siemens bolstered supply chain resilience with graph-powered AI insights ...
PDF
Knowledge Graphs for AI-Ready Data and Enterprise Deployment - Gartner IT Sym...
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Jin Foo - Prospa GraphSummit Sydney Presentation.pdf
GraphSummit Singapore Master Deck - May 20, 2025
Graphs & GraphRAG - Essential Ingredients for GenAI
Neo4j Knowledge for Customer Experience.pptx
GraphTalk New Zealand - The Art of The Possible.pptx
Neo4j: The Art of the Possible with Graph
Smarter Knowledge Graphs For Public Sector
GraphRAG and Knowledge Graphs Exploring AI's Future
Matinée GenAI & GraphRAG Paris - Décembre 24
ANZ Presentation: GraphSummit Melbourne 2024
Google Cloud Presentation GraphSummit Melbourne 2024: Building Generative AI ...
Telstra Presentation GraphSummit Melbourne: Optimising Business Outcomes with...
Hands-On GraphRAG Workshop: GraphSummit Melbourne 2024
Démonstration Digital Twin Building Wire Management
Swiss Life - Les graphes au service de la détection de fraude dans le domaine...
Démonstration Supply Chain - GraphTalk Paris
The Art of Possible - GraphTalk Paris Opening Session
How Siemens bolstered supply chain resilience with graph-powered AI insights ...
Knowledge Graphs for AI-Ready Data and Enterprise Deployment - Gartner IT Sym...
Ad

Recently uploaded (20)

PDF
Machine learning based COVID-19 study performance prediction
PDF
Encapsulation theory and applications.pdf
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
MYSQL Presentation for SQL database connectivity
PPTX
sap open course for s4hana steps from ECC to s4
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
KodekX | Application Modernization Development
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
Cloud computing and distributed systems.
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
cuic standard and advanced reporting.pdf
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
Machine learning based COVID-19 study performance prediction
Encapsulation theory and applications.pdf
Spectral efficient network and resource selection model in 5G networks
Advanced methodologies resolving dimensionality complications for autism neur...
Network Security Unit 5.pdf for BCA BBA.
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
MYSQL Presentation for SQL database connectivity
sap open course for s4hana steps from ECC to s4
The AUB Centre for AI in Media Proposal.docx
KodekX | Application Modernization Development
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Cloud computing and distributed systems.
“AI and Expert System Decision Support & Business Intelligence Systems”
cuic standard and advanced reporting.pdf
Understanding_Digital_Forensics_Presentation.pptx
NewMind AI Weekly Chronicles - August'25 Week I
Unlocking AI with Model Context Protocol (MCP)
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Mobile App Security Testing_ A Comprehensive Guide.pdf

Enabling Patient-Driven Medicine Using Graph Database

  • 1. © 2022 Neo4j, Inc. All rights reserved. Enabling Patient-Driven Medicine Using Graph Database A systems biology paradigm for medicine Kasthuri Kannan, PhD Associate Professor, Director (Computational Pathology Program) Departments of Translational Molecular Pathology & Neurosurgery
  • 2. © 2022 Neo4j, Inc. All rights reserved. Internship Knowledge Based Systems Jan. 2007-Dec. 2007 Research Specialist/Associate Stowers Institute for Medical Research/ Penn State Jan. 2008--Mar. 2011 Research Fellow/Associate Memorial Sloan-Kettering Cancer Center Apr. 2011-Oct. 2013 Assistant Professor New York University Nov. 2013-Aug. 2019 Bachelor of Science Mathematics (BSc) University of Madras Apr. 1998 Master of Science Mathematics (MSc) IIT, Madras Apr. 2000 Master of Science Mathematics (MS) Texas A&M University Apr. 2002 education acquired credentials Kasthuri Kannan Doctor of Philosophy Computer Science (PhD) Texas A&M University Aug. 2008 creativity skills Several successful collaborations Active learner and listener Emotional intelligence Communication Proven track record of applying data science to biology and omics data Image analysis/processing Designing/directing data science & machine learning courses Software development Bioinformatics pipeline development NGS analysis/processing Systems Biology Directed insights using data science Mathematical & statistical modeling + Visualization Data science (14 years) Graph Databases (Neo4j) Machine learning Several bioinformatics tools Implementation skills R/Python/C/C++ Strong Unix skills Image analysis Statistics, Mathematics Top publications work experience directed efforts Associate Professor & Director UT MD Anderson Cancer Center Feb. 2020-Present self- motivation Data Scientist (Cancer Research)
  • 3. © 2022 Neo4j, Inc. All rights reserved. 3 The state of modern biomedicine
  • 4. © 2022 Neo4j, Inc. All rights reserved. 4 Basics (Genetics) Gene Expression. Image Credit: udaix/Shutterstock.com Gene expression
  • 5. © 2022 Neo4j, Inc. All rights reserved. 5 Same DNA but different mRNA (and hence different proteins) Image Credits: Hubrecht Institute, https://www.pnas.org/doi/10.1073/pnas.0509380102
  • 6. © 2022 Neo4j, Inc. All rights reserved. 6 How different mRNAs are produced by the same DNA? A UGA Protein X Protein Y Image Credits: courses.lumenlearning.com, https://www.clancymedicalgroup.com/amino-acids/, themedicalbiochemistrypage.org
  • 7. © 2022 Neo4j, Inc. All rights reserved. 7
  • 8. © 2022 Neo4j, Inc. All rights reserved. 8 Yet another mechanism… Genes Expressed Image Credit: https://www.zymoresearch.com/pages/what-is-epigenetics
  • 9. © 2022 Neo4j, Inc. All rights reserved. 9 Nature vs Nurture Image Credit: https://www.genomicseducation.hee.nhs.uk/
  • 10. © 2022 Neo4j, Inc. All rights reserved. 10 Absence of consensus
  • 11. © 2022 Neo4j, Inc. All rights reserved. 11 Algorithm • Find all 3.2 billion letters as a sequence • or extract sequential coding letters of the genome ~ 20K genes • Look for gene sequence changes • Look for methylated regions (promotors) • Correlate with the mRNA numbers (transcripts)
  • 12. © 2022 Neo4j, Inc. All rights reserved. 12 Next-generation sequencing Made identifying the sequences of nucleotides possible Can identify both DNA, RNA and methylated nucleotides/sequences Affordable with less turn-around time (< $1000 & 8-11 days) Reference genome available through Human Genome Project Image Credit: https://bitesizebio.com/
  • 13. © 2022 Neo4j, Inc. All rights reserved. 13 Data deluge in biology • Explosion of biological datasets Image Credits: https://www.nature.com/articles/d41586-019-03536-x, https://th.wikipedia.org/wiki/%E0%B9%84%E0%B8%9F%E0%B8%A5%E0%B9%8C:Karyotype_color_chromosomes_white_b ackground.png, https://www.activemotif.com/catalog/2/gene-regulation, https://www.activemotif.com/catalog/42/dna- methylation-enrichment
  • 14. © 2022 Neo4j, Inc. All rights reserved. 14 Yet no consensus  Image Credit: https://www.flyaeroguard.com/learning-center/parts-of-an-airplane/
  • 15. © 2022 Neo4j, Inc. All rights reserved. 15 Systems thinking vs reductionism in biology • Modern biology is more or less “reductionistic” • Divide and conquer may not work when studying complex systems • Biomedicine and especially cancer is complex and adaptive • Reductionism is like relational databases (breaking into tables) • Need an integrative paradigm – SYSTEMS APPROACH
  • 16. © 2022 Neo4j, Inc. All rights reserved. 16 Integrativetechnology defines systems biology Image Credit: Institute of Systems Biology
  • 17. © 2022 Neo4j, Inc. All rights reserved. 17 Systems biology initiative for brain tumors Goal of the project: creating a unifying computational framework to integrate all available patient-centric brain tumor recurrence data to generate hypotheses in an unbiased manner for actionable target validation and predicting disease outcomes for initiating clinical trials.
  • 18. © 2022 Neo4j, Inc. All rights reserved. 18 Glioblastoma – most lethal of several brain tumors Image Credit: braintumor.org
  • 19. © 2022 Neo4j, Inc. All rights reserved. 19 Graph architecture • Integration and unbiased analysis becomes natural • Integration is handled by defining the relationships and unbiased analysis is handled by writing appropriate queries
  • 20. © 2022 Neo4j, Inc. All rights reserved. 20 Is this idea new? 1Irina Balaur, Mansoor Saqi, Ana Barat, Artem Lysenko, Alexander Mazein, Christopher J. Rawlings, Heather J. Ruskin, and Charles Auffray. EpiGeNet: A Graph Database of Interdependencies Between Genetic and Epigenetic Events in Colorectal Cancer. Journal of Computational Biology. Oct 2017.969-980 Integration is either subjective, restricted to datatypes or phenotypes, i.e., relationships are not general enough Necessity to create additional relationship types for new data Results in restricted survival analysis
  • 21. © 2022 Neo4j, Inc. All rights reserved. 21 What is new? Molecular Event patients > 1 Our database (Basic Representation) Molecular Event
  • 22. © 2022 Neo4j, Inc. All rights reserved. 22 Graph ENgine fOr systeMs mEdicine (GENOME) Such architecture allows any type of patient-data integration and allows comprehensive survival analysis Copy number events Expression events Mutation events Methylation events Any two events are connected if they are shared by more than one patient Graph ENgine fOr systeMs mEdicine (GENOME)
  • 23. © 2022 Neo4j, Inc. All rights reserved. 23 GENOME overview
  • 24. © 2022 Neo4j, Inc. All rights reserved. 24 Filters fastq files align .bam files re-align (GATK) variant calling filtering driver analysis driver mutations Molecular Event patients > 1 Molecular Event
  • 25. © 2022 Neo4j, Inc. All rights reserved. 25 Survival correlation A B C 13 w x y z (wx), (xy), (yz), (zw): 1-length paths (4) (wxy), (xyz), (yzw), (zwx): 2-length paths (4) (wxyz), (xyzw), (yzwx), (zwxy): 3-length paths (4) (wxyzw): 4-length paths (1) 11K nodes and 16M relationships
  • 26. © 2022 Neo4j, Inc. All rights reserved. 26 Random walk simulations I Path no. 51 p < 0.03 ITGA8-ABCA4-C22orf42- Intergenic CpG-NGF- CCL1-HBBP1-14q32.13- 10q11.22-C17orf54 Path no. 54 Path no. 51 CAMTA1-17q25.3-10q26.13- WWOX-Intergenic CpG- CHST1-17q25o3-Intergenic CpG-TRRAP-CPS1 Path no. 54 p < 0.03
  • 27. © 2022 Neo4j, Inc. All rights reserved. 27 Random walk simulations II p < 0.005
  • 28. © 2022 Neo4j, Inc. All rights reserved. 28 Random walk simulations III NMUR2-GABRG2-FKBP8- KCNK9-HOXD1-6p22.3 p < 0.001 Path no. 16
  • 29. © 2022 Neo4j, Inc. All rights reserved. 29 Power of this approach • Identified several functionally important genes and regions for conducting biological experiments
  • 30. © 2022 Neo4j, Inc. All rights reserved. 30 Novelty of this approach • Provides a strong framework for systems biology paradigm in medicine • Patient-driven than patient-properties driven • Truly unbiased and integrative ◦ Not time dependent (based on scientific consensus) or subjective ◦ Correlation with survival combining multiple biomarkers has not been attempted
  • 31. © 2022 Neo4j, Inc. All rights reserved. 31 Acknowledgements and support • Patients and their families – entrusting us with their data • Neo4j and Neo4j team (Rob Martin, Phani Dathar and Greg Shirley) • Yang Liu, PhD student – actively developing the database • Drs. Jason Huse, Vinay Puduvalli, Frederick Lang - unwavering support • Dr. David Jaffray, Sr. Vice President and CTO (MDACC) - insights, database availability • Moon ShotProgram® - coordinators and administrative support (neurosurgery, neuro-onc.) – funding and logistics to make the project possible • Brain Tumor Center, MD Anderson Cancer Center; The GLASS consortium • Rui Jiang, Suren – database support at MD Anderson Cancer Center • IT support staff at MD Anderson Dr. David Jaffray Dr. Jason Huse Yang, Liu
  • 32. © 2022 Neo4j, Inc. All rights reserved. 32 Thank you! Questions?! Contact information: kskannan@mdanderson.org, kasthuri@gmail.com https://kannan-kasthuri.github.io/