Neo4j Demo: Using Knowledge Graphs to Classify Diabetes Patients (GlaxoSmithKline)

© 2022 Neo4j, Inc. All rights reserved.
1
Neo4j Demo
Classify Diabetes Patients
and Connect to Knowledge Graphs
Dr Alexander Jarasch
Field Engineering Specialist Pharma / Healthcare / Biotech

2
Use Cases for the Entire Drug Life Cycle
Target
Discovery
Hit
Generation
Lead
Identification
Lead
Optimization
Animal
Models
Clinical
Trials
FDA/EMA
Review &
Approval
Post
Approval
Manufacturing
F100
F100
Cyber
security
F100

3
Native Graph Database
The foundation of the Neo4j platform; delivers enterprise-scale and performance,
security, and data integrity for transaction and analytical workloads.
Data Science and Analytics
Explorative tools, rich algorithm library, and Integrated supervised Machine Learning
framework.
Development Tools & Frameworks
Tooling, APIs, query builder, multi-language support for development, admin, modelling,
and rapid prototyping needs.
Discovery & Visualization
Code-free querying, data modeling and exploration tools for data scientists, developers,
and analysts.
Graph Query Language Support
Cypher & openCypher; Ongoing leadership and standards work (GQL) to establish
lingua franca for graphs.
Ecosystem & Integrations
Rich ecosystem of tech and integration partners. Ingestion tools (JDBC, Kafka, Spark, BI
Tools, etc.) for bulk and streaming needs.
Runs Anywhere
Deploy as-a-Service (AuraDB) or self-hosted within your cloud of choice (AWS, GCP,
Azure) via their marketplace, or on-premises.
Neo4j Graph Data Platform

4
65+ Graph Algorithms - Out of the Box
Pathfinding & Search Centrality Community Detection
❏ Delta-Stepping Single-Source
❏ Dijkstra’s Single-Source
❏ Dijkstra Source-Target
❏ All Pairs Shortest Path
❏ A* Shortest Path
❏ Yen’s K Shortest Path
❏ Minimum Weight Spanning Tree
❏ Random Walk
❏ Breadth & Depth First Search
❏ Degree Centrality
❏ Closeness Centrality
❏ Harmonic Centrality
❏ Betweenness Centrality & Approx.
❏ PageRank
❏ Personalized PageRank
❏ ArticleRank
❏ Eigenvector Centrality
❏ Hyperlink Induced Topic Search (HITS)
❏ Influence Maximization (Greedy,
CELF)
❏ Weakly Connected Components
❏ Strongly Connected Components
❏ Label Propagation
❏ Leiden
❏ Louvain
❏ K-Means Clustering
❏ K-1 Coloring
❏ Modularity Optimization
❏ Speaker Listener Label Propagation
❏ Approximate Max K-Cut
❏ Triangle Count
❏ Local Clustering Coefficient
❏ Conductance Metric
Heuristic LP Similarity Graph Embeddings
❏ Adamic Adar
❏ Common Neighbors
❏ Preferential Attachment
❏ Resource Allocations
❏ Same Community
❏ Total Neighbors
❏ K-Nearest Neighbors (KNN)
❏ Filtered K-Nearest Neighbors (KNN)
❏ Node Similarity
❏ Filtered Node Similarity
❏ Similarity Functions
❏ Fast Random Projection (FastRP)
❏ Node2Vec
❏ GraphSAGE
❏ Node Classification Pipeline
❏ Link Prediction Pipeline
❏ Node Regression Pipeline

5
What We Do in This Demo
• GDS Node Classification Pipeline on patient data
• Loading data
• Data preparation -> Fast Random Projections
• Model training
• Predict T2D / non-T2D based on transcriptomics
• GDS Community Detection
• For Subphenotyping

From Chaos to
Structure:
Neo4j Graph Data
Science is Changing
How Machine Learning
Gets Done
Graph Embeddings summarize the enhanced
explicit knowledge of a graph
6

7
Dataset

8
Dataset
• 63 patients (total)
• 51 classified as T2D / non-T2D
• 12 unclassified
• 18k transcripts measured on each patient

9
Data Model
Transcript
Patient
:MEASURED
value
:SIMILAR
similarityScore

10
Demo Time

11
Coming Back to Knowledge Graphs

12
Clinical Data from Patients
Gender Age BMI diab.status Sample coll #samples #used
surgery disease Histological diagnosis

Neo4j, Inc. All rights reserved 2022
13
Transform Data with GDS - Fast Random Projections
CALL gds.fastRP.write(
'patients',
{
embeddingDimension: 50,
writeProperty: ‘fastRP-
embedding’
}
)
YIELD nodePropertiesWritten

15
Summary
• Graph Data Science library to perform out of the box machine learning
• Using Graph Embeddings (FastRP) to represent heterogenous data as
vector keeping the topology
• New GDS pipeline for Node Classification
• GDS Community Detection

16
Thank you!
Contact me
alexander.jarasch@neo4j.com

Neo4j Demo: Using Knowledge Graphs to Classify Diabetes Patients (GlaxoSmithKline)

More Related Content

What's hot (20)

Similar to Neo4j Demo: Using Knowledge Graphs to Classify Diabetes Patients (GlaxoSmithKline) (20)

More from Neo4j (20)

Recently uploaded (20)

Neo4j Demo: Using Knowledge Graphs to Classify Diabetes Patients (GlaxoSmithKline)