SlideShare a Scribd company logo
The power of the
Cognitive Probability Graph
(aka Cognitive Computing)
June 2016
Jans Aasman
ja@franz.com
10 years ago
Structured Data
7 years ago
Structured Data Unstructured Data
4 to 5 years ago
Structured Data
Unstructured
Data
Knowledge
Domain knowledge
Linked Open Data
Vocabularies
Taxonomies/Ontologies
New #1: Learning. Feed output of data
science back into data infrastructure
Structured
Data
Unstructured
Data
Knowledge
Domain knowledge
Linked Open Data
Vocabularies
Taxonomies/Ontolo
gies
Probabilistic
Inferences.
New # 2: everything in one (distributed)
semantic graph
Structured
Data
Unstructured
Data
Knowledge
Domain
knowledge
Linked Open Data
Vocabularies
Taxonomies/Ontol
ogies
Probabilistic
Inferences.
Unstructured
Data
Knowledge
Domain
knowledge
Linked Open Data
Vocabularies
Taxonomies/Ontol
ogies
AKA: Cognitive Computing
Structured
Data
Unstructured
Data
Knowledge
Domain
knowledge
Linked Open Data
Vocabularies
Taxonomies/Ontol
ogies
Probabilistic
Inferences.
Unstructured
Data
Knowledge
Domain
knowledge
Linked Open Data
Vocabularies
Taxonomies/Ontol
ogies
Examples
Examples
• Healthcare: If I have this class of diagnostics and I get this procedure what are some of the new
symptoms I might get in the next two years.
• eCommerce and brand protection: find all my products based on product similarity
• Logistics: what can I statistically predict about part P breaking down and what other parts do I
usually buy after that part breaks down.
• Police Intelligence: find the most plausible story of a temporally orderend shortest path between
two criminal through observed (hard) facts and inferred (soft facts)
• Fraud detection: find links between your local chamber of commerce and the panama papers
through similar names and addresses.
Example healthcare
• Franz and Montefiore are partners in the Semantic Data Lake project.
One cognitive computing platform for all healthcare analytics
Example healthcare
• We created a single data centric platform that can serve any type of
analytic without building a new data mart for every new question.
• Currently 2.7 million patients with 10 years of data
• All data captured in a Unified Clinical Event model with 350 classes of
events.
Healthcare: structured and unstructured data
AllegroGraph - Cognitive Probability Graph webcast
Structured patient data combined with complex
integrated terminology
Provenance for every value
Healthcare: the knowledge bases
• More > 180 vocabularies and terminology systems integrated in on
unified terminology system (Mesh, Snomed, UMLS, RxNorm, LOINC etc,
etc)
• External databases and
• Linked Open Data
OMOP
11089001
6600349
11894800
5
7534205
16790501
14667809
35896705
9209005
1732609
9908905
1469609
329005
LOINC
113345001 140460009
skos:semanticRelation
skos:narrower
118948005
skos:broader
“9209005”
SNOMEDCT
M0024135
M0008124 M0004742
skos:semanticRelation
skos:narrower
M0015742
skos:broader“Abdominal Pain”
“M0024135”
skos:exactMatch
A0549302
A0978543
skos-xl:prefLabel
9209005
“Abdominal Pain”
SAB
AUI
SUI
MeSH
SNOMEDCT
MedDRA
rdfs:subClassOf
rdf:type
C0172359 C0232487
C0238551
“Abdominal Pain”
“C0000737”
skos:semanticRelation
skos:broader
UMLS - MTH
skos:notation
S035799
skos-xl:label
MTH
STR
C000737
Everything linked through SKOS
SKOS/SKOS-XL
ConceptScheme
Concept
UMLS - Semantic Net
Entity Event
Label
Population,
Community
Time
Pt.Pt.Pt.Pt.
SDL Paradigm:
Pt.Pt.Pt.Pt.
Diagnosis
Codes
Disease
Classification
OMIM,
GONG
Genetic
Profile
Procedure
CodeHCPC
Manufacturers
PharmKGB
Drug
Classification
Drug
Codes
DrugBank
ClinicalTrials
CER
PubMed
Analytic Tapestry
(closed loop analytics)
AllegroGraph - Cognitive Probability Graph webcast
AllegroGraph - Cognitive Probability Graph webcast
Healthcare: probabilistic inferences
Why is this so important?
• Usually the output of data science results in reports and publications but
• No formal trace where the data came from
• No formal link to the actual methods you used, or who did it, or when you did it
• Cannot be compared to earlier results
• Cannot be used as building blocks for further research
• In general : the output is not queryable
• This is not good for delivery of care, reproducibility of research findings,
security and compliance, and results in loss of value-added information,
and enterprise intellectual property and assets, and unnecessary
duplication of efforts
Odds ratio
AllegroGraph - Cognitive Probability Graph webcast
AllegroGraph - Cognitive Probability Graph webcast
AllegroGraph - Cognitive Probability Graph webcast
Association rules
AllegroGraph - Cognitive Probability Graph webcast
K-means clustering
AllegroGraph - Cognitive Probability Graph webcast
AllegroGraph - Cognitive Probability Graph webcast
And then a query you could do never before
• Using the Knowledge Base, the Structured Data and the Probabilistic
inferences all at the same time.
• To find the statistical links between Diabetes and Vision problems in our
Semantic Data Lake
• Find the set of ICD9s that are connected via one or more steps to
concepts in the KB that mention Diabetes
• Find the set of ICD9s that are connected via one or more steps to
vision* or eye* or retinal*
• An show how those two sets are related in the space of odds ratios
AllegroGraph - Cognitive Probability Graph webcast
AllegroGraph - Cognitive Probability Graph webcast
And then just a few other examples
In the ecommerce world: find similar objects based on > 10
criteria, including description, product codes, pictures, etc
AllegroGraph - Cognitive Probability Graph webcast
Returns a Graph in a Table (ughh )
But powerful when visualized
Or like this
And linking with the panama papers
And now the researchers can start investigating
Summary: this is the new paradigm of computing
Structured
Data
Unstructured
Data
Knowledge
Domain
knowledge
Linked Open Data
Vocabularies
Taxonomies/Ontol
ogies
Probabilistic
Inferences.
Unstructured
Data
Knowledge
Domain
knowledge
Linked Open Data
Vocabularies
Taxonomies/Ontol
ogies

More Related Content

PDF
The Power of Machine Learning and Graphs
PDF
Why is JSON-LD Important to Businesses - Franz Inc
PDF
Toronto OpenRefine MeetUp Nov 2015
PPTX
LD4KD 2015 - Demos and tools
PDF
2017-01-08-scaling tribalknowledge
DOCX
IEEE 2014 JAVA DATA MINING PROJECTS Scalable keyword search on large rdf data
PDF
Iterative data discovery and transformation with open refine
PPTX
Introduction Big data
The Power of Machine Learning and Graphs
Why is JSON-LD Important to Businesses - Franz Inc
Toronto OpenRefine MeetUp Nov 2015
LD4KD 2015 - Demos and tools
2017-01-08-scaling tribalknowledge
IEEE 2014 JAVA DATA MINING PROJECTS Scalable keyword search on large rdf data
Iterative data discovery and transformation with open refine
Introduction Big data

What's hot (20)

PPTX
OpenRefine Tutorial
PDF
Slide 2 collecting, storing and analyzing big data
PDF
Congressional PageRank: Graph Analytics of US Congress With Neo4j
PDF
GraphDB Cloud: Enterprise Ready RDF Database on Demand
PDF
Distributed machine learning 101 using apache spark from a browser devoxx.b...
PDF
Discovering Emerging Tech through Graph Analysis - Henry Hwangbo @ GraphConne...
PDF
Fast Data processing with RFX
PDF
Smarter content with a Dynamic Semantic Publishing Platform
PPT
BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015
PDF
Machine learning for java developers
PPTX
Spark for Recommender Systems
PPT
Analytics and Access to the UK web archive
PPTX
Strata sf - Amundsen presentation
PPTX
Data Analytics with R and SQL Server
PPTX
NoSQL: what does it mean, how did we get here, and why should I care? - Hugo ...
PPTX
Nicola Pagni - Anomaly Detection in Elasticsearch
PPTX
Python for data science
PDF
(Big) Data Science
PDF
R vs Python vs SAS
KEY
Panda Provenance
OpenRefine Tutorial
Slide 2 collecting, storing and analyzing big data
Congressional PageRank: Graph Analytics of US Congress With Neo4j
GraphDB Cloud: Enterprise Ready RDF Database on Demand
Distributed machine learning 101 using apache spark from a browser devoxx.b...
Discovering Emerging Tech through Graph Analysis - Henry Hwangbo @ GraphConne...
Fast Data processing with RFX
Smarter content with a Dynamic Semantic Publishing Platform
BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015
Machine learning for java developers
Spark for Recommender Systems
Analytics and Access to the UK web archive
Strata sf - Amundsen presentation
Data Analytics with R and SQL Server
NoSQL: what does it mean, how did we get here, and why should I care? - Hugo ...
Nicola Pagni - Anomaly Detection in Elasticsearch
Python for data science
(Big) Data Science
R vs Python vs SAS
Panda Provenance
Ad

Viewers also liked (9)

PPT
InfiniteGraph Presentation from Oct 21, 2010 DBTA Webcast
PPTX
Chatt State Library Staff: Parks and Recreation
PDF
PowerOfRelationshipsInBigData_SVNoSQL
PPT
Objectivity/DB: A Multipurpose NoSQL Database
PDF
Sparksee Technology overview
PDF
Sparksee overview
PPT
An Introduction to Graph Databases
PPTX
Introduction to Graph Databases
InfiniteGraph Presentation from Oct 21, 2010 DBTA Webcast
Chatt State Library Staff: Parks and Recreation
PowerOfRelationshipsInBigData_SVNoSQL
Objectivity/DB: A Multipurpose NoSQL Database
Sparksee Technology overview
Sparksee overview
An Introduction to Graph Databases
Introduction to Graph Databases
Ad

Similar to AllegroGraph - Cognitive Probability Graph webcast (20)

PDF
Big Data in Healthcare and Medical Devices
PPTX
Combining Explicit and Latent Web Semantics for Maintaining Knowledge Graphs
PPTX
Becoming Datacentric
PPTX
Data_Science_Applications_&_Use_Cases.pptx
PDF
Cri big data
PPTX
Real-time applications of Data Science.pptx
PPTX
Data_Science_Applications_&_Use_Cases.pptx
PPTX
Dynamic Search Using Semantics & Statistics
PPTX
Bayesian reasoning
PDF
AI for Marking Industry application for.pdf
PDF
Data_Science_Applications_&_Use_Cases.pdf
PPTX
A Big Picture in Research Data Management
PPTX
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
PDF
Nordic health data metadata
PPTX
Melissa Informatics - Data Quality and AI
PDF
Exploratory Data Analysis
PDF
Using Machine Learning to Automate Clinical Pathways
PPTX
2016 Scope david cocker
PDF
Week_2_Lecture.pdf
PDF
Building safety-critical medical device platforms and Meaningful Use EHR gate...
Big Data in Healthcare and Medical Devices
Combining Explicit and Latent Web Semantics for Maintaining Knowledge Graphs
Becoming Datacentric
Data_Science_Applications_&_Use_Cases.pptx
Cri big data
Real-time applications of Data Science.pptx
Data_Science_Applications_&_Use_Cases.pptx
Dynamic Search Using Semantics & Statistics
Bayesian reasoning
AI for Marking Industry application for.pdf
Data_Science_Applications_&_Use_Cases.pdf
A Big Picture in Research Data Management
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Nordic health data metadata
Melissa Informatics - Data Quality and AI
Exploratory Data Analysis
Using Machine Learning to Automate Clinical Pathways
2016 Scope david cocker
Week_2_Lecture.pdf
Building safety-critical medical device platforms and Meaningful Use EHR gate...

Recently uploaded (20)

PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PDF
Empathic Computing: Creating Shared Understanding
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Approach and Philosophy of On baking technology
PPTX
OMC Textile Division Presentation 2021.pptx
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PPTX
cloud_computing_Infrastucture_as_cloud_p
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Machine learning based COVID-19 study performance prediction
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
Encapsulation theory and applications.pdf
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPTX
Tartificialntelligence_presentation.pptx
PDF
A comparative study of natural language inference in Swahili using monolingua...
PPT
Teaching material agriculture food technology
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Accuracy of neural networks in brain wave diagnosis of schizophrenia
Empathic Computing: Creating Shared Understanding
Per capita expenditure prediction using model stacking based on satellite ima...
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Digital-Transformation-Roadmap-for-Companies.pptx
Programs and apps: productivity, graphics, security and other tools
Approach and Philosophy of On baking technology
OMC Textile Division Presentation 2021.pptx
gpt5_lecture_notes_comprehensive_20250812015547.pdf
cloud_computing_Infrastucture_as_cloud_p
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Machine learning based COVID-19 study performance prediction
A comparative analysis of optical character recognition models for extracting...
Encapsulation theory and applications.pdf
MIND Revenue Release Quarter 2 2025 Press Release
Tartificialntelligence_presentation.pptx
A comparative study of natural language inference in Swahili using monolingua...
Teaching material agriculture food technology

AllegroGraph - Cognitive Probability Graph webcast