1. CS520: KNOWLEDGE GRAPHS
Data Models, Knowledge Acquisition, Inference, Applications
Learn about the basic concepts,
latest research & applications
Lectures and Invited Guests
Spring 2021, Tu/Thu 4:30-5:50, cs520.Stanford.edu
3. Motivation for the Seminar
• Knowledge Graphs are being used in
• Web search
• Answering questions
• Data integration
• Knowledge Graphs are also target of output for
• NLP and computer vision algorithms
• ML algorithms more generally
• Knowledge Graphs are a topic of a major program from NSF
• https://www.nsf.gov/od/oia/convergence-
accelerator/Award%20Listings/track-a.jsp
4. Seminar Outline
Knowledge Graph
• What is it?
• How do create it?
• How do we reason with it?
• How do we use it with modern AI algorithms?
• Where is the research?
5. Course Design
• Two 80-minute sessions each week (Tue/Thu)
• Tuesday sessions based on the synthesis of key points from the 2020 series
• The synthesis points are also available as written notes on the course website
• Some Tuesday sessions will also have invited guests
• Thursday sessions will feature invited guests
• (Generally) two 30-minute presentations
• Followed by Q & A
• Recordings will be available on the course web site
6. For Stanford Students
• Complete a quiz for all 10 of the Tuesday sessions
• Submit a written summary for any 8 of the 10 Thursday sessions
8. Outline
• Knowledge Graph
• Resurgence of interest in Knowledge Graphs
• Search engines
• Data integration
• Artificial Intelligence
• What is new and different?
9. What is a Knowledge Graph?
B
C
A
Directed Labeled Graph
Nodes and edges have well-defined meanings
10. What is a Knowledge Graph?
Predicate
Object
Subject
Directed Labeled Graph
Nodes and edges have well-defined meanings
11. What is a Knowledge Graph?
Relation
Entity
Entity
Directed Labeled Graph
Nodes and edges have well-defined meanings
12. What is a Knowledge Graph?
subclass of
Class
Class
Directed Labeled Graph
Nodes and edges have well-defined meanings
13. What is a Knowledge Graph?
friends
bob
art
Directed Labeled Graph
Nodes and edges have well-defined meanings
14. What is a Knowledge Graph?
subclass of
Human
Person
Directed Labeled Graph
Nodes and edges have well-defined meanings
15. Different ways to define meaning
• Based on a user’s actions
• friend relationship
• Explanation in a human understandable language
• E.g., linguistic resource Wordnet
• Logical Specification
• Using a set of axioms
• Associating examples
• Defining a cat using a set of images
• Embeddings
• Statistics on a corpus of text
16. Rich History of work on Knowledge Graphs
• Knowledge Representation
• Semantic networks
• Description logics
• Conceptual graphs
• Database systems
• Network databases
• Triple stores
17. Outline
• Graphs in Computer Science
• Resurgence of interest in Knowledge Graphs
• Search engines
• Data integration
• Artificial Intelligence
• What is new and different?
18. Knowledge Graphs in Search
• The Winterthur example
• This example was introduced by Denny Vrandečić
• For more details
• Visit his Spring 2020 presentation
• A story linked to the course website
26. Problem
• Twin Towns and Sister Cities are identical concepts
• The reference to Winterthur in the Ontario Page appears in text description
• There is no easy way to resolve the differences
32. Graph Underlying Wikidata
Winterthur Ontario
Twinned administrative body
Zurich Metropolitan Area
Switzerland
part of
part of
country
United States
part of
North America
33. Graph Underlying Wikidata
Winterthur Ontario
Twinned administrative body
Zurich Metropolitan Area
Switzerland
part of
part of
country
United States
part of
North America
Winterthur
same as
LIBRARY OF CONGRESS
35. We can also query the data
Display on a map the birth cities of people who died in Winterthour?
• Requires querying multiple data sources on the web
• Requires understanding their schemas
• Schemas published using Schema.Org vocabulary
• Structured results can then be included in the search results on the
web pages
39. Wikidata Knowledge Graph
• A graph of unprecedented scale
• Collaboratively created
• Data may be curated manually or automatically
• Semantic definitions in Schema.Org
• Compelling use case: Web Search
40. Outline
• Graphs in Computer Science
• Resurgence of interest in Knowledge Graphs
• Search engines
• Data integration
• Artificial Intelligence
• What is new and different?
41. Example Use Case
• 360 Degree View of a Customer
INTERNAL
COMPANY
DATA
Who is funding who? Who supplies to who? Who are my customers?
Risk Analysis for Lending Decisions
Business Intelligence for Marketing
42. Data Integration
• Data reside in multiple sources
• Company directory, product catalog, government database, weather report, …
• Answering queries requires combining data from multiple sources
• We need to provide translations of data between multiple sources
• Direct mappings
• Shared schema
43. Data Integration
• Schema-free approach to data integration
• Convert the relational data from multiple sources into triples
• Stored in a graph database
• Referred to as a knowledge graph
• Deal with schema mappings/translations on “pay as you go” basis
• Visualization
• Optimized for graph traversals
44. Outline
• Graphs in Computer Science
• Resurgence of interest in Knowledge Graphs
• Search engines
• Data integration
• Artificial Intelligence
• What is new and different?
45. Artificial Intelligence
• Output representation for
• Natural Language Processing
• Computer Vision
• Input representation for machine learning
• Language Models
• Graph Models
47. Natural Language Processing
• Entity Extraction
Albert Einstein was a German-
born theoretical physicist who
developed the theory of
relativity.
48. Natural Language Processing
• Entity Extraction
Albert Einstein was a German-
born theoretical physicist who
developed the theory of
relativity.
• Relation Extraction
49. Natural Language Processing
• Entity Extraction
Albert Einstein was a German-
born theoretical physicist who
developed the theory of
relativity.
• Relation Extraction
Question Answering
Common Sense Reasoning
53. Computer Vision
Object Detection • Edge Detection
Man
glasses
bucket
horse
wearing
feeding
holding
eating from
Visual Question Answering
54. Input to Machine Learning
• Machine learning requires numerical input
• Symbolic inputs must be converted to numerical input
• A process known as embedding
• Word Embeddings
• Graph Embeddings
55. Word Embedding
• Primary use case is to calculate similarity between words
• “like” is similar to “enjoy”
• But, generally useful for a variety of language understanding tasks
• Key idea: capture the meaning of a word by counting how often it
occurs next to other words
58. Word Embedding
Meaning of a word is captured by the vector corresponding to each row of co-occurrence counts
Word similarity can be calculated using the distance between the vectors
59. Word Embedding
• A large-scale text corpus can have billion plus words
• The storage requirement for the vectors blows up
• Dimensionality reduction (typically in the range of 200)
• Linear algebra techniques (e.g., Singular Value Decomposition)
• Automatic learning of the necessary parameters
61. Graph Embedding
• Application areas
• Recommendation engines
• Generalize what we did for word embeddings
• Goal is still to reduce the nodes to vectors so that we can calculate the node
similarity as a difference between the vectors
62. Word Embedding to Graph Embedding
• Word embeddings view the text as a linear graph
• Word prediction is the instance of more general problem of link prediction
I like knowledge graphs .
63. Graph Embedding
• Example encoding function
• Randomly walk the graph
• Compute the cooccurrence counts between the nodes
• Once nodes have been converted into vectors
• calculate node similarity
• Optimize the encoding function
64. Knowledge Graphs and AI
• Output representation for
• Natural language processing
• Computer vision
• Input representation for machine learning
• Language models
• Graph models
65. Summary
• Graphs are a fundamental construct in discrete mathematics
• Defining meaning is the crux of the problem for knowledge graphs
• Rich history in knowledge representation and databases
• Recent surge of interest driven by
• Use of structured data in web search results
• Progress in NLP and vision
• Progress in ML to perform predictive tasks
• What’s new?
• Scale
• Bottom-up development
• Multiple modes of construction
66. What are Knowledge Graphs
and why do we need them?
Prof. Chaitanya Baru
National Science Foundation
Thursday, April 1, 2021