SlideShare a Scribd company logo
Yunyao Li
PAN-DL@EMNLP’23 | Adobe | December, 2023
The Role of Patterns in the Era of
Large Language Models
Initial Learnings from Constructing, Growing and Serving Large
Knowledge Graphs*
* Work done at IBM Research and Apple
yunyaol@adobe.com
@yunyao_li
Knowledge Bases
Image Source: https://www.csee.umbc.edu/courses/graduate/691/fall22/kg/
Example: Financial Content Knowledge Base
Financial
Reports
Ontology
[VLDB’2017] Creation and Interaction with Large-scale Domain-Speci
fi
c Knowledge Bases.
XML
Knowledge
Extraction
Overall Architecture: A Simpli
fi
ed View
Linking
Fusion
KG Construction
Transforming
>31,000 companies
439 industries
~170,000 insiders
~100 millions
fi
nancial
metrics ~22,000 industry
KPIs
Financial Content KB
KG Services
QA
APIs
Example: Saga
Structured
Knowledge
Sources
Real-time
Sources
Ontology
Unstructured
Knowledge
Sources
Linking
Fusion
KG Construction KG
Knowledge
Extraction
KG Services
QA
Semantic Annotation
… …
Embedding Services
[SIGMOD’23] Growing and Serving Large Open-domain Knowledge Graphs.
[SIGMOD’22] Saga: A Platform for Continuous Construction and Serving of Knowledge at Scale
Transforming
Overall Architecture: A Simpli
fi
ed View
Key Components
of KG Construction, Growth, and Services
KG
QA Linking
Embedding … ….
Extraction
Integration
Inference Introspection
Services
Construction & Growth
Key Components
of KG Construction, Growth, and Services
KG
QA Linking
Embedding … ….
Extraction
Integration
Inference Introspection
Services
Construction & Growth
“Connor McDavid”
name
“Richmond Hill”
name
“97/01/13”
dob
place of birth
CITY
type
PERSON
type
“Connor McDavid”
name
“Jan 13”
bday
goals
HOCKEY_PLAYER
type
“43”
Source A
Source B
“Connor McDavid”
name
“Richmond Hill”
name
“97/01/13”
dob
place of birth
CITY
type
PERSON
type
“Connor McDavid”
name
“Jan 13”
bday
goals
HOCKEY_PLAYER
type
“43”
Source A
Source B
“Connor McDavid”
name
ID1
“Richmond Hill”
name
“January 13, 1997”
dob
place of birth
CITY
type
PERSON
type
goals
HOCKEY_PLAYER
type
“43”
Linking
Fusion
Entity Normalization & Variant Generation
Learning: Structured Representations
Capture Entity Semantic Structure
[COLING’2018] Exploiting Structure in Representation of Named Entities using Active Learning.
[ICDE’2018] LUSTRE: An Interactive System for Entity Structured Representation and Variant
Generation.
Generated normalizers for Watson Discovery
[AAAI’2020] PARTNER: Human-in-the-Loop Entity Name Understanding with Deep
Learning.
[EMNLP’2020] Learning Structured Representations of Entity Names using Active
Learning and Weak Supervision.
“Bank of America N.A.” “Bank of America National Association”
Pattern-Based: Synthesizing
Normalization and Variant
Generation Functions
“97/01/13” “January 13, 1997”
Key Components
of KG Construction, Growth, and Services
KG
QA Linking
Embedding … ….
Extraction
Integration
Inference Introspection
Services
Construction & Growth
Graph Completion via Ontology Inference
KG
Ontology Inference Rules Updated KG
A has_mother B B has_child A
A has_father B B has_child A
A has_spouse B B has_spouse A
A contains B B is_part_of A
A has_child B A has_child C B has_sibling C C has_sibling B
… …
→
→
→
→
∧ → ∧
Example Inference
Who’s Kylian Mbappé’s mother?
Source: https://en.wikipedia.org/wiki/Kylian_Mbappé
No information
about his mother
Example Inference
Who’s Kylian Mbappé’s mother?
Source: https://www.wikidata.org/wiki/Q45094361
A has_child B A is_a female B has_mother A
∧ →
Fayza Lamri has_child Kylian Mbappé
Fayza Lamri is_a female
Kylian Mbappé has_mother Fayza Lamri
Infer high-quality facts
at scale
Fact Editing for LLM
Ontology-Guided Evaluation
Source: Evaluating the Ripple Effects of Knowledge Editing in Language Models https://arxiv.org/pdf/2307.12976.pdf
Fact Editing for LLM
Ontology-Guided Evaluation
Source: Evaluating the Ripple Effects of Knowledge Editing in Language Models https://arxiv.org/pdf/2307.12976.pdf
Key Components
of KG Construction, Growth, and Services
KG
QA Linking
Embedding … ….
Extraction
Integration
Inference Introspection
Services
Construction & Growth
Scale Fact Collection
Missing / stale facts
Missing
Facts
Query
Synthesizer
QA System
candidate facts
Baseline
New
Facts
Scale Fact Collection
Missing / stale facts
Missing
Facts
Query
Synthesizer
QA System
candidate facts
Baseline
New
Facts
Query-by-Committee
Missing
Facts
Query
Synthesizer
QA System
candidate facts
New
Facts
QA System
Q1
QA System
… …
… …
…
Qn
QbC
Selector
AnswerSet1
AnswerSetn
[EMNLP-DaSH’2022] Improving Human Annotation Effectiveness for Fact Collection by Identifying the Most Relevant Answers
Success Rate
fact collection
25%
Scale Fact Collection
Missing / stale facts
Missing
Facts
Query
Synthesizer
QA System
candidate facts
Baseline
New
Facts
Open Domain Knowledge Extraction
[SIGMOD’23] Growing and Serving Large Open-domain Knowledge Graphs.
Throughput vs.
manual fact collection
>100x
Missing
Facts
Query
Synthesizer
Web Search
candidate facts w/
lower-con
fi
dence
New
Facts
Knowledge
Extractor
Fact
Corroboration
Extraction: Pattern vs. LLM
* All details simpli
fi
ed for presentation
If entity.type = “Person” And If
tuple.key = “Height” Return height
= extract(tuple.value, “d?.d+
s*m”)
You are an accurate information extraction system responsible to
fi
nd answers to a set of questions solely from a given passage.
For example
Now please work on the following task:
Questions: height
Passage:
Title: José Varela
Infobox properties:
{“Full name": "José Carlos Moreira Varela”
“Date of birth”: “15 September 1997 (age 26)”
“Place of birth”: “Praia, Cape Verde”
“Height”: “1.68 m (5 ft 6 in)”
… …}
Key Value
Full name José Carlos Moreira Varela
Date of birth 15 September 1997 (age 26)
Place of
birth
Praia, Cape Verde
Height 1.68 m (5 ft 6 in)
… …
Key-Value Pair Extractor Height Extractor
Height = 1.68 m
Prompt
Pattern-based Extractors
Height = 1.68 m
LLM
LLM-based Extractor
Demonstrate Example
InfoBox
Content
Extraction: Pattern vs. LLM
* All details simpli
fi
ed for presentation purpose
If entity.type = “Person” And If
tuple.key = “Height” Return height
= extract(tuple.value, “d?.d+
s*m”)
Key Value
Born 5 September 1808, Calcutta
…
Died 30 May 1869 (aged 60) ..
Political Party Liberal Party.
Spouse Annie Henrietta Templer …
… …
Key-Value Pair Extractor Height Extractor
Height = null
Pattern-based Extractors
Height = 1.80 m
LLM
LLM-based Extractors
hallucination
You are an accurate information extraction system responsible to
fi
nd
answers to a set of questions solely from a given passage.
For example
Now please work on the following task:
Questions: height
Passage:
Title: Sir Arthur William Buller
Infobox properties:
{“Born": “5 September 1808”
“Calcutta, British India”
… …}
Demonstrate Example
InfoBox
Content
Prompt
Extraction: Pattern vs. LLM
* All details simpli
fi
ed for presentation purpose
If entity.type = “Person” And If
tuple.key = “Spouses”
Return spouse = extract(tuple.value,
PersonNameRegex), start time =
extract(tuple.value,
StartTimeRegex), end time =
extract(tuple.value, EndTimeRegex)
Key Value
Born Jacques Haussmann, …
Died October 31, 1958 (aged 86) …
Citizenship American
Education Clifton College
… …
Key-Value Pair Extractor Spouse Extractor
Pattern-based Extractors
Spouse = Zita Johann
Start time = 1929
End time = 1933
Spouse = Joan Courtney
Start time = 1952
End time = 1988
LLM-based Extractors
You are an accurate information extraction system responsible to
fi
nd
answers to a set of questions solely from a given passage.
For example
Now please work on the following task:
Questions: spouse
Passage:
Title: John Houseman
Infobox properties:
{“Born”: “Jacques Haussmann”
“September 22, 1902”
… …}
Prompt
LLM
Demonstrate Example
InfoBox
Content
Spouse = Zita Johann
Start time = 1929
End time = 1933
Incomplete
Extraction: Pattern vs. LLM
A Side-by-Side Comparison
Pattern-based LLM-based
Throughput
Quality of Results
Simple Cases
Complex Cases
Development
Effort
Simple Cases
Complex Cases
High Low
High High
High
Medium
Medium
Medium
High
Low
Extraction: Pattern vs. LLM
Opportunity to Get the Best of Both Worlds
Source: Language Models Enable Simple Systems for Generating Structured Views of Heterogeneous Data Lakes https://arxiv.org/pdf/2304.09433.pdf
A recent example
Additional reading: Large Language Model Is Not a Good Few-shot Information Extractor, but a Good Reranker for Hard Samples!. https://arxiv.org/abs/2303.08559
Multilingual Coverage of KG
EN
ES
ES
IT
EN
ES
EN
DE
EN
ES
ES
ES
ES
IT
EN
EN
ES
ES
0%
100%
AR DE ES FR IT JA KO RU ZH
36
40
63
36
34
21
24
27
55
64
60
37
64
66
79
76
73
45
Coverage of entity names (Wikidata)
Major gap exists
[EMNLP’23] Increasing Coverage and Precision of Textual Information in Multilingual Knowledge Graphs
[EMNLP’23] Increasing Coverage and Precision of Textual Information in Multilingual Knowledge Graphs
Multilingual Knowledge Graph Enrichment
EN
ES
ES
IT
EN
ES
EN
DE
EN
ES
ES
ES
ES
IT
EN
EN
ES
ES
EN
ES
ES
IT
EN
ES
IT
DE
EN
DE
ES
IT
EN
DE
ES
DE
IT
EN
ES
ES
IT
DE
EN
ES
ES
IT
DE
EN
ES
ES
IT
DE
EN
ES
ES
IT
DE
Before
Existing KG
After
Multilingually-enriched KG
M-NTA
Increasing multilingual coverage of locale-speci
fi
c facts.
M-NTA | Multi-source Naturalization, Translation, and Alignment
Leverages complementary knowledge across locales and tools
Naturalization
triple-to-text
KG
Machine Translation
Web Search
LLMs
Alignment
text-to-triple
Ensemblement
Triple Selection
Apple, is_a, fruit of the apple tree
Apple, is_a, American
multinational technology company
…
⟨
⟩
⟨
⟩
Apple is a fruit of the apple tree
Apple is an American multinational
technology company
…
リンゴはリンゴの
木
の実です
りんごはりんごの
木
の実です
…
Apple, is_a, fruit of the apple tree
リンゴはリンゴの
木
の実です
りんごはりんごの
木
の実です
…
⟨
⟩
リンゴ りんご 果実
6 4 1
リンゴ りんご
6 4
[EMNLP’23] Increasing Coverage and Precision of Textual Information in Multilingual Knowledge Graphs
Improve Question Answering
Reduce the number of unanswerable queries
DE ES FR ZH JA
+12.1%
+14.4%
+13.4%
+26.9%
+18.1%
[EMNLP’23] Increasing Coverage and Precision of Textual Information in Multilingual Knowledge Graphs
MKQA Dataset 2
Dec. 9. Poster Session 4
Daniel Lee
Simone Conia
Key Components
of KG Construction, Growth, and Services
KG
QA Linking
Embedding … ….
Extraction
Integration
Inference Introspection
Services
Construction & Growth
Introspection
Constraint Violation Detection
KG
Ontology Constraints Updated KG
Soft
Hard
|date of birth| 1
|date of death| 1
…
≤
≤
date of birth date of death
…
≤
Potential
errors
Errors
Introspection
Constraint Violation Detection
Source: https://www.wikidata.org/wiki/Q455611
Data issues: format +
missing quali
fi
er
Introspection
Constraint Violation Detection
Source: https://en.wikipedia.org/wiki/Plato
Date of birth:
• 428/427
• 424/423 BC
Date of Death:
348 BC
Extracted facts
- Two dates of birth
Potential error
Actual error
- Extracted date of birth is later than date of death
428/427 vs 348 BC
Key Components
of KG Construction, Growth, and Services
KG
QA Linking
Embedding … ….
Extraction
Integration
Inference Introspection
Services
Construction & Growth
[EMNLP’23] FLEEK: Factual Error Detection and Correction with Evidence Retrieved from External Knowledge
FLEEK
Factual Error Detection and Correction with Evidence Retrieved from External Knowledge
FLEEK | Demo
Dec. 8. Poster Session 1
Farima
Fatahi Bayat
Kun Qian
FLEEK
Factual Error Detection and Correction with Evidence Retrieved from External Knowledge
Input Text
Fact Extraction
text-to-triple
Question Generation
triple-to-question
Veri
fi
cation
Revision
Final Correction
Evidence Retrieval
[EMNLP’23] FLEEK: Factual Error Detection and Correction with Evidence Retrieved from External Knowledge
Conversational KG QA with LLM-generated Dialogs
Con
fi
gurable Attributes
User Experience Level
Voice Interaction
Search Interaction
Metadata Level
Popularity Scores - Long Tail Entities
Timestamps
Conversation Level
Topic Exploration
Extend to Related Entities & Neighbors
Voice Assistant Questions
More well-formed questions,
with a small mix of queries
Dis
fl
uencies - yes
Deixis - yes
Web Search Queries
Often short queries, mimic
search engine interactions but
with follow-ups
Dis
fl
uencies - no
Deixis - yes
Typos - yes
Voice Assistant Questions
Dis
fl
uencies & Deixis
Question: Hmm, which languages does Karl Wolff use
Answer: German
Question: Could you please, um, inform me about his military branch
Answer: Waffen-SS
Question: Do you know which wars he was a part of
Answer: ['Italian campaign', 'World War II', 'World War I']
Question: Do you know his military ranks
Answer: ['Obergruppenführer', ‘general']
Question: Do you know his date of birth
Answer: +1900-05-13
Question: Where was he born
Answer: Darmstadt
Question: Can you, uh, tell me when this military person died
Answer: +1984-07-15
Voice Assistant Questions with Related Entities
Question: Do you know any languages that Karl Wolff speaks
Answer: German
Question: Which military branch is he a part of
Answer: Waffen-SS
Question: Could you please, um, inform me about the wars he was involved in
Answer: ['Italian campaign', 'World War II’, ‘World War I']
Question: What about Sepp Dietrich
Answer: World War I
Question: Can you tell me, um, Karl's military rank
Answer: ['Obergruppenführer', 'general']
Question: How about Sep
Answer: SS-Oberst-Gruppenführer
Question: Can you, uh, tell me the birthplace of Karl
Answer: Darmstadt
Question: Ermm, what about Sepp
Answer: Hawangen
Primary Entity Related Entity
Web Search Queries — Short & Keyword-esque
Question: Karl Wolff country of citizenship
Answer: Germany
Question: wars involving him
Answer: ['Italian campaign', 'World War II', 'World War I']
Question: Also for Sepp Dietrich
Answer: World War I
Question: Karl place of birth
Answer: Darmstadt
Question: Answer for Sepp
Answer: Hawangen
Question: Karl died in
Answer: Rosenheim
Question: For Sepp
Answer: Ludwigsburg
Question: Karl military rank
Answer: ['Obergruppenführer', 'general']
Web Search Queries + Typos
Question: Kerl Wilff contry of citizenship
Answer: Germany
Question: wars involvng him
Answer: ['Italian campaign', 'World War II', 'World War I']
Question: Also fr Sepp Dietrich
Answer: World War I
Question: KJarl place of birth
Answer: Darmstadt
Question: Answer for Sep
Answer: Hawangen
Question: Karl died in
Answer: Rosenheim
Question: For Sepp
Answer: Ludwigsburg
Question: Karl mlitary rsnk
Answer: ['Obergruppenführer', 'general']
Dataset Statistics - Exhaustive & Evergrowing
Dataset
# Entities
(# Conversations)
# Facts # Questions Per Fact # Unique Types
# Unique
Predicates
General Set 29M 196M 12 (Web) + 12 (Voice) 274 1252
Related Entities
Set
210K 6.1M
24
[+ 30 (RE Follow-Up)]
95 265
Internal use only–do not distribute.
Evaluation - Effectiveness of LLMs on these conversations
Model
Question Type
Experience
Accuracy
GPT-3.5 Voice Assistant 25.9
GPT-4 Voice Assistant 32.4
GPT-3.5 Web Search 28.6
GPT-4 Web Search 35.7
General Subset
Model
Question Type
Experience
Accuracy
GPT-3.5 Voice Assistant 37.7
GPT-4 Voice Assistant 44.4
GPT-3.5 Web Search 38.7
GPT-4 Web Search 46.7
Related Entity Subset
Direct Triple Retrieval
Triple Retrieval
Direct Retrieval without Entity Linking
Triple Index
Query
• [S1, R1, O1]
• [S2, R2, O2]
• ……….
• [S100, R100, O100]
LLM
(Prompting for
Answer
Generation)
Answer
You are a question answering agent.
You will always provide short concise answers.
Based on the following evidence:
Fact 1: …..
Fact 2: …..
…
Fact N: …..
Answer the question using only the
evidence above:
Query
Query Triple
BERT BERT
hq ht
sim(q, t) = hqT ht
Can only work well for simple questions!
Subgraph + Triple Retrieval (Ours)
• We consider two types of subgraphs:
• Cliques: Subgraph containing predicates of an entity.
• 2-hop subgraphs: Subgraph containing predicates of one
and two-hop entities together
Original Graph
Cliques 2-hop Subgraphs
Triple Selection
Direct Retrieval without Entity Linking
Subgraph
Index
Query
• [S1, R1, O1]
• [S2, R2, O2]
• …
• [S100, R100,
LLM
(Prompting for
Answer
Generation)
Answer
Subgraph Retrieval
Evaluation
System Accuracy
System A 15.4
Direct Triple Retrieval 53.9
Subgraph + Triple Retrieval (Ours) 56.3
System Dataset 1 Dataset 2
System A 17.2 21.1
Ours 24.1 26.3
Public Benchmark Internal Benchmark
Wins
Overcoming Intent Detection Errors
Query: how old was Ronald Reagan when he was
inaugurated president
System A Answer
Ronald Reagan died June 5,
2004 at age 93 in X.
Our Answer
Ronald Reagan was 69 years old
when he was inaugurated president.
Query: what movies were Bill Cosby and Sidney Poitier in?
System A Answer
<empty>
Our Answer
Bill Cosby and Sidney Poitier have been
in several movies together, including:
Uptown Saturday Night , Let's Do It
Again
Query: who is the female lead in the movie music man?
System A Answer
<empty>
Our Answer
The female lead in the movie Music
Man is Shirley Jones.
Handling Multi-Hop Queries
Query: The drummer for Nirvana was born in what city?
System A Answer Our Answer
The drummer for Nirvana, Dave Grohl,
was born in Warren, Ohio.
<empty>
Key Components
of KG Construction, Growth, and Services
KG
QA Linking
Embedding … ….
Extraction
Integration
Inference Introspection
Services
Construction & Growth
Entity Linking
Virginia Heffernan was born in Hanover
Theo Lingen was born in Hanover
Hanover, New
Hampshire
Hanover,
Germany
Online Entity Linking
Enabling better question answering
Of
fl
ine Entity Linking
Enable Better Ranking and Search
“They bought tickets for Beyoncé; she’s performing at T-Mobile Park”
KG:123
KG Info
KG:345
KG Info
0.914
0.312
Entity Importance
Entity Importance
Entity Embedding Index
related entities (approximate
nearest neighbor search)
"entity_name": "Beyonce Knowles",
“entity_types": [
"artist",
"human",
“writer"],
…
"entity_name": "T-Mobile Park",
“entity_types": [
“stadium",
"POI",
"location"],
…
Custom Con
fi
guration
Specify what types should be included and what should not be present
Improve linking quality
Include:
City, Natural Place, Landmark, National Park, …
Exclude:
Company, Hospitals, Person, …
Example:
For “weather” use cases
Custom Tag Con
fi
guration
Example Use Case: Weather
Weather in Obama
Source: duckduckgo.com
Won’t be
considered with the
con
fi
guration
Obama [Person]
Obama [City]
Key Components
of KG Construction, Growth, and Services
KG
QA Linking
Embedding … ….
Extraction
Integration
Inference Introspection
Services
Construction & Growth
Fact Ranking & Related Entities
Embed entities / relations / queries in embedding space
Query processing = nearest neighbor search
Lady Gaga, occupation, ?
Apple Con
fi
dential–Internal Use Only
Related Entities
LLM
Entity Descriptions
Embeddings
Search
Query Logs
Entity Co-occurrence
Reranking Related Entities KV Store
KG
Example Use Case
Fact Ranking and Related Entities
Lady Gaga
Song
Album
Related Entities
Movie
Shadow …
Dance Telephone
The Frame Artpop Chromatica …
Adrian
Grande
Beyoncé Bradley
Cooper
…
House of
Gucci
A Star
Is Born
Sin City: A
Dame to Kill
For
…
Fact Ranking: Lady Gaga is
fi
rst a musician then an actress
Example Use Case
Fact Ranking and Related Entities
Lady Gaga
Song
Album
Related Entities
Movie
Shadow …
Dance Telephone
The Frame Artpop Chromatica …
Adrian
Grande
Beyoncé Bradley
Cooper
…
House of
Gucci
A Star
Is Born
Sin City: A
Dame to Kill
For
…
Relatedness:
Based on KG +
query log
Key Components
of KG Construction, Growth, and Services
KG
QA Linking
Embedding … ….
Extraction
Integration
Inference Introspection
Construction & Growth Services
LLMs vs. KGs
Source: Shirui Pan, et al. Unifying Large Language
Models and Knowledge Graphs: A Roadmap
https://arxiv.org/abs/2306.08302 Source: Link
Thanks!
IBM (including interns):
Shivakumar Vaithyanathan
Sriram Raghavan
Rajasekar Krishnamurthy
Lucian Popa
Ron Fagin
Fred Reiss
Laura Chiticariu
Mauricio Hernadez
Eser Kandogan
Huaiyu Zhu
Kun Qian
Dakuo Wang
Maeda Hana
fi
Many amazing collaborators and interns …
Apple (including interns):
Ihab Ilyas
Theodoros Rekatsinas
Umar Farooq Minhas
Ali Mousavi
Jefferey Pound
Anil Pacaci
Hongyu Ren
Kun Qian
Fei Wu
Simone Conia
Sha (Zoey) Li
Azadeh Nikfarjam
Yisi Sang
Saloni Potdar
Farima Fatahi Bayat … …
Universities:
Azza Abouzeid (NYU-Abu Dhabi)
H. V. Jagadish (U. Of Michigan)
Fei Xia (U. Of Washington)
Kevin Chen-Chuan Chang (UIUC)
ChengXiang Zhai (UIUC)
Domenico Lembo(Sapienza University of
Rome)
Dragomir R. Radev (Yale)
Jonathan K. Kummerfeld (U. Of
Michigan)
Toby Li (U. of Notre Dame)
Rishabh Iyer (UT Dallas)
Eduard C. Dragut (Temple Univ.) … ….
Douglas Burdick
Alan Akbik
Nancy Wang
Prithiviraj Sen
Marina Danilevsky
Poornima Chozhiyath Raman
Sudarshan Rangarajan
Ramiya Venkatachalam
Kiran Kate
Chenguang Wang
Ishan Jindal
Yiwei Yang
Nikita Bhutani … ….
© 2023 Adobe. All Rights Reserved. Adobe Confidential.
Unleashing
Creativity
Adobe
Creative Cloud
Accelerating Document
Productivity
Adobe
Document Cloud
Powering Digital
Businesses
Adobe
Experience Cloud
Adobe
Experience
Cloud
©2023. Adobe. All Rights reserved. Adobe Confidential.
© 2023 Adobe. All Rights Reserved. Adobe Confidential.
Adobe Experience Cloud: breadth of integrated applications
Marketing Planning & Workflow
Marketing system of record to
connect, collaborate and execute
the workflows required for
personalization at scale and
content supply chain
Adobe Experience Platform
Open, cloud-native platform transforming behavioral and transactional data into unified customer
profiles that update in real time and use AI-driven insights to help deliver the right experiences
across every channel
Customer Journeys
Real-time, omni-channel customer
and account-based journey
orchestration & campaign
execution
Content & Commerce
Content management and
commerce solutions for
personalized, multi-channel
experiences
Data Insights & Audiences
Omni-channel experience insights &
intelligence, including first-party data
management & activation for known
& unknown audiences

More Related Content

PDF
Human in the Loop AI for Building Knowledge Bases
PPT
Introduction iii
PDF
A study-to-understand-differential-equations-applied-to-aerodynamics-using-cf...
PDF
What Is Machine Learning? | Machine Learning Basics | Edureka
PDF
Meaning Representations for Natural Languages: Design, Models and Applications
PPTX
Overview on Open Source Technology.pptx
PDF
C# cơ bản hay
Human in the Loop AI for Building Knowledge Bases
Introduction iii
A study-to-understand-differential-equations-applied-to-aerodynamics-using-cf...
What Is Machine Learning? | Machine Learning Basics | Edureka
Meaning Representations for Natural Languages: Design, Models and Applications
Overview on Open Source Technology.pptx
C# cơ bản hay

What's hot (9)

PPTX
Recommending What Video to Watch Next: A Multitask Ranking System
PDF
Machine Learning for Recommender Systems MLSS 2015 Sydney
PDF
Linear Regression Algorithm | Linear Regression in Python | Machine Learning ...
PPTX
Propositional logic(part 3)
PPTX
Machine Learning Unit 3 Semester 3 MSc IT Part 2 Mumbai University
PPTX
Machine learning (webinar)
PDF
SIMILARITY MEASURES FOR RECOMMENDER SYSTEMS: A COMPARATIVE STUDY
PDF
La place du jeu vidéo dans le développement de l'enfant
PPTX
Recommending What Video to Watch Next: A Multitask Ranking System
Machine Learning for Recommender Systems MLSS 2015 Sydney
Linear Regression Algorithm | Linear Regression in Python | Machine Learning ...
Propositional logic(part 3)
Machine Learning Unit 3 Semester 3 MSc IT Part 2 Mumbai University
Machine learning (webinar)
SIMILARITY MEASURES FOR RECOMMENDER SYSTEMS: A COMPARATIVE STUDY
La place du jeu vidéo dans le développement de l'enfant
Ad

Similar to The Role of Patterns in the Era of Large Language Models (20)

PPTX
Gervais Tompkin, Gensler, at Opportunity Green 2009
PPTX
Gervais Tompkin
PDF
Top Essay Writing Companies. Best Ess. Online assignment writing service.
PPT
mlas06_nigam_tie_01.ppt
PPTX
Knowledge graphs for knowing more and knowing for sure
PDF
Evaluation Initiatives for Entity-oriented Search
PDF
Dynamic Factual Summaries for Entity Cards
PDF
Essay On Computer Education.pdf
PDF
BigML Fall 2016 Release
PDF
Cause And Effect Essay On Hippie Movement
PDF
Knowledge Graph Maintenance
PDF
Google Kernel Function
PDF
Natural Language Processing - A brief survey of technologies and applications
ODP
From Wikipedia to Thousands of Wikis – The DBkWik Knowledge Graph
PPT
Workshop a way-of_applying_an_events_model_to_national_archives_data
PPTX
Relatedness-based Multi-Entity Summarization
PDF
Writing A Good Thesis Statement
PDF
Keynote Open Source Diversity - Festival del Software Libre
PPT
Natural Language Processing
PDF
How To Write A Conclusion On An Essay. How to Write an Essay Conclusion Parag...
Gervais Tompkin, Gensler, at Opportunity Green 2009
Gervais Tompkin
Top Essay Writing Companies. Best Ess. Online assignment writing service.
mlas06_nigam_tie_01.ppt
Knowledge graphs for knowing more and knowing for sure
Evaluation Initiatives for Entity-oriented Search
Dynamic Factual Summaries for Entity Cards
Essay On Computer Education.pdf
BigML Fall 2016 Release
Cause And Effect Essay On Hippie Movement
Knowledge Graph Maintenance
Google Kernel Function
Natural Language Processing - A brief survey of technologies and applications
From Wikipedia to Thousands of Wikis – The DBkWik Knowledge Graph
Workshop a way-of_applying_an_events_model_to_national_archives_data
Relatedness-based Multi-Entity Summarization
Writing A Good Thesis Statement
Keynote Open Source Diversity - Festival del Software Libre
Natural Language Processing
How To Write A Conclusion On An Essay. How to Write an Essay Conclusion Parag...
Ad

More from Yunyao Li (20)

PDF
Meaning Representations for-Natural Languages Design, Models, and Application...
PDF
Building, Growing and Serving Large Knowledge Graphs with Human-in-the-Loop
PPTX
Taming the Wild West of NLP
PPTX
Towards Deep Table Understanding
PDF
Explainability for Natural Language Processing
PPTX
Explainability for Natural Language Processing
PDF
Towards Universal Language Understanding
PPTX
Explainability for Natural Language Processing
PDF
Towards Universal Language Understanding (2020 version)
PDF
Towards Universal Semantic Understanding of Natural Languages
PPT
An In-depth Analysis of the Effect of Text Normalization in Social Media
PDF
Exploiting Structure in Representation of Named Entities using Active Learning
PPTX
K-SRL: Instance-based Learning for Semantic Role Labeling
PDF
Coling poster
PDF
Coling demo
PPTX
Natural Language Data Management and Interfaces: Recent Development and Open ...
PDF
Polyglot: Multilingual Semantic Role Labeling with Unified Labels
PDF
Transparent Machine Learning for Information Extraction: State-of-the-Art and...
PDF
The Power of Declarative Analytics
PDF
Enterprise Search in the Big Data Era: Recent Developments and Open Challenges
Meaning Representations for-Natural Languages Design, Models, and Application...
Building, Growing and Serving Large Knowledge Graphs with Human-in-the-Loop
Taming the Wild West of NLP
Towards Deep Table Understanding
Explainability for Natural Language Processing
Explainability for Natural Language Processing
Towards Universal Language Understanding
Explainability for Natural Language Processing
Towards Universal Language Understanding (2020 version)
Towards Universal Semantic Understanding of Natural Languages
An In-depth Analysis of the Effect of Text Normalization in Social Media
Exploiting Structure in Representation of Named Entities using Active Learning
K-SRL: Instance-based Learning for Semantic Role Labeling
Coling poster
Coling demo
Natural Language Data Management and Interfaces: Recent Development and Open ...
Polyglot: Multilingual Semantic Role Labeling with Unified Labels
Transparent Machine Learning for Information Extraction: State-of-the-Art and...
The Power of Declarative Analytics
Enterprise Search in the Big Data Era: Recent Developments and Open Challenges

Recently uploaded (20)

PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Machine learning based COVID-19 study performance prediction
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
Big Data Technologies - Introduction.pptx
PPT
Teaching material agriculture food technology
PPTX
sap open course for s4hana steps from ECC to s4
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Approach and Philosophy of On baking technology
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Chapter 3 Spatial Domain Image Processing.pdf
MIND Revenue Release Quarter 2 2025 Press Release
Machine learning based COVID-19 study performance prediction
Encapsulation_ Review paper, used for researhc scholars
Big Data Technologies - Introduction.pptx
Teaching material agriculture food technology
sap open course for s4hana steps from ECC to s4
20250228 LYD VKU AI Blended-Learning.pptx
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Network Security Unit 5.pdf for BCA BBA.
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Approach and Philosophy of On baking technology
The Rise and Fall of 3GPP – Time for a Sabbatical?
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Advanced methodologies resolving dimensionality complications for autism neur...
Per capita expenditure prediction using model stacking based on satellite ima...
Digital-Transformation-Roadmap-for-Companies.pptx
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Chapter 3 Spatial Domain Image Processing.pdf

The Role of Patterns in the Era of Large Language Models

  • 1. Yunyao Li PAN-DL@EMNLP’23 | Adobe | December, 2023 The Role of Patterns in the Era of Large Language Models Initial Learnings from Constructing, Growing and Serving Large Knowledge Graphs* * Work done at IBM Research and Apple yunyaol@adobe.com @yunyao_li
  • 2. Knowledge Bases Image Source: https://www.csee.umbc.edu/courses/graduate/691/fall22/kg/
  • 3. Example: Financial Content Knowledge Base Financial Reports Ontology [VLDB’2017] Creation and Interaction with Large-scale Domain-Speci fi c Knowledge Bases. XML Knowledge Extraction Overall Architecture: A Simpli fi ed View Linking Fusion KG Construction Transforming >31,000 companies 439 industries ~170,000 insiders ~100 millions fi nancial metrics ~22,000 industry KPIs Financial Content KB KG Services QA APIs
  • 4. Example: Saga Structured Knowledge Sources Real-time Sources Ontology Unstructured Knowledge Sources Linking Fusion KG Construction KG Knowledge Extraction KG Services QA Semantic Annotation … … Embedding Services [SIGMOD’23] Growing and Serving Large Open-domain Knowledge Graphs. [SIGMOD’22] Saga: A Platform for Continuous Construction and Serving of Knowledge at Scale Transforming Overall Architecture: A Simpli fi ed View
  • 5. Key Components of KG Construction, Growth, and Services KG QA Linking Embedding … …. Extraction Integration Inference Introspection Services Construction & Growth
  • 6. Key Components of KG Construction, Growth, and Services KG QA Linking Embedding … …. Extraction Integration Inference Introspection Services Construction & Growth
  • 7. “Connor McDavid” name “Richmond Hill” name “97/01/13” dob place of birth CITY type PERSON type “Connor McDavid” name “Jan 13” bday goals HOCKEY_PLAYER type “43” Source A Source B
  • 8. “Connor McDavid” name “Richmond Hill” name “97/01/13” dob place of birth CITY type PERSON type “Connor McDavid” name “Jan 13” bday goals HOCKEY_PLAYER type “43” Source A Source B “Connor McDavid” name ID1 “Richmond Hill” name “January 13, 1997” dob place of birth CITY type PERSON type goals HOCKEY_PLAYER type “43” Linking Fusion
  • 9. Entity Normalization & Variant Generation Learning: Structured Representations Capture Entity Semantic Structure [COLING’2018] Exploiting Structure in Representation of Named Entities using Active Learning. [ICDE’2018] LUSTRE: An Interactive System for Entity Structured Representation and Variant Generation. Generated normalizers for Watson Discovery [AAAI’2020] PARTNER: Human-in-the-Loop Entity Name Understanding with Deep Learning. [EMNLP’2020] Learning Structured Representations of Entity Names using Active Learning and Weak Supervision. “Bank of America N.A.” “Bank of America National Association” Pattern-Based: Synthesizing Normalization and Variant Generation Functions “97/01/13” “January 13, 1997”
  • 10. Key Components of KG Construction, Growth, and Services KG QA Linking Embedding … …. Extraction Integration Inference Introspection Services Construction & Growth
  • 11. Graph Completion via Ontology Inference KG Ontology Inference Rules Updated KG A has_mother B B has_child A A has_father B B has_child A A has_spouse B B has_spouse A A contains B B is_part_of A A has_child B A has_child C B has_sibling C C has_sibling B … … → → → → ∧ → ∧
  • 12. Example Inference Who’s Kylian Mbappé’s mother? Source: https://en.wikipedia.org/wiki/Kylian_Mbappé No information about his mother
  • 13. Example Inference Who’s Kylian Mbappé’s mother? Source: https://www.wikidata.org/wiki/Q45094361 A has_child B A is_a female B has_mother A ∧ → Fayza Lamri has_child Kylian Mbappé Fayza Lamri is_a female Kylian Mbappé has_mother Fayza Lamri Infer high-quality facts at scale
  • 14. Fact Editing for LLM Ontology-Guided Evaluation Source: Evaluating the Ripple Effects of Knowledge Editing in Language Models https://arxiv.org/pdf/2307.12976.pdf
  • 15. Fact Editing for LLM Ontology-Guided Evaluation Source: Evaluating the Ripple Effects of Knowledge Editing in Language Models https://arxiv.org/pdf/2307.12976.pdf
  • 16. Key Components of KG Construction, Growth, and Services KG QA Linking Embedding … …. Extraction Integration Inference Introspection Services Construction & Growth
  • 17. Scale Fact Collection Missing / stale facts Missing Facts Query Synthesizer QA System candidate facts Baseline New Facts
  • 18. Scale Fact Collection Missing / stale facts Missing Facts Query Synthesizer QA System candidate facts Baseline New Facts Query-by-Committee Missing Facts Query Synthesizer QA System candidate facts New Facts QA System Q1 QA System … … … … … Qn QbC Selector AnswerSet1 AnswerSetn [EMNLP-DaSH’2022] Improving Human Annotation Effectiveness for Fact Collection by Identifying the Most Relevant Answers Success Rate fact collection 25%
  • 19. Scale Fact Collection Missing / stale facts Missing Facts Query Synthesizer QA System candidate facts Baseline New Facts Open Domain Knowledge Extraction [SIGMOD’23] Growing and Serving Large Open-domain Knowledge Graphs. Throughput vs. manual fact collection >100x Missing Facts Query Synthesizer Web Search candidate facts w/ lower-con fi dence New Facts Knowledge Extractor Fact Corroboration
  • 20. Extraction: Pattern vs. LLM * All details simpli fi ed for presentation If entity.type = “Person” And If tuple.key = “Height” Return height = extract(tuple.value, “d?.d+ s*m”) You are an accurate information extraction system responsible to fi nd answers to a set of questions solely from a given passage. For example Now please work on the following task: Questions: height Passage: Title: José Varela Infobox properties: {“Full name": "José Carlos Moreira Varela” “Date of birth”: “15 September 1997 (age 26)” “Place of birth”: “Praia, Cape Verde” “Height”: “1.68 m (5 ft 6 in)” … …} Key Value Full name José Carlos Moreira Varela Date of birth 15 September 1997 (age 26) Place of birth Praia, Cape Verde Height 1.68 m (5 ft 6 in) … … Key-Value Pair Extractor Height Extractor Height = 1.68 m Prompt Pattern-based Extractors Height = 1.68 m LLM LLM-based Extractor Demonstrate Example InfoBox Content
  • 21. Extraction: Pattern vs. LLM * All details simpli fi ed for presentation purpose If entity.type = “Person” And If tuple.key = “Height” Return height = extract(tuple.value, “d?.d+ s*m”) Key Value Born 5 September 1808, Calcutta … Died 30 May 1869 (aged 60) .. Political Party Liberal Party. Spouse Annie Henrietta Templer … … … Key-Value Pair Extractor Height Extractor Height = null Pattern-based Extractors Height = 1.80 m LLM LLM-based Extractors hallucination You are an accurate information extraction system responsible to fi nd answers to a set of questions solely from a given passage. For example Now please work on the following task: Questions: height Passage: Title: Sir Arthur William Buller Infobox properties: {“Born": “5 September 1808” “Calcutta, British India” … …} Demonstrate Example InfoBox Content Prompt
  • 22. Extraction: Pattern vs. LLM * All details simpli fi ed for presentation purpose If entity.type = “Person” And If tuple.key = “Spouses” Return spouse = extract(tuple.value, PersonNameRegex), start time = extract(tuple.value, StartTimeRegex), end time = extract(tuple.value, EndTimeRegex) Key Value Born Jacques Haussmann, … Died October 31, 1958 (aged 86) … Citizenship American Education Clifton College … … Key-Value Pair Extractor Spouse Extractor Pattern-based Extractors Spouse = Zita Johann Start time = 1929 End time = 1933 Spouse = Joan Courtney Start time = 1952 End time = 1988 LLM-based Extractors You are an accurate information extraction system responsible to fi nd answers to a set of questions solely from a given passage. For example Now please work on the following task: Questions: spouse Passage: Title: John Houseman Infobox properties: {“Born”: “Jacques Haussmann” “September 22, 1902” … …} Prompt LLM Demonstrate Example InfoBox Content Spouse = Zita Johann Start time = 1929 End time = 1933 Incomplete
  • 23. Extraction: Pattern vs. LLM A Side-by-Side Comparison Pattern-based LLM-based Throughput Quality of Results Simple Cases Complex Cases Development Effort Simple Cases Complex Cases High Low High High High Medium Medium Medium High Low
  • 24. Extraction: Pattern vs. LLM Opportunity to Get the Best of Both Worlds Source: Language Models Enable Simple Systems for Generating Structured Views of Heterogeneous Data Lakes https://arxiv.org/pdf/2304.09433.pdf A recent example Additional reading: Large Language Model Is Not a Good Few-shot Information Extractor, but a Good Reranker for Hard Samples!. https://arxiv.org/abs/2303.08559
  • 25. Multilingual Coverage of KG EN ES ES IT EN ES EN DE EN ES ES ES ES IT EN EN ES ES 0% 100% AR DE ES FR IT JA KO RU ZH 36 40 63 36 34 21 24 27 55 64 60 37 64 66 79 76 73 45 Coverage of entity names (Wikidata) Major gap exists [EMNLP’23] Increasing Coverage and Precision of Textual Information in Multilingual Knowledge Graphs
  • 26. [EMNLP’23] Increasing Coverage and Precision of Textual Information in Multilingual Knowledge Graphs Multilingual Knowledge Graph Enrichment EN ES ES IT EN ES EN DE EN ES ES ES ES IT EN EN ES ES EN ES ES IT EN ES IT DE EN DE ES IT EN DE ES DE IT EN ES ES IT DE EN ES ES IT DE EN ES ES IT DE EN ES ES IT DE Before Existing KG After Multilingually-enriched KG M-NTA Increasing multilingual coverage of locale-speci fi c facts.
  • 27. M-NTA | Multi-source Naturalization, Translation, and Alignment Leverages complementary knowledge across locales and tools Naturalization triple-to-text KG Machine Translation Web Search LLMs Alignment text-to-triple Ensemblement Triple Selection Apple, is_a, fruit of the apple tree Apple, is_a, American multinational technology company … ⟨ ⟩ ⟨ ⟩ Apple is a fruit of the apple tree Apple is an American multinational technology company … リンゴはリンゴの 木 の実です りんごはりんごの 木 の実です … Apple, is_a, fruit of the apple tree リンゴはリンゴの 木 の実です りんごはりんごの 木 の実です … ⟨ ⟩ リンゴ りんご 果実 6 4 1 リンゴ りんご 6 4 [EMNLP’23] Increasing Coverage and Precision of Textual Information in Multilingual Knowledge Graphs
  • 28. Improve Question Answering Reduce the number of unanswerable queries DE ES FR ZH JA +12.1% +14.4% +13.4% +26.9% +18.1% [EMNLP’23] Increasing Coverage and Precision of Textual Information in Multilingual Knowledge Graphs MKQA Dataset 2 Dec. 9. Poster Session 4 Daniel Lee Simone Conia
  • 29. Key Components of KG Construction, Growth, and Services KG QA Linking Embedding … …. Extraction Integration Inference Introspection Services Construction & Growth
  • 30. Introspection Constraint Violation Detection KG Ontology Constraints Updated KG Soft Hard |date of birth| 1 |date of death| 1 … ≤ ≤ date of birth date of death … ≤ Potential errors Errors
  • 31. Introspection Constraint Violation Detection Source: https://www.wikidata.org/wiki/Q455611 Data issues: format + missing quali fi er
  • 32. Introspection Constraint Violation Detection Source: https://en.wikipedia.org/wiki/Plato Date of birth: • 428/427 • 424/423 BC Date of Death: 348 BC Extracted facts - Two dates of birth Potential error Actual error - Extracted date of birth is later than date of death 428/427 vs 348 BC
  • 33. Key Components of KG Construction, Growth, and Services KG QA Linking Embedding … …. Extraction Integration Inference Introspection Services Construction & Growth
  • 34. [EMNLP’23] FLEEK: Factual Error Detection and Correction with Evidence Retrieved from External Knowledge FLEEK Factual Error Detection and Correction with Evidence Retrieved from External Knowledge
  • 35. FLEEK | Demo Dec. 8. Poster Session 1 Farima Fatahi Bayat Kun Qian
  • 36. FLEEK Factual Error Detection and Correction with Evidence Retrieved from External Knowledge Input Text Fact Extraction text-to-triple Question Generation triple-to-question Veri fi cation Revision Final Correction Evidence Retrieval [EMNLP’23] FLEEK: Factual Error Detection and Correction with Evidence Retrieved from External Knowledge
  • 37. Conversational KG QA with LLM-generated Dialogs
  • 38. Con fi gurable Attributes User Experience Level Voice Interaction Search Interaction Metadata Level Popularity Scores - Long Tail Entities Timestamps Conversation Level Topic Exploration Extend to Related Entities & Neighbors
  • 39. Voice Assistant Questions More well-formed questions, with a small mix of queries Dis fl uencies - yes Deixis - yes Web Search Queries Often short queries, mimic search engine interactions but with follow-ups Dis fl uencies - no Deixis - yes Typos - yes
  • 40. Voice Assistant Questions Dis fl uencies & Deixis Question: Hmm, which languages does Karl Wolff use Answer: German Question: Could you please, um, inform me about his military branch Answer: Waffen-SS Question: Do you know which wars he was a part of Answer: ['Italian campaign', 'World War II', 'World War I'] Question: Do you know his military ranks Answer: ['Obergruppenführer', ‘general'] Question: Do you know his date of birth Answer: +1900-05-13 Question: Where was he born Answer: Darmstadt Question: Can you, uh, tell me when this military person died Answer: +1984-07-15
  • 41. Voice Assistant Questions with Related Entities Question: Do you know any languages that Karl Wolff speaks Answer: German Question: Which military branch is he a part of Answer: Waffen-SS Question: Could you please, um, inform me about the wars he was involved in Answer: ['Italian campaign', 'World War II’, ‘World War I'] Question: What about Sepp Dietrich Answer: World War I Question: Can you tell me, um, Karl's military rank Answer: ['Obergruppenführer', 'general'] Question: How about Sep Answer: SS-Oberst-Gruppenführer Question: Can you, uh, tell me the birthplace of Karl Answer: Darmstadt Question: Ermm, what about Sepp Answer: Hawangen Primary Entity Related Entity
  • 42. Web Search Queries — Short & Keyword-esque Question: Karl Wolff country of citizenship Answer: Germany Question: wars involving him Answer: ['Italian campaign', 'World War II', 'World War I'] Question: Also for Sepp Dietrich Answer: World War I Question: Karl place of birth Answer: Darmstadt Question: Answer for Sepp Answer: Hawangen Question: Karl died in Answer: Rosenheim Question: For Sepp Answer: Ludwigsburg Question: Karl military rank Answer: ['Obergruppenführer', 'general']
  • 43. Web Search Queries + Typos Question: Kerl Wilff contry of citizenship Answer: Germany Question: wars involvng him Answer: ['Italian campaign', 'World War II', 'World War I'] Question: Also fr Sepp Dietrich Answer: World War I Question: KJarl place of birth Answer: Darmstadt Question: Answer for Sep Answer: Hawangen Question: Karl died in Answer: Rosenheim Question: For Sepp Answer: Ludwigsburg Question: Karl mlitary rsnk Answer: ['Obergruppenführer', 'general']
  • 44. Dataset Statistics - Exhaustive & Evergrowing Dataset # Entities (# Conversations) # Facts # Questions Per Fact # Unique Types # Unique Predicates General Set 29M 196M 12 (Web) + 12 (Voice) 274 1252 Related Entities Set 210K 6.1M 24 [+ 30 (RE Follow-Up)] 95 265
  • 45. Internal use only–do not distribute. Evaluation - Effectiveness of LLMs on these conversations Model Question Type Experience Accuracy GPT-3.5 Voice Assistant 25.9 GPT-4 Voice Assistant 32.4 GPT-3.5 Web Search 28.6 GPT-4 Web Search 35.7 General Subset Model Question Type Experience Accuracy GPT-3.5 Voice Assistant 37.7 GPT-4 Voice Assistant 44.4 GPT-3.5 Web Search 38.7 GPT-4 Web Search 46.7 Related Entity Subset
  • 46. Direct Triple Retrieval Triple Retrieval Direct Retrieval without Entity Linking Triple Index Query • [S1, R1, O1] • [S2, R2, O2] • ………. • [S100, R100, O100] LLM (Prompting for Answer Generation) Answer You are a question answering agent. You will always provide short concise answers. Based on the following evidence: Fact 1: ….. Fact 2: ….. … Fact N: ….. Answer the question using only the evidence above: Query Query Triple BERT BERT hq ht sim(q, t) = hqT ht Can only work well for simple questions!
  • 47. Subgraph + Triple Retrieval (Ours) • We consider two types of subgraphs: • Cliques: Subgraph containing predicates of an entity. • 2-hop subgraphs: Subgraph containing predicates of one and two-hop entities together Original Graph Cliques 2-hop Subgraphs Triple Selection Direct Retrieval without Entity Linking Subgraph Index Query • [S1, R1, O1] • [S2, R2, O2] • … • [S100, R100, LLM (Prompting for Answer Generation) Answer Subgraph Retrieval
  • 48. Evaluation System Accuracy System A 15.4 Direct Triple Retrieval 53.9 Subgraph + Triple Retrieval (Ours) 56.3 System Dataset 1 Dataset 2 System A 17.2 21.1 Ours 24.1 26.3 Public Benchmark Internal Benchmark
  • 49. Wins Overcoming Intent Detection Errors Query: how old was Ronald Reagan when he was inaugurated president System A Answer Ronald Reagan died June 5, 2004 at age 93 in X. Our Answer Ronald Reagan was 69 years old when he was inaugurated president. Query: what movies were Bill Cosby and Sidney Poitier in? System A Answer <empty> Our Answer Bill Cosby and Sidney Poitier have been in several movies together, including: Uptown Saturday Night , Let's Do It Again Query: who is the female lead in the movie music man? System A Answer <empty> Our Answer The female lead in the movie Music Man is Shirley Jones. Handling Multi-Hop Queries Query: The drummer for Nirvana was born in what city? System A Answer Our Answer The drummer for Nirvana, Dave Grohl, was born in Warren, Ohio. <empty>
  • 50. Key Components of KG Construction, Growth, and Services KG QA Linking Embedding … …. Extraction Integration Inference Introspection Services Construction & Growth
  • 51. Entity Linking Virginia Heffernan was born in Hanover Theo Lingen was born in Hanover Hanover, New Hampshire Hanover, Germany
  • 52. Online Entity Linking Enabling better question answering
  • 53. Of fl ine Entity Linking Enable Better Ranking and Search “They bought tickets for Beyoncé; she’s performing at T-Mobile Park” KG:123 KG Info KG:345 KG Info 0.914 0.312 Entity Importance Entity Importance Entity Embedding Index related entities (approximate nearest neighbor search) "entity_name": "Beyonce Knowles", “entity_types": [ "artist", "human", “writer"], … "entity_name": "T-Mobile Park", “entity_types": [ “stadium", "POI", "location"], …
  • 54. Custom Con fi guration Specify what types should be included and what should not be present Improve linking quality Include: City, Natural Place, Landmark, National Park, … Exclude: Company, Hospitals, Person, … Example: For “weather” use cases
  • 55. Custom Tag Con fi guration Example Use Case: Weather Weather in Obama Source: duckduckgo.com Won’t be considered with the con fi guration Obama [Person] Obama [City]
  • 56. Key Components of KG Construction, Growth, and Services KG QA Linking Embedding … …. Extraction Integration Inference Introspection Services Construction & Growth
  • 57. Fact Ranking & Related Entities Embed entities / relations / queries in embedding space Query processing = nearest neighbor search Lady Gaga, occupation, ?
  • 58. Apple Con fi dential–Internal Use Only Related Entities LLM Entity Descriptions Embeddings Search Query Logs Entity Co-occurrence Reranking Related Entities KV Store KG
  • 59. Example Use Case Fact Ranking and Related Entities Lady Gaga Song Album Related Entities Movie Shadow … Dance Telephone The Frame Artpop Chromatica … Adrian Grande Beyoncé Bradley Cooper … House of Gucci A Star Is Born Sin City: A Dame to Kill For … Fact Ranking: Lady Gaga is fi rst a musician then an actress
  • 60. Example Use Case Fact Ranking and Related Entities Lady Gaga Song Album Related Entities Movie Shadow … Dance Telephone The Frame Artpop Chromatica … Adrian Grande Beyoncé Bradley Cooper … House of Gucci A Star Is Born Sin City: A Dame to Kill For … Relatedness: Based on KG + query log
  • 61. Key Components of KG Construction, Growth, and Services KG QA Linking Embedding … …. Extraction Integration Inference Introspection Construction & Growth Services
  • 62. LLMs vs. KGs Source: Shirui Pan, et al. Unifying Large Language Models and Knowledge Graphs: A Roadmap https://arxiv.org/abs/2306.08302 Source: Link
  • 63. Thanks! IBM (including interns): Shivakumar Vaithyanathan Sriram Raghavan Rajasekar Krishnamurthy Lucian Popa Ron Fagin Fred Reiss Laura Chiticariu Mauricio Hernadez Eser Kandogan Huaiyu Zhu Kun Qian Dakuo Wang Maeda Hana fi Many amazing collaborators and interns … Apple (including interns): Ihab Ilyas Theodoros Rekatsinas Umar Farooq Minhas Ali Mousavi Jefferey Pound Anil Pacaci Hongyu Ren Kun Qian Fei Wu Simone Conia Sha (Zoey) Li Azadeh Nikfarjam Yisi Sang Saloni Potdar Farima Fatahi Bayat … … Universities: Azza Abouzeid (NYU-Abu Dhabi) H. V. Jagadish (U. Of Michigan) Fei Xia (U. Of Washington) Kevin Chen-Chuan Chang (UIUC) ChengXiang Zhai (UIUC) Domenico Lembo(Sapienza University of Rome) Dragomir R. Radev (Yale) Jonathan K. Kummerfeld (U. Of Michigan) Toby Li (U. of Notre Dame) Rishabh Iyer (UT Dallas) Eduard C. Dragut (Temple Univ.) … …. Douglas Burdick Alan Akbik Nancy Wang Prithiviraj Sen Marina Danilevsky Poornima Chozhiyath Raman Sudarshan Rangarajan Ramiya Venkatachalam Kiran Kate Chenguang Wang Ishan Jindal Yiwei Yang Nikita Bhutani … ….
  • 64. © 2023 Adobe. All Rights Reserved. Adobe Confidential. Unleashing Creativity Adobe Creative Cloud Accelerating Document Productivity Adobe Document Cloud Powering Digital Businesses Adobe Experience Cloud Adobe Experience Cloud ©2023. Adobe. All Rights reserved. Adobe Confidential.
  • 65. © 2023 Adobe. All Rights Reserved. Adobe Confidential. Adobe Experience Cloud: breadth of integrated applications Marketing Planning & Workflow Marketing system of record to connect, collaborate and execute the workflows required for personalization at scale and content supply chain Adobe Experience Platform Open, cloud-native platform transforming behavioral and transactional data into unified customer profiles that update in real time and use AI-driven insights to help deliver the right experiences across every channel Customer Journeys Real-time, omni-channel customer and account-based journey orchestration & campaign execution Content & Commerce Content management and commerce solutions for personalized, multi-channel experiences Data Insights & Audiences Omni-channel experience insights & intelligence, including first-party data management & activation for known & unknown audiences