SlideShare a Scribd company logo
GOVERNMENT USERS
Conference
“Navigating the Human Terrain”
College Park, MD, May 20-21, 2008
Linguistic
Considerations of
Identity Resolution
David Murgatroyd
Software Architect
Basis Technology
2
Outline
 Introduction
 Linguistic Challenges
 Variation (Intentional & Unintentional)
 Composition
 Frequency
 Under-specification
 Multilinguality
 Integration Challenges
 Inputs & Outputs
 Properties
 Evaluation Challenges
 Corpora: Find or Build?
 Metrics: Adopt or Create?
 Conclusion
3
Introduction: An Exercise
Jim Killeen Kileen, J. D.
Jaime Kilin
‫كلين‬ ‫جمس‬
 Is there a >50% chance these refer to the same
person? If…US Citizens; On a ferry to Spain;
In a documentary
4
What is Identity Resolution?
 Identity Resolution (aka Entity Resolution):
 determining if two or more given references refer to
the same entity.
 Different from name matching as it’s about
identity of entities not similarity of names
 See also:
 Murgatroyd, D. Some Linguistic Considerations of
Entity Resolution and Retrieval. In Proceedings of
LREC 2008 Workshop on Resources and Evaluation for
Identity Matching, Entity Resolution and Entity
Management.
5
What sorts of references?
 Non-linguistic reference examples:
 Numerical identifiers
— SSN
— Some portions of address (Street Number, Zip Code)
 Visual identifiers (e.g., pictures, symbols)
 Biometrics (e.g., DNA, iris, signature, voice)
 Linguistic reference examples:
 Nouns or pronouns in documents (e.g., “the CEO of Basis”)
 Names of associated/related entities
— Locations (e.g., Street or City Name)
— Organizations
— Individuals
 Name of entity <- we’re going to focus on this one
6
Let’s focus on names of people
 Common and familiar
 Often fairly identifying piece of personal
information
 Demonstrate typical challenges of resolution
with linguistic data
7
Outline
 Introduction
 Linguistic Challenges
 Variation (Intentional & Unintentional)
 Composition
 Frequency
 Under-specification
 Multilinguality
 Integration Challenges
 Inputs & Outputs
 Properties
 Evaluation Challenges
 Corpora: Find or Build?
 Metrics: Adopt or Create?
 Conclusion
8
Variation (Intentional)
 Variation may be intentional
 References may be draw on a large set of names:
— Formality (e.g., nicknames)
— Transparency (e.g., aliases)
— Location (e.g., toponym)
— Life status
 Vocation (e.g., titles)
 Marital status (e.g., marriage/divorce/widowhood)
 Parenthood (e.g., patronymic)
 Faith (e.g., christening, pilgrimage)
 Death (e.g., posthumous names)
— Dialect (e.g., adolescent girls preferring “Jenni” over “Jenny”)
— Style of text (e.g., “Sollermun” for “Solomon” in Huck Finn)
Jim Killeen
9
Variation (Unintentional)
 Variation may be unintentional, arising from:
 Typos
— E.g., “Killeen” vs. “Kileen”
 Guessing spelling based on pronunciation
— E.g., “Caliin”
 Ambiguities inherent in the encoding (e.g., Unicode):
— Characters with the same glyph
 E.g., Latin and Cyrillic small “i”
— Characters with similar glyphs
 E.g., Latin “K” and Greenlandic “ĸ”
— Characters with composed/combined forms
 E.g., ņ (n with cedilla) vs. ņ (n + combining cedilla)
Kileen, J. D.
10
Composition
 Names have differing orders:
 Given v. Surname: “Killen, Jim” v. “Jim Killeen”
 Varies by culture
 Name references may be partial:
 “Jim” v. “Jim Killeen”
11
Under-specification
 Name components may be abbreviated
 Initials (e.g., “J. D.”)
 Abbreviations (e.g., “Jas.”)
 Name references may have incomplete…
 orthography (e.g., Semitic languages)
 segmentation (e.g., Asian languages)
 phonology (e.g., Ideographic languages)
Kileen, J. D.
‫كلين‬ ‫جمس‬
12
Frequency
 Any person can make up a name (an open class)
 A few are common, most are very uncommon
 Zipfian distribution
 Lesson:
 Valuable to know
common names
 Valuable to have a
strategy for unknown
names
13
Multilinguality
 Names may appear in many languages-of-use
 This leads to variation at many linguistic levels.
 Orthographic:
 transliteration confronts skew in:
—orthographic-to-phonetic mappings of source and
target languages-of-use
—sound systems between the languages
‫كلين‬ ‫جمس‬ <-> James Klein
14
Multilinguality (cont’d)
 Syntactic:
 different languages-of-use may imply different name
word order
 Semantic:
 name words which communicate meaning (e.g.,
titles) may vary (e.g., “Jr.” for “‫الصغر‬ “which
means “the younger”)
 Pragmatic:
 different languages-of-use may use different names
based on the audience (e.g., “Mr. Laden” vs. “‫المير‬”
which means “the prince”)
15
Outline
 Introduction
 Linguistic Challenges
 Variation (Intentional & Unintentional)
 Composition
 Frequency
 Under-specification
 Multilinguality
 Integration Challenges
 Inputs & Outputs
 Properties
 Evaluation Challenges
 Corpora: Find or Build?
 Metrics: Adopt or Create?
 Conclusion
16
Inputs & Outputs
 Inputs options include:
 Pair-wise: simple integration, but no shared effort
 Set-based: harder integration, but able to optimize
 Output options include:
 Feature-based: with weights/tuning
 Probability-based:
—more principled combination
—NOTE: similarity is not probability
17
Integration Properties
 Certain properties help make efficient
implementations:
 Reflexivity:
—Resolve(a,a) is always true
—NOTE: does not imply Resolve(a,a’) where a~a’
 Commutativity:
—Resolve(a,b)  Resolve(b,a)
 Transitivity:
—Resolve(a,b) & Resolve(b,c) => Resolve(a,c)
18
Outline
 Introduction
 Linguistic Challenges
 Variation (Intentional & Unintentional)
 Composition
 Frequency
 Under-specification
 Multilinguality
 Integration Challenges
 Inputs & Outputs
 Properties
 Evaluation Challenges
 Corpora: Find or Build?
 Metrics: Adopt or Create?
 Conclusion
19
Corpora: Find or Build?
 Requirements:
 Annotated for ground truth
 Represent linguistic challenges
 Scalable/practical
 Options
 Adapt public “database” corpora:
— Wikipedia:
 Annotated: yes
 Representative: somewhat
 Scalable: yes
— Citation DBs:
 Annotated: no
 Representative: somewhat
 Scalable: yes
20
Corpora: Find or Build? (cont’d)
 Adapt public “document” corpora:
— Co-reference documents:
 Annotated: yes
 Representative: less as often single doc/language-of-use
 Scalable: yes
 Create corpora by hand:
— From scratch: “parrot sessions” (auditory or visual)
 Annotated: yes
 Representative: largely
 Scalable: no
— From un-annotated databases:
 Annotated: no
 Representative: yes
 Scalable/practical: no; databases may be private
— Synthesize from generative model
 Annotated: yes
 Representative: no, tied to generating model
 Scalable: yes
21
Metrics
 Back to our initial example
Jim Killeen Kileen, J. D.
Jaime Kilin
‫كلين‬ ‫جمس‬
Jim
JDKJimK illeen
J. Diw Killeen
Reference
System A
System B
22
Metrics: Adopt or Create?
 How to quantify the quality of the system’s resolutions
vs. the reference?
 Goals:
 Discriminative: separates good v. bad systems for users’ needs
 Interpretable: number aligns with intuition
 Considerations:
 Assume transitive closure (TC) of output?
 Apply weights to try to be more discriminative?
 Common concepts:
 Precision: % of stuff in answer that’s right
 Recall: % of right stuff in answer
 F-Score: Harmonic mean of these = 2*P*R/(P+R)
23
Candidate Metrics
 Pair-wise % correct: over all N*(N-1)/2 node pairs
 Pair-wise P&R: based on links drawn
 Edit-distance: # of links to add/subtract to correct
 Metrics used in document co-reference resolution:
 MUC-6: entity-based P&R on missing links from graph
 B-CUBED: average per-reference P&R of links
 CEAF (Constrained Entity-Alignment F): entities aligned
using some similarity measure; P&R are % of possible
similarity level achieved
24
Comparing Metrics
Jim Killeen
Jaime Kilin
‫كلين‬ ‫جمس‬
Jim
JDKJimK illeen
J. Diw Killeen
Reference
System A
System B
Kileen, J. D.
No TCTC
3
6
1
4
Edit-dist
81858973717982B
90788062618279A
No TCTCNo TCTC
CEAF
(TC)
B-CUBED
(TC)
MUC-6
(TC)
Pairwise F% Correct
My preference
25
Conclusion
 Identity resolution systems face linguistic
challenges
 They need to be carefully integrated to meet
these challenges
 Evaluation corpora should reflect these
challenges
 Evaluation metrics should align with qualitative
judgements
26
Bibliography
Bagga, A., Baldwin., B. (1998). Algorithms for scoring coreference chains. In
Proceedings of the First International Conference on Language Resources
and Evaluation Workshop on Linguistic Coreference.
Fellegi, I. P., Sunter, A. B. (1969). A theory for record linkage. Journal of the
American Statistical Association, Vol. 64, No. 328, pp. 1183--1210.
Luo, X. (2005). On coreference resolution performance metrics. In Proc. of
HLT-EMNLP, pp 25--32.
Menestrina, D., Benjelloun, O., Garcia-Molina, H. (2006). Generic entity
resolution with data confidences. In First International VLDB Workshop on
Clean Databases. Seoul, Korea.
Murgatroyd, D. Some Linguistic Considerations of Entity Resolution and
Retrieval. In Proceedings of LREC 2008 Workshop on Resources and
Evaluation for Identity Matching, Entity Resolution and Entity
Management.
Spock Team (2008). The Spock Challenge. http://challenge.spock.com/
(Retrieved February 5.)
Vilain, M. Burger, J. Aberdeen, J. Connolly, D., Hirschman, L. (1995). A
model-theoretic coreference scoring scheme. In Proceedings of the 6th
Message Understanding Conference (MUC6). Morgan Kaufmann, pp. 45--52.
27
Questions?
More information:
http://www.basistech.com

More Related Content

PPSX
Dissociative identity disorder (DID) earlier known as Multiple Personality Di...
PPT
Neurological, cognitive, affective and linguistic considerations
PPT
Age And Neurological Factors (I Presentation)
PPTX
neurological, cognitive, affective and linguistic considerations
PPT
Natural Order of Vocabulary Acquisition
PDF
Managing machine learning
PDF
Automated Methods for Identity Resolution across Online Social Networks
PPT
Chapter3 Eex502
Dissociative identity disorder (DID) earlier known as Multiple Personality Di...
Neurological, cognitive, affective and linguistic considerations
Age And Neurological Factors (I Presentation)
neurological, cognitive, affective and linguistic considerations
Natural Order of Vocabulary Acquisition
Managing machine learning
Automated Methods for Identity Resolution across Online Social Networks
Chapter3 Eex502

Viewers also liked (13)

PPTX
Language acquisition
PDF
The interference of the first language
ODP
Age and acquisition
PPTX
Age and language acquisition
PPT
Language acquisition (2)
PPT
code switching
PPTX
Interference Between First and Second Languages pp pres
PPTX
Age and acquisition
PPTX
Bilingualism, code switching, and code mixing
PPT
Krashens Five Hypotheses
PPTX
Code Switching
PPTX
Bilingualism
 
PPTX
Krashen's Five Main Hypotheses
Language acquisition
The interference of the first language
Age and acquisition
Age and language acquisition
Language acquisition (2)
code switching
Interference Between First and Second Languages pp pres
Age and acquisition
Bilingualism, code switching, and code mixing
Krashens Five Hypotheses
Code Switching
Bilingualism
 
Krashen's Five Main Hypotheses
Ad

Similar to Linguistic Considerations of Identity Resolution (2008) (20)

PDF
Formal Concept Analysis meets grammar typology
PDF
Personal identity matching
PDF
PERSONAL IDENTITY MATCHING
PDF
Adnan: Introduction to Natural Language Processing
PPT
Linked Data and cultural heritage data: an overview of the approaches from Eu...
PDF
MT SUMMIT PPT: Language-independent Model for Machine Translation Evaluation ...
PDF
Automatic Profiling Of Learner Texts
PDF
CUHK intern PPT. Machine Translation Evaluation: Methods and Tools
PDF
Mini seminar presentation on context-based NED optimization
PDF
Towards Identity Resolution: The Challenge of Name Matching
PPTX
Corpus linguistics
PDF
Practical Corpus Linguistics An Introduction to Corpus-Based Language Analysi...
PPT
Learning for semantic parsing using statistical syntactic parsing techniques
PDF
Julia? why a new language, an an application to genomics data analysis
PDF
Temporal Semantic Techniques for Text Analysis and Applications
PDF
Cohesive Software Design
PDF
NLP Project Full Circle
PDF
Master Thesis
PDF
Measuring reliability and validity in human coding and machine classification
PPT
Natural Language Processing
Formal Concept Analysis meets grammar typology
Personal identity matching
PERSONAL IDENTITY MATCHING
Adnan: Introduction to Natural Language Processing
Linked Data and cultural heritage data: an overview of the approaches from Eu...
MT SUMMIT PPT: Language-independent Model for Machine Translation Evaluation ...
Automatic Profiling Of Learner Texts
CUHK intern PPT. Machine Translation Evaluation: Methods and Tools
Mini seminar presentation on context-based NED optimization
Towards Identity Resolution: The Challenge of Name Matching
Corpus linguistics
Practical Corpus Linguistics An Introduction to Corpus-Based Language Analysi...
Learning for semantic parsing using statistical syntactic parsing techniques
Julia? why a new language, an an application to genomics data analysis
Temporal Semantic Techniques for Text Analysis and Applications
Cohesive Software Design
NLP Project Full Circle
Master Thesis
Measuring reliability and validity in human coding and machine classification
Natural Language Processing
Ad

More from David Murgatroyd (13)

PPTX
Mission-Driven Machine Learning
PDF
Leveraging AI the Right Way (for Product Managers)
PDF
Managing Your Machine Learning Portfolio
PDF
How to train your product owner
PDF
Technology & Faith: from Coding to Culture
PDF
Agile Deep Learning
PPTX
Choosing a Job for the Right Reasons
PPTX
NLP in the Real World
PPTX
System combination for HLT
PPTX
HltCon overview
PPTX
Simple fuzzy name matching in solr
PDF
Moving beyond-entity-extraction-to-entity-resolution-david-murgatroyd-human-l...
PPTX
From Research to Reality: Advances in HLT 2013
Mission-Driven Machine Learning
Leveraging AI the Right Way (for Product Managers)
Managing Your Machine Learning Portfolio
How to train your product owner
Technology & Faith: from Coding to Culture
Agile Deep Learning
Choosing a Job for the Right Reasons
NLP in the Real World
System combination for HLT
HltCon overview
Simple fuzzy name matching in solr
Moving beyond-entity-extraction-to-entity-resolution-david-murgatroyd-human-l...
From Research to Reality: Advances in HLT 2013

Recently uploaded (20)

PPTX
Oracle Fusion HCM Cloud Demo for Beginners
PDF
Salesforce Agentforce AI Implementation.pdf
PPTX
Why Generative AI is the Future of Content, Code & Creativity?
PDF
AI/ML Infra Meetup | LLM Agents and Implementation Challenges
PDF
Ableton Live Suite for MacOS Crack Full Download (Latest 2025)
PPTX
Tech Workshop Escape Room Tech Workshop
PDF
Designing Intelligence for the Shop Floor.pdf
PPTX
Advanced SystemCare Ultimate Crack + Portable (2025)
PDF
Types of Token_ From Utility to Security.pdf
PDF
Autodesk AutoCAD Crack Free Download 2025
PDF
Digital Systems & Binary Numbers (comprehensive )
PPTX
assetexplorer- product-overview - presentation
PPTX
WiFi Honeypot Detecscfddssdffsedfseztor.pptx
PDF
Top 10 Software Development Trends to Watch in 2025 🚀.pdf
PDF
Product Update: Alluxio AI 3.7 Now with Sub-Millisecond Latency
PPTX
"Secure File Sharing Solutions on AWS".pptx
PPTX
Patient Appointment Booking in Odoo with online payment
PDF
AI/ML Infra Meetup | Beyond S3's Basics: Architecting for AI-Native Data Access
DOCX
Greta — No-Code AI for Building Full-Stack Web & Mobile Apps
PDF
Website Design Services for Small Businesses.pdf
Oracle Fusion HCM Cloud Demo for Beginners
Salesforce Agentforce AI Implementation.pdf
Why Generative AI is the Future of Content, Code & Creativity?
AI/ML Infra Meetup | LLM Agents and Implementation Challenges
Ableton Live Suite for MacOS Crack Full Download (Latest 2025)
Tech Workshop Escape Room Tech Workshop
Designing Intelligence for the Shop Floor.pdf
Advanced SystemCare Ultimate Crack + Portable (2025)
Types of Token_ From Utility to Security.pdf
Autodesk AutoCAD Crack Free Download 2025
Digital Systems & Binary Numbers (comprehensive )
assetexplorer- product-overview - presentation
WiFi Honeypot Detecscfddssdffsedfseztor.pptx
Top 10 Software Development Trends to Watch in 2025 🚀.pdf
Product Update: Alluxio AI 3.7 Now with Sub-Millisecond Latency
"Secure File Sharing Solutions on AWS".pptx
Patient Appointment Booking in Odoo with online payment
AI/ML Infra Meetup | Beyond S3's Basics: Architecting for AI-Native Data Access
Greta — No-Code AI for Building Full-Stack Web & Mobile Apps
Website Design Services for Small Businesses.pdf

Linguistic Considerations of Identity Resolution (2008)

  • 1. GOVERNMENT USERS Conference “Navigating the Human Terrain” College Park, MD, May 20-21, 2008 Linguistic Considerations of Identity Resolution David Murgatroyd Software Architect Basis Technology
  • 2. 2 Outline  Introduction  Linguistic Challenges  Variation (Intentional & Unintentional)  Composition  Frequency  Under-specification  Multilinguality  Integration Challenges  Inputs & Outputs  Properties  Evaluation Challenges  Corpora: Find or Build?  Metrics: Adopt or Create?  Conclusion
  • 3. 3 Introduction: An Exercise Jim Killeen Kileen, J. D. Jaime Kilin ‫كلين‬ ‫جمس‬  Is there a >50% chance these refer to the same person? If…US Citizens; On a ferry to Spain; In a documentary
  • 4. 4 What is Identity Resolution?  Identity Resolution (aka Entity Resolution):  determining if two or more given references refer to the same entity.  Different from name matching as it’s about identity of entities not similarity of names  See also:  Murgatroyd, D. Some Linguistic Considerations of Entity Resolution and Retrieval. In Proceedings of LREC 2008 Workshop on Resources and Evaluation for Identity Matching, Entity Resolution and Entity Management.
  • 5. 5 What sorts of references?  Non-linguistic reference examples:  Numerical identifiers — SSN — Some portions of address (Street Number, Zip Code)  Visual identifiers (e.g., pictures, symbols)  Biometrics (e.g., DNA, iris, signature, voice)  Linguistic reference examples:  Nouns or pronouns in documents (e.g., “the CEO of Basis”)  Names of associated/related entities — Locations (e.g., Street or City Name) — Organizations — Individuals  Name of entity <- we’re going to focus on this one
  • 6. 6 Let’s focus on names of people  Common and familiar  Often fairly identifying piece of personal information  Demonstrate typical challenges of resolution with linguistic data
  • 7. 7 Outline  Introduction  Linguistic Challenges  Variation (Intentional & Unintentional)  Composition  Frequency  Under-specification  Multilinguality  Integration Challenges  Inputs & Outputs  Properties  Evaluation Challenges  Corpora: Find or Build?  Metrics: Adopt or Create?  Conclusion
  • 8. 8 Variation (Intentional)  Variation may be intentional  References may be draw on a large set of names: — Formality (e.g., nicknames) — Transparency (e.g., aliases) — Location (e.g., toponym) — Life status  Vocation (e.g., titles)  Marital status (e.g., marriage/divorce/widowhood)  Parenthood (e.g., patronymic)  Faith (e.g., christening, pilgrimage)  Death (e.g., posthumous names) — Dialect (e.g., adolescent girls preferring “Jenni” over “Jenny”) — Style of text (e.g., “Sollermun” for “Solomon” in Huck Finn) Jim Killeen
  • 9. 9 Variation (Unintentional)  Variation may be unintentional, arising from:  Typos — E.g., “Killeen” vs. “Kileen”  Guessing spelling based on pronunciation — E.g., “Caliin”  Ambiguities inherent in the encoding (e.g., Unicode): — Characters with the same glyph  E.g., Latin and Cyrillic small “i” — Characters with similar glyphs  E.g., Latin “K” and Greenlandic “ĸ” — Characters with composed/combined forms  E.g., ņ (n with cedilla) vs. ņ (n + combining cedilla) Kileen, J. D.
  • 10. 10 Composition  Names have differing orders:  Given v. Surname: “Killen, Jim” v. “Jim Killeen”  Varies by culture  Name references may be partial:  “Jim” v. “Jim Killeen”
  • 11. 11 Under-specification  Name components may be abbreviated  Initials (e.g., “J. D.”)  Abbreviations (e.g., “Jas.”)  Name references may have incomplete…  orthography (e.g., Semitic languages)  segmentation (e.g., Asian languages)  phonology (e.g., Ideographic languages) Kileen, J. D. ‫كلين‬ ‫جمس‬
  • 12. 12 Frequency  Any person can make up a name (an open class)  A few are common, most are very uncommon  Zipfian distribution  Lesson:  Valuable to know common names  Valuable to have a strategy for unknown names
  • 13. 13 Multilinguality  Names may appear in many languages-of-use  This leads to variation at many linguistic levels.  Orthographic:  transliteration confronts skew in: —orthographic-to-phonetic mappings of source and target languages-of-use —sound systems between the languages ‫كلين‬ ‫جمس‬ <-> James Klein
  • 14. 14 Multilinguality (cont’d)  Syntactic:  different languages-of-use may imply different name word order  Semantic:  name words which communicate meaning (e.g., titles) may vary (e.g., “Jr.” for “‫الصغر‬ “which means “the younger”)  Pragmatic:  different languages-of-use may use different names based on the audience (e.g., “Mr. Laden” vs. “‫المير‬” which means “the prince”)
  • 15. 15 Outline  Introduction  Linguistic Challenges  Variation (Intentional & Unintentional)  Composition  Frequency  Under-specification  Multilinguality  Integration Challenges  Inputs & Outputs  Properties  Evaluation Challenges  Corpora: Find or Build?  Metrics: Adopt or Create?  Conclusion
  • 16. 16 Inputs & Outputs  Inputs options include:  Pair-wise: simple integration, but no shared effort  Set-based: harder integration, but able to optimize  Output options include:  Feature-based: with weights/tuning  Probability-based: —more principled combination —NOTE: similarity is not probability
  • 17. 17 Integration Properties  Certain properties help make efficient implementations:  Reflexivity: —Resolve(a,a) is always true —NOTE: does not imply Resolve(a,a’) where a~a’  Commutativity: —Resolve(a,b)  Resolve(b,a)  Transitivity: —Resolve(a,b) & Resolve(b,c) => Resolve(a,c)
  • 18. 18 Outline  Introduction  Linguistic Challenges  Variation (Intentional & Unintentional)  Composition  Frequency  Under-specification  Multilinguality  Integration Challenges  Inputs & Outputs  Properties  Evaluation Challenges  Corpora: Find or Build?  Metrics: Adopt or Create?  Conclusion
  • 19. 19 Corpora: Find or Build?  Requirements:  Annotated for ground truth  Represent linguistic challenges  Scalable/practical  Options  Adapt public “database” corpora: — Wikipedia:  Annotated: yes  Representative: somewhat  Scalable: yes — Citation DBs:  Annotated: no  Representative: somewhat  Scalable: yes
  • 20. 20 Corpora: Find or Build? (cont’d)  Adapt public “document” corpora: — Co-reference documents:  Annotated: yes  Representative: less as often single doc/language-of-use  Scalable: yes  Create corpora by hand: — From scratch: “parrot sessions” (auditory or visual)  Annotated: yes  Representative: largely  Scalable: no — From un-annotated databases:  Annotated: no  Representative: yes  Scalable/practical: no; databases may be private — Synthesize from generative model  Annotated: yes  Representative: no, tied to generating model  Scalable: yes
  • 21. 21 Metrics  Back to our initial example Jim Killeen Kileen, J. D. Jaime Kilin ‫كلين‬ ‫جمس‬ Jim JDKJimK illeen J. Diw Killeen Reference System A System B
  • 22. 22 Metrics: Adopt or Create?  How to quantify the quality of the system’s resolutions vs. the reference?  Goals:  Discriminative: separates good v. bad systems for users’ needs  Interpretable: number aligns with intuition  Considerations:  Assume transitive closure (TC) of output?  Apply weights to try to be more discriminative?  Common concepts:  Precision: % of stuff in answer that’s right  Recall: % of right stuff in answer  F-Score: Harmonic mean of these = 2*P*R/(P+R)
  • 23. 23 Candidate Metrics  Pair-wise % correct: over all N*(N-1)/2 node pairs  Pair-wise P&R: based on links drawn  Edit-distance: # of links to add/subtract to correct  Metrics used in document co-reference resolution:  MUC-6: entity-based P&R on missing links from graph  B-CUBED: average per-reference P&R of links  CEAF (Constrained Entity-Alignment F): entities aligned using some similarity measure; P&R are % of possible similarity level achieved
  • 24. 24 Comparing Metrics Jim Killeen Jaime Kilin ‫كلين‬ ‫جمس‬ Jim JDKJimK illeen J. Diw Killeen Reference System A System B Kileen, J. D. No TCTC 3 6 1 4 Edit-dist 81858973717982B 90788062618279A No TCTCNo TCTC CEAF (TC) B-CUBED (TC) MUC-6 (TC) Pairwise F% Correct My preference
  • 25. 25 Conclusion  Identity resolution systems face linguistic challenges  They need to be carefully integrated to meet these challenges  Evaluation corpora should reflect these challenges  Evaluation metrics should align with qualitative judgements
  • 26. 26 Bibliography Bagga, A., Baldwin., B. (1998). Algorithms for scoring coreference chains. In Proceedings of the First International Conference on Language Resources and Evaluation Workshop on Linguistic Coreference. Fellegi, I. P., Sunter, A. B. (1969). A theory for record linkage. Journal of the American Statistical Association, Vol. 64, No. 328, pp. 1183--1210. Luo, X. (2005). On coreference resolution performance metrics. In Proc. of HLT-EMNLP, pp 25--32. Menestrina, D., Benjelloun, O., Garcia-Molina, H. (2006). Generic entity resolution with data confidences. In First International VLDB Workshop on Clean Databases. Seoul, Korea. Murgatroyd, D. Some Linguistic Considerations of Entity Resolution and Retrieval. In Proceedings of LREC 2008 Workshop on Resources and Evaluation for Identity Matching, Entity Resolution and Entity Management. Spock Team (2008). The Spock Challenge. http://challenge.spock.com/ (Retrieved February 5.) Vilain, M. Burger, J. Aberdeen, J. Connolly, D., Hirschman, L. (1995). A model-theoretic coreference scoring scheme. In Proceedings of the 6th Message Understanding Conference (MUC6). Morgan Kaufmann, pp. 45--52.