SlideShare a Scribd company logo
SOURCES OF CHANGE IN MODERN
KNOWLEDGE ORGANIZATION
SYSTEMS
Paul Groth (@pgroth)
Disruptive Technology Director
Elsevier Labs (@elsevierlabs)
February 2, 2016
Contributions: Brad Allen, Michael Lauruhn
KNOWLEDGE
ORGANIZATION IS
IMPORTANT
Sources of Change in Modern Knowledge Organization Systems
https://www.elsevier.com/authors/author-schemas/elsevier-xml-dtds-and-transport-schemas
• 548 page document
• defines the content structure of
a document
• “Developing a DTD alone is
insufficient to allow an XML-
based process; high-quality
documentation helps in
clarifying the interpretation of
the tags and specifying the
ways in which they are used”
Sources of Change in Modern Knowledge Organization Systems
Sources of Change in Modern Knowledge Organization Systems
Sources of Change in Modern Knowledge Organization Systems
Education
8
• Elsevier Enterprise Content Model ontology
• 40+ properties
• 20 datatypes
• 10 Content types
• 20 Asset types
• Adaptive Learning ontology
• Recommendation
• Teaching
• Assessing
• Remediation
• SKOS ontology
• 3 third-party vocabularies: QSEN, Bloom etc.
• QTI 2.1 compliant schema
• XHTML5 schema
• 50+ data-type attribute definitions
• Student Learning Objective ontology
• SKOS ontology extended with 2 properties
• Multi-media assets incl. Text Time based
Markup Language
BIG KOS
ANSWERS ARE ABOUT THINGS, NOT JUST WORKS
Why shouldn’t a search on an author return
information about the author, including the
author’s works? Where was the author born,
when did she live, what is she known for? … All of
this is possible, but only if we can make some
fundamental changes in our approach to
bibliographic description. ... The challenge for us
lies in transforming what we can of our data into
interrelated “things” without overindulging that
metaphor.
Coyle, K. (2016). FRBR, before and after: a look at our
bibliographical models. Chicago: ALA Editions.
Sources of Change in Modern Knowledge Organization Systems
KNOWLEDGE GRAPHS AND MACHINE READING TURN
CONTENT INTO ANSWERS
• Knowledge graphs are "graph structured knowledge bases (KBs)
which store factual information in form of relationships between
entities” (Nickel, M., Murphy, K., Tresp, V. and Gabrilovich, E.
(2015). A review of relational machine learning for knowledge
graphs. arXiv:1503.00759v3)
• Knowledge graphs are metadata evolved beyond the focus on
the work, linking people, concepts, things and events
• Knowledge graphs organize data extracted from content through
machine reading so that queries can provide answers
Sources of Change in Modern Knowledge Organization Systems
ELSEVIER: KNOWLEDGE GRAPHS FOR RESEARCH
ELSEVIER: KNOWLEDGE GRAPHS FOR LIFE
SCIENCESBiological Pathways extracted via
semantic text mining
A upregulates B
B upregulates C
C increases disease D
Normalizing vocabularies required: proteins, diseases, drugs, chemicals
A  B  C  D
Bioactivities
through text analysis
IC50 6.3nM, kinase binding assay
10mM concentration
Chemical Structures
And Properties
InChi,
Name
NCBI,
Uniprot
EMTREE
ReaxysTree,
Structures
ELSEVIER’S KNOWLEDGE PLATFORM
Products
Data & Content
Sources
Knowledge
Graphs
Platforms &
Shared Services
Entity Hubs
Usage logs Pathways EHRsArticles Authors Institutions
SyllabiCitations ChemicalsBooks DrugsFunders
Funder Hub Article HubProfile Hub Journal Hub Institution Hub
Research HealthcareLife Sciences
Content Life Sciences Search IdentityResearch
Reaxys CK SherpathScopus SD ROS
THE BATTLE FOR THE KNOWLEDGE GRAPH
I really believe that the key battleground in any
industry is that of its knowledge graph. Google
has it for media/advertising, Netflix has it for
filmed entertainment, Uber has it for inner city
transportation, Facebook has it across social
media as well as messaging and the multiples
speak for themselves.
Tony Askew, Founder/Partner at REV (personal communication,
September 29, 2016)
CHANGE
Concept1
Concept2 Concept3
KOS
Professional
Curators
Literature
Software
Non-professional
contributors
Data
⚐Society & Politics
(4, 5, 6)
(7, 8, 9)
(3)
(1, 2)
SOURCES OF CHANGE FOR KOS – CURRENT VIEW
1. dealing with changing cultural and societal norms, specifically to address or
correct bias;
2. political influence
3. new concepts and terminology arising from discoveries or change in
perspective within a technical/scientific community
4. GARDENING
Wikipedia Categories
25% increase in the number of categories over the 2012 - 2014 period vs
a 12% increase in the number of articles. Likewise, the number of
disambiguation pages has increased by 13%. (Bairi et al. 2015)
http://blog.schema.org/2015/11/schemaorg-whats-new.html
5. INCREMENTAL CONTRIBUTORSHIP
Over 17,000 active users on
wikidata as of Feb 2017
6. PROGRESSIVE FORMALIZATION
7. SOFTWARE AGENTS
p=83
r = 176
83 x 176 sparse binary-valued matrix
with 366 entries
surface form
relations
structured
relations
entitypairs
Content
Universal
schema
Surface form
relations
Structured
relations
Factorization
model
Matrix
Construction
Open
Information
Extraction
Entity
Resolution
Matrix
Factorization
Knowledge
graph
Curation
Predicted
relations
Matrix
Completion
Taxonomy
Triple
Extraction
14M articles from
Science Direct
3.3M facts
475M facts
49M facts920K concepts from EMMeT
glaucoma developed many years after chronic inflammation of uveal tract
glaucoma develop following chronic inflammation of uveal tract
glaucoma can appear soon in family history of glaucoma
glaucoma can appear soon in age over 40
glaucoma the risk of functional visual field loss
glaucoma contributing causes of functional visual field loss
glaucoma contributed to functional visual field loss
glaucoma is considered the second leading cause of functional visual field loss
glaucoma remains the second leading cause of functional visual field loss
Latent factor matrix
r = 176
p=83
Latentfactormatrix
×
83 x 176 real-valued matrix with
14,608 entries
=
diseases 2791370 glaucoma have been documented to cause contact dermatitis 3815093 diseases
diseases 2791370 glaucoma is assessed through evaluation 5415395 qualifier
diseases 2791370 glaucoma progresses more rapidly than primary open-angle glaucoma 8247149 diseases
diseases 2791370 glaucoma recommend treatment 5216597 procedures
diseases 2791370 glaucoma supports the assumption that oxidative stress 8184588 diseases
diseases 2791370 glaucoma is the death of retinal ganglion cells 8002088 anatomy
8. INTEGRATION OF LARGE NUMBERS OF DATA SOURCES
Groth, Paul, "The Knowledge-Remixing Bottleneck," Intelligent Systems, IEEE
, vol.28, no.5, pp.44,48, Sept.-Oct. 2013 doi: 10.1109/MIS.2013.138
• 10 different extractors
• E.g mapping-based infobox extractor
• Infobox uses a hand-built ontology based on the 350
• Based on acommonly used English language
infoboxes
• Integrates with Yago
• Yago relies on Wikipedia + Wordnet
• Upper ontology from Wordnet and then a mapping to
Wikipedia categories based frequencies
• Wordnet is built by psycholinguists
9. TRAINING DATA
Sources of Change in Modern Knowledge Organization Systems
CONCLUSION AND A QUESTION
• KOSs are important and are expanding in size
• A focus on organizing information about entities not just “content”
• The construction and maintenance of massive KOSs  new sources of change
• Two new actors: software and non-professionals
• How do we deal with theses sources?
• New biases, opaque systems
• The role of a KOS observatory?
• Empirical evidence for what to do

More Related Content

PPTX
Research Data Sharing: A Basic Framework
PPTX
Machines are people too
PPTX
Knowledge graph construction for research & medicine
PPTX
The Roots: Linked data and the foundations of successful Agriculture Data
PPTX
Combining Explicit and Latent Web Semantics for Maintaining Knowledge Graphs
PPTX
The need for a transparent data supply chain
PPTX
From Text to Data to the World: The Future of Knowledge Graphs
PPTX
Minimal viable-datareuse-czi
Research Data Sharing: A Basic Framework
Machines are people too
Knowledge graph construction for research & medicine
The Roots: Linked data and the foundations of successful Agriculture Data
Combining Explicit and Latent Web Semantics for Maintaining Knowledge Graphs
The need for a transparent data supply chain
From Text to Data to the World: The Future of Knowledge Graphs
Minimal viable-datareuse-czi

What's hot (20)

PPTX
Content + Signals: The value of the entire data estate for machine learning
PPTX
End-to-End Learning for Answering Structured Queries Directly over Text
PPTX
From Data Search to Data Showcasing
PPTX
Thoughts on Knowledge Graphs & Deeper Provenance
PPTX
The Challenge of Deeper Knowledge Graphs for Science
PPTX
Information architecture at Elsevier
PPTX
More ways of symbol grounding for knowledge graphs?
PPTX
Data Communities - reusable data in and outside your organization.
PDF
Knowledge Graph Maintenance
PPTX
Thinking About the Making of Data
PPTX
Elsevier’s Healthcare Knowledge Graph
PDF
Knowledge Graph Maintenance
PPT
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
PPTX
Knowledge Graph Semantics/Interoperability
PDF
Open interoperability standards, tools and services at EMBL-EBI
PDF
Reproducible research: First steps.
PDF
The Fourth Paradigm - Deltares Data Science Day, 31 October 2014
PDF
On community-standards, data curation and scholarly communication" Stanford M...
PPTX
Reproducibility and Scientific Research: why, what, where, when, who, how
PPTX
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
Content + Signals: The value of the entire data estate for machine learning
End-to-End Learning for Answering Structured Queries Directly over Text
From Data Search to Data Showcasing
Thoughts on Knowledge Graphs & Deeper Provenance
The Challenge of Deeper Knowledge Graphs for Science
Information architecture at Elsevier
More ways of symbol grounding for knowledge graphs?
Data Communities - reusable data in and outside your organization.
Knowledge Graph Maintenance
Thinking About the Making of Data
Elsevier’s Healthcare Knowledge Graph
Knowledge Graph Maintenance
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
Knowledge Graph Semantics/Interoperability
Open interoperability standards, tools and services at EMBL-EBI
Reproducible research: First steps.
The Fourth Paradigm - Deltares Data Science Day, 31 October 2014
On community-standards, data curation and scholarly communication" Stanford M...
Reproducibility and Scientific Research: why, what, where, when, who, how
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
Ad

Viewers also liked (20)

PPTX
Tradeoffs in Automatic Provenance Capture
PPTX
Structured Data & the Future of Educational Material
PPTX
Knowledge Graphs at Elsevier
PPTX
Decoupling Provenance Capture and Analysis from Execution
PPTX
Data for Science: How Elsevier is using data science to empower researchers
PPTX
Knowledge Graph Construction and the Role of DBPedia
PPTX
Data Integration vs Transparency: Tackling the tension
PPTX
Altmetrics: painting a broader picture of impact
PPTX
Telling your research story with (alt)metrics
PPTX
"Don't Publish, Release" - Revisited
PPTX
Transparency in the Data Supply Chain
PPTX
Open PHACTS API Walkthrough
PPTX
Provenance for Data Munging Environments
PPTX
Machine Reading: What it means for publishers?
PPTX
Ideals and Norms in Scholarship
PPTX
CSUN 2012: ScienceDirect Article Of The Future Collaboration
PPT
Validation of Europeana data: application profile, OWL ontology, or else?
PPTX
How much does it cost sspmeeting may2015_kiley
PDF
DC-2016 Keynote 2016-10-13
PDF
Knowledge Graphs for a Connected World - AI, Deep & Machine Learning Meetup
Tradeoffs in Automatic Provenance Capture
Structured Data & the Future of Educational Material
Knowledge Graphs at Elsevier
Decoupling Provenance Capture and Analysis from Execution
Data for Science: How Elsevier is using data science to empower researchers
Knowledge Graph Construction and the Role of DBPedia
Data Integration vs Transparency: Tackling the tension
Altmetrics: painting a broader picture of impact
Telling your research story with (alt)metrics
"Don't Publish, Release" - Revisited
Transparency in the Data Supply Chain
Open PHACTS API Walkthrough
Provenance for Data Munging Environments
Machine Reading: What it means for publishers?
Ideals and Norms in Scholarship
CSUN 2012: ScienceDirect Article Of The Future Collaboration
Validation of Europeana data: application profile, OWL ontology, or else?
How much does it cost sspmeeting may2015_kiley
DC-2016 Keynote 2016-10-13
Knowledge Graphs for a Connected World - AI, Deep & Machine Learning Meetup
Ad

Similar to Sources of Change in Modern Knowledge Organization Systems (20)

PDF
Scientific Knowledge Graphs: an Overview
PDF
A Clean Slate?
PPTX
Ontologies For the Modern Age - McGuinness' Keynote at ISWC 2017
PPTX
Connected Data for Machine Learning | Paul Groth
PDF
Omics Logic - Bioinformatics 2.0
PDF
Minimal viable data reuse
PDF
Applying machine learning techniques to big data in the scholarly domain
PPTX
MS-Presentation-new template arid university.pptx
PPTX
Research Objects: more than the sum of the parts
PDF
ONTOLOGY SERVICE CENTER: A DATAHUB FOR ONTOLOGY APPLICATION
PDF
ONTOLOGY SERVICE CENTER: A DATAHUB FOR ONTOLOGY APPLICATION
PPTX
SKOS as the focal point of linked data strategies
PPTX
Managing 'Big Data' in the social sciences: the contribution of an analytico-...
PDF
Mendeley Open Repositories 2011 Paper
PDF
E bank uk_linking_research_data_scholarly
PPTX
The Electronic Notebook Ontology
PDF
C a s e - b a s e d S y s t e m f o r I n n o v a t i o n M a n a g e m e n t...
PPTX
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...
PPT
How Bio Ontologies Enable Open Science
PDF
Navigation through citation network based on content similarity using cosine ...
Scientific Knowledge Graphs: an Overview
A Clean Slate?
Ontologies For the Modern Age - McGuinness' Keynote at ISWC 2017
Connected Data for Machine Learning | Paul Groth
Omics Logic - Bioinformatics 2.0
Minimal viable data reuse
Applying machine learning techniques to big data in the scholarly domain
MS-Presentation-new template arid university.pptx
Research Objects: more than the sum of the parts
ONTOLOGY SERVICE CENTER: A DATAHUB FOR ONTOLOGY APPLICATION
ONTOLOGY SERVICE CENTER: A DATAHUB FOR ONTOLOGY APPLICATION
SKOS as the focal point of linked data strategies
Managing 'Big Data' in the social sciences: the contribution of an analytico-...
Mendeley Open Repositories 2011 Paper
E bank uk_linking_research_data_scholarly
The Electronic Notebook Ontology
C a s e - b a s e d S y s t e m f o r I n n o v a t i o n M a n a g e m e n t...
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...
How Bio Ontologies Enable Open Science
Navigation through citation network based on content similarity using cosine ...

More from Paul Groth (8)

PDF
Co-Constructing Explanations for AI Systems using Provenance
PDF
Evaluation Challenges in Using Generative AI for Science & Technical Content
PDF
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
PDF
Data Curation and Debugging for Data Centric AI
PDF
Knowledge Graph Futures
PPTX
Diversity and Depth: Implementing AI across many long tail domains
PPTX
Progressive Provenance Capture Through Re-computation
PPTX
Are we finally ready for transclusion?*
Co-Constructing Explanations for AI Systems using Provenance
Evaluation Challenges in Using Generative AI for Science & Technical Content
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Data Curation and Debugging for Data Centric AI
Knowledge Graph Futures
Diversity and Depth: Implementing AI across many long tail domains
Progressive Provenance Capture Through Re-computation
Are we finally ready for transclusion?*

Recently uploaded (20)

PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PPTX
MYSQL Presentation for SQL database connectivity
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Encapsulation theory and applications.pdf
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Approach and Philosophy of On baking technology
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
MYSQL Presentation for SQL database connectivity
20250228 LYD VKU AI Blended-Learning.pptx
Unlocking AI with Model Context Protocol (MCP)
Chapter 3 Spatial Domain Image Processing.pdf
Programs and apps: productivity, graphics, security and other tools
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Mobile App Security Testing_ A Comprehensive Guide.pdf
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
Encapsulation theory and applications.pdf
Spectral efficient network and resource selection model in 5G networks
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Dropbox Q2 2025 Financial Results & Investor Presentation
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Advanced methodologies resolving dimensionality complications for autism neur...
Understanding_Digital_Forensics_Presentation.pptx
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Approach and Philosophy of On baking technology

Sources of Change in Modern Knowledge Organization Systems

  • 1. SOURCES OF CHANGE IN MODERN KNOWLEDGE ORGANIZATION SYSTEMS Paul Groth (@pgroth) Disruptive Technology Director Elsevier Labs (@elsevierlabs) February 2, 2016 Contributions: Brad Allen, Michael Lauruhn
  • 4. https://www.elsevier.com/authors/author-schemas/elsevier-xml-dtds-and-transport-schemas • 548 page document • defines the content structure of a document • “Developing a DTD alone is insufficient to allow an XML- based process; high-quality documentation helps in clarifying the interpretation of the tags and specifying the ways in which they are used”
  • 8. Education 8 • Elsevier Enterprise Content Model ontology • 40+ properties • 20 datatypes • 10 Content types • 20 Asset types • Adaptive Learning ontology • Recommendation • Teaching • Assessing • Remediation • SKOS ontology • 3 third-party vocabularies: QSEN, Bloom etc. • QTI 2.1 compliant schema • XHTML5 schema • 50+ data-type attribute definitions • Student Learning Objective ontology • SKOS ontology extended with 2 properties • Multi-media assets incl. Text Time based Markup Language
  • 10. ANSWERS ARE ABOUT THINGS, NOT JUST WORKS Why shouldn’t a search on an author return information about the author, including the author’s works? Where was the author born, when did she live, what is she known for? … All of this is possible, but only if we can make some fundamental changes in our approach to bibliographic description. ... The challenge for us lies in transforming what we can of our data into interrelated “things” without overindulging that metaphor. Coyle, K. (2016). FRBR, before and after: a look at our bibliographical models. Chicago: ALA Editions.
  • 12. KNOWLEDGE GRAPHS AND MACHINE READING TURN CONTENT INTO ANSWERS • Knowledge graphs are "graph structured knowledge bases (KBs) which store factual information in form of relationships between entities” (Nickel, M., Murphy, K., Tresp, V. and Gabrilovich, E. (2015). A review of relational machine learning for knowledge graphs. arXiv:1503.00759v3) • Knowledge graphs are metadata evolved beyond the focus on the work, linking people, concepts, things and events • Knowledge graphs organize data extracted from content through machine reading so that queries can provide answers
  • 15. ELSEVIER: KNOWLEDGE GRAPHS FOR LIFE SCIENCESBiological Pathways extracted via semantic text mining A upregulates B B upregulates C C increases disease D Normalizing vocabularies required: proteins, diseases, drugs, chemicals A  B  C  D Bioactivities through text analysis IC50 6.3nM, kinase binding assay 10mM concentration Chemical Structures And Properties InChi, Name NCBI, Uniprot EMTREE ReaxysTree, Structures
  • 16. ELSEVIER’S KNOWLEDGE PLATFORM Products Data & Content Sources Knowledge Graphs Platforms & Shared Services Entity Hubs Usage logs Pathways EHRsArticles Authors Institutions SyllabiCitations ChemicalsBooks DrugsFunders Funder Hub Article HubProfile Hub Journal Hub Institution Hub Research HealthcareLife Sciences Content Life Sciences Search IdentityResearch Reaxys CK SherpathScopus SD ROS
  • 17. THE BATTLE FOR THE KNOWLEDGE GRAPH I really believe that the key battleground in any industry is that of its knowledge graph. Google has it for media/advertising, Netflix has it for filmed entertainment, Uber has it for inner city transportation, Facebook has it across social media as well as messaging and the multiples speak for themselves. Tony Askew, Founder/Partner at REV (personal communication, September 29, 2016)
  • 20. SOURCES OF CHANGE FOR KOS – CURRENT VIEW 1. dealing with changing cultural and societal norms, specifically to address or correct bias; 2. political influence 3. new concepts and terminology arising from discoveries or change in perspective within a technical/scientific community
  • 21. 4. GARDENING Wikipedia Categories 25% increase in the number of categories over the 2012 - 2014 period vs a 12% increase in the number of articles. Likewise, the number of disambiguation pages has increased by 13%. (Bairi et al. 2015) http://blog.schema.org/2015/11/schemaorg-whats-new.html
  • 22. 5. INCREMENTAL CONTRIBUTORSHIP Over 17,000 active users on wikidata as of Feb 2017
  • 24. 7. SOFTWARE AGENTS p=83 r = 176 83 x 176 sparse binary-valued matrix with 366 entries surface form relations structured relations entitypairs Content Universal schema Surface form relations Structured relations Factorization model Matrix Construction Open Information Extraction Entity Resolution Matrix Factorization Knowledge graph Curation Predicted relations Matrix Completion Taxonomy Triple Extraction 14M articles from Science Direct 3.3M facts 475M facts 49M facts920K concepts from EMMeT glaucoma developed many years after chronic inflammation of uveal tract glaucoma develop following chronic inflammation of uveal tract glaucoma can appear soon in family history of glaucoma glaucoma can appear soon in age over 40 glaucoma the risk of functional visual field loss glaucoma contributing causes of functional visual field loss glaucoma contributed to functional visual field loss glaucoma is considered the second leading cause of functional visual field loss glaucoma remains the second leading cause of functional visual field loss Latent factor matrix r = 176 p=83 Latentfactormatrix × 83 x 176 real-valued matrix with 14,608 entries = diseases 2791370 glaucoma have been documented to cause contact dermatitis 3815093 diseases diseases 2791370 glaucoma is assessed through evaluation 5415395 qualifier diseases 2791370 glaucoma progresses more rapidly than primary open-angle glaucoma 8247149 diseases diseases 2791370 glaucoma recommend treatment 5216597 procedures diseases 2791370 glaucoma supports the assumption that oxidative stress 8184588 diseases diseases 2791370 glaucoma is the death of retinal ganglion cells 8002088 anatomy
  • 25. 8. INTEGRATION OF LARGE NUMBERS OF DATA SOURCES Groth, Paul, "The Knowledge-Remixing Bottleneck," Intelligent Systems, IEEE , vol.28, no.5, pp.44,48, Sept.-Oct. 2013 doi: 10.1109/MIS.2013.138 • 10 different extractors • E.g mapping-based infobox extractor • Infobox uses a hand-built ontology based on the 350 • Based on acommonly used English language infoboxes • Integrates with Yago • Yago relies on Wikipedia + Wordnet • Upper ontology from Wordnet and then a mapping to Wikipedia categories based frequencies • Wordnet is built by psycholinguists
  • 28. CONCLUSION AND A QUESTION • KOSs are important and are expanding in size • A focus on organizing information about entities not just “content” • The construction and maintenance of massive KOSs  new sources of change • Two new actors: software and non-professionals • How do we deal with theses sources? • New biases, opaque systems • The role of a KOS observatory? • Empirical evidence for what to do

Editor's Notes

  • #9: Use of open standards
  • #23: 1700 active contributors
  • #24: We don’t start with a full formal definition but formalize over time from usage