SlideShare a Scribd company logo
Linkator: enriching web pages by automatically adding dereferenceable semantic annotationsSamur Araujo, Geert-Jan Houben, Daniel SchwabeWeb Information SystemsDelft University of Technology, the Netherlands
Summary – dereferencing semantic annotationsWhat dereferencing semantic annotations is about?Automatic linking web pages.SummaryOverview of the problem and motivation.Our approach for solving the problem.One example of use.
MotivationLinks between HTML pages are the main mechanism to navigate on web pages.However, a lot of pages are unlinked or poorly linked.Terms on pages have meaning and are intrinsically associated to concepts or entities that the user is interested in.These terms can be interpreted by machines and automatically linked to relevant resources on the web.
Linkator: enriching web pages by automatically adding dereferenceable semantic annotations
Problem Statement The problem of automatic linking can be divided in 3 sub-problems:How to identify candidate terms (anchors) for adding links?It denotes concepts in which the user is interested.  Which concept does a candidate term represent?Disambiguate a candidate term.How to identify a web resource to be the link target?How to select a source of data for finding the destination of the link?
State-of-the-Art in Automatic LinkingCandidate Terms:Focused on term disambiguation using an auxiliary knowledge base or dictionaries (e.g. wikipedia and wordnet).Link Target:It is selected from a specific knowledge base [1] or from a collection [2] of target documents. LimitationsDoes not support well users interested in a broader range of domains.[1] Mihalcea, R. and Csomai, A. Wikify!: linking documents to encyclopedic knowledge. In Proceedings of the 16th ACM Conference on Information and Knowledge management (CIKM 07), Lisbon, Portugal, pp. 233-242, 2007.[2] Gardner JJ, Krowne A, Xiong L. NNexus: An Automatic Linker for Collaborative Web-Based Corpora. IEEE Trans. Knowl. Data Eng. 21(6). 829-839. 2009.
Linkator Approach	LinkatorExtract Terms from Web PagesAssociate Terms to ConceptsFind Resources that Represents these ConceptsCore LinkatorInformation Extraction EngineSemantic Annotator
Link ClickedPage AccessedPage is accessedAnnotated pageTerm are extractedAnnotation is extractedPage is semantically annotatedEndpoint is chosenSemantic Links createdQuery is formulatedIf notfoundSearch for a resource
Linkator Approach	Web BrowserLinkator Client - Firefox PluginAnnotatorRDFa AnnotatorInformation Extraction EngineHTTPHTTPLinkator ServerLinked DataEndpoint ResolutionSparqlQuery Formulation
Semantic Link – Definition A semantic link is an HTML tag A that is semantically annotated with RDFa.It contains RDF triples associated to it.Semantic Link causes a query over Linked Data.
RDF Triples associated to the Semantic LinkSemantic Links
Dereferencing Semantic LinksLinkator uses the Linked Data cloudfor discovering a destination for the semantic link as opposed to querying search engines or a fixed knowledge base.Algorithm for Endpoint ResolutionAlgorithm for Query Formulation
Endpoint ResolutionTask:Find endpoints that contain a specific concept.Linkatorselects available endpoints based on the vocabulariesused in the semantic links. voiD(Vocabulary of Interlinked Datasets)
Endpoint ResolutionSelect the vocabulary of all RDF types associated with the annotation.Or select the vocabularies of all predicates associated with the annotation.
Endpoint ResolutionThe SelectEndpoint function find the resource: http://ontoware.org/swrc/swrc_v0.3.owl#AuthorIt extracts the vocabulary associated with this resource:http://ontoware.org/swrc/swrc_v0.3.owlIt queries the voiDdescriptor of the available SPARQL endpoints, looking for such a vocabulary.
Query FormulationQuery is based on the object of the triple.Try to find a human-readable representation of the resource, i.e., try to match predicates such as: foaf:homepage, akt:has-web-address, rdfs:seeAlso.
Proof of ConceptSemantic links for pages that contain bibliographic citations. Extended version of FreeCite parsing engine.Example of bibliographic citation:Keesvan derSluijs, Geert-Jan Houben, Erwin Leonardi, Jan Hidders. Hera: Engineering Web Applications Using Semantic Web-Based Models. Book chapter: Semantic Web Information Management: A Model-Based Perspective, De Virgilio, Roberto; Giunchiglia, Fausto; Tanca, Letizia (Eds.), Chapter 22, 2010, Springer.
LinkatorExtract Terms from Web PagesAssociate Terms to ConceptsFind Resources that Represents these ConceptsCore LinkatorInformation Extraction EngineSemantic AnnotatorHtml PageSparql Endpoint Discovering and SelectionMarkupRemovedEntity Extraction Plain TextText Semantically AnnotatedEndpoint QueryingSemantic link clickedSemantic AnnotationInsert annotations on the pageHTML Page Semantically AnnotatedURL GenerationFreeCite Extraction EngineCore Linkator
Example – HTML Page without Links
Linkator: enriching web pages by automatically adding dereferenceable semantic annotations
Example – Page annotated with RDFa
Example – Page with Semantic Links
Linkator: enriching web pages by automatically adding dereferenceable semantic annotations
Linkator: enriching web pages by automatically adding dereferenceable semantic annotations
Conclusion and Future WorkFor a specific scenario of linking bibliographic citations Linkator provides a reasonable solution. The composition of the Semantic Web technologies can provide a reasonable solution for the problem of automatic linking.Linkator is a concrete application that uses Semantic Web technologies.Future Work: Use Linkator in a broader scenario.Enhance the Linkator algorithms.Evaluate the precision and recall of the linking.
Questions?Thank you for your attention!Samur Araujos.f.cardosodearaujo@tudelft.nlYou can download Linkator at:http://www.wis.ewi.tudelft.nl/
Annotation on the page are used to find the link destinationAnnotated HTML PageHTML PagePage is annotatedLink is clickedRDF
State-of-the-Art in Automatic LinkingExample: Wikify! [1] is focused on linking keywords on web pages to Wikipedia articlesNnexus [2] focus on linking keywords obtained from an index extracted from target documents. [1] Mihalcea, R. and Csomai, A. Wikify!: linking documents to encyclopedic knowledge. In Proceedings of the 16th ACM Conference on Information and Knowledge management (CIKM 07), Lisbon, Portugal, pp. 233-242, 2007.[2] Gardner JJ, Krowne A, Xiong L. NNexus: An Automatic Linker for Collaborative Web-Based Corpora. IEEE Trans. Knowl. Data Eng. 21(6). 829-839. 2009.
Endpoint ResolutionFUNCTION SelectEndpoint	E := Array	R : = select all rdf:type objects associated to the semantic link	T := ExtractVocabulary(R)FOR EACH vocabulary in T DO{E.add (select endpoints that contain this vocabulary)	}IF E = Empty 	{		R := select all predicates associated to the semantic link		T := ExtractVocabulary(R)FOR EACH vocabulary in T DO		{E.add (select endpoints that contain this vocabulary)		}	}RETURN E FUNCTION ExtractVocabulary(R)	V := ArrayFOR EACH resource in R DO	{V.add (extract the vocabulary from the resource)	}RETURN V12345678910111213141516171819202122232425262728
Semantic Link – ExampleTriples associated with the semantic link.

More Related Content

PDF
Ontologies and semantic web
PPTX
Semantic web
PPT
A review of the state of the art in Machine Learning on the Semantic Web
PPTX
General Introduction for Semantic Web and Linked Open Data
PPSX
An Introduction to Semantic Web Technology
PDF
CS6010 Social Network Analysis Unit II
PDF
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
PPTX
semantic web & natural language
Ontologies and semantic web
Semantic web
A review of the state of the art in Machine Learning on the Semantic Web
General Introduction for Semantic Web and Linked Open Data
An Introduction to Semantic Web Technology
CS6010 Social Network Analysis Unit II
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
semantic web & natural language

Viewers also liked (18)

PPS
Montreal Jardins
PPS
Bella Edward
PPS
águas turbulentas
PPT
Alsacefrance Harry
PPT
Bella Edward
PDF
WS Final
PPS
You Needed Me
PDF
ActionCOACH - Social Media Success - Mari Smith
PPT
Social Media Success - eWomenNetwork 2010
PPT
marginal cost
DOC
C. peter wagner confrontando a rainha dos céus.rev
PPTX
Social Profit System 2014 - Free Webinar with Mari Smith!
PDF
Designing for Purpose
PDF
Ambient interfaces: Influencing energy behaviours in urban environments
PPT
Facebook Marketing: An Hour A Day - Session 1 of 3
PPT
Information economics
PPT
supply chain management
PPT
Facebook Marketing - Presentation for Jay Berkowitz
Montreal Jardins
Bella Edward
águas turbulentas
Alsacefrance Harry
Bella Edward
WS Final
You Needed Me
ActionCOACH - Social Media Success - Mari Smith
Social Media Success - eWomenNetwork 2010
marginal cost
C. peter wagner confrontando a rainha dos céus.rev
Social Profit System 2014 - Free Webinar with Mari Smith!
Designing for Purpose
Ambient interfaces: Influencing energy behaviours in urban environments
Facebook Marketing: An Hour A Day - Session 1 of 3
Information economics
supply chain management
Facebook Marketing - Presentation for Jay Berkowitz
Ad

Similar to Linkator: enriching web pages by automatically adding dereferenceable semantic annotations (20)

PPTX
Linked Data MLA 2015
PPTX
Linked data MLA 2015
PPTX
ESWC 2015 Closing and "General Chair's minute of Madness"
PDF
An imperative focus on semantic
PPTX
Linked data - NCompass presentation
PDF
Archivi e linked data
PPT
Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating...
PDF
Linked Data
PDF
Contextual Computing: Laying a Global Data Foundation
PDF
An Annotation Framework For The Semantic Web
PDF
Linked Data Generation for the University Data From Legacy Database
PPTX
Linked data HHS 2015
PPT
Tutorial on Semantic Digital Libraries (ESWC'2007)
PPTX
New Directions in Information Organization: A Linked Data Model with BIBFRAME
PPTX
Why I don't use Semantic Web technologies anymore, event if they still influe...
PDF
Beyond HREF (LAWDI)
PDF
Contextual Computing - Knowledge Graphs & Web of Entities
PPT
What is Linked Data, and What Does It Mean for Libraries?
PPTX
Practical approaches to linked data
PPT
Future of Web 2.0 & The Semantic Web
Linked Data MLA 2015
Linked data MLA 2015
ESWC 2015 Closing and "General Chair's minute of Madness"
An imperative focus on semantic
Linked data - NCompass presentation
Archivi e linked data
Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating...
Linked Data
Contextual Computing: Laying a Global Data Foundation
An Annotation Framework For The Semantic Web
Linked Data Generation for the University Data From Legacy Database
Linked data HHS 2015
Tutorial on Semantic Digital Libraries (ESWC'2007)
New Directions in Information Organization: A Linked Data Model with BIBFRAME
Why I don't use Semantic Web technologies anymore, event if they still influe...
Beyond HREF (LAWDI)
Contextual Computing - Knowledge Graphs & Web of Entities
What is Linked Data, and What Does It Mean for Libraries?
Practical approaches to linked data
Future of Web 2.0 & The Semantic Web
Ad

Recently uploaded (20)

PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Zenith AI: Advanced Artificial Intelligence
PDF
Web App vs Mobile App What Should You Build First.pdf
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
PDF
WOOl fibre morphology and structure.pdf for textiles
PDF
Hybrid model detection and classification of lung cancer
PDF
Getting Started with Data Integration: FME Form 101
PPTX
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
PDF
project resource management chapter-09.pdf
PDF
Hindi spoken digit analysis for native and non-native speakers
PDF
Mushroom cultivation and it's methods.pdf
PDF
A novel scalable deep ensemble learning framework for big data classification...
PPTX
cloud_computing_Infrastucture_as_cloud_p
PPTX
OMC Textile Division Presentation 2021.pptx
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Approach and Philosophy of On baking technology
Digital-Transformation-Roadmap-for-Companies.pptx
Accuracy of neural networks in brain wave diagnosis of schizophrenia
Assigned Numbers - 2025 - Bluetooth® Document
Building Integrated photovoltaic BIPV_UPV.pdf
Zenith AI: Advanced Artificial Intelligence
Web App vs Mobile App What Should You Build First.pdf
Programs and apps: productivity, graphics, security and other tools
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
WOOl fibre morphology and structure.pdf for textiles
Hybrid model detection and classification of lung cancer
Getting Started with Data Integration: FME Form 101
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
project resource management chapter-09.pdf
Hindi spoken digit analysis for native and non-native speakers
Mushroom cultivation and it's methods.pdf
A novel scalable deep ensemble learning framework for big data classification...
cloud_computing_Infrastucture_as_cloud_p
OMC Textile Division Presentation 2021.pptx
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Approach and Philosophy of On baking technology

Linkator: enriching web pages by automatically adding dereferenceable semantic annotations

  • 1. Linkator: enriching web pages by automatically adding dereferenceable semantic annotationsSamur Araujo, Geert-Jan Houben, Daniel SchwabeWeb Information SystemsDelft University of Technology, the Netherlands
  • 2. Summary – dereferencing semantic annotationsWhat dereferencing semantic annotations is about?Automatic linking web pages.SummaryOverview of the problem and motivation.Our approach for solving the problem.One example of use.
  • 3. MotivationLinks between HTML pages are the main mechanism to navigate on web pages.However, a lot of pages are unlinked or poorly linked.Terms on pages have meaning and are intrinsically associated to concepts or entities that the user is interested in.These terms can be interpreted by machines and automatically linked to relevant resources on the web.
  • 5. Problem Statement The problem of automatic linking can be divided in 3 sub-problems:How to identify candidate terms (anchors) for adding links?It denotes concepts in which the user is interested. Which concept does a candidate term represent?Disambiguate a candidate term.How to identify a web resource to be the link target?How to select a source of data for finding the destination of the link?
  • 6. State-of-the-Art in Automatic LinkingCandidate Terms:Focused on term disambiguation using an auxiliary knowledge base or dictionaries (e.g. wikipedia and wordnet).Link Target:It is selected from a specific knowledge base [1] or from a collection [2] of target documents. LimitationsDoes not support well users interested in a broader range of domains.[1] Mihalcea, R. and Csomai, A. Wikify!: linking documents to encyclopedic knowledge. In Proceedings of the 16th ACM Conference on Information and Knowledge management (CIKM 07), Lisbon, Portugal, pp. 233-242, 2007.[2] Gardner JJ, Krowne A, Xiong L. NNexus: An Automatic Linker for Collaborative Web-Based Corpora. IEEE Trans. Knowl. Data Eng. 21(6). 829-839. 2009.
  • 7. Linkator Approach LinkatorExtract Terms from Web PagesAssociate Terms to ConceptsFind Resources that Represents these ConceptsCore LinkatorInformation Extraction EngineSemantic Annotator
  • 8. Link ClickedPage AccessedPage is accessedAnnotated pageTerm are extractedAnnotation is extractedPage is semantically annotatedEndpoint is chosenSemantic Links createdQuery is formulatedIf notfoundSearch for a resource
  • 9. Linkator Approach Web BrowserLinkator Client - Firefox PluginAnnotatorRDFa AnnotatorInformation Extraction EngineHTTPHTTPLinkator ServerLinked DataEndpoint ResolutionSparqlQuery Formulation
  • 10. Semantic Link – Definition A semantic link is an HTML tag A that is semantically annotated with RDFa.It contains RDF triples associated to it.Semantic Link causes a query over Linked Data.
  • 11. RDF Triples associated to the Semantic LinkSemantic Links
  • 12. Dereferencing Semantic LinksLinkator uses the Linked Data cloudfor discovering a destination for the semantic link as opposed to querying search engines or a fixed knowledge base.Algorithm for Endpoint ResolutionAlgorithm for Query Formulation
  • 13. Endpoint ResolutionTask:Find endpoints that contain a specific concept.Linkatorselects available endpoints based on the vocabulariesused in the semantic links. voiD(Vocabulary of Interlinked Datasets)
  • 14. Endpoint ResolutionSelect the vocabulary of all RDF types associated with the annotation.Or select the vocabularies of all predicates associated with the annotation.
  • 15. Endpoint ResolutionThe SelectEndpoint function find the resource: http://ontoware.org/swrc/swrc_v0.3.owl#AuthorIt extracts the vocabulary associated with this resource:http://ontoware.org/swrc/swrc_v0.3.owlIt queries the voiDdescriptor of the available SPARQL endpoints, looking for such a vocabulary.
  • 16. Query FormulationQuery is based on the object of the triple.Try to find a human-readable representation of the resource, i.e., try to match predicates such as: foaf:homepage, akt:has-web-address, rdfs:seeAlso.
  • 17. Proof of ConceptSemantic links for pages that contain bibliographic citations. Extended version of FreeCite parsing engine.Example of bibliographic citation:Keesvan derSluijs, Geert-Jan Houben, Erwin Leonardi, Jan Hidders. Hera: Engineering Web Applications Using Semantic Web-Based Models. Book chapter: Semantic Web Information Management: A Model-Based Perspective, De Virgilio, Roberto; Giunchiglia, Fausto; Tanca, Letizia (Eds.), Chapter 22, 2010, Springer.
  • 18. LinkatorExtract Terms from Web PagesAssociate Terms to ConceptsFind Resources that Represents these ConceptsCore LinkatorInformation Extraction EngineSemantic AnnotatorHtml PageSparql Endpoint Discovering and SelectionMarkupRemovedEntity Extraction Plain TextText Semantically AnnotatedEndpoint QueryingSemantic link clickedSemantic AnnotationInsert annotations on the pageHTML Page Semantically AnnotatedURL GenerationFreeCite Extraction EngineCore Linkator
  • 19. Example – HTML Page without Links
  • 21. Example – Page annotated with RDFa
  • 22. Example – Page with Semantic Links
  • 25. Conclusion and Future WorkFor a specific scenario of linking bibliographic citations Linkator provides a reasonable solution. The composition of the Semantic Web technologies can provide a reasonable solution for the problem of automatic linking.Linkator is a concrete application that uses Semantic Web technologies.Future Work: Use Linkator in a broader scenario.Enhance the Linkator algorithms.Evaluate the precision and recall of the linking.
  • 26. Questions?Thank you for your attention!Samur Araujos.f.cardosodearaujo@tudelft.nlYou can download Linkator at:http://www.wis.ewi.tudelft.nl/
  • 27. Annotation on the page are used to find the link destinationAnnotated HTML PageHTML PagePage is annotatedLink is clickedRDF
  • 28. State-of-the-Art in Automatic LinkingExample: Wikify! [1] is focused on linking keywords on web pages to Wikipedia articlesNnexus [2] focus on linking keywords obtained from an index extracted from target documents. [1] Mihalcea, R. and Csomai, A. Wikify!: linking documents to encyclopedic knowledge. In Proceedings of the 16th ACM Conference on Information and Knowledge management (CIKM 07), Lisbon, Portugal, pp. 233-242, 2007.[2] Gardner JJ, Krowne A, Xiong L. NNexus: An Automatic Linker for Collaborative Web-Based Corpora. IEEE Trans. Knowl. Data Eng. 21(6). 829-839. 2009.
  • 29. Endpoint ResolutionFUNCTION SelectEndpoint E := Array R : = select all rdf:type objects associated to the semantic link T := ExtractVocabulary(R)FOR EACH vocabulary in T DO{E.add (select endpoints that contain this vocabulary) }IF E = Empty { R := select all predicates associated to the semantic link T := ExtractVocabulary(R)FOR EACH vocabulary in T DO {E.add (select endpoints that contain this vocabulary) } }RETURN E FUNCTION ExtractVocabulary(R) V := ArrayFOR EACH resource in R DO {V.add (extract the vocabulary from the resource) }RETURN V12345678910111213141516171819202122232425262728
  • 30. Semantic Link – ExampleTriples associated with the semantic link.

Editor's Notes

  • #2: I am in the start phase of the phd research. In this presentation, I will outline the vision at the start of the phd period on the research problem which is building trust in web content and our approach to solving this problem. Also I will give a brief plan of my PhD research.
  • #6: We focus on content trust and formulate our main research questions. The first key issue here is to investigate what kind factors that can influence trust in content.Following the first one, we also need to know how to capture and represent the information about these factors.The third key issue is how to assess or compute content trust based on the information we get from the second step. Ideally we want to have a trust value assigned to every piece of content. Different from the propagation of trust through the network of people, since we now have more information, and semantics about the content, we want to build metrics to assess the trustworthiness based content itself and the connection between different pieces of content, especially the semantic similarity and relation.