SlideShare a Scribd company logo
SEMANTIC WEB
WITH JAHIA
February 2014

www.sigma.fr
SUMMARY

• WHY ?
• Background
• Web 2.0 is not enough
• WHAT ?
• Definitions
• It’s real
• HOW ?
• JAHIA fits
• Integration

www.sigma.fr
WHY ?

• Background
• Web 2.0 is not enough

www.sigma.fr
Background : who we are ?
Thomas Delerm and Adrien Di Mascio from Logilab will explain the interest of
web semantics in modern web applications for the best use of your data.
They’ll give the recipes that make Jahia an appropriate CMS for the semantic
and linked data web, a.k.a. "web 3.0"



Adrien DI MASCIO - Semantic Web Director
Company : Logilab



Thomas DELERM - Web Architect
Company : SIGMA
Worked in cell and IPTV content startups

www.sigma.fr
How the web evolved


Web « 1 » was about
documents and links



Web « 2.0 » is about social
and users

https://web.archive.org/web/19991116151216/http://www4.yahoo.com/

www.sigma.fr
WHY ?

• Background

• Web 2.0 is not enough

www.sigma.fr
Failures of Web 2.0


All the databases and APIs are in “silo”  searches are limited



Results are documents, not objects



Are my results up to date and reliable ?

Example : Renault : Too many combinations when you want to buy a car : more than 10^20

[1]

[1] http://www.semweb.pro/talk/2474
www.sigma.fr
Failures of Web 2.0


Web 2.0 is far from perfect :



User tag
– Different orthography
– Different meanings for the
same orthography (Hollande)
– No relationships between
tags



You cannot (in one request)
answer complex queries like “List
on my website 10 products
whose producer is Samsung and
price under $50”

www.sigma.fr
We have a solution


There is always a technical evolution
– From PC to Web : WWW and links

– From Web to Web 2.0 : AJAX (dynamic web sites)

– From Web 2.0 to Web 3.0 : Semantic properties and Linked
data

So let’s learn what the semantic web is !
www.sigma.fr
WHAT ?

• Definitions
• It’s real

www.sigma.fr
Semantic Web – (Anti)definitions

Today, Semantic Web is not:
Magic
Natural Language Processing
Image Automatic Processing
A new protocol
It's a worldwide network of data built upon a set of interoperable standards that
use URLs to identify data and link them together.

www.sigma.fr
No Natural Language Processing
A human reads:

<h1>Semantic Web</h1>
 <p>Semantic Web is worldwide network of data invented by <a
href="http://w3.org/People/Berners-Lee">Tim Berners Lee</a> in
1994.</p>

A machine reads:

<h1> ????????????</h1>
 <p> ??????????????????????????????????????????????????
?????<a href="http://w3.org/People/BernersLee"> ???????????????</a> ????????</p>

www.sigma.fr
If only ...
… The machine could read:



SemanticWeb is_a network



SemanticWeb was_created_by TimBernersLee



SemanticWeb was_created_in 1994

www.sigma.fr
Annotate your document
Use rdfa or schema.org

<p itemtype="Concept">
<span itemprop="name">Semantic Web</span> is
<span itemprop="description">worldwide network of data</span>
invented by
<a itemprop="creator" href="http://w3.org/People/Berners-Lee">
Tim Berners Lee</a>
in <span="creation_date">1994</span>.</p>

www.sigma.fr
Publish another representation
Publish RDF and use HTTP content-negotiation
<http://mysite.com/SemanticWeb>
a <http://www.w3.org/2004/02/skos/core#Concept>;
skos:closeMatch <http://data.bnf.fr/ark:/12148/cb119328992> ;
dc:creator <http://w3.org/People/Berners-Lee/> ;
dc:date "1994".

More familiar with JSON ? Take a look at JSON-LD

www.sigma.fr
Vocabularies, ontologies



An ontology is a structured set of terms and concepts.



Each term and concept is also identified by a URL

 There are quite a few standard ontologies for various domains
(social interactions, libraries, music, events, etc.)

www.sigma.fr
Make it happen now !



RDF is nice



Some database engines store RDF graphs
- You can query them with the SPARQL language



Standardized by W3C



You don't necessarily need to change your technology stack



If your data is structured, publishing RDF is easy
- Choosing an ontology or a vocabulary can be hard
- Make your relational database answer a SPARQL query is hard

www.sigma.fr
WHAT ?

• Definitions

• It’s real

www.sigma.fr
It's all about data
Publishing structured data:

Helps search engines
Better indexation
Better page rank
Eases external data integration
Importing a CSV file requires a preliminary agreement on its structure
Maintaining data is expensive, reuse published data (dbpedia, freebase,
geonames)

www.sigma.fr
Examples
GoodRelations annotations

Schema.org annotations

www.sigma.fr
HOW ?

• Jahia fits
• Integration

www.sigma.fr
Client case : Bpi


One goal : use state-of-the art Semantic Web since they are a library
(Bibliothèque Publique d’information)



3 main needs:
– Input data easily for contents and within contents
– Store data in a safe, RDF-friendly manner
– Output data
• On every page for SEO (RDFa)
• In searches
• In exports (RDF)



Good news : Jahia fits !

www.sigma.fr
The choice of Jahia


Input :
- Jahia allows to define clear content definitions (CND files) with
inheritance.
- Jahia is content-centric



Enrich within contents : CKEditor



On contents : contribution or edition (GWT) modes

www.sigma.fr
The choice of Jahia : storage and output
Storage : you need a framework than can abstract different sources of data :
enter JCR
– Unique repository for all content
– External data are abstract : LDAP, Files, other DB…
Output:
– Graph structure + XML format  fit for meta data
– JSP views can be easily tailored for special export formats

www.sigma.fr
HOW ?

• Jahia fits

• Integration

www.sigma.fr
Input : CKEditor and categories


Make sure text data is stored as plain HTML
- Properties file to map schema.org  HTML code
- In-content schema.org properties  Created a CKEditor Plugin



Triple categorization of contents
–Categories (closed list)
–Tags (open)
–Authorities (closed – linked with BnF)



Next steps
–Need for a triple store ?
–Categorization through automatic spider browsing ?

www.sigma.fr
Content structure


Directories per category



The semantic mapping is transparent :
no additional field to fill in



Properties files to map a field and its
semantic exports (Dublin Core, FOAF..)

 Kind of challenges met
– Where to store meta data of a file 
extend jnt:file
– How to create a sub content while
creating its parents  edit Spring GWT
XML

www.sigma.fr
Vocabularies used
Page

Schema.org

OpenGraph

Dublin Core

FOAF

Lists
Details on short and
long contents

No
Yes

No
Yes

No
Yes

No
Partial

Details : events, IT
resource [file]

Yes

No

Yes

No

Auteurs
Place

No
 

No
 

Yes
 

Yes
 

In HTML

Everywhere

Header

Header

Everywhere

Format in HTML

RDFa

Meta

Meta

RDFa

In RDF

Yes

Yes, one line per 
meta
 
Automatic 
(mapping)

Yes, native

Contributed
By

Yes, one line per 
meta
 
 
Automatic +  Automatic 
Manual Bpi
(mapping)

 
Automatic 
(mapping)
www.sigma.fr
Output


We chose RDFa because more widely used for now (than microdata)



Debate : shall enrichment be made manually ? Automatically ? Though a
mixed technology ?



The field  dc:xxx mapping will be used to improve search results



“ARK” URIs are used to exchange objects between repositories (internal,
Jahia, external like BnF)

www.sigma.fr
Future




Free your data !
Put them together
Share them between applications and
externally



Forces you to organize your IT
differently

www.sigma.fr
Future : Facebook


Facebook is gradually promoting the
posts that contain Opengraph data [1]



« Facebook testing more uses for
Open Graph » [2]

[1] http://newsroom.fb.com/News/787/News-Feed-FYI-WhatHappens-When-You-See-More-Updates-fromFriends(January 21, 2014)
[2] http://allfacebook.com/add-to-my-movies-link_b128387

www.sigma.fr
Future : Web 3.0

www.sigma.fr
Conclusion


“If you’re not paying for it, you are the product” [1]



Semantic Web is going to be imposed by internet giants because they need
it to know you better



Make the first step to enrich your data, don’t miss the train !



Jahia 7 catches it :
– External data provider
– Quality, extendable editor

[1] http://blogs.law.harvard.edu/futureoftheinternet/2012/03/21/meme-patrol-when-something-online-is-free-youre-not-the-customer-youre-the-product/

www.sigma.fr
Questions & Answers



Webography:
New W3C Blog on Semantic Web & linked data : http://www.w3.org/blog/data/
http://fr.slideshare.net/AntidotNet/time2-market-lyon-13nov2013-slideshare#
http://fr.slideshare.net/terraces/technologies-du-web-smantique-pour-lentreprise-20
http://fr.slideshare.net/AntidotNet/web-smantique-web-de-donnes-web-30-linked-dataquelques-repres-pour-sy-retrouver

www.sigma.fr

More Related Content

PPT
MuseoTorino, first italian project using a GraphDB, RDFa, Linked Open Data
PDF
FIBO & Schema.org
PDF
Schema.org where did that come from?
PDF
Schema.org: Where did that come from!
KEY
The Kasabi Information Marketplace
PDF
Contextual Computing - Knowledge Graphs & Web of Entities
PDF
Structured data: Where did that come from & why are Google asking for it
PPTX
A possible future role of schema.org for business reporting
MuseoTorino, first italian project using a GraphDB, RDFa, Linked Open Data
FIBO & Schema.org
Schema.org where did that come from?
Schema.org: Where did that come from!
The Kasabi Information Marketplace
Contextual Computing - Knowledge Graphs & Web of Entities
Structured data: Where did that come from & why are Google asking for it
A possible future role of schema.org for business reporting

What's hot (20)

PDF
Choosing the Right Graph Database to Succeed in Your Project
PDF
Structured Data for the Financial Industry
PDF
Cenitpede: Analyzing Webcrawl
PPTX
Rank | Analyse | Lead | Search
PPTX
Sigma EE: Reaping low-hanging fruits in RDF-based data integration
PPTX
CILIP Conference - x metadata evolution the final mile - Richard Wallis
PDF
WebGUI And The Semantic Web
PPTX
CILIP Conference - Diffusion of ISNIs into book supply chain metadata - Andr...
PDF
Wed roman tut_open_datapub
PDF
What is Web-scraping?
PPTX
Semantic web
PDF
Industry Ontologies: Case Studies in Creating and Extending Schema.org
PPTX
A Real-World Implementation of Linked Data
PDF
What_do_Knowledge_Graph_Embeddings_Learn.pdf
PPTX
Connected data meetup group - introduction & scope
PPT
The Power of Semantic Technologies to Explore Linked Open Data
PDF
HUG France - Paris - Data Engineer's Toolkit
PDF
Intro to Cypher
PDF
GraphConnect 2014 SF: From Zero to Graph in 120: Model
PPTX
Week 5 - Interactive News Editing and Producing
Choosing the Right Graph Database to Succeed in Your Project
Structured Data for the Financial Industry
Cenitpede: Analyzing Webcrawl
Rank | Analyse | Lead | Search
Sigma EE: Reaping low-hanging fruits in RDF-based data integration
CILIP Conference - x metadata evolution the final mile - Richard Wallis
WebGUI And The Semantic Web
CILIP Conference - Diffusion of ISNIs into book supply chain metadata - Andr...
Wed roman tut_open_datapub
What is Web-scraping?
Semantic web
Industry Ontologies: Case Studies in Creating and Extending Schema.org
A Real-World Implementation of Linked Data
What_do_Knowledge_Graph_Embeddings_Learn.pdf
Connected data meetup group - introduction & scope
The Power of Semantic Technologies to Explore Linked Open Data
HUG France - Paris - Data Engineer's Toolkit
Intro to Cypher
GraphConnect 2014 SF: From Zero to Graph in 120: Model
Week 5 - Interactive News Editing and Producing
Ad

Similar to JahiaOne - Semantic Web with Jahia (20)

KEY
Web Technology Trends (early 2009)
PDF
01 web 2.0 - more than a pretty face for soa
PPT
Lessons learned from Semantic Wiki
PPTX
Jeremy cabral search marketing summit - scraping data-driven content (1)
PDF
Schema.org Structured data the What, Why, & How
PPTX
The Internet as a Single Database
PPT
TERMINALFOUR t44u 2008 - Piero Tintori - Integration Publishing To Share Poin...
PPT
Explaining The Semantic Web
PDF
Pratical Deep Dive into the Semantic Web - #smconnect
PDF
Building Data Lakes and Analytics on AWS. IPExpo Manchester.
PPT
Information Management & Sharing in Digital Era
PPSX
The Web of data and web data commons
PPTX
L'architettura di classe enterprise di nuova generazione - Massimo Brignoli
PDF
Web Architecture with Infopark's Cloud Platform - Thomas Witt @Cloud Develope...
PPT
SKB-Web2.0.ppt SKB-Web2.0.ppt SKB-Web2.0.ppt SKB-Web2.0.ppt SKB-Web2.0.ppt SK...
PPTX
Open belgium 2015 - open tourism
PPTX
Basic Application Performance Optimization Techniques (Backend)
PPTX
Near Real-Time Data Analysis With FlyData
PDF
Semantic Web For Dummies
PDF
Industry Ontologies: Case Studies in Creating and Extending Schema.org for In...
Web Technology Trends (early 2009)
01 web 2.0 - more than a pretty face for soa
Lessons learned from Semantic Wiki
Jeremy cabral search marketing summit - scraping data-driven content (1)
Schema.org Structured data the What, Why, & How
The Internet as a Single Database
TERMINALFOUR t44u 2008 - Piero Tintori - Integration Publishing To Share Poin...
Explaining The Semantic Web
Pratical Deep Dive into the Semantic Web - #smconnect
Building Data Lakes and Analytics on AWS. IPExpo Manchester.
Information Management & Sharing in Digital Era
The Web of data and web data commons
L'architettura di classe enterprise di nuova generazione - Massimo Brignoli
Web Architecture with Infopark's Cloud Platform - Thomas Witt @Cloud Develope...
SKB-Web2.0.ppt SKB-Web2.0.ppt SKB-Web2.0.ppt SKB-Web2.0.ppt SKB-Web2.0.ppt SK...
Open belgium 2015 - open tourism
Basic Application Performance Optimization Techniques (Backend)
Near Real-Time Data Analysis With FlyData
Semantic Web For Dummies
Industry Ontologies: Case Studies in Creating and Extending Schema.org for In...
Ad

More from Jahia Solutions Group (20)

PDF
The Road ahead: What we see as the future of digital. By Elie Auvray
PPTX
Monitoring and Data-Driven Decision Making with Daniel Maher
PPTX
The ultimate search of the perfect customer experience By Brian Solis
PPTX
Docker, Kubernetes, Openshift: Jahia on steroids in production with Julian Ma...
PDF
Data for Dummies by Dan Katz, CDO at Safran
PDF
Content and commerce: The perfect combo. By Catherine Barba
PDF
The power of great customer experience in today’s world. Olivier Mourrieras &...
PPTX
Making Digital simpler. Occam’s Razor, Horses, Zebras, and Evolution
PPTX
Elasticsearch powered EDP by Cedric Mailleux
PPTX
Jahia Cloud Offerings by Julian Maurel & Abass Safoutou
PPTX
Learn how to go headless with Jahia DX by Serge Huber
PPTX
Making the life of patients easier in the healthcare sector thanks to digital...
PDF
Impletementing Analytics - Stop talking, Start doing! by Ben Salmon, We are C...
PPTX
Strategy for content with local and global sites by Romain Gauthier
PPTX
Apache Unomi presentation and update. By Serge Huber, CTO Jahia
PDF
Personalisation and Headless in a business context by Lars Petersen
PPTX
Digital Revolution from Silo to Platform by Gilles Babinet
PPTX
A customer journey with AI by Xavier Vaccari, Softeam Group
PPTX
Using CX to unlock Total Experience by David Balko, Tribal
PPTX
AI-monitor & Marketing Factory, customer case study by Valerie Voci
The Road ahead: What we see as the future of digital. By Elie Auvray
Monitoring and Data-Driven Decision Making with Daniel Maher
The ultimate search of the perfect customer experience By Brian Solis
Docker, Kubernetes, Openshift: Jahia on steroids in production with Julian Ma...
Data for Dummies by Dan Katz, CDO at Safran
Content and commerce: The perfect combo. By Catherine Barba
The power of great customer experience in today’s world. Olivier Mourrieras &...
Making Digital simpler. Occam’s Razor, Horses, Zebras, and Evolution
Elasticsearch powered EDP by Cedric Mailleux
Jahia Cloud Offerings by Julian Maurel & Abass Safoutou
Learn how to go headless with Jahia DX by Serge Huber
Making the life of patients easier in the healthcare sector thanks to digital...
Impletementing Analytics - Stop talking, Start doing! by Ben Salmon, We are C...
Strategy for content with local and global sites by Romain Gauthier
Apache Unomi presentation and update. By Serge Huber, CTO Jahia
Personalisation and Headless in a business context by Lars Petersen
Digital Revolution from Silo to Platform by Gilles Babinet
A customer journey with AI by Xavier Vaccari, Softeam Group
Using CX to unlock Total Experience by David Balko, Tribal
AI-monitor & Marketing Factory, customer case study by Valerie Voci

Recently uploaded (20)

PDF
A comparative analysis of optical character recognition models for extracting...
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
Spectroscopy.pptx food analysis technology
PPTX
Tartificialntelligence_presentation.pptx
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Getting Started with Data Integration: FME Form 101
PPTX
Big Data Technologies - Introduction.pptx
PDF
Machine learning based COVID-19 study performance prediction
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
cuic standard and advanced reporting.pdf
PDF
Empathic Computing: Creating Shared Understanding
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPTX
A Presentation on Artificial Intelligence
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
A comparative analysis of optical character recognition models for extracting...
The Rise and Fall of 3GPP – Time for a Sabbatical?
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Spectroscopy.pptx food analysis technology
Tartificialntelligence_presentation.pptx
Mobile App Security Testing_ A Comprehensive Guide.pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Getting Started with Data Integration: FME Form 101
Big Data Technologies - Introduction.pptx
Machine learning based COVID-19 study performance prediction
Spectral efficient network and resource selection model in 5G networks
Group 1 Presentation -Planning and Decision Making .pptx
Encapsulation_ Review paper, used for researhc scholars
cuic standard and advanced reporting.pdf
Empathic Computing: Creating Shared Understanding
Digital-Transformation-Roadmap-for-Companies.pptx
A Presentation on Artificial Intelligence
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
Per capita expenditure prediction using model stacking based on satellite ima...
Diabetes mellitus diagnosis method based random forest with bat algorithm

JahiaOne - Semantic Web with Jahia

  • 2. SUMMARY • WHY ? • Background • Web 2.0 is not enough • WHAT ? • Definitions • It’s real • HOW ? • JAHIA fits • Integration www.sigma.fr
  • 3. WHY ? • Background • Web 2.0 is not enough www.sigma.fr
  • 4. Background : who we are ? Thomas Delerm and Adrien Di Mascio from Logilab will explain the interest of web semantics in modern web applications for the best use of your data. They’ll give the recipes that make Jahia an appropriate CMS for the semantic and linked data web, a.k.a. "web 3.0"  Adrien DI MASCIO - Semantic Web Director Company : Logilab  Thomas DELERM - Web Architect Company : SIGMA Worked in cell and IPTV content startups www.sigma.fr
  • 5. How the web evolved  Web « 1 » was about documents and links  Web « 2.0 » is about social and users https://web.archive.org/web/19991116151216/http://www4.yahoo.com/ www.sigma.fr
  • 6. WHY ? • Background • Web 2.0 is not enough www.sigma.fr
  • 7. Failures of Web 2.0  All the databases and APIs are in “silo”  searches are limited  Results are documents, not objects  Are my results up to date and reliable ? Example : Renault : Too many combinations when you want to buy a car : more than 10^20 [1] [1] http://www.semweb.pro/talk/2474 www.sigma.fr
  • 8. Failures of Web 2.0  Web 2.0 is far from perfect :  User tag – Different orthography – Different meanings for the same orthography (Hollande) – No relationships between tags  You cannot (in one request) answer complex queries like “List on my website 10 products whose producer is Samsung and price under $50” www.sigma.fr
  • 9. We have a solution  There is always a technical evolution – From PC to Web : WWW and links – From Web to Web 2.0 : AJAX (dynamic web sites) – From Web 2.0 to Web 3.0 : Semantic properties and Linked data So let’s learn what the semantic web is ! www.sigma.fr
  • 10. WHAT ? • Definitions • It’s real www.sigma.fr
  • 11. Semantic Web – (Anti)definitions Today, Semantic Web is not: Magic Natural Language Processing Image Automatic Processing A new protocol It's a worldwide network of data built upon a set of interoperable standards that use URLs to identify data and link them together. www.sigma.fr
  • 12. No Natural Language Processing A human reads: <h1>Semantic Web</h1>  <p>Semantic Web is worldwide network of data invented by <a href="http://w3.org/People/Berners-Lee">Tim Berners Lee</a> in 1994.</p> A machine reads: <h1> ????????????</h1>  <p> ?????????????????????????????????????????????????? ?????<a href="http://w3.org/People/BernersLee"> ???????????????</a> ????????</p> www.sigma.fr
  • 13. If only ... … The machine could read:  SemanticWeb is_a network  SemanticWeb was_created_by TimBernersLee  SemanticWeb was_created_in 1994 www.sigma.fr
  • 14. Annotate your document Use rdfa or schema.org <p itemtype="Concept"> <span itemprop="name">Semantic Web</span> is <span itemprop="description">worldwide network of data</span> invented by <a itemprop="creator" href="http://w3.org/People/Berners-Lee"> Tim Berners Lee</a> in <span="creation_date">1994</span>.</p> www.sigma.fr
  • 15. Publish another representation Publish RDF and use HTTP content-negotiation <http://mysite.com/SemanticWeb> a <http://www.w3.org/2004/02/skos/core#Concept>; skos:closeMatch <http://data.bnf.fr/ark:/12148/cb119328992> ; dc:creator <http://w3.org/People/Berners-Lee/> ; dc:date "1994". More familiar with JSON ? Take a look at JSON-LD www.sigma.fr
  • 16. Vocabularies, ontologies  An ontology is a structured set of terms and concepts.  Each term and concept is also identified by a URL  There are quite a few standard ontologies for various domains (social interactions, libraries, music, events, etc.) www.sigma.fr
  • 17. Make it happen now !  RDF is nice  Some database engines store RDF graphs - You can query them with the SPARQL language  Standardized by W3C  You don't necessarily need to change your technology stack  If your data is structured, publishing RDF is easy - Choosing an ontology or a vocabulary can be hard - Make your relational database answer a SPARQL query is hard www.sigma.fr
  • 18. WHAT ? • Definitions • It’s real www.sigma.fr
  • 19. It's all about data Publishing structured data: Helps search engines Better indexation Better page rank Eases external data integration Importing a CSV file requires a preliminary agreement on its structure Maintaining data is expensive, reuse published data (dbpedia, freebase, geonames) www.sigma.fr
  • 21. HOW ? • Jahia fits • Integration www.sigma.fr
  • 22. Client case : Bpi  One goal : use state-of-the art Semantic Web since they are a library (Bibliothèque Publique d’information)  3 main needs: – Input data easily for contents and within contents – Store data in a safe, RDF-friendly manner – Output data • On every page for SEO (RDFa) • In searches • In exports (RDF)  Good news : Jahia fits ! www.sigma.fr
  • 23. The choice of Jahia  Input : - Jahia allows to define clear content definitions (CND files) with inheritance. - Jahia is content-centric  Enrich within contents : CKEditor  On contents : contribution or edition (GWT) modes www.sigma.fr
  • 24. The choice of Jahia : storage and output Storage : you need a framework than can abstract different sources of data : enter JCR – Unique repository for all content – External data are abstract : LDAP, Files, other DB… Output: – Graph structure + XML format  fit for meta data – JSP views can be easily tailored for special export formats www.sigma.fr
  • 25. HOW ? • Jahia fits • Integration www.sigma.fr
  • 26. Input : CKEditor and categories  Make sure text data is stored as plain HTML - Properties file to map schema.org  HTML code - In-content schema.org properties  Created a CKEditor Plugin  Triple categorization of contents –Categories (closed list) –Tags (open) –Authorities (closed – linked with BnF)  Next steps –Need for a triple store ? –Categorization through automatic spider browsing ? www.sigma.fr
  • 27. Content structure  Directories per category  The semantic mapping is transparent : no additional field to fill in  Properties files to map a field and its semantic exports (Dublin Core, FOAF..)  Kind of challenges met – Where to store meta data of a file  extend jnt:file – How to create a sub content while creating its parents  edit Spring GWT XML www.sigma.fr
  • 28. Vocabularies used Page Schema.org OpenGraph Dublin Core FOAF Lists Details on short and long contents No Yes No Yes No Yes No Partial Details : events, IT resource [file] Yes No Yes No Auteurs Place No   No   Yes   Yes   In HTML Everywhere Header Header Everywhere Format in HTML RDFa Meta Meta RDFa In RDF Yes Yes, one line per  meta   Automatic  (mapping) Yes, native Contributed By Yes, one line per  meta     Automatic +  Automatic  Manual Bpi (mapping)   Automatic  (mapping) www.sigma.fr
  • 29. Output  We chose RDFa because more widely used for now (than microdata)  Debate : shall enrichment be made manually ? Automatically ? Though a mixed technology ?  The field  dc:xxx mapping will be used to improve search results  “ARK” URIs are used to exchange objects between repositories (internal, Jahia, external like BnF) www.sigma.fr
  • 30. Future    Free your data ! Put them together Share them between applications and externally  Forces you to organize your IT differently www.sigma.fr
  • 31. Future : Facebook  Facebook is gradually promoting the posts that contain Opengraph data [1]  « Facebook testing more uses for Open Graph » [2] [1] http://newsroom.fb.com/News/787/News-Feed-FYI-WhatHappens-When-You-See-More-Updates-fromFriends(January 21, 2014) [2] http://allfacebook.com/add-to-my-movies-link_b128387 www.sigma.fr
  • 32. Future : Web 3.0 www.sigma.fr
  • 33. Conclusion  “If you’re not paying for it, you are the product” [1]  Semantic Web is going to be imposed by internet giants because they need it to know you better  Make the first step to enrich your data, don’t miss the train !  Jahia 7 catches it : – External data provider – Quality, extendable editor [1] http://blogs.law.harvard.edu/futureoftheinternet/2012/03/21/meme-patrol-when-something-online-is-free-youre-not-the-customer-youre-the-product/ www.sigma.fr
  • 34. Questions & Answers  Webography: New W3C Blog on Semantic Web & linked data : http://www.w3.org/blog/data/ http://fr.slideshare.net/AntidotNet/time2-market-lyon-13nov2013-slideshare# http://fr.slideshare.net/terraces/technologies-du-web-smantique-pour-lentreprise-20 http://fr.slideshare.net/AntidotNet/web-smantique-web-de-donnes-web-30-linked-dataquelques-repres-pour-sy-retrouver www.sigma.fr

Editor's Notes

  • #21: 19 July 2013 at Google : Knowledge Graph expansion – More than a quarter of all searches started showing some kind of knwoledge graph after this date20 August 2013 Google Hummingbird foces on conversational and semantic search to try and delivery correct answers to broad meanung questions
  • #29: We chose not to output semantics on lists pages on purpose