SlideShare a Scribd company logo
Towards a Multilingual Ontology for
Ontology-driven Content Mining in Social
Web Sites
Marcirio Silveira Chaves1
- marcirioc@uatlantica.pt
Cássia Trojahn2
- cassia.trojahn@inrialpes.fr
1
Universidade Atlântica, Oeiras, Portugal
2
INRIA & LIG, Grenoble, France
Workshop on Cross-Cultural and Cross-Lingual Aspects of the Semantic Web
Shanghai, China, November 7th, 2010
In conjunction with the 9th International Semantic Web Conference (ISWC2010)
Motivation
• Social Semantic Web is highly dependent on the development of
multilingual ontologies.
• Only 2.5% of the ontologies in the OntoSelect library is multilingual.
• (Multilingual) Hotel domain ontologies are rare.
• Multilingual comments need to be processed.
• Ontology-driven mining of comments from Social Web sites.
November 7th 2C3LSW2010
Context
• Customer Knowledge Management (CKM)
– Customer Relationship Management (CRM) and
– Knowledge Management (KM).
• Multilingual comments to support CKM
November 7th 3C3LSW2010
Outline
• Multilingual Ontology Application
• Hontology
• Related Ontologies
• Extending Hontology
• Conclusion
• Ongoing Work
November 7th C3LSW2010 4
Multilingual Ontology Application
November 7th 5C3LSW2010
Social web
data
Social web
data
Social web
data
Extraction
Transformation
Loading
Extraction
Transformation
Loading
Comment
annotator
Comment
annotator
Multilingual ontologyMultilingual ontology
Ontology
augmenter
Ontology
augmenter
User
interface
User
interface
Knowledge
base Expert
Data pre-processing Ontology
enrichement
SearchingComments
annotation
CKMCKM
Manager
Hontology
• Development Methodology
– Identify existing ontologies on related domains
– Select the main concepts and properties
– Organize concepts and properties hierarchically into categories
– Translate the ontology (manual)
– Expand concepts and properties based on comments
– Translate the new concepts and properties (manual)
– Generate the ontology in several formats
November 7th 6C3LSW2010
November 7th 7C3LSW2010
Hontology
• Category: contains all the types of categories into which a Hotel
can be classified, e.g., tourist, comfort, and luxury.
• Facility: includes the utility options offered by each hotel, e.g.,
beauty salon, kids club, and pool bar.
• Hospitality: contains the existing kinds of hotels, e.g., hostel,
pension, and motel.
November 7th 8C3LSW2010
Hontology
• Hotel: details the kind of hotels, e.g., bunker, cave, and capsule.
• Leisure: lists the leisure options, e.g., gym, jacuzzi, and sauna.
• Points of interest: often mentioned in comments about the
hotels, e.g., stadium, museum, and monument.
• Room: splits into Hostel Room and Hotel Room, which have
different kinds and nomenclature for rooms.
November 7th 9C3LSW2010
Hontology
• Hontology supports three languages
– English, French and Portuguese
• 97 concepts
• 9 object properties
• 25 data properties
November 7th 10C3LSW2010
Hontology
Related Ontologies
Mondeca HarmoNET Travel
Itinerary
Hontology
# concepts 1000 54 8 97
# properties n.a. 166 24 34
# instances Zero Zero Zero Zero
Domain Tourism Tourism Travel Hotel
Multilingual No No No Yes
Use Mondeca
Project
Accommodation
and events
n.a. Hotel Sector
Support
Decision
Public freely
available
No Yes Yes Yes
November 7th 11C3LSW2010
Extending Hontology
• Ontology augmenter
• Multilingual ontology matching
• Machine-learning methods
• (Semi)-automatically multilingual extension
• Hontology can be used as a multilingual resource to cross-language
information retrieval.
November 7th 12C3LSW2010
• Ontology augmenter
Term correlation: considers potential terms mentioned in the comments, which are
present in Hontology.
• ``Rooms are comfortable, but pillows are very hard'' the terms ``pillow'' (in
the ontology) and ``room'' (not in the ontology) should be probably related
through a property linking them in Hontology.
• Once the ontology is enriched with the term ``pillow'', a comment
containing, for instance, only the sentence ``Pillows are very hard'' can be
found under the concept ``room''.
November 7th 13C3LSW2010
Extending Hontology
• Ontology augmenter
Rules (or lexical patterns): comments usually contain a set of common adjectives,
e.g., good, cheap, and soft.
• Using lexical patterns and extract relevant terms which are preceding or
succeeding the adjective,
• ``Air-conditioned is loud'', ``Small bathroom''.
November 7th 14C3LSW2010
Extending Hontology
• Ontology augmenter
Synonyms
• elements that must be considered in the improvement of Hontology.
• they have already being considered in the process of adding labels to the
concepts.
• This task can be extended with the help of dictionaries and lexical resources
within an automatic process.
November 7th 15C3LSW2010
Extending Hontology
November 7th C3LSW2010 16
Ongoing work
(1) enrich Hontology by using potential terms from comments
(2) exploit Hontology in Multilingual Ontology Matching (i.e., creating
between Hontology and other ontologies)
(3) include labels in other languages
(4) exploit the issues related to ontology localization and
internationalization.
• Main contribution
– to make available for the community, a multilingual ontology
that can be used as a baseline for many usages and applications
in the context of the Multilingual Semantic Web.
November 7th 17C3LSW2010
Final Remarks

More Related Content

PPTX
Linz for young people
PDF
DIADEM: domain-centric intelligent automated data extraction methodology Pres...
PPTX
Diadem 1.0
PDF
diadem-vldb-2015
PDF
Joint Repairs for Web Wrappers
KEY
DIADEM WWW 2012
PPSX
Semantic web
PPTX
Ontology mapping for the semantic web
Linz for young people
DIADEM: domain-centric intelligent automated data extraction methodology Pres...
Diadem 1.0
diadem-vldb-2015
Joint Repairs for Web Wrappers
DIADEM WWW 2012
Semantic web
Ontology mapping for the semantic web

Similar to Towards a Multilingual Ontology for Ontology-driven Content Mining in Social Web Sites (20)

PPT
Collaborative Ontology Building Project
PDF
Travel semantics: Use of semantic technologies in online travel and tourism i...
PPT
Ontology Mapping
PDF
Lecture: Semantic Word Clouds
PPT
Content Repositories vs Knowledge Bases
PDF
Ontology Mapping
PDF
Semantic technologies for cultural heritage
PPT
Jtelss presentation Paola Monachesi
PPT
A Survey of the Landscape and State-of-Art in Semantic Wiki
PPT
Method for ontology generation from concept maps in shallow domains
PDF
Adaptando, Aprendendo e Integrando Modelos Semânticos.
PPT
Intelligent expert systems for location planning
PPTX
Information Content based Ranking Metric for Linked Open Vocabularies
PDF
A Comparative Study Ontology Building Tools for Semantic Web Applications
PDF
A Comparative Study of Ontology building Tools in Semantic Web Applications
PDF
A Comparative Study Ontology Building Tools for Semantic Web Applications
PDF
Ontological approach for improving semantic web search results
PDF
Ontological approach for improving semantic web search results
PDF
Volume 2-issue-6-2016-2020
PDF
Volume 2-issue-6-2016-2020
Collaborative Ontology Building Project
Travel semantics: Use of semantic technologies in online travel and tourism i...
Ontology Mapping
Lecture: Semantic Word Clouds
Content Repositories vs Knowledge Bases
Ontology Mapping
Semantic technologies for cultural heritage
Jtelss presentation Paola Monachesi
A Survey of the Landscape and State-of-Art in Semantic Wiki
Method for ontology generation from concept maps in shallow domains
Adaptando, Aprendendo e Integrando Modelos Semânticos.
Intelligent expert systems for location planning
Information Content based Ranking Metric for Linked Open Vocabularies
A Comparative Study Ontology Building Tools for Semantic Web Applications
A Comparative Study of Ontology building Tools in Semantic Web Applications
A Comparative Study Ontology Building Tools for Semantic Web Applications
Ontological approach for improving semantic web search results
Ontological approach for improving semantic web search results
Volume 2-issue-6-2016-2020
Volume 2-issue-6-2016-2020
Ad

More from Marcirio Chaves (16)

PPTX
A Look at Risks in IT Projects: A Case Study during the Merger Period in the ...
PPTX
Lessons Learned Model for Projects Supported by Web 2.0 Tools: a Mixed Method...
PPTX
Rethinking Lessons Learned in the PMBoK Process Groups: A Model based on Peop...
PPTX
Revisita e Análise dos Métodos para Captura de Lições Aprendidas: Uma Contrib...
PPTX
A Identificação de Riscos Novos e Potencializados em Projetos de Tecnologia d...
PPT
WEB 2.0 TECHNOLOGIES TO SUPPORT LESSONS LEARNED IN PROJECT MANAGEMENT
PDF
A Fine-Grained Analysis of User-Generated Content to Support Decision Making
PPTX
A Multidomain and Multilingual Conceptual Data Model for Online Reviews Repre...
PPTX
Phd Marcirio Chaves
PPT
Tutorial on Parallel Computing and Message Passing Model - C5
PPT
Tutorial on Parallel Computing and Message Passing Model - C4
PPT
Tutorial on Parallel Computing and Message Passing Model - C3
PPT
Tutorial on Parallel Computing and Message Passing Model - C2
PPT
Tutorial on Parallel Computing and Message Passing Model - C1
PPT
Simpósio Brasileiro de Banco de Dados 2005
PPT
defesa dissertação mestrado
A Look at Risks in IT Projects: A Case Study during the Merger Period in the ...
Lessons Learned Model for Projects Supported by Web 2.0 Tools: a Mixed Method...
Rethinking Lessons Learned in the PMBoK Process Groups: A Model based on Peop...
Revisita e Análise dos Métodos para Captura de Lições Aprendidas: Uma Contrib...
A Identificação de Riscos Novos e Potencializados em Projetos de Tecnologia d...
WEB 2.0 TECHNOLOGIES TO SUPPORT LESSONS LEARNED IN PROJECT MANAGEMENT
A Fine-Grained Analysis of User-Generated Content to Support Decision Making
A Multidomain and Multilingual Conceptual Data Model for Online Reviews Repre...
Phd Marcirio Chaves
Tutorial on Parallel Computing and Message Passing Model - C5
Tutorial on Parallel Computing and Message Passing Model - C4
Tutorial on Parallel Computing and Message Passing Model - C3
Tutorial on Parallel Computing and Message Passing Model - C2
Tutorial on Parallel Computing and Message Passing Model - C1
Simpósio Brasileiro de Banco de Dados 2005
defesa dissertação mestrado
Ad

Recently uploaded (20)

PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PDF
Business Ethics Teaching Materials for college
PPTX
Week 4 Term 3 Study Techniques revisited.pptx
PDF
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PPTX
PPH.pptx obstetrics and gynecology in nursing
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PDF
Basic Mud Logging Guide for educational purpose
PDF
O7-L3 Supply Chain Operations - ICLT Program
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
Complications of Minimal Access Surgery at WLH
PDF
Anesthesia in Laparoscopic Surgery in India
PDF
TR - Agricultural Crops Production NC III.pdf
PPTX
Pharma ospi slides which help in ospi learning
PDF
Pre independence Education in Inndia.pdf
PPTX
master seminar digital applications in india
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
Business Ethics Teaching Materials for college
Week 4 Term 3 Study Techniques revisited.pptx
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
Supply Chain Operations Speaking Notes -ICLT Program
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PPH.pptx obstetrics and gynecology in nursing
102 student loan defaulters named and shamed – Is someone you know on the list?
Basic Mud Logging Guide for educational purpose
O7-L3 Supply Chain Operations - ICLT Program
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
Complications of Minimal Access Surgery at WLH
Anesthesia in Laparoscopic Surgery in India
TR - Agricultural Crops Production NC III.pdf
Pharma ospi slides which help in ospi learning
Pre independence Education in Inndia.pdf
master seminar digital applications in india
human mycosis Human fungal infections are called human mycosis..pptx
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
O5-L3 Freight Transport Ops (International) V1.pdf

Towards a Multilingual Ontology for Ontology-driven Content Mining in Social Web Sites

  • 1. Towards a Multilingual Ontology for Ontology-driven Content Mining in Social Web Sites Marcirio Silveira Chaves1 - marcirioc@uatlantica.pt Cássia Trojahn2 - cassia.trojahn@inrialpes.fr 1 Universidade Atlântica, Oeiras, Portugal 2 INRIA & LIG, Grenoble, France Workshop on Cross-Cultural and Cross-Lingual Aspects of the Semantic Web Shanghai, China, November 7th, 2010 In conjunction with the 9th International Semantic Web Conference (ISWC2010)
  • 2. Motivation • Social Semantic Web is highly dependent on the development of multilingual ontologies. • Only 2.5% of the ontologies in the OntoSelect library is multilingual. • (Multilingual) Hotel domain ontologies are rare. • Multilingual comments need to be processed. • Ontology-driven mining of comments from Social Web sites. November 7th 2C3LSW2010
  • 3. Context • Customer Knowledge Management (CKM) – Customer Relationship Management (CRM) and – Knowledge Management (KM). • Multilingual comments to support CKM November 7th 3C3LSW2010
  • 4. Outline • Multilingual Ontology Application • Hontology • Related Ontologies • Extending Hontology • Conclusion • Ongoing Work November 7th C3LSW2010 4
  • 5. Multilingual Ontology Application November 7th 5C3LSW2010 Social web data Social web data Social web data Extraction Transformation Loading Extraction Transformation Loading Comment annotator Comment annotator Multilingual ontologyMultilingual ontology Ontology augmenter Ontology augmenter User interface User interface Knowledge base Expert Data pre-processing Ontology enrichement SearchingComments annotation CKMCKM Manager
  • 6. Hontology • Development Methodology – Identify existing ontologies on related domains – Select the main concepts and properties – Organize concepts and properties hierarchically into categories – Translate the ontology (manual) – Expand concepts and properties based on comments – Translate the new concepts and properties (manual) – Generate the ontology in several formats November 7th 6C3LSW2010
  • 8. • Category: contains all the types of categories into which a Hotel can be classified, e.g., tourist, comfort, and luxury. • Facility: includes the utility options offered by each hotel, e.g., beauty salon, kids club, and pool bar. • Hospitality: contains the existing kinds of hotels, e.g., hostel, pension, and motel. November 7th 8C3LSW2010 Hontology
  • 9. • Hotel: details the kind of hotels, e.g., bunker, cave, and capsule. • Leisure: lists the leisure options, e.g., gym, jacuzzi, and sauna. • Points of interest: often mentioned in comments about the hotels, e.g., stadium, museum, and monument. • Room: splits into Hostel Room and Hotel Room, which have different kinds and nomenclature for rooms. November 7th 9C3LSW2010 Hontology
  • 10. • Hontology supports three languages – English, French and Portuguese • 97 concepts • 9 object properties • 25 data properties November 7th 10C3LSW2010 Hontology
  • 11. Related Ontologies Mondeca HarmoNET Travel Itinerary Hontology # concepts 1000 54 8 97 # properties n.a. 166 24 34 # instances Zero Zero Zero Zero Domain Tourism Tourism Travel Hotel Multilingual No No No Yes Use Mondeca Project Accommodation and events n.a. Hotel Sector Support Decision Public freely available No Yes Yes Yes November 7th 11C3LSW2010
  • 12. Extending Hontology • Ontology augmenter • Multilingual ontology matching • Machine-learning methods • (Semi)-automatically multilingual extension • Hontology can be used as a multilingual resource to cross-language information retrieval. November 7th 12C3LSW2010
  • 13. • Ontology augmenter Term correlation: considers potential terms mentioned in the comments, which are present in Hontology. • ``Rooms are comfortable, but pillows are very hard'' the terms ``pillow'' (in the ontology) and ``room'' (not in the ontology) should be probably related through a property linking them in Hontology. • Once the ontology is enriched with the term ``pillow'', a comment containing, for instance, only the sentence ``Pillows are very hard'' can be found under the concept ``room''. November 7th 13C3LSW2010 Extending Hontology
  • 14. • Ontology augmenter Rules (or lexical patterns): comments usually contain a set of common adjectives, e.g., good, cheap, and soft. • Using lexical patterns and extract relevant terms which are preceding or succeeding the adjective, • ``Air-conditioned is loud'', ``Small bathroom''. November 7th 14C3LSW2010 Extending Hontology
  • 15. • Ontology augmenter Synonyms • elements that must be considered in the improvement of Hontology. • they have already being considered in the process of adding labels to the concepts. • This task can be extended with the help of dictionaries and lexical resources within an automatic process. November 7th 15C3LSW2010 Extending Hontology
  • 16. November 7th C3LSW2010 16 Ongoing work (1) enrich Hontology by using potential terms from comments (2) exploit Hontology in Multilingual Ontology Matching (i.e., creating between Hontology and other ontologies) (3) include labels in other languages (4) exploit the issues related to ontology localization and internationalization.
  • 17. • Main contribution – to make available for the community, a multilingual ontology that can be used as a baseline for many usages and applications in the context of the Multilingual Semantic Web. November 7th 17C3LSW2010 Final Remarks