SlideShare a Scribd company logo
Pradeep Pillai and Michael Kandefer Department of Computer Science and Engineering University at Buffalo  Buffalo, NY, 14260 {pbpillai,mwk3}@cse.buffalo.edu Schema Matching and Ontology Mapping: A Comparison
Interoperability problem Problem of combining heterogeneous and distributed data sources Two solutions: Schema matching Ontology mapping W3C converging on standards for publishing web ontologies (e.g. OWL) Distributed ontologies is still an issue Intuition: Schema matching approaches are applicable to the ontology domain Introduction
Schema Matching Ontology Mapping Comparison Ontology mapping using schema matching Conclusion AGENDA
Distinction between matching and mapping isn’t clear Schema matching: process of “establishing [logical] correspondences between elements of the source and target schemas” [Cho08] Schema mapping:  process of generating the assertions from schema matching Sometimes called “instance mapping” Schema Matching Definition
Two general categories [ShvEuz05,MadBerRah01] Element-based: Mappings created based on analysis of the schema elements String-based Language-based Constraint-based Structure-based: Mapping created based on analysis of the elements and schema structure Tree-based Graph-based Matching approaches aren’t mutually exclusive Hybrid systems employ multiple methodologies Other properties Mappings need not be 1:1 Auxiliary information can be utilized Schema matching topology
Utilizes string comparisons between elements to establish mappings Prefix/Suffix: Look for similar prefixes/suffixes Edit distance: How many swaps, additions, or subtractions it takes to convert one element into the other NGram: compute the number of common substrings of length  n Ex. COMA, S-Match Element-based: String mappings
Element-based: String mappings - Prefix(3) - 3-Gram(2) - Edit distance(5) PurchaseOrder DeliverTo InvoiceTo Items Address Address Item Street City City Street ItemCount ItemNumber Quantity UnitOfMeasure PO POShipTo POBillTo POLines Item Street City City Street Count Line Qty UoM
Utilizes properties of language in order to find elements with a common word sense Normalization  Tokenization: Punctuation used to divide an element into tokens.  Expansion: Expand acronym and short-hand tokens. Elimination: Remove undesirable tokens, such as prepositions, before comparison Lemmatization: Tokens converted to their basic form (e.g. remove pluralization) and compared Auxiliary information: Utilize external sources to aid matching Wordnet, thesauri, or dictionaries Ex. Cupid, S-Match Element-based: Language mappings
Element-based: Language mappings POBillTo InvoiceTo <PO,Bill,To> <Invoice,To> Tokenize: <PO,Bill> <Invoice> Elimination: Expansion: <Purchase,Order,Bill> <Invoice> <Purchase,Order,Bill> <Bill> Related form: PurchaseOrder DeliverTo InvoiceTo Items Address Address Item Street City City Street ItemCount ItemNumber Quantity UnitOfMeasure PO POShipTo POBillTo POLines Item Street City City Street Count Line Qty UoM
  Represents schemas as graphs/trees Nodes are elements and attributes Arcs are relationships Assumes matched elements between two graphs should have related elements that can be matched Ex. Similarity flooding, Cupid   Structure-based: Graph/tree mappings
Ontology definition: “ Specification of a conceptualization .”  [ Gru92 ] ” Explicit formal  specification of the terms in the domain and relations among them .” Ontology Mapping Definition:  “ Given two ontologies  O1  and  O2 , mapping one ontology onto another means that for each entity (concept  C , relation  R ,or instance  I ) in ontology  O1 , we try to find a corresponding entity, which has the same intended meaning, in ontology  O2 ” [Ehrig and Staab] Ontology Mapping Problem
Research Classification of Ontology Mapping [Noy04] Mapping Discovery  aims to find the similarities between two ontologies, and how do we determine which concepts and properties represent similar notions? Declarative formal representation of mappings  identifies the ways we can represent the mappings between two ontologies to enable reasoning with that mapping. Reasoning with mappings   Is concerned with performing reasoning based on the mapping between ontologies. After defining the mapping, what type of and how we can perform reasoning on these mappings? Ontology Mapping Research
Snoggle  A user interactive visual ontology mapping tool.  User’s define mappings definitions  between the two ontologies which are expressed in SWRL (Semantic Web Rule Language). Converted into Jena Rules  which  are applied to the Jena inference engine to produce instances which can be queried. Survey : State of the Art
GLUE – [ Doa+3 ] - Machine learning techniques to find mappings. If the system is provided with two ontologies, for each concept in one ontology it finds the most similar concept in the other ontology.  GLUE architecture  consists of -  Distribution Estimator    - Similarity Estimator.  - Relaxation Labeler  GLUE output's one to one correspondences between the taxonomies the ontologies . - String similarity, structure and and machine learning strategies. GLUE
PROMPT  [Noy04]  Input: Two ontology's  in OWL/ OKBC  Output: Suggestions of mapping and a merging ontology  based on the choice made by the user. iPROMPT : Interactive ontology merging tool. AnchorPROMT : Graph-based mappings to provide additional information for iPROMPT.  PROMPTDiff : Compares different ontology versions by combining matchers in a fixed point manner. PROMPTFactor : Tool for extracting a part of an ontology. PROMPT
Lucene Ontology Mapper    The source ontology is indexed into Lucene Documents (fields) using the  Lucene search engine  Each field in the target ontology is provided as a search argument which is turn compared with the fileds in the source document and the hit scores are computed.  Fields with the maximum hit scores are said to be similar and hence mapped. PowerMap also uses Lucene as part of its Ontology Mapping Framework IR Approaches
QOM  String similarity, structure and instances.  Input : Two OWL or RDFS ontology's with elements (e.g., classes, properties, instances) in the ontology's  Output: One-to-one or one-to-none correspondences.  Heuristics are used to lower the number of candidate mappings.  It avoids the complete pair wise comparison of trees in favor of the top-down strategy  Sigmoid functions are applied which emphasizes high individual similarities and de-emphasizes low individual similarities Threshold is used  to discard spurious evidence of similarity. QOM
Schemas [Cho08,UscGru03] Specify database structure Relationships Attributes Typically relational or XML Ontologies [UscGru03, ShvEuz05] Formal semantic specification of a shared conceptualization Concepts Relationships Typically encoded with formal languages Description logics Most utilize taxonomic structure Schemas and Ontologies
Both are forms of meta-data Both utilized for domain description Both utilize constraints (but in different ways) Similarities
Few differences The  essential  (and trivial) difference is what each specifies and their uses DB for querying  Ontologies for search and derivation Lines are blurring (e.g. SPARQL) Schemas don’t have semantics Relational schemas lack generality Ontologies use constraints to establish meaning Schemas use constraints to establish integrity  Differences
Element matching approaches [Wac+6] Top-level ontologies Shared ontology utilized for common language and semantics for subsumed ontologies Ontologies that inherit the top-level ontology can be mapped easier Semantic Correspondence Utilizes top-level ontologies for automatic ontology mapping Formal concept analysis: Produces a common concept lattice between ontologies through object-attribute analysis Structure level [ShvEuz05] Topology matching Utilizes sub-/super- class semantics Assumes the superclasses and subclasses of matched elements are more likely to be related Model matching Utilizes semantic interpretations of ontologies to construct logical representations of potential mappings Utilizes background “knowledge” to provide axioms for the representation Runs a SAT/Validity checker to determine “correct” mappings Consequences of Differences
Due to similarities, and few differences Applications can be made that translate DB Schemas to Ontologies [XuZhaDon06] Methodologies developed with both in mind will benefit both Algorithms for schema matching applicable to ontology mapping Some approaches that rely on semantics prevent the opposite [Hess06] Schema vocabularies and forced taxonomic structure could eliminate this Schema -> Ontology
Implementing an algorithm for OWL ontology mapping based on Cupid Cupid [MadBerRah01] Hybrid approach Uses linguistic and data-type constraint matching followed by tree structure mapping “ Derives” mappings as a result of coefficient computation Our approach Parse two OWL ontologies Use a simple string matcher for initial similarities Utilize tree structure methodology on known OWL semantics  Schema Matching Algorithm
Assumptions Leaf nodes are structurally ( ssim)  similar if they have lexical and data-type similarity lsim(s,t) [0-1] : Lexical similarity uses substring, normalization, and hypernymy and synonymy matching data-type-similarity(s,t) [0-.5] : Look up table of data-types and their similarity Non-leaf nodes are  ssim  if they are  lsim  and their leaf nodes are  weighted similarly ( wsim),  immediate children do not influence  ssim . wsim(s,t) [0-1] : Measure of the lexical and structural similarity. Preference to one or the other is controlled by a modifying constant. Constants w struct :  Modifies   the influence of each matcher th accept : When to accept two leaf nodes as  strongly linked th high /th low : When to increase/decrease structural similarity c inc /c dec : How much to increase/decrease structural similarity Algorithm – TreeMatch( S , T ) Initialize  ssim(s,t)  =  data-type-similarity(s,t)  for every leaf node in  S  and  T Using post-order traversal, for every node  s  in  S,  and node  t  in  T wsim(s,t)  =  w struct  *  ssim(s,t) + (1 – w struct )  *  lsim(s,t) if  wsim(s,t) > th high   increase  ssim  for all leaf nodes of  s  and  t  by  c inc if  wsim(s,t) < thlow  decrease  ssim  for all leaf nodes of  s  and  t  by  c dec Tree Matcher
Cupid Mappings - High  lsim (A1) - High  wsim (A2) - Matches PurchaseOrder DeliverTo InvoiceTo Items Address Address Item Street City City Street ItemCount ItemNumber Quantity UnitOfMeasure PO POShipTo POBillTo POLines Item Street City City Street Count Line Qty UoM
Schema Matching and Ontology Matching address similar problems Schema matching approaches are applicable to ontology mapping Doesn’t utilize semantic information The opposite doesn’t hold. Hybrid approaches are the best methodologies for automatic, generic schema matching and ontology mapping Systems that employ schema matching might be capable of working with ontologies provided minimal adjustment (e.g. Cupid) Additional experimentation is needed Conclusions
[Cho08] – J. Chomicki. Data Integration: Schema Mapping. February 2008.    http://www.cse.buffalo.edu/~chomicki/636/handout-mapping.pdf [Doa+3] – A. Doan, J. Madhavan, P. Domingos, and A. Halevy . Learning to Map between Ontologies on the    Semantic Web.  Proceedings of the 11th international conference on World Wide Web . 2002. [Gru93] – T.R. Grubber. A Translation Approach to Portable Ontologies .     Knowledge Acquisition 5(2) . 1992  [Hess06] – A. Hess. An Interative Algorithm for Ontology Mapping Capable of Using Training Data.  Proceedings of ESWC '06 . 2006. [MadBerRah01] – J. Madhaven, P. Bernstein, and E. Rahm. Gweneric Schema Matching with Cupid.    Proceedings of the 27 th  VLDB Conference.  2001. [Noy04]  - N. Noy. Semantic Integration: A Survey of Ontology-based Approaches.    Sigmond Record, Special Issue on Semantic Integration.  2004 [SchvEuz05] – P. Shvaiko and J. Euzenat.  A Survey of Schema-based Matching Approaches.    Journal on Data Semantics.  2005. [UscGru05] – M. Uschold and M. Gruninger. Ontology and Semantics for Seamless Connectivity.    Sigmond Record 33(4). 2004. [Wac+6] - H. Wache, T. Vogele, U. Visser, H. Stuckenschmidt, G. Schuster, H. Neumann, and S. Hubner.    Ontology-based Integration of Information: A Survey of Existing Approaches.  IJCAI--01 Workshop:    Ontologies and Information Sharing.  2001 [XuZhaDon06] – Z. Xu, S. Zhang, and Y. Dong. Mapping between Relational Database Schema and OWL    Ontology for Deep Annotation.  Proceedings of the 2006 IEEE/WIC/ACM International Conference    on Web Intelligence.  2006. References

More Related Content

PPT
Ontology Mapping
PDF
Ontology Mapping
PDF
Learning ontologies
PPTX
Ontology mapping for the semantic web
PPTX
Ontology integration - Heterogeneity, Techniques and more
PPT
Ontology engineering: Ontology alignment
PDF
Ontology matching
PPTX
Ontology-based Data Integration
Ontology Mapping
Ontology Mapping
Learning ontologies
Ontology mapping for the semantic web
Ontology integration - Heterogeneity, Techniques and more
Ontology engineering: Ontology alignment
Ontology matching
Ontology-based Data Integration

What's hot (20)

PDF
Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azm...
PPTX
ONTOLOGY BASED DATA ACCESS
PPTX
Ontology
PPTX
Semantic Web, Ontology, and Ontology Learning: Introduction
PPTX
The Standardization of Semantic Web Ontology
PDF
Lect6-An introduction to ontologies and ontology development
PDF
Using Text Comprehension Model for Learning Concepts, Context, and Topic of...
PDF
ONTOLOGICAL MODEL FOR CHARACTER RECOGNITION BASED ON SPATIAL RELATIONS
PDF
Ijetcas14 624
PDF
Ijetcas14 639
PPT
4 semantic web and ontology
PDF
Ontology-based Classification and Faceted Search Interface for APIs
DOC
Representation of ontology by Classified Interrelated object model
PPTX
Ontology Engineering for Big Data
PDF
Ontology Building and its Application using Hozo
PPTX
Ontology
PDF
Ekaw ontology learning for cost effective large-scale semantic annotation
PDF
Introduction to Ontology Concepts and Terminology
DOCX
NE7012- SOCIAL NETWORK ANALYSIS
Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azm...
ONTOLOGY BASED DATA ACCESS
Ontology
Semantic Web, Ontology, and Ontology Learning: Introduction
The Standardization of Semantic Web Ontology
Lect6-An introduction to ontologies and ontology development
Using Text Comprehension Model for Learning Concepts, Context, and Topic of...
ONTOLOGICAL MODEL FOR CHARACTER RECOGNITION BASED ON SPATIAL RELATIONS
Ijetcas14 624
Ijetcas14 639
4 semantic web and ontology
Ontology-based Classification and Faceted Search Interface for APIs
Representation of ontology by Classified Interrelated object model
Ontology Engineering for Big Data
Ontology Building and its Application using Hozo
Ontology
Ekaw ontology learning for cost effective large-scale semantic annotation
Introduction to Ontology Concepts and Terminology
NE7012- SOCIAL NETWORK ANALYSIS
Ad

Similar to Data Integration Ontology Mapping (20)

PDF
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING
PPTX
Barzilay & Lapata 2008 presentation
PPTX
SWSN UNIT-3.pptx we can information about swsn professional
PPT
X Som Graduation Presentation
PDF
Conceptual similarity measurement algorithm for domain specific ontology[
PDF
CONCEPTUAL SIMILARITY MEASUREMENT ALGORITHM FOR DOMAIN SPECIFIC ONTOLOGY
PDF
Conceptual Similarity Measurement Algorithm For Domain Specific Ontology
PDF
Interactive Analysis of Word Vector Embeddings
PDF
A NAIVE METHOD FOR ONTOLOGY CONSTRUCTION
PDF
A NAIVE METHOD FOR ONTOLOGY CONSTRUCTION
PDF
A Naive Method For Ontology Construction
PPT
Dexa2007 Orsi V1.5
DOCX
Towards Ontology Development Based on Relational Database
PDF
L1803058388
PDF
Ontology Matching Based on hypernym, hyponym, holonym, and meronym Sets in Wo...
PPT
{Ontology: Resource} x {Matching : Mapping} x {Schema : Instance} :: Compone...
PDF
TEXT PLAGIARISM CHECKER USING FRIENDSHIP GRAPHS
PDF
IDENTIFYING THE SEMANTIC RELATIONS ON UNSTRUCTURED DATA
PDF
IDENTIFYING THE SEMANTIC RELATIONS ON UNSTRUCTURED DATA
PDF
Identifying the semantic relations on
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING
Barzilay & Lapata 2008 presentation
SWSN UNIT-3.pptx we can information about swsn professional
X Som Graduation Presentation
Conceptual similarity measurement algorithm for domain specific ontology[
CONCEPTUAL SIMILARITY MEASUREMENT ALGORITHM FOR DOMAIN SPECIFIC ONTOLOGY
Conceptual Similarity Measurement Algorithm For Domain Specific Ontology
Interactive Analysis of Word Vector Embeddings
A NAIVE METHOD FOR ONTOLOGY CONSTRUCTION
A NAIVE METHOD FOR ONTOLOGY CONSTRUCTION
A Naive Method For Ontology Construction
Dexa2007 Orsi V1.5
Towards Ontology Development Based on Relational Database
L1803058388
Ontology Matching Based on hypernym, hyponym, holonym, and meronym Sets in Wo...
{Ontology: Resource} x {Matching : Mapping} x {Schema : Instance} :: Compone...
TEXT PLAGIARISM CHECKER USING FRIENDSHIP GRAPHS
IDENTIFYING THE SEMANTIC RELATIONS ON UNSTRUCTURED DATA
IDENTIFYING THE SEMANTIC RELATIONS ON UNSTRUCTURED DATA
Identifying the semantic relations on
Ad

Recently uploaded (20)

PPTX
Cloud computing and distributed systems.
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Approach and Philosophy of On baking technology
PDF
Encapsulation theory and applications.pdf
PPT
Teaching material agriculture food technology
PDF
Electronic commerce courselecture one. Pdf
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
cuic standard and advanced reporting.pdf
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Spectral efficient network and resource selection model in 5G networks
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
Big Data Technologies - Introduction.pptx
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
Spectroscopy.pptx food analysis technology
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
Cloud computing and distributed systems.
Digital-Transformation-Roadmap-for-Companies.pptx
Network Security Unit 5.pdf for BCA BBA.
Approach and Philosophy of On baking technology
Encapsulation theory and applications.pdf
Teaching material agriculture food technology
Electronic commerce courselecture one. Pdf
MYSQL Presentation for SQL database connectivity
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
cuic standard and advanced reporting.pdf
Advanced methodologies resolving dimensionality complications for autism neur...
Spectral efficient network and resource selection model in 5G networks
The AUB Centre for AI in Media Proposal.docx
Big Data Technologies - Introduction.pptx
The Rise and Fall of 3GPP – Time for a Sabbatical?
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Review of recent advances in non-invasive hemoglobin estimation
Spectroscopy.pptx food analysis technology
Dropbox Q2 2025 Financial Results & Investor Presentation
Mobile App Security Testing_ A Comprehensive Guide.pdf

Data Integration Ontology Mapping

  • 1. Pradeep Pillai and Michael Kandefer Department of Computer Science and Engineering University at Buffalo Buffalo, NY, 14260 {pbpillai,mwk3}@cse.buffalo.edu Schema Matching and Ontology Mapping: A Comparison
  • 2. Interoperability problem Problem of combining heterogeneous and distributed data sources Two solutions: Schema matching Ontology mapping W3C converging on standards for publishing web ontologies (e.g. OWL) Distributed ontologies is still an issue Intuition: Schema matching approaches are applicable to the ontology domain Introduction
  • 3. Schema Matching Ontology Mapping Comparison Ontology mapping using schema matching Conclusion AGENDA
  • 4. Distinction between matching and mapping isn’t clear Schema matching: process of “establishing [logical] correspondences between elements of the source and target schemas” [Cho08] Schema mapping: process of generating the assertions from schema matching Sometimes called “instance mapping” Schema Matching Definition
  • 5. Two general categories [ShvEuz05,MadBerRah01] Element-based: Mappings created based on analysis of the schema elements String-based Language-based Constraint-based Structure-based: Mapping created based on analysis of the elements and schema structure Tree-based Graph-based Matching approaches aren’t mutually exclusive Hybrid systems employ multiple methodologies Other properties Mappings need not be 1:1 Auxiliary information can be utilized Schema matching topology
  • 6. Utilizes string comparisons between elements to establish mappings Prefix/Suffix: Look for similar prefixes/suffixes Edit distance: How many swaps, additions, or subtractions it takes to convert one element into the other NGram: compute the number of common substrings of length n Ex. COMA, S-Match Element-based: String mappings
  • 7. Element-based: String mappings - Prefix(3) - 3-Gram(2) - Edit distance(5) PurchaseOrder DeliverTo InvoiceTo Items Address Address Item Street City City Street ItemCount ItemNumber Quantity UnitOfMeasure PO POShipTo POBillTo POLines Item Street City City Street Count Line Qty UoM
  • 8. Utilizes properties of language in order to find elements with a common word sense Normalization Tokenization: Punctuation used to divide an element into tokens. Expansion: Expand acronym and short-hand tokens. Elimination: Remove undesirable tokens, such as prepositions, before comparison Lemmatization: Tokens converted to their basic form (e.g. remove pluralization) and compared Auxiliary information: Utilize external sources to aid matching Wordnet, thesauri, or dictionaries Ex. Cupid, S-Match Element-based: Language mappings
  • 9. Element-based: Language mappings POBillTo InvoiceTo <PO,Bill,To> <Invoice,To> Tokenize: <PO,Bill> <Invoice> Elimination: Expansion: <Purchase,Order,Bill> <Invoice> <Purchase,Order,Bill> <Bill> Related form: PurchaseOrder DeliverTo InvoiceTo Items Address Address Item Street City City Street ItemCount ItemNumber Quantity UnitOfMeasure PO POShipTo POBillTo POLines Item Street City City Street Count Line Qty UoM
  • 10. Represents schemas as graphs/trees Nodes are elements and attributes Arcs are relationships Assumes matched elements between two graphs should have related elements that can be matched Ex. Similarity flooding, Cupid Structure-based: Graph/tree mappings
  • 11. Ontology definition: “ Specification of a conceptualization .” [ Gru92 ] ” Explicit formal specification of the terms in the domain and relations among them .” Ontology Mapping Definition: “ Given two ontologies O1 and O2 , mapping one ontology onto another means that for each entity (concept C , relation R ,or instance I ) in ontology O1 , we try to find a corresponding entity, which has the same intended meaning, in ontology O2 ” [Ehrig and Staab] Ontology Mapping Problem
  • 12. Research Classification of Ontology Mapping [Noy04] Mapping Discovery aims to find the similarities between two ontologies, and how do we determine which concepts and properties represent similar notions? Declarative formal representation of mappings identifies the ways we can represent the mappings between two ontologies to enable reasoning with that mapping. Reasoning with mappings Is concerned with performing reasoning based on the mapping between ontologies. After defining the mapping, what type of and how we can perform reasoning on these mappings? Ontology Mapping Research
  • 13. Snoggle A user interactive visual ontology mapping tool. User’s define mappings definitions between the two ontologies which are expressed in SWRL (Semantic Web Rule Language). Converted into Jena Rules which are applied to the Jena inference engine to produce instances which can be queried. Survey : State of the Art
  • 14. GLUE – [ Doa+3 ] - Machine learning techniques to find mappings. If the system is provided with two ontologies, for each concept in one ontology it finds the most similar concept in the other ontology. GLUE architecture consists of - Distribution Estimator - Similarity Estimator. - Relaxation Labeler GLUE output's one to one correspondences between the taxonomies the ontologies . - String similarity, structure and and machine learning strategies. GLUE
  • 15. PROMPT [Noy04] Input: Two ontology's in OWL/ OKBC Output: Suggestions of mapping and a merging ontology based on the choice made by the user. iPROMPT : Interactive ontology merging tool. AnchorPROMT : Graph-based mappings to provide additional information for iPROMPT. PROMPTDiff : Compares different ontology versions by combining matchers in a fixed point manner. PROMPTFactor : Tool for extracting a part of an ontology. PROMPT
  • 16. Lucene Ontology Mapper The source ontology is indexed into Lucene Documents (fields) using the Lucene search engine Each field in the target ontology is provided as a search argument which is turn compared with the fileds in the source document and the hit scores are computed. Fields with the maximum hit scores are said to be similar and hence mapped. PowerMap also uses Lucene as part of its Ontology Mapping Framework IR Approaches
  • 17. QOM String similarity, structure and instances. Input : Two OWL or RDFS ontology's with elements (e.g., classes, properties, instances) in the ontology's Output: One-to-one or one-to-none correspondences. Heuristics are used to lower the number of candidate mappings. It avoids the complete pair wise comparison of trees in favor of the top-down strategy Sigmoid functions are applied which emphasizes high individual similarities and de-emphasizes low individual similarities Threshold is used to discard spurious evidence of similarity. QOM
  • 18. Schemas [Cho08,UscGru03] Specify database structure Relationships Attributes Typically relational or XML Ontologies [UscGru03, ShvEuz05] Formal semantic specification of a shared conceptualization Concepts Relationships Typically encoded with formal languages Description logics Most utilize taxonomic structure Schemas and Ontologies
  • 19. Both are forms of meta-data Both utilized for domain description Both utilize constraints (but in different ways) Similarities
  • 20. Few differences The essential (and trivial) difference is what each specifies and their uses DB for querying Ontologies for search and derivation Lines are blurring (e.g. SPARQL) Schemas don’t have semantics Relational schemas lack generality Ontologies use constraints to establish meaning Schemas use constraints to establish integrity Differences
  • 21. Element matching approaches [Wac+6] Top-level ontologies Shared ontology utilized for common language and semantics for subsumed ontologies Ontologies that inherit the top-level ontology can be mapped easier Semantic Correspondence Utilizes top-level ontologies for automatic ontology mapping Formal concept analysis: Produces a common concept lattice between ontologies through object-attribute analysis Structure level [ShvEuz05] Topology matching Utilizes sub-/super- class semantics Assumes the superclasses and subclasses of matched elements are more likely to be related Model matching Utilizes semantic interpretations of ontologies to construct logical representations of potential mappings Utilizes background “knowledge” to provide axioms for the representation Runs a SAT/Validity checker to determine “correct” mappings Consequences of Differences
  • 22. Due to similarities, and few differences Applications can be made that translate DB Schemas to Ontologies [XuZhaDon06] Methodologies developed with both in mind will benefit both Algorithms for schema matching applicable to ontology mapping Some approaches that rely on semantics prevent the opposite [Hess06] Schema vocabularies and forced taxonomic structure could eliminate this Schema -> Ontology
  • 23. Implementing an algorithm for OWL ontology mapping based on Cupid Cupid [MadBerRah01] Hybrid approach Uses linguistic and data-type constraint matching followed by tree structure mapping “ Derives” mappings as a result of coefficient computation Our approach Parse two OWL ontologies Use a simple string matcher for initial similarities Utilize tree structure methodology on known OWL semantics Schema Matching Algorithm
  • 24. Assumptions Leaf nodes are structurally ( ssim) similar if they have lexical and data-type similarity lsim(s,t) [0-1] : Lexical similarity uses substring, normalization, and hypernymy and synonymy matching data-type-similarity(s,t) [0-.5] : Look up table of data-types and their similarity Non-leaf nodes are ssim if they are lsim and their leaf nodes are weighted similarly ( wsim), immediate children do not influence ssim . wsim(s,t) [0-1] : Measure of the lexical and structural similarity. Preference to one or the other is controlled by a modifying constant. Constants w struct : Modifies the influence of each matcher th accept : When to accept two leaf nodes as strongly linked th high /th low : When to increase/decrease structural similarity c inc /c dec : How much to increase/decrease structural similarity Algorithm – TreeMatch( S , T ) Initialize ssim(s,t) = data-type-similarity(s,t) for every leaf node in S and T Using post-order traversal, for every node s in S, and node t in T wsim(s,t) = w struct * ssim(s,t) + (1 – w struct ) * lsim(s,t) if wsim(s,t) > th high increase ssim for all leaf nodes of s and t by c inc if wsim(s,t) < thlow decrease ssim for all leaf nodes of s and t by c dec Tree Matcher
  • 25. Cupid Mappings - High lsim (A1) - High wsim (A2) - Matches PurchaseOrder DeliverTo InvoiceTo Items Address Address Item Street City City Street ItemCount ItemNumber Quantity UnitOfMeasure PO POShipTo POBillTo POLines Item Street City City Street Count Line Qty UoM
  • 26. Schema Matching and Ontology Matching address similar problems Schema matching approaches are applicable to ontology mapping Doesn’t utilize semantic information The opposite doesn’t hold. Hybrid approaches are the best methodologies for automatic, generic schema matching and ontology mapping Systems that employ schema matching might be capable of working with ontologies provided minimal adjustment (e.g. Cupid) Additional experimentation is needed Conclusions
  • 27. [Cho08] – J. Chomicki. Data Integration: Schema Mapping. February 2008. http://www.cse.buffalo.edu/~chomicki/636/handout-mapping.pdf [Doa+3] – A. Doan, J. Madhavan, P. Domingos, and A. Halevy . Learning to Map between Ontologies on the Semantic Web. Proceedings of the 11th international conference on World Wide Web . 2002. [Gru93] – T.R. Grubber. A Translation Approach to Portable Ontologies . Knowledge Acquisition 5(2) . 1992 [Hess06] – A. Hess. An Interative Algorithm for Ontology Mapping Capable of Using Training Data. Proceedings of ESWC '06 . 2006. [MadBerRah01] – J. Madhaven, P. Bernstein, and E. Rahm. Gweneric Schema Matching with Cupid. Proceedings of the 27 th VLDB Conference. 2001. [Noy04] - N. Noy. Semantic Integration: A Survey of Ontology-based Approaches. Sigmond Record, Special Issue on Semantic Integration. 2004 [SchvEuz05] – P. Shvaiko and J. Euzenat. A Survey of Schema-based Matching Approaches. Journal on Data Semantics. 2005. [UscGru05] – M. Uschold and M. Gruninger. Ontology and Semantics for Seamless Connectivity. Sigmond Record 33(4). 2004. [Wac+6] - H. Wache, T. Vogele, U. Visser, H. Stuckenschmidt, G. Schuster, H. Neumann, and S. Hubner. Ontology-based Integration of Information: A Survey of Existing Approaches. IJCAI--01 Workshop: Ontologies and Information Sharing. 2001 [XuZhaDon06] – Z. Xu, S. Zhang, and Y. Dong. Mapping between Relational Database Schema and OWL Ontology for Deep Annotation. Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence. 2006. References