SlideShare a Scribd company logo
Ontological Knowledge Engineering
for Cultural Heritage of Andean Textiles

                   Immanuel Normann

                        July 20, 2012


  Department of Computer Science and Information Systems
Project Context
●   Pre-Columbian Latin America had no writing system
●   Alternative encoding systems were developed to pass down
    cultural knowledge
●   Hypothesis: weaving patterns as “writing systems” in this sense
●   General research endavour: deciphering these “writing
    systems”
●   Our objective: systematization on knowledge about Andean
    weaving through ontological approach
    ●   implementation of ontological knowledge system
    ●   instantiation of the system with facts
Project Team
●   La Paz
    Instituto de Lengua y Cultura Aymara (Denise Y Arnold)
    ●   Domain experts: knowledge acquisition and creation, building
        physical and virtual models, creating multimedia data.
    ●   Software developer: web front end
●   London
    ●   AHRC (Luciana Martins):
        principal investigator & domain experts (iconographic analysis)
    ●   Birkbeck DCS (Sven Helmer):
        Knowledge engineering + knowledge system implementation
My Role in this Project


Knowledge
engineering
My Role in this Project


Knowledge
engineering




Software
engineering
My Role in this Project


Knowledge
engineering



Content
processing



Software
engineering
My Role in this Project
Overview

  Knowledge
2 engineering



  Content
3 processing



  Software
1 engineering
Software Matters
Project status at the beginning of my work
●   Project proposal intends ontological approach
●   LaPaz team already aquainted with ontology related know how:
    ●   Methontology
    ●   Protege, CMap tools
    ●   CIDOC-CRM
●   Great amount of knowledge/data in spreadsheets
●   Relational database schemes developed.
●   other
    ●   handwritten museum register documents
    ●   images, videos, other multimedia documents,
    ●   woven samples
Initial Steps
●   identification of central research subdomains and their
    documents
    textiles, instruments, processes, historical/cultural back
    grounds, iconography, ...
●   identification of central docs: concept maps, spreadsheets
●   identification of the requirements for the KMS:
    ●   identification of stake holders
    ●   development of use case scenarios
    ●   competency questions
●   setting up a communication platform & versioning system
Example Concept Map
tiempo                                                                               Materia                        Objeto textil
      es
                                                Example elabora
                                                         Concept Map
                                                 se hizo en
                                                             se
                                                                 es                          prima
                         periodo                                                                                fibra
                                                                       con
                    es                     P. Colonial                                                      tinte

                                       P. Contemporáneo, etc.
                                                                                                          mordiente
           estilo
                                               tiene                                     se elabora
                                                                                         con                instrumento
      es
                         e. universal                                                                         es s
                                                                     Objeto                                                    telar

                         e. local/tecnológico                        textil es                                            es
                                                                                                                                       T. horizontal
                                                                                         prend         imagen

Vida social                                                                                a                                            T. cintura
                                                       tiene                              bien
                               Foto,
      Aprendizaje,             video                                                                                           Rueca, etc.
          etc.                                                                   se obtiene mediante

                                   tiene               es                  actividades
                                                elaborado
                                                      por                   es              evento
  Lugar                                                                                                  movimient
                                                         actor                                               o
 es                  sitio                                                                   proceso
                                                         es
                                                                                             es             esquila
                                                                 persona
                              S. producción                          es
                    es                                                       tejedora                           hilado
                                   S. recojo
                                                                      pertenece a                               teñido
                               S. custodia
                                                                                                            urdido                     estructura
                                                                   grupo
                                                                       es                                       tejido
                     ruta                                                         apsu
                                                                                                                                          técnica
                                                                                                            acabado
Example Datasheet
Example Competency Questions
●   ¿En qué sitios se halla evidencia de la práctica de la técnica x?
●   What sites is evidence of the practice of the technique x?
●


●   ¿En qué culturas se halla evidencia de la práctica de tal técnica?
●   In what cultures is evidence of the practice of the technique x?
●


●   ¿Cuál es el registro más antiguo de la técnica T?
●   What is the oldest log of the technical T?
●


●   ¿En qué tipo de prenda se empleó por primera vez la técnica X?
●   What type of garment is employment for the first time the technique X?
●


●   ¿Qué tipos de textiles se ha tejido usando la técnica T en un período P y región R?
●   What types of textiles has been woven using the technique T in a period P and region R?
Early Results from Requirement Analysis
●   How much of ontological reasoning is needed?
●   Which system could provide it?
●   Early tendency: RDBM.
    ●   RDB schema already defined
    ●   content partially already inserted in RDBM
    ●   most content in spreadsheets
    ●   ideas for simple reasoning developed
        (transitivity, ontological queries translated to SQL)
●   Does this approach satisfy the requirements?
Against the RDBM approach
●   Knowledge in concept maps
    ●   graph like knowledge representation - closer to ontological
        knowledge representation.
    ●   graph like queries involving some reasoning.
●   Dynamik model evolution
    ●   RDBS schema vs. Ontology change.
Relational Database vs. Ontology

Relational database systems
●   are perfect to model relationships with a static knowledge model
    (i.e. static relationship schema)
●   schema change is problematic and
●   no notion of hierarchies.


Ontology knowledge systems
●   allow to store the same datatypes as relational database systems
●   allow for modelling relationships
     –   in a different way closer to concept maps then to relation tables
●   have a built in notion of hierarchies!
●   and allow even more reasoning.
Queries on Graph Structures




select all Accesorios es elabora con Técnica para faz de trama
Requirements for Museum KMS

A museum knowledge management system should
●   facilitate relations between entities
●   have built in support for basic reasoning
●   should be flexible w.r.t. the evolution of knowledge model
●   facilitate storage of basic datatypes (numbers, boolean, ...), free
    text, and multimedia.
Conclusion
●   the RDB approach is insufficient w.r.t. model evolution and
    reasoning.
●   Ontological storage engine required.
●   Which is the best for our purpose?
Review of Triplestores

    State of the art surveys:
    ●   http://www.w3.org/wiki/LargeTripleStores
    ●   Europeana RDF Store Report (2011)
●   An incomplete list of triple stores:
    ●   Native stores: AllegroGrah, OWLIM, stardog
    ●   RDBMS based: Oracle, Jena SDB
    ●   hybrid: Virtuoso, Sesame, BigData
Our Decision: Virtuoso
●   why virtuoso:
    ●   multi paradigm storage: RDBM (SQL), XML (XQuery), OWL
        (SPARQL), reasoning.
    ●   scalable, massive data processing, stable, opensource edition,
        active community.
    ●   some know how from former projects
●   may be drawbacks:
    ●   too many ways to implement a knowledge base.
    ●   manual 4000 pages.
    ●   reasoning capabilities beyond reasoners like Pellet.
Knowledge Engineering
Formal ontologies in a nutshell
Conceptual issues
Ontology in a nutshell
●   unary constructs:
    ●   individuals (e.g. the textile object whose ID is ILCA_BML074)
    ●   class (e.g. the set of all garment classified as Poncho)
●   binary constructs:
    ●   object property = relation between individuals (e.g. in custody of:
        textile object ILCA_BML074 is in custody of the British Museum)
    ●   data property = attribute of an individual (e.g. has width: textile
        object ILCA_BML074 has width 52 cm)
    ●   instance of (type) = a relation between individuals classes (e.g.
        textile object ILCA_BML074 is an instance of the class Facha
        Ancha)
    ●   subclass relation = relation between classes (e.g. Facha Ancha is
        a subclass of Accesorios)
    ●   and even more like: union, intersection, complement, quantification, number restriction, ...
Ontology in a nutshell
Ontology Schema and Facts

    Ontology schema (TBox)
●   subclass relations (e.g. Poncho is subclass of Producto Textil)
●   domain and range restrictions of
    ●   object properties (e.g. in custody of has domain Producto Textil
        and as range Museum)
    ●   data properties (e.g. has width has domain Producto Textil and cm
        as range)
    Ontology facts (ABox)
●   all relations involving individuals (instance of, object properties,
    data properties)
TBox




       ABox
Knowledge Engineering
Formal ontologies in a nutshell
Conceptual issues
Abstract Entities
●   Abstracts entities: don't exist in space or in time.
●   Concrete entities exist at least in time. For example:
    ●   physical objects (like garments, books, etc.)
    ●   events (like the production of a certain garment)
●   Entities like colour, material, and shape are rather time independent.
●   what is the appropriate way to model abstract entities?
    In OWL we have only two options: as classes or instances.
●   For concrete entities it is easy:
    ●   my jacket I am wearing is an instance of the class of all Jackets which is
        a subclass of physical objects.
    ●   the discovery of Machu Picchu by Hiram Bingham is an instance of the
        class of all discoveries which is a subclass of events.
Abstract Entities
●   What about abstract entities: can they have subclasses or
    instances? For example colours:
        –   is the red we see here one instance and the red we see there
            another instance?
        –   If so, isn't it inconsistent to say that they are both the same reds?
            (we introduced the concept of colour coccurrence).
        –   is red a unique colour or a class of colors whose instances are e.g.
            dark-red, orange-red.
        –   aren't dark-red and orange-red rather themselves classes of reds?
        –   are there at all colours that are not subdividable into more granular
            colour values? (we chose to stop at RGB. For physicians wave
            lenght would make more sense).
Semi Abstract Entities
●   structure, technique, motive:
    ●   not localized in space: possibly at two different place at the same
        time.
    ●   not localized in time: may exist even if currently not applied or
        observed.
    ●   but: techniques / motives are invented and can be forgotten
●   epoch and style
    ●   seem to be clearly bound to a certain time period, but
    ●   at least styles may revive at any time.
    ●   epoch is a highly debated concept anyway.
Anonymous Entities
●   How should we formalize “Poncho p1 is made of Alpaca”?
    The naive way:
    p1 made_of a1.     p1 type Poncho. a1 type Alpaca.
    p1 is a concrete object we can point to. What about a1?
●   Consider: “Poncho p2 is also made of Alpaca”.
    p2 made_of a2.     p2 type Poncho. a2 type Alpaca.
    Is a1=a2 or not?
    We don't know and we don't care!
Anonymous Entities
●   Proper formalization of “Poncho p1 is made of Alpaca”:
    p1 type (made_of some Alpaca)
●   meaning:
    ●   p1 is an instance of the class (made_of some Alpaca)
    ●   (made_of some Alpaca) is the class of all x such that there exists
        and an a which is an instance of Alpaca.
    short: “p1 is made of some instance of Alpaca”
Limited Reasoning in Virtuoso
●   (made_of some Alpaca) is quantified class expression
    (some is its quantifier)
●   Problem with Virtuoso: it accepts quantified expressions, but
    does not support reasoning on them.
●   Example:
    p1 type (made_of some Alpaca)
    Alpaca subClassOf Camelido
    => p1 type (made_of some Camelido)
●   Virtuoso cannot infer this conclusion.
Prototypes as Workaround

Workaround for the Quantification Problem
●   introduce a class Prototype
●   create for every class (if needed) a dedicated instance of
    prototype.
●   Example:
    alpaca type Prototype.    alpaca type Alpaca.
    alpaca prototype_for Alpaca.
Prototypes as Workaround

    Reasoning via prototypes
●   Replace p1 type (made_of some Alpaca)
    by p1 made_of alpaca.
●   Now Virtuoso can deduce:
    p1 made_of alpaca.        Alpaca subClassOf Camelido.
    => p1 made_of ?x.         ?x type Camelido.
●   Note:
    ●   prototypes, in contrast to regular physical individuals, are not
        located in space and time ( => modeling conflict )
    ●   alpaca prototype_for Alpaca is not OWL conform.
Ontological Mistakes

Confusing subclass and instance with part of:
●   lake Titicaca is a spatial part of the Andes, but not a subclass of it.
●   weaving is temporal part of garment production (dying another
    one), but neither an instance nor a subclass of it.
●   part of is a super property of spatial- and temporal-part of.
Confusing subclass with instance:
●   Poncho (as indefinite word) is not an instance of garment but a
    subclass: the class of all concrete ponchos.
Ontological Mistakes

Confusing determined with undetermined objects:
●   in “this poncho (p1) is made of Alpaca”
    Alpaca should not be modelled as a certain instance of the class
    Alpaca!
Confusing equivalence with synonymy and/or translations:
●   if cloak same as manto and manto same as coat,
    then cloak same as coat.
●   if chair same as Sessel and Sessel same as armchair,
    then chair same as armchair.
Related Work

Controlled vocabularies:
●   Getty Thesaurus of Geographic Names (TGN),
●   Cataloging Cultural Objects (CCO),
●   Categories for the Description of Works of Art (CDWA)
Foundational Ontologies:
●   The CIDOC Conceptual Reference Model (CRM):
    concepts and relationships used in cultural heritage documentation.
●   DOLCE (Descriptive Ontology for Linguistic and Cognitive Engineering)
Linking open data (LOD):
●   dbpedia, freebase, geonames, ... (http://linkeddata.org/)
●   Linked Data and SPARQL service of British Museum
Content Processing
From Structured Content to RDF
TBox




       ABox
Migration of Knowledge Representations

    Separation of knowledge modelling:
●   TBox knowledge created with graph drawing tools     (http://www.yworks.com)

●   ABox facts created in spreadsheets


    Technical challenges:
●   migration to target format for TBox and ABox: RDF triples
    (source node - link - target node)

●   TBox migration: easy
●   ABox migration: difficult - due to irregular spreadsheets
●   TBox & ABox vocabulary alignment: tedious
Concept Hierarchies as TBox
ABox: facts in a spreadsheet
Workflows and Tools
spreadsheets                         concept maps
Workflows and Tools
spreadsheets                                       concept maps




               Problem: inconsistent vocabulary!




                          RDF
Workflows and Tools
spreadsheets                         concept maps




                   RDF
Workflows and Tools
Workflows and Tools
spreadsheets                             concept maps




               <




                       RDF
Thank you!

More Related Content

PDF
The Art of Presentation II. Following the ZEN path. PREPARATION
PDF
The Art of Presentation III. Following the ZEN path. DESIGN
PDF
Social Software for Robots
PDF
Vulnerability samlet 2
PDF
Vulnerability samlet
PDF
Catworks solitaires
PPT
Experiencing data gathering processes in rumour and gossip
PDF
College Advertising Portfolio
The Art of Presentation II. Following the ZEN path. PREPARATION
The Art of Presentation III. Following the ZEN path. DESIGN
Social Software for Robots
Vulnerability samlet 2
Vulnerability samlet
Catworks solitaires
Experiencing data gathering processes in rumour and gossip
College Advertising Portfolio

Viewers also liked (20)

PDF
Jumpstart_Brochure_Web_Nov_15_2016
PPT
Proyecto Seguridad Vial
PPTX
Isf tree and shrub seed group 2013
PDF
Monitorización de Twitter total con Twiana
DOCX
PDF
Belbin(es) newsletter-junio2013 (1)
PDF
Empresas 2011 12 pag
PPTX
Estados de animo
PPTX
Lcc's frutis empresa
PPTX
Expertiam Power Point 2012 Linkedin Slideshare
ODP
Desplegando Proyectos Symfony2 con Capistrano-Capifony y Git
PDF
10 claves para que una empresa sea exitosa en Internet.
PPT
Irish, Living Language
PPTX
Edpe 3058 – enseñanza de la gramática en
PPT
Harmonizing course objectives and programme objectives in assessing student l...
PPS
Duo Dinamico 50aniversario(A)
PDF
Energía en España - ¿Crisis entre Economía y Medio Ambiente?
PDF
Implementación de sanitarios ecológicos en comunidades wixaritari del norte d...
PPTX
PPT
Neuromarketing - klucz do umysłu konsumenta?
Jumpstart_Brochure_Web_Nov_15_2016
Proyecto Seguridad Vial
Isf tree and shrub seed group 2013
Monitorización de Twitter total con Twiana
Belbin(es) newsletter-junio2013 (1)
Empresas 2011 12 pag
Estados de animo
Lcc's frutis empresa
Expertiam Power Point 2012 Linkedin Slideshare
Desplegando Proyectos Symfony2 con Capistrano-Capifony y Git
10 claves para que una empresa sea exitosa en Internet.
Irish, Living Language
Edpe 3058 – enseñanza de la gramática en
Harmonizing course objectives and programme objectives in assessing student l...
Duo Dinamico 50aniversario(A)
Energía en España - ¿Crisis entre Economía y Medio Ambiente?
Implementación de sanitarios ecológicos en comunidades wixaritari del norte d...
Neuromarketing - klucz do umysłu konsumenta?
Ad

Recently uploaded (20)

PDF
Blue Breeze Dapoli – A Scenic Coastal Haven
PDF
3 Best sites to Buy Twitter Accounts (PVA & Bulk)
PDF
Indra Developer in dholera smart city, real Esate
PDF
Pride World City: Redefining Township Living in Charholi
PDF
Bayz 101 By Danube at Business Bay, Dubai.pdf
PPTX
Tangled Up in Green Luxury Developments
PDF
Under Construction Projects in Mumbai: A Smart Investment for Future Living
PDF
Binghatti Haven at Dubai Sports City -
PDF
Expert Advice on Property Development Finance for 2025.pdf
PPTX
Introduction to Property Management.pptx
PDF
Collaborating-for-a-Better-Future-Cross-Sector-Partnerships-in-Miami.pdf
PPTX
Pennsylvania Association of REALTORS Standard Forms Update - August 2025
PDF
SkyHills Astra at Dubai Science Park.pdf
PDF
Interstellar Tower at JVT, Dubai – Mr Eight Development
PDF
Skyvue Stellar at Sobha Hartland 2, Dubai – Sobha Group
PDF
Robin Pahuja – The Driving Force Among Gurgaon’s Top Real Estate Partners
PDF
Why DJ Thielen is the #1 Choice for Real Estate Investing
PDF
Real Estate in Pune – A Hotspot for Residential & Commercial Investment in India
PDF
Website Redesign Strategy: When and Why Malaysian Businesses Should Upgrade
PDF
401 Contracting Company for Construction
Blue Breeze Dapoli – A Scenic Coastal Haven
3 Best sites to Buy Twitter Accounts (PVA & Bulk)
Indra Developer in dholera smart city, real Esate
Pride World City: Redefining Township Living in Charholi
Bayz 101 By Danube at Business Bay, Dubai.pdf
Tangled Up in Green Luxury Developments
Under Construction Projects in Mumbai: A Smart Investment for Future Living
Binghatti Haven at Dubai Sports City -
Expert Advice on Property Development Finance for 2025.pdf
Introduction to Property Management.pptx
Collaborating-for-a-Better-Future-Cross-Sector-Partnerships-in-Miami.pdf
Pennsylvania Association of REALTORS Standard Forms Update - August 2025
SkyHills Astra at Dubai Science Park.pdf
Interstellar Tower at JVT, Dubai – Mr Eight Development
Skyvue Stellar at Sobha Hartland 2, Dubai – Sobha Group
Robin Pahuja – The Driving Force Among Gurgaon’s Top Real Estate Partners
Why DJ Thielen is the #1 Choice for Real Estate Investing
Real Estate in Pune – A Hotspot for Residential & Commercial Investment in India
Website Redesign Strategy: When and Why Malaysian Businesses Should Upgrade
401 Contracting Company for Construction
Ad

Lkl talk-2012

  • 1. Ontological Knowledge Engineering for Cultural Heritage of Andean Textiles Immanuel Normann July 20, 2012 Department of Computer Science and Information Systems
  • 2. Project Context ● Pre-Columbian Latin America had no writing system ● Alternative encoding systems were developed to pass down cultural knowledge ● Hypothesis: weaving patterns as “writing systems” in this sense ● General research endavour: deciphering these “writing systems” ● Our objective: systematization on knowledge about Andean weaving through ontological approach ● implementation of ontological knowledge system ● instantiation of the system with facts
  • 3. Project Team ● La Paz Instituto de Lengua y Cultura Aymara (Denise Y Arnold) ● Domain experts: knowledge acquisition and creation, building physical and virtual models, creating multimedia data. ● Software developer: web front end ● London ● AHRC (Luciana Martins): principal investigator & domain experts (iconographic analysis) ● Birkbeck DCS (Sven Helmer): Knowledge engineering + knowledge system implementation
  • 4. My Role in this Project Knowledge engineering
  • 5. My Role in this Project Knowledge engineering Software engineering
  • 6. My Role in this Project Knowledge engineering Content processing Software engineering
  • 7. My Role in this Project Overview Knowledge 2 engineering Content 3 processing Software 1 engineering
  • 9. Project status at the beginning of my work ● Project proposal intends ontological approach ● LaPaz team already aquainted with ontology related know how: ● Methontology ● Protege, CMap tools ● CIDOC-CRM ● Great amount of knowledge/data in spreadsheets ● Relational database schemes developed. ● other ● handwritten museum register documents ● images, videos, other multimedia documents, ● woven samples
  • 10. Initial Steps ● identification of central research subdomains and their documents textiles, instruments, processes, historical/cultural back grounds, iconography, ... ● identification of central docs: concept maps, spreadsheets ● identification of the requirements for the KMS: ● identification of stake holders ● development of use case scenarios ● competency questions ● setting up a communication platform & versioning system
  • 12. tiempo Materia Objeto textil es Example elabora Concept Map se hizo en se es prima periodo fibra con es P. Colonial tinte P. Contemporáneo, etc. mordiente estilo tiene se elabora con instrumento es e. universal es s Objeto telar e. local/tecnológico textil es es T. horizontal prend imagen Vida social a T. cintura tiene bien Foto, Aprendizaje, video Rueca, etc. etc. se obtiene mediante tiene es actividades elaborado por es evento Lugar movimient actor o es sitio proceso es es esquila persona S. producción es es tejedora hilado S. recojo pertenece a teñido S. custodia urdido estructura grupo es tejido ruta apsu técnica acabado
  • 14. Example Competency Questions ● ¿En qué sitios se halla evidencia de la práctica de la técnica x? ● What sites is evidence of the practice of the technique x? ● ● ¿En qué culturas se halla evidencia de la práctica de tal técnica? ● In what cultures is evidence of the practice of the technique x? ● ● ¿Cuál es el registro más antiguo de la técnica T? ● What is the oldest log of the technical T? ● ● ¿En qué tipo de prenda se empleó por primera vez la técnica X? ● What type of garment is employment for the first time the technique X? ● ● ¿Qué tipos de textiles se ha tejido usando la técnica T en un período P y región R? ● What types of textiles has been woven using the technique T in a period P and region R?
  • 15. Early Results from Requirement Analysis ● How much of ontological reasoning is needed? ● Which system could provide it? ● Early tendency: RDBM. ● RDB schema already defined ● content partially already inserted in RDBM ● most content in spreadsheets ● ideas for simple reasoning developed (transitivity, ontological queries translated to SQL) ● Does this approach satisfy the requirements?
  • 16. Against the RDBM approach ● Knowledge in concept maps ● graph like knowledge representation - closer to ontological knowledge representation. ● graph like queries involving some reasoning. ● Dynamik model evolution ● RDBS schema vs. Ontology change.
  • 17. Relational Database vs. Ontology Relational database systems ● are perfect to model relationships with a static knowledge model (i.e. static relationship schema) ● schema change is problematic and ● no notion of hierarchies. Ontology knowledge systems ● allow to store the same datatypes as relational database systems ● allow for modelling relationships – in a different way closer to concept maps then to relation tables ● have a built in notion of hierarchies! ● and allow even more reasoning.
  • 18. Queries on Graph Structures select all Accesorios es elabora con Técnica para faz de trama
  • 19. Requirements for Museum KMS A museum knowledge management system should ● facilitate relations between entities ● have built in support for basic reasoning ● should be flexible w.r.t. the evolution of knowledge model ● facilitate storage of basic datatypes (numbers, boolean, ...), free text, and multimedia. Conclusion ● the RDB approach is insufficient w.r.t. model evolution and reasoning. ● Ontological storage engine required. ● Which is the best for our purpose?
  • 20. Review of Triplestores State of the art surveys: ● http://www.w3.org/wiki/LargeTripleStores ● Europeana RDF Store Report (2011) ● An incomplete list of triple stores: ● Native stores: AllegroGrah, OWLIM, stardog ● RDBMS based: Oracle, Jena SDB ● hybrid: Virtuoso, Sesame, BigData
  • 21. Our Decision: Virtuoso ● why virtuoso: ● multi paradigm storage: RDBM (SQL), XML (XQuery), OWL (SPARQL), reasoning. ● scalable, massive data processing, stable, opensource edition, active community. ● some know how from former projects ● may be drawbacks: ● too many ways to implement a knowledge base. ● manual 4000 pages. ● reasoning capabilities beyond reasoners like Pellet.
  • 22. Knowledge Engineering Formal ontologies in a nutshell Conceptual issues
  • 23. Ontology in a nutshell ● unary constructs: ● individuals (e.g. the textile object whose ID is ILCA_BML074) ● class (e.g. the set of all garment classified as Poncho) ● binary constructs: ● object property = relation between individuals (e.g. in custody of: textile object ILCA_BML074 is in custody of the British Museum) ● data property = attribute of an individual (e.g. has width: textile object ILCA_BML074 has width 52 cm) ● instance of (type) = a relation between individuals classes (e.g. textile object ILCA_BML074 is an instance of the class Facha Ancha) ● subclass relation = relation between classes (e.g. Facha Ancha is a subclass of Accesorios) ● and even more like: union, intersection, complement, quantification, number restriction, ...
  • 24. Ontology in a nutshell
  • 25. Ontology Schema and Facts Ontology schema (TBox) ● subclass relations (e.g. Poncho is subclass of Producto Textil) ● domain and range restrictions of ● object properties (e.g. in custody of has domain Producto Textil and as range Museum) ● data properties (e.g. has width has domain Producto Textil and cm as range) Ontology facts (ABox) ● all relations involving individuals (instance of, object properties, data properties)
  • 26. TBox ABox
  • 27. Knowledge Engineering Formal ontologies in a nutshell Conceptual issues
  • 28. Abstract Entities ● Abstracts entities: don't exist in space or in time. ● Concrete entities exist at least in time. For example: ● physical objects (like garments, books, etc.) ● events (like the production of a certain garment) ● Entities like colour, material, and shape are rather time independent. ● what is the appropriate way to model abstract entities? In OWL we have only two options: as classes or instances. ● For concrete entities it is easy: ● my jacket I am wearing is an instance of the class of all Jackets which is a subclass of physical objects. ● the discovery of Machu Picchu by Hiram Bingham is an instance of the class of all discoveries which is a subclass of events.
  • 29. Abstract Entities ● What about abstract entities: can they have subclasses or instances? For example colours: – is the red we see here one instance and the red we see there another instance? – If so, isn't it inconsistent to say that they are both the same reds? (we introduced the concept of colour coccurrence). – is red a unique colour or a class of colors whose instances are e.g. dark-red, orange-red. – aren't dark-red and orange-red rather themselves classes of reds? – are there at all colours that are not subdividable into more granular colour values? (we chose to stop at RGB. For physicians wave lenght would make more sense).
  • 30. Semi Abstract Entities ● structure, technique, motive: ● not localized in space: possibly at two different place at the same time. ● not localized in time: may exist even if currently not applied or observed. ● but: techniques / motives are invented and can be forgotten ● epoch and style ● seem to be clearly bound to a certain time period, but ● at least styles may revive at any time. ● epoch is a highly debated concept anyway.
  • 31. Anonymous Entities ● How should we formalize “Poncho p1 is made of Alpaca”? The naive way: p1 made_of a1. p1 type Poncho. a1 type Alpaca. p1 is a concrete object we can point to. What about a1? ● Consider: “Poncho p2 is also made of Alpaca”. p2 made_of a2. p2 type Poncho. a2 type Alpaca. Is a1=a2 or not? We don't know and we don't care!
  • 32. Anonymous Entities ● Proper formalization of “Poncho p1 is made of Alpaca”: p1 type (made_of some Alpaca) ● meaning: ● p1 is an instance of the class (made_of some Alpaca) ● (made_of some Alpaca) is the class of all x such that there exists and an a which is an instance of Alpaca. short: “p1 is made of some instance of Alpaca”
  • 33. Limited Reasoning in Virtuoso ● (made_of some Alpaca) is quantified class expression (some is its quantifier) ● Problem with Virtuoso: it accepts quantified expressions, but does not support reasoning on them. ● Example: p1 type (made_of some Alpaca) Alpaca subClassOf Camelido => p1 type (made_of some Camelido) ● Virtuoso cannot infer this conclusion.
  • 34. Prototypes as Workaround Workaround for the Quantification Problem ● introduce a class Prototype ● create for every class (if needed) a dedicated instance of prototype. ● Example: alpaca type Prototype. alpaca type Alpaca. alpaca prototype_for Alpaca.
  • 35. Prototypes as Workaround Reasoning via prototypes ● Replace p1 type (made_of some Alpaca) by p1 made_of alpaca. ● Now Virtuoso can deduce: p1 made_of alpaca. Alpaca subClassOf Camelido. => p1 made_of ?x. ?x type Camelido. ● Note: ● prototypes, in contrast to regular physical individuals, are not located in space and time ( => modeling conflict ) ● alpaca prototype_for Alpaca is not OWL conform.
  • 36. Ontological Mistakes Confusing subclass and instance with part of: ● lake Titicaca is a spatial part of the Andes, but not a subclass of it. ● weaving is temporal part of garment production (dying another one), but neither an instance nor a subclass of it. ● part of is a super property of spatial- and temporal-part of. Confusing subclass with instance: ● Poncho (as indefinite word) is not an instance of garment but a subclass: the class of all concrete ponchos.
  • 37. Ontological Mistakes Confusing determined with undetermined objects: ● in “this poncho (p1) is made of Alpaca” Alpaca should not be modelled as a certain instance of the class Alpaca! Confusing equivalence with synonymy and/or translations: ● if cloak same as manto and manto same as coat, then cloak same as coat. ● if chair same as Sessel and Sessel same as armchair, then chair same as armchair.
  • 38. Related Work Controlled vocabularies: ● Getty Thesaurus of Geographic Names (TGN), ● Cataloging Cultural Objects (CCO), ● Categories for the Description of Works of Art (CDWA) Foundational Ontologies: ● The CIDOC Conceptual Reference Model (CRM): concepts and relationships used in cultural heritage documentation. ● DOLCE (Descriptive Ontology for Linguistic and Cognitive Engineering) Linking open data (LOD): ● dbpedia, freebase, geonames, ... (http://linkeddata.org/) ● Linked Data and SPARQL service of British Museum
  • 41. TBox ABox
  • 42. Migration of Knowledge Representations Separation of knowledge modelling: ● TBox knowledge created with graph drawing tools (http://www.yworks.com) ● ABox facts created in spreadsheets Technical challenges: ● migration to target format for TBox and ABox: RDF triples (source node - link - target node) ● TBox migration: easy ● ABox migration: difficult - due to irregular spreadsheets ● TBox & ABox vocabulary alignment: tedious
  • 44. ABox: facts in a spreadsheet
  • 46. Workflows and Tools spreadsheets concept maps Problem: inconsistent vocabulary! RDF
  • 49. Workflows and Tools spreadsheets concept maps < RDF