SlideShare a Scribd company logo
O NTOLOGY-BASED
 C LASSIFICATION OF M OLECULES :
A L OGIC P ROGRAMMING A PPROACH


                Despoina Magka

  Department of Computer Science, University of Oxford


               November 30, 2012
B IOINFORMATICS AND S EMANTIC T ECHNOLOGIES
      Life sciences data deluge




1
B IOINFORMATICS AND S EMANTIC T ECHNOLOGIES
      Life sciences data deluge
      Hierarchical organisation of biochemical knowledge




1
B IOINFORMATICS AND S EMANTIC T ECHNOLOGIES
      Life sciences data deluge
      Hierarchical organisation of biochemical knowledge




1
B IOINFORMATICS AND S EMANTIC T ECHNOLOGIES
      Life sciences data deluge
      Hierarchical organisation of biochemical knowledge




1
B IOINFORMATICS AND S EMANTIC T ECHNOLOGIES
      Life sciences data deluge
      Hierarchical organisation of biochemical knowledge




      Fast, automatic and repeatable classification driven by
      Semantic technologies




1
B IOINFORMATICS AND S EMANTIC T ECHNOLOGIES
      Life sciences data deluge
      Hierarchical organisation of biochemical knowledge




      Fast, automatic and repeatable classification driven by
      Semantic technologies
      Web Ontology Language, a W3C standard family
      of logic-based formalisms



1
B IOINFORMATICS AND S EMANTIC T ECHNOLOGIES
      Life sciences data deluge
      Hierarchical organisation of biochemical knowledge




      Fast, automatic and repeatable classification driven by
      Semantic technologies
      Web Ontology Language, a W3C standard family
      of logic-based formalisms
      OWL bio- and chemo-ontologies widely adopted

1
T HE C H EBI O NTOLOGY

    OWL ontology Chemical Entities of Biological Interest




2
T HE C H EBI O NTOLOGY

    OWL ontology Chemical Entities of Biological Interest
    Dictionary of molecules with taxonomical information




2
T HE C H EBI O NTOLOGY

    OWL ontology Chemical Entities of Biological Interest
    Dictionary of molecules with taxonomical information




              caffeine is a cyclic molecule
2
T HE C H EBI O NTOLOGY

    OWL ontology Chemical Entities of Biological Interest
    Dictionary of molecules with taxonomical information




            serotonin is an organic molecule

2
T HE C H EBI O NTOLOGY

    OWL ontology Chemical Entities of Biological Interest
    Dictionary of molecules with taxonomical information




              ascorbic acid is a carboxylic ester




2
T HE C H EBI O NTOLOGY

    OWL ontology Chemical Entities of Biological Interest
    Dictionary of molecules with taxonomical information
    Pharmaceutical design and study of biological pathways




2
T HE C H EBI O NTOLOGY

    OWL ontology Chemical Entities of Biological Interest
    Dictionary of molecules with taxonomical information
    Pharmaceutical design and study of biological pathways




    ChEBI is manually incremented




2
T HE C H EBI O NTOLOGY

    OWL ontology Chemical Entities of Biological Interest
    Dictionary of molecules with taxonomical information
    Pharmaceutical design and study of biological pathways




    ChEBI is manually incremented
    Currently ~30,000 chemical entities, expands at 3,500/yr




2
T HE C H EBI O NTOLOGY

    OWL ontology Chemical Entities of Biological Interest
    Dictionary of molecules with taxonomical information
    Pharmaceutical design and study of biological pathways




    ChEBI is manually incremented
    Currently ~30,000 chemical entities, expands at 3,500/yr
    Existing chemical databases describe millions of molecules




2
T HE C H EBI O NTOLOGY

    OWL ontology Chemical Entities of Biological Interest
    Dictionary of molecules with taxonomical information
    Pharmaceutical design and study of biological pathways




    ChEBI is manually incremented
    Currently ~30,000 chemical entities, expands at 3,500/yr
    Existing chemical databases describe millions of molecules
    Speed up growth by automating chemical classification



2
E XPRESSIVITY L IMITATIONS OF OWL
    1   At least one tree-shaped model for each consistent OWL
        ontology    problematic representation of cycles




3
E XPRESSIVITY L IMITATIONS OF OWL
     1   At least one tree-shaped model for each consistent OWL
         ontology    problematic representation of cycles




    E XAMPLE


     C      C

     C      C




3
E XPRESSIVITY L IMITATIONS OF OWL
      1   At least one tree-shaped model for each consistent OWL
          ontology    problematic representation of cycles




     E XAMPLE
    Cyclobutane    ∃(= 4)hasAtom.(Carbon     ∃(= 2)hasBond.Carbon)

      C      C

      C      C




3
E XPRESSIVITY L IMITATIONS OF OWL
      1   At least one tree-shaped model for each consistent OWL
          ontology    problematic representation of cycles




     E XAMPLE
    Cyclobutane    ∃(= 4)hasAtom.(Carbon     ∃(= 2)hasBond.Carbon)

      C      C

      C      C




3
E XPRESSIVITY L IMITATIONS OF OWL
      1   At least one tree-shaped model for each consistent OWL
          ontology    problematic representation of cycles




     E XAMPLE
    Cyclobutane      ∃(= 4)hasAtom.(Carbon        ∃(= 2)hasBond.Carbon)

      C         C

      C         C


          OWL-based reasoning support
            1   Is cyclobutane a cyclic molecule? 


3
E XPRESSIVITY L IMITATIONS OF OWL
      1   At least one tree-shaped model for each consistent OWL
          ontology    problematic representation of cycles
      2   No minimality condition on the models    hard to axiomatise
          classes based on the absence of attributes

     E XAMPLE
    Cyclobutane      ∃(= 4)hasAtom.(Carbon        ∃(= 2)hasBond.Carbon)

      C         C

      C         C


          OWL-based reasoning support
            1   Is cyclobutane a cyclic molecule? 


3
E XPRESSIVITY L IMITATIONS OF OWL
      1   At least one tree-shaped model for each consistent OWL
          ontology    problematic representation of cycles
      2   No minimality condition on the models    hard to axiomatise
          classes based on the absence of attributes

     E XAMPLE
    Cyclobutane      ∃(= 4)hasAtom.(Carbon        ∃(= 2)hasBond.Carbon)
                                                  Oxygen
      C         C

      C         C


          OWL-based reasoning support
            1   Is cyclobutane a cyclic molecule? 


3
E XPRESSIVITY L IMITATIONS OF OWL
      1   At least one tree-shaped model for each consistent OWL
          ontology    problematic representation of cycles
      2   No minimality condition on the models    hard to axiomatise
          classes based on the absence of attributes

     E XAMPLE
    Cyclobutane      ∃(= 4)hasAtom.(Carbon        ∃(= 2)hasBond.Carbon)
                                                  Oxygen
      C         C

      C         C


          OWL-based reasoning support
            1   Is cyclobutane a cyclic molecule? 
            2   Is cyclobutane a hydrocarbon? 

3
E XPRESSIVITY L IMITATIONS OF OWL
      1   At least one tree-shaped model for each consistent OWL
          ontology    problematic representation of cycles
      2   No minimality condition on the models    hard to axiomatise
          classes based on the absence of attributes

     E XAMPLE
    Cyclobutane    ∃(= 4)hasAtom.(Carbon      ∃(= 2)hasBond.Carbon)
                                             Oxygen
      C      C

      C      C




3
E XPRESSIVITY L IMITATIONS OF OWL
      1   At least one tree-shaped model for each consistent OWL
          ontology    problematic representation of cycles
      2   No minimality condition on the models    hard to axiomatise
          classes based on the absence of attributes

     E XAMPLE
    Cyclobutane      ∃(= 4)hasAtom.(Carbon          ∃(= 2)hasBond.Carbon)
                                                    Oxygen
      C         C

      C         C

          Required reasoning support
            1   Is cyclobutane a cyclic molecule?
            2   Is cyclobutane a hydrocarbon?

3
E XPRESSIVITY L IMITATIONS OF OWL
      1   At least one tree-shaped model for each consistent OWL
          ontology    problematic representation of cycles
      2   No minimality condition on the models    hard to axiomatise
          classes based on the absence of attributes

     E XAMPLE
    Cyclobutane      ∃(= 4)hasAtom.(Carbon        ∃(= 2)hasBond.Carbon)
                                                  Oxygen
      C         C

      C         C

          Required reasoning support
            1   Is cyclobutane a cyclic molecule? 
            2   Is cyclobutane a hydrocarbon? 

3
R ESULTS OVERVIEW
    1   Expressive and decidable formalism for modelling complex
        objects: Description Graphs Logic Programs




4
R ESULTS OVERVIEW
    1   Expressive and decidable formalism for modelling complex
        objects: Description Graphs Logic Programs
    2   Modelling that spans a wide range of structure-dependent
        classes of molecules




4
R ESULTS OVERVIEW
    1   Expressive and decidable formalism for modelling complex
        objects: Description Graphs Logic Programs
    2   Modelling that spans a wide range of structure-dependent
        classes of molecules
    3   Implementation that draws upon DLV and performs
        structure-based classification with a significant speedup




4
R ESULTS OVERVIEW
    1   Expressive and decidable formalism for modelling complex
        objects: Description Graphs Logic Programs
    2   Modelling that spans a wide range of structure-dependent
        classes of molecules
    3   Implementation that draws upon DLV and performs
        structure-based classification with a significant speedup
    4   Evaluation over part of the manually curated ChEBI
        ontology revealed modelling errors




4
R ESULTS OVERVIEW
    1   Expressive and decidable formalism for modelling complex
        objects: Description Graphs Logic Programs
    2   Modelling that spans a wide range of structure-dependent
        classes of molecules
    3   Implementation that draws upon DLV and performs
        structure-based classification with a significant speedup
    4   Evaluation over part of the manually curated ChEBI
        ontology revealed modelling errors

          Language for representing biochemical structures with a
              favourable performance/expressivity trade-off




4
C LASSIFYING S TRUCTURED O BJECTS




5
C LASSIFYING S TRUCTURED O BJECTS

                   ascorbicAcid :   0
                                                 o
                                                 6
                                        o         c                o
                                                                                   o
                                        5   c    11        c       1       c
                       hasAtom                   h
                                                                                   2
                                            12            10               7
                        single                                 c       c
                                                 13
                        double                                 9       8


                                                      4    o                   3   o




5
C LASSIFYING S TRUCTURED O BJECTS

                                     ascorbicAcid :   0
                                                                   o
                                                                   6
                                                          o         c                o
                                                                                                     o
                                                          5   c    11        c       1       c
                                         hasAtom                   h
                                                                                                     2
                                                              12            10               7
                                          single                                 c       c
                                                                   13
                                          double                                 9       8


                                                                        4    o                   3   o




    ascorbicAcid(x) →hasAtom(x, f1 (x)) ∧ . . . ∧ hasAtom(x, f13 (x))
                       o(f1 (x)) ∧ . . . ∧ c(f7 (x)) ∧ . . . ∧
                       single(f1 (x), f7 (x)) ∧ double(f7 (x), f2 (x)) ∧ . . .




5
C LASSIFYING S TRUCTURED O BJECTS

                                        ascorbicAcid :   0
                                                                      o
                                                                      6
                                                             o         c                o
                                                                                                        o
                                                             5   c    11        c       1       c
                                            hasAtom                   h
                                                                                                        2
                                                                 12            10               7
                                             single                                 c       c
                                                                      13
                                             double                                 9       8


                                                                           4    o                   3   o




     ascorbicAcid(x) →hasAtom(x, f1 (x)) ∧ . . . ∧ hasAtom(x, f13 (x))
                          o(f1 (x)) ∧ . . . ∧ c(f7 (x)) ∧ . . . ∧
                          single(f1 (x), f7 (x)) ∧ double(f7 (x), f2 (x)) ∧ . . .
     hasAtom(x, y1 ) ∧ hasAtom(x, y2 ) ∧ y1 = y2 → polyatomicEntity(x)
     ∧5 hasAtom(x, yi ) ∧ c(y1 ) ∧ o(y2 ) ∧ o(y3 )∧
      i=1
               c(y4 ) ∧ horc(y5 ) ∧ double(y1 , y2 )∧
    single(y1 , y3 ) ∧ single(y3 , y4 ) ∧ single(y1 , y5 ) → carboxylicEster(x)
5
C LASSIFYING S TRUCTURED O BJECTS

                                      ascorbicAcid :   0
                                                                    o
                                                                    6
                                                           o         c                o
                                                                                                      o
                                                           5   c    11        c       1       c
                                          hasAtom                   h
                                                                                                      2
                                                               12            10               7
                                           single                                 c       c
                                                                    13
                                           double                                 9       8


                                                                         4    o                   3   o



    Input fact: ascorbicAcid(a)
    Stable model: ascorbicAcid(a), hasAtom(a, af ) for 1 ≤ i ≤ 13,
                                                           i
    o(af ) for 1 ≤ i ≤ 6, c(af ) for 7 ≤ i ≤ 12, h(af ), single(af , af ),
       i                      i                       13               8 3
    single(af , af ), single(af , af ) for i ∈ {5, 11}, single(af , af ),
              9 4               12 i                                11 6
    single(af , af ) for i ∈ {1, 9, 11, 13}, single(af , af ) for i ∈ {1, 8},
              10 i                                    7 i
    double(af , af ), double(af , af ), horc(af ) for 7 ≤ i ≤ 13,
               2 7                8 9            i
    polyatomicEntity(a), carboxylicEster(a), cyclic(a)


5
C LASSIFYING S TRUCTURED O BJECTS

                                      ascorbicAcid :   0
                                                                    o
                                                                    6
                                                           o         c                o
                                                                                                      o
                                                           5   c    11        c       1       c
                                          hasAtom                   h
                                                                                                      2
                                                               12            10               7
                                           single                                 c       c
                                                                    13
                                           double                                 9       8


                                                                         4    o                   3   o



    Input fact: ascorbicAcid(a)
    Stable model: ascorbicAcid(a), hasAtom(a, af ) for 1 ≤ i ≤ 13,
                                                           i
    o(af ) for 1 ≤ i ≤ 6, c(af ) for 7 ≤ i ≤ 12, h(af ), single(af , af ),
       i                      i                       13               8 3
    single(af , af ), single(af , af ) for i ∈ {5, 11}, single(af , af ),
              9 4               12 i                                11 6
    single(af , af ) for i ∈ {1, 9, 11, 13}, single(af , af ) for i ∈ {1, 8},
              10 i                                    7 i
    double(af , af ), double(af , af ), horc(af ) for 7 ≤ i ≤ 13,
               2 7                8 9            i
    polyatomicEntity(a), carboxylicEster(a), cyclic(a)
    Ascorbic acid is a cyclic polyatomic entity and a carboxylic ester

5
C HEMICAL C LASSES W E C OVERED
    1   Existence of subcomponents




6
C HEMICAL C LASSES W E C OVERED
    1   Existence of subcomponents
            Carbon molecules




6
C HEMICAL C LASSES W E C OVERED
    1   Existence of subcomponents
            Carbon molecules
            Carboxylic acids and carboxylic esters




6
C HEMICAL C LASSES W E C OVERED
    1   Existence of subcomponents
            Carbon molecules
            Carboxylic acids and carboxylic esters
            Ketones and aldehydes




6
C HEMICAL C LASSES W E C OVERED
    1   Existence of subcomponents
            Carbon molecules
            Carboxylic acids and carboxylic esters
            Ketones and aldehydes
    2   Exact cardinality of parts




6
C HEMICAL C LASSES W E C OVERED
    1   Existence of subcomponents
            Carbon molecules
            Carboxylic acids and carboxylic esters
            Ketones and aldehydes
    2   Exact cardinality of parts
            Exactly two carbons




6
C HEMICAL C LASSES W E C OVERED
    1   Existence of subcomponents
            Carbon molecules
            Carboxylic acids and carboxylic esters
            Ketones and aldehydes
    2   Exact cardinality of parts
            Exactly two carbons
            Dicarboxylic acid




6
C HEMICAL C LASSES W E C OVERED
    1   Existence of subcomponents
            Carbon molecules
            Carboxylic acids and carboxylic esters
            Ketones and aldehydes
    2   Exact cardinality of parts
            Exactly two carbons
            Dicarboxylic acid
    3   Exclusive composition




6
C HEMICAL C LASSES W E C OVERED
    1   Existence of subcomponents
            Carbon molecules
            Carboxylic acids and carboxylic esters
            Ketones and aldehydes
    2   Exact cardinality of parts
            Exactly two carbons
            Dicarboxylic acid
    3   Exclusive composition
            Inorganic molecules




6
C HEMICAL C LASSES W E C OVERED
    1   Existence of subcomponents
            Carbon molecules
            Carboxylic acids and carboxylic esters
            Ketones and aldehydes
    2   Exact cardinality of parts
            Exactly two carbons
            Dicarboxylic acid
    3   Exclusive composition
            Inorganic molecules
            Hydrocarbons




6
C HEMICAL C LASSES W E C OVERED
    1   Existence of subcomponents
            Carbon molecules
            Carboxylic acids and carboxylic esters
            Ketones and aldehydes
    2   Exact cardinality of parts
            Exactly two carbons
            Dicarboxylic acid
    3   Exclusive composition
            Inorganic molecules
            Hydrocarbons
            Saturated molecules




6
C HEMICAL C LASSES W E C OVERED
    1   Existence of subcomponents
            Carbon molecules
            Carboxylic acids and carboxylic esters
            Ketones and aldehydes
    2   Exact cardinality of parts
            Exactly two carbons
            Dicarboxylic acid
    3   Exclusive composition
            Inorganic molecules
            Hydrocarbons
            Saturated molecules
    4   Cyclicity-related classes




6
C HEMICAL C LASSES W E C OVERED
    1   Existence of subcomponents
            Carbon molecules
            Carboxylic acids and carboxylic esters
            Ketones and aldehydes
    2   Exact cardinality of parts
            Exactly two carbons
            Dicarboxylic acid
    3   Exclusive composition
            Inorganic molecules
            Hydrocarbons
            Saturated molecules
    4   Cyclicity-related classes
            Benzenes




6
C HEMICAL C LASSES W E C OVERED
    1   Existence of subcomponents
            Carbon molecules
            Carboxylic acids and carboxylic esters
            Ketones and aldehydes
    2   Exact cardinality of parts
            Exactly two carbons
            Dicarboxylic acid
    3   Exclusive composition
            Inorganic molecules
            Hydrocarbons
            Saturated molecules
    4   Cyclicity-related classes
            Benzenes
            Cyclic molecules




6
C HEMICAL C LASSES W E C OVERED
    1   Existence of subcomponents
            Carbon molecules
            Carboxylic acids and carboxylic esters
            Ketones and aldehydes
    2   Exact cardinality of parts
            Exactly two carbons
            Dicarboxylic acid
    3   Exclusive composition
            Inorganic molecules
            Hydrocarbons
            Saturated molecules
    4   Cyclicity-related classes
            Benzenes
            Cyclic molecules
            Alkanes




6
E MPIRICAL E VALUATION
    Draws upon DLV, a deductive databases engine




7
E MPIRICAL E VALUATION
    Draws upon DLV, a deductive databases engine
    Evaluation with data extracted from ChEBI




7
E MPIRICAL E VALUATION
    Draws upon DLV, a deductive databases engine
    Evaluation with data extracted from ChEBI
    500 molecules under 51 chemical classes in 40 secs




7
E MPIRICAL E VALUATION
    Draws upon DLV, a deductive databases engine
    Evaluation with data extracted from ChEBI
    500 molecules under 51 chemical classes in 40 secs
    Quicker than other approaches:




7
E MPIRICAL E VALUATION
    Draws upon DLV, a deductive databases engine
    Evaluation with data extracted from ChEBI
    500 molecules under 51 chemical classes in 40 secs
    Quicker than other approaches:
        [Hastings et al., 2010] 140 molecules in 4 hours
        [Magka et al., 2012] 70 molecules in 450 secs




7
E MPIRICAL E VALUATION
    Draws upon DLV, a deductive databases engine
    Evaluation with data extracted from ChEBI
    500 molecules under 51 chemical classes in 40 secs
    Quicker than other approaches:
        [Hastings et al., 2010] 140 molecules in 4 hours
        [Magka et al., 2012] 70 molecules in 450 secs
    Subsumptions exposed by our prototype:




7
E MPIRICAL E VALUATION
    Draws upon DLV, a deductive databases engine
    Evaluation with data extracted from ChEBI
    500 molecules under 51 chemical classes in 40 secs
    Quicker than other approaches:
        [Hastings et al., 2010] 140 molecules in 4 hours
        [Magka et al., 2012] 70 molecules in 450 secs
    Subsumptions exposed by our prototype:
        ascorbic acid is a polyatomic entity, a carboxylic ester and a
        cyclic molecule
        missing from the ChEBI OWL ontology




7
E MPIRICAL E VALUATION
    Draws upon DLV, a deductive databases engine
    Evaluation with data extracted from ChEBI
    500 molecules under 51 chemical classes in 40 secs
    Quicker than other approaches:
        [Hastings et al., 2010] 140 molecules in 4 hours
        [Magka et al., 2012] 70 molecules in 450 secs
    Subsumptions exposed by our prototype:
        ascorbic acid is a polyatomic entity, a carboxylic ester and a
        cyclic molecule
        missing from the ChEBI OWL ontology
    Contradictory subclass relation from ChEBI:




7
E MPIRICAL E VALUATION
    Draws upon DLV, a deductive databases engine
    Evaluation with data extracted from ChEBI
    500 molecules under 51 chemical classes in 40 secs
    Quicker than other approaches:
        [Hastings et al., 2010] 140 molecules in 4 hours
        [Magka et al., 2012] 70 molecules in 450 secs
    Subsumptions exposed by our prototype:
        ascorbic acid is a polyatomic entity, a carboxylic ester and a
        cyclic molecule
        missing from the ChEBI OWL ontology
    Contradictory subclass relation from ChEBI:
        Ascorbic acid is asserted to be a carboxylic acid (release 95)
        Not listed among the subsumptions derived by our prototype




7
C ONCLUSION AND F URTHER R ESEARCH
    Results
     1   Expressive and decidable formalism for complex objects




8
C ONCLUSION AND F URTHER R ESEARCH
    Results
     1   Expressive and decidable formalism for complex objects
     2   Wide range of structure-based classes




8
C ONCLUSION AND F URTHER R ESEARCH
    Results
     1   Expressive and decidable formalism for complex objects
     2   Wide range of structure-based classes
     3   DLV-based implementation exhibits a significant speedup




8
C ONCLUSION AND F URTHER R ESEARCH
    Results
     1   Expressive and decidable formalism for complex objects
     2   Wide range of structure-based classes
     3   DLV-based implementation exhibits a significant speedup
     4   Evaluation over ChEBI ontology revealed modelling errors




8
C ONCLUSION AND F URTHER R ESEARCH
    Results
     1   Expressive and decidable formalism for complex objects
     2   Wide range of structure-based classes
     3   DLV-based implementation exhibits a significant speedup
     4   Evaluation over ChEBI ontology revealed modelling errors
     Language for representing biochemical structures with a
         favourable performance/expressivity trade-off




8
C ONCLUSION AND F URTHER R ESEARCH
    Results
      1   Expressive and decidable formalism for complex objects
      2   Wide range of structure-based classes
      3   DLV-based implementation exhibits a significant speedup
      4   Evaluation over ChEBI ontology revealed modelling errors
      Language for representing biochemical structures with a
          favourable performance/expressivity trade-off

    Future directions
          SMILES-based surface syntax




8
C ONCLUSION AND F URTHER R ESEARCH
    Results
      1 Expressive and decidable formalism for complex objects
      2 Wide range of structure-based classes
      3 DLV-based implementation exhibits a significant speedup
      4 Evaluation over ChEBI ontology revealed modelling errors

      Language for representing biochemical structures with a
          favourable performance/expressivity trade-off

    Future directions
          SMILES-based surface syntax

           ∧5 hasAtom(x, yi ) ∧ c(y1 ) ∧ o(y2 ) ∧ o(y3 ) ∧ c(y4 )∧
            i=1
           double(y1 , y2 ) ∧ single(y1 , y3 ) ∧ single(y3 , y4 ) ∧ single(y1 , y5 )
           → carboxylicEster(x)




8
C ONCLUSION AND F URTHER R ESEARCH
    Results
      1 Expressive and decidable formalism for complex objects
      2 Wide range of structure-based classes
      3 DLV-based implementation exhibits a significant speedup
      4 Evaluation over ChEBI ontology revealed modelling errors

      Language for representing biochemical structures with a
          favourable performance/expressivity trade-off

    Future directions
          SMILES-based surface syntax

           define carboxylicEster
           some hasAtom SMILES(COC(= O)[∗])
           end.




8
C ONCLUSION AND F URTHER R ESEARCH
    Results
      1   Expressive and decidable formalism for complex objects
      2   Wide range of structure-based classes
      3   DLV-based implementation exhibits a significant speedup
      4   Evaluation over ChEBI ontology revealed modelling errors
      Language for representing biochemical structures with a
          favourable performance/expressivity trade-off

    Future directions
          SMILES-based surface syntax
          Detect subsumptions between classes




8
C ONCLUSION AND F URTHER R ESEARCH
    Results
      1   Expressive and decidable formalism for complex objects
      2   Wide range of structure-based classes
      3   DLV-based implementation exhibits a significant speedup
      4   Evaluation over ChEBI ontology revealed modelling errors
      Language for representing biochemical structures with a
          favourable performance/expressivity trade-off

    Future directions
          SMILES-based surface syntax
          Detect subsumptions between classes
          E.g., Carboxylic ester is an organic molecular entity




8
C ONCLUSION AND F URTHER R ESEARCH
    Results
      1   Expressive and decidable formalism for complex objects
      2   Wide range of structure-based classes
      3   DLV-based implementation exhibits a significant speedup
      4   Evaluation over ChEBI ontology revealed modelling errors
      Language for representing biochemical structures with a
          favourable performance/expressivity trade-off

    Future directions
          SMILES-based surface syntax
          Detect subsumptions between classes
          Extensions with numerical datatypes




8
C ONCLUSION AND F URTHER R ESEARCH
    Results
      1   Expressive and decidable formalism for complex objects
      2   Wide range of structure-based classes
      3   DLV-based implementation exhibits a significant speedup
      4   Evaluation over ChEBI ontology revealed modelling errors
      Language for representing biochemical structures with a
          favourable performance/expressivity trade-off

    Future directions
          SMILES-based surface syntax
          Detect subsumptions between classes
          Extensions with numerical datatypes
          E.g., Small molecules if they weigh less than 800 daltons




8
C ONCLUSION AND F URTHER R ESEARCH
    Results
      1   Expressive and decidable formalism for complex objects
      2   Wide range of structure-based classes
      3   DLV-based implementation exhibits a significant speedup
      4   Evaluation over ChEBI ontology revealed modelling errors
      Language for representing biochemical structures with a
          favourable performance/expressivity trade-off

    Future directions
          SMILES-based surface syntax
          Detect subsumptions between classes
          Extensions with numerical datatypes
          Classification of complex biological objects




8
C ONCLUSION AND F URTHER R ESEARCH
    Results
      1   Expressive and decidable formalism for complex objects
      2   Wide range of structure-based classes
      3   DLV-based implementation exhibits a significant speedup
      4   Evaluation over ChEBI ontology revealed modelling errors
      Language for representing biochemical structures with a
          favourable performance/expressivity trade-off

    Future directions
          SMILES-based surface syntax
          Detect subsumptions between classes
          Extensions with numerical datatypes
          Classification of complex biological objects
          Integration with Protégé, Bioclipse, JChemPaint,. . .




8
C ONCLUSION AND F URTHER R ESEARCH
    Results
      1   Expressive and decidable formalism for complex objects
      2   Wide range of structure-based classes
      3   DLV-based implementation exhibits a significant speedup
      4   Evaluation over ChEBI ontology revealed modelling errors
      Language for representing biochemical structures with a
          favourable performance/expressivity trade-off

    Future directions
          SMILES-based surface syntax
          Detect subsumptions between classes
          Extensions with numerical datatypes
          Classification of complex biological objects
          Integration with Protégé, Bioclipse, JChemPaint,. . .
          Mapping from our logic to RDF




8
C ONCLUSION AND F URTHER R ESEARCH
    Results
      1   Expressive and decidable formalism for complex objects
      2   Wide range of structure-based classes
      3   DLV-based implementation exhibits a significant speedup
      4   Evaluation over ChEBI ontology revealed modelling errors
      Language for representing biochemical structures with a
          favourable performance/expressivity trade-off

    Future directions
          SMILES-based surface syntax
          Detect subsumptions between classes
          Extensions with numerical datatypes
          Classification of complex biological objects
          Integration with Protégé, Bioclipse, JChemPaint,. . .
          Mapping from our logic to RDF
    Thank you! Questions?!?


8

More Related Content

PDF
Modelling Structured Domains with Description Graphs and Logic Programming
PDF
Chemistry The Science in Context Volume I and II 5th Edition Gilbert Solution...
PDF
Chemistry The Science in Context Volume I and II 4th Edition Gilbert Solution...
PPT
Chem e2a lecture 2-2011
PDF
Chapter 7
PPT
AP Chem Chapter 2 Sample Exercises
PPTX
The mini definitive chemistry guide
PPT
Ch. 3 elements and the periodic table(sec.1,2and 3)
Modelling Structured Domains with Description Graphs and Logic Programming
Chemistry The Science in Context Volume I and II 5th Edition Gilbert Solution...
Chemistry The Science in Context Volume I and II 4th Edition Gilbert Solution...
Chem e2a lecture 2-2011
Chapter 7
AP Chem Chapter 2 Sample Exercises
The mini definitive chemistry guide
Ch. 3 elements and the periodic table(sec.1,2and 3)

Viewers also liked (19)

PDF
Classifying Chemicals with Description Graphs and Logic Programming
PDF
Acyclicity Conditions and their Application to Query Answering in Description...
PDF
Tractable Extensions of the Description Logic EL with Numerical Datatypes
PDF
Computing Stable Models for Nonmonotonic Existential Rules
PDF
thesis-despoina
PDF
Ontology-based Classification and Faceted Search Interface for APIs
PPTX
Data Integration at the Ontology Engineering Group
PPTX
Ontology For Data Integration
PPTX
Ontology-based Data Integration
PPTX
Jarrar: Introduction to Ontology
PPT
Examples of Ontology Applications
PDF
Introduction to Ontology Concepts and Terminology
PDF
Ontologies in computer science and on the web
PDF
Introduction to the Semantic Web
PPTX
Introduction to the Semantic Web
PPTX
ontology based- data_integration.
PPT
Ontology Powerpoint
PDF
Functional Programming Patterns (BuildStuff '14)
Classifying Chemicals with Description Graphs and Logic Programming
Acyclicity Conditions and their Application to Query Answering in Description...
Tractable Extensions of the Description Logic EL with Numerical Datatypes
Computing Stable Models for Nonmonotonic Existential Rules
thesis-despoina
Ontology-based Classification and Faceted Search Interface for APIs
Data Integration at the Ontology Engineering Group
Ontology For Data Integration
Ontology-based Data Integration
Jarrar: Introduction to Ontology
Examples of Ontology Applications
Introduction to Ontology Concepts and Terminology
Ontologies in computer science and on the web
Introduction to the Semantic Web
Introduction to the Semantic Web
ontology based- data_integration.
Ontology Powerpoint
Functional Programming Patterns (BuildStuff '14)
Ad

Similar to Ontology-Based Classification of Molecules: a Logic Programming Approach (20)

PDF
Organic-Chemistry Introduction Chembiooo
PPT
intro organic
PPTX
Organic Chemistry-Some Basic Principles and Techniques.pptx
PDF
Elements of crystallography, very useful for engineering students
PPT
Or Ganic Intro
PPTX
Chap 1 Bonding.pptx about chemical and different bonding
DOC
organic.doc
PDF
Ch05. streochemistry
PPTX
covalent bonding IGCSE chemistry, topic 4
PDF
basic_organic_chemistry_and_mechanisms_revision_from_m_wills_for_when_you_are...
PPTX
Introduction to organic chemisry
PDF
steroechemistry
PPT
Ionic bonds ok1294990488
PPTX
Stereochem2012ques.pptx
PPT
12 chp14 lect 3
PPT
Chemical bonding
PDF
stereochemistry
PDF
Chapter 26 skeleton notes
PPTX
Neighbouring Group Participation.pptx
PDF
Basics of Organic Chemistry, Biochemistry
Organic-Chemistry Introduction Chembiooo
intro organic
Organic Chemistry-Some Basic Principles and Techniques.pptx
Elements of crystallography, very useful for engineering students
Or Ganic Intro
Chap 1 Bonding.pptx about chemical and different bonding
organic.doc
Ch05. streochemistry
covalent bonding IGCSE chemistry, topic 4
basic_organic_chemistry_and_mechanisms_revision_from_m_wills_for_when_you_are...
Introduction to organic chemisry
steroechemistry
Ionic bonds ok1294990488
Stereochem2012ques.pptx
12 chp14 lect 3
Chemical bonding
stereochemistry
Chapter 26 skeleton notes
Neighbouring Group Participation.pptx
Basics of Organic Chemistry, Biochemistry
Ad

Recently uploaded (20)

PPTX
cloud_computing_Infrastucture_as_cloud_p
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
Hindi spoken digit analysis for native and non-native speakers
PDF
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
PPTX
A Presentation on Artificial Intelligence
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PDF
Getting Started with Data Integration: FME Form 101
PDF
Zenith AI: Advanced Artificial Intelligence
PDF
Heart disease approach using modified random forest and particle swarm optimi...
PDF
Mushroom cultivation and it's methods.pdf
PDF
Enhancing emotion recognition model for a student engagement use case through...
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
PDF
Encapsulation theory and applications.pdf
cloud_computing_Infrastucture_as_cloud_p
Encapsulation_ Review paper, used for researhc scholars
Assigned Numbers - 2025 - Bluetooth® Document
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
A comparative analysis of optical character recognition models for extracting...
Hindi spoken digit analysis for native and non-native speakers
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
A Presentation on Artificial Intelligence
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Univ-Connecticut-ChatGPT-Presentaion.pdf
Group 1 Presentation -Planning and Decision Making .pptx
Getting Started with Data Integration: FME Form 101
Zenith AI: Advanced Artificial Intelligence
Heart disease approach using modified random forest and particle swarm optimi...
Mushroom cultivation and it's methods.pdf
Enhancing emotion recognition model for a student engagement use case through...
Programs and apps: productivity, graphics, security and other tools
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
Encapsulation theory and applications.pdf

Ontology-Based Classification of Molecules: a Logic Programming Approach

  • 1. O NTOLOGY-BASED C LASSIFICATION OF M OLECULES : A L OGIC P ROGRAMMING A PPROACH Despoina Magka Department of Computer Science, University of Oxford November 30, 2012
  • 2. B IOINFORMATICS AND S EMANTIC T ECHNOLOGIES Life sciences data deluge 1
  • 3. B IOINFORMATICS AND S EMANTIC T ECHNOLOGIES Life sciences data deluge Hierarchical organisation of biochemical knowledge 1
  • 4. B IOINFORMATICS AND S EMANTIC T ECHNOLOGIES Life sciences data deluge Hierarchical organisation of biochemical knowledge 1
  • 5. B IOINFORMATICS AND S EMANTIC T ECHNOLOGIES Life sciences data deluge Hierarchical organisation of biochemical knowledge 1
  • 6. B IOINFORMATICS AND S EMANTIC T ECHNOLOGIES Life sciences data deluge Hierarchical organisation of biochemical knowledge Fast, automatic and repeatable classification driven by Semantic technologies 1
  • 7. B IOINFORMATICS AND S EMANTIC T ECHNOLOGIES Life sciences data deluge Hierarchical organisation of biochemical knowledge Fast, automatic and repeatable classification driven by Semantic technologies Web Ontology Language, a W3C standard family of logic-based formalisms 1
  • 8. B IOINFORMATICS AND S EMANTIC T ECHNOLOGIES Life sciences data deluge Hierarchical organisation of biochemical knowledge Fast, automatic and repeatable classification driven by Semantic technologies Web Ontology Language, a W3C standard family of logic-based formalisms OWL bio- and chemo-ontologies widely adopted 1
  • 9. T HE C H EBI O NTOLOGY OWL ontology Chemical Entities of Biological Interest 2
  • 10. T HE C H EBI O NTOLOGY OWL ontology Chemical Entities of Biological Interest Dictionary of molecules with taxonomical information 2
  • 11. T HE C H EBI O NTOLOGY OWL ontology Chemical Entities of Biological Interest Dictionary of molecules with taxonomical information caffeine is a cyclic molecule 2
  • 12. T HE C H EBI O NTOLOGY OWL ontology Chemical Entities of Biological Interest Dictionary of molecules with taxonomical information serotonin is an organic molecule 2
  • 13. T HE C H EBI O NTOLOGY OWL ontology Chemical Entities of Biological Interest Dictionary of molecules with taxonomical information ascorbic acid is a carboxylic ester 2
  • 14. T HE C H EBI O NTOLOGY OWL ontology Chemical Entities of Biological Interest Dictionary of molecules with taxonomical information Pharmaceutical design and study of biological pathways 2
  • 15. T HE C H EBI O NTOLOGY OWL ontology Chemical Entities of Biological Interest Dictionary of molecules with taxonomical information Pharmaceutical design and study of biological pathways ChEBI is manually incremented 2
  • 16. T HE C H EBI O NTOLOGY OWL ontology Chemical Entities of Biological Interest Dictionary of molecules with taxonomical information Pharmaceutical design and study of biological pathways ChEBI is manually incremented Currently ~30,000 chemical entities, expands at 3,500/yr 2
  • 17. T HE C H EBI O NTOLOGY OWL ontology Chemical Entities of Biological Interest Dictionary of molecules with taxonomical information Pharmaceutical design and study of biological pathways ChEBI is manually incremented Currently ~30,000 chemical entities, expands at 3,500/yr Existing chemical databases describe millions of molecules 2
  • 18. T HE C H EBI O NTOLOGY OWL ontology Chemical Entities of Biological Interest Dictionary of molecules with taxonomical information Pharmaceutical design and study of biological pathways ChEBI is manually incremented Currently ~30,000 chemical entities, expands at 3,500/yr Existing chemical databases describe millions of molecules Speed up growth by automating chemical classification 2
  • 19. E XPRESSIVITY L IMITATIONS OF OWL 1 At least one tree-shaped model for each consistent OWL ontology problematic representation of cycles 3
  • 20. E XPRESSIVITY L IMITATIONS OF OWL 1 At least one tree-shaped model for each consistent OWL ontology problematic representation of cycles E XAMPLE C C C C 3
  • 21. E XPRESSIVITY L IMITATIONS OF OWL 1 At least one tree-shaped model for each consistent OWL ontology problematic representation of cycles E XAMPLE Cyclobutane ∃(= 4)hasAtom.(Carbon ∃(= 2)hasBond.Carbon) C C C C 3
  • 22. E XPRESSIVITY L IMITATIONS OF OWL 1 At least one tree-shaped model for each consistent OWL ontology problematic representation of cycles E XAMPLE Cyclobutane ∃(= 4)hasAtom.(Carbon ∃(= 2)hasBond.Carbon) C C C C 3
  • 23. E XPRESSIVITY L IMITATIONS OF OWL 1 At least one tree-shaped model for each consistent OWL ontology problematic representation of cycles E XAMPLE Cyclobutane ∃(= 4)hasAtom.(Carbon ∃(= 2)hasBond.Carbon) C C C C OWL-based reasoning support 1 Is cyclobutane a cyclic molecule? 3
  • 24. E XPRESSIVITY L IMITATIONS OF OWL 1 At least one tree-shaped model for each consistent OWL ontology problematic representation of cycles 2 No minimality condition on the models hard to axiomatise classes based on the absence of attributes E XAMPLE Cyclobutane ∃(= 4)hasAtom.(Carbon ∃(= 2)hasBond.Carbon) C C C C OWL-based reasoning support 1 Is cyclobutane a cyclic molecule? 3
  • 25. E XPRESSIVITY L IMITATIONS OF OWL 1 At least one tree-shaped model for each consistent OWL ontology problematic representation of cycles 2 No minimality condition on the models hard to axiomatise classes based on the absence of attributes E XAMPLE Cyclobutane ∃(= 4)hasAtom.(Carbon ∃(= 2)hasBond.Carbon) Oxygen C C C C OWL-based reasoning support 1 Is cyclobutane a cyclic molecule? 3
  • 26. E XPRESSIVITY L IMITATIONS OF OWL 1 At least one tree-shaped model for each consistent OWL ontology problematic representation of cycles 2 No minimality condition on the models hard to axiomatise classes based on the absence of attributes E XAMPLE Cyclobutane ∃(= 4)hasAtom.(Carbon ∃(= 2)hasBond.Carbon) Oxygen C C C C OWL-based reasoning support 1 Is cyclobutane a cyclic molecule? 2 Is cyclobutane a hydrocarbon? 3
  • 27. E XPRESSIVITY L IMITATIONS OF OWL 1 At least one tree-shaped model for each consistent OWL ontology problematic representation of cycles 2 No minimality condition on the models hard to axiomatise classes based on the absence of attributes E XAMPLE Cyclobutane ∃(= 4)hasAtom.(Carbon ∃(= 2)hasBond.Carbon) Oxygen C C C C 3
  • 28. E XPRESSIVITY L IMITATIONS OF OWL 1 At least one tree-shaped model for each consistent OWL ontology problematic representation of cycles 2 No minimality condition on the models hard to axiomatise classes based on the absence of attributes E XAMPLE Cyclobutane ∃(= 4)hasAtom.(Carbon ∃(= 2)hasBond.Carbon) Oxygen C C C C Required reasoning support 1 Is cyclobutane a cyclic molecule? 2 Is cyclobutane a hydrocarbon? 3
  • 29. E XPRESSIVITY L IMITATIONS OF OWL 1 At least one tree-shaped model for each consistent OWL ontology problematic representation of cycles 2 No minimality condition on the models hard to axiomatise classes based on the absence of attributes E XAMPLE Cyclobutane ∃(= 4)hasAtom.(Carbon ∃(= 2)hasBond.Carbon) Oxygen C C C C Required reasoning support 1 Is cyclobutane a cyclic molecule? 2 Is cyclobutane a hydrocarbon? 3
  • 30. R ESULTS OVERVIEW 1 Expressive and decidable formalism for modelling complex objects: Description Graphs Logic Programs 4
  • 31. R ESULTS OVERVIEW 1 Expressive and decidable formalism for modelling complex objects: Description Graphs Logic Programs 2 Modelling that spans a wide range of structure-dependent classes of molecules 4
  • 32. R ESULTS OVERVIEW 1 Expressive and decidable formalism for modelling complex objects: Description Graphs Logic Programs 2 Modelling that spans a wide range of structure-dependent classes of molecules 3 Implementation that draws upon DLV and performs structure-based classification with a significant speedup 4
  • 33. R ESULTS OVERVIEW 1 Expressive and decidable formalism for modelling complex objects: Description Graphs Logic Programs 2 Modelling that spans a wide range of structure-dependent classes of molecules 3 Implementation that draws upon DLV and performs structure-based classification with a significant speedup 4 Evaluation over part of the manually curated ChEBI ontology revealed modelling errors 4
  • 34. R ESULTS OVERVIEW 1 Expressive and decidable formalism for modelling complex objects: Description Graphs Logic Programs 2 Modelling that spans a wide range of structure-dependent classes of molecules 3 Implementation that draws upon DLV and performs structure-based classification with a significant speedup 4 Evaluation over part of the manually curated ChEBI ontology revealed modelling errors Language for representing biochemical structures with a favourable performance/expressivity trade-off 4
  • 35. C LASSIFYING S TRUCTURED O BJECTS 5
  • 36. C LASSIFYING S TRUCTURED O BJECTS ascorbicAcid : 0 o 6 o c o o 5 c 11 c 1 c hasAtom h 2 12 10 7 single c c 13 double 9 8 4 o 3 o 5
  • 37. C LASSIFYING S TRUCTURED O BJECTS ascorbicAcid : 0 o 6 o c o o 5 c 11 c 1 c hasAtom h 2 12 10 7 single c c 13 double 9 8 4 o 3 o ascorbicAcid(x) →hasAtom(x, f1 (x)) ∧ . . . ∧ hasAtom(x, f13 (x)) o(f1 (x)) ∧ . . . ∧ c(f7 (x)) ∧ . . . ∧ single(f1 (x), f7 (x)) ∧ double(f7 (x), f2 (x)) ∧ . . . 5
  • 38. C LASSIFYING S TRUCTURED O BJECTS ascorbicAcid : 0 o 6 o c o o 5 c 11 c 1 c hasAtom h 2 12 10 7 single c c 13 double 9 8 4 o 3 o ascorbicAcid(x) →hasAtom(x, f1 (x)) ∧ . . . ∧ hasAtom(x, f13 (x)) o(f1 (x)) ∧ . . . ∧ c(f7 (x)) ∧ . . . ∧ single(f1 (x), f7 (x)) ∧ double(f7 (x), f2 (x)) ∧ . . . hasAtom(x, y1 ) ∧ hasAtom(x, y2 ) ∧ y1 = y2 → polyatomicEntity(x) ∧5 hasAtom(x, yi ) ∧ c(y1 ) ∧ o(y2 ) ∧ o(y3 )∧ i=1 c(y4 ) ∧ horc(y5 ) ∧ double(y1 , y2 )∧ single(y1 , y3 ) ∧ single(y3 , y4 ) ∧ single(y1 , y5 ) → carboxylicEster(x) 5
  • 39. C LASSIFYING S TRUCTURED O BJECTS ascorbicAcid : 0 o 6 o c o o 5 c 11 c 1 c hasAtom h 2 12 10 7 single c c 13 double 9 8 4 o 3 o Input fact: ascorbicAcid(a) Stable model: ascorbicAcid(a), hasAtom(a, af ) for 1 ≤ i ≤ 13, i o(af ) for 1 ≤ i ≤ 6, c(af ) for 7 ≤ i ≤ 12, h(af ), single(af , af ), i i 13 8 3 single(af , af ), single(af , af ) for i ∈ {5, 11}, single(af , af ), 9 4 12 i 11 6 single(af , af ) for i ∈ {1, 9, 11, 13}, single(af , af ) for i ∈ {1, 8}, 10 i 7 i double(af , af ), double(af , af ), horc(af ) for 7 ≤ i ≤ 13, 2 7 8 9 i polyatomicEntity(a), carboxylicEster(a), cyclic(a) 5
  • 40. C LASSIFYING S TRUCTURED O BJECTS ascorbicAcid : 0 o 6 o c o o 5 c 11 c 1 c hasAtom h 2 12 10 7 single c c 13 double 9 8 4 o 3 o Input fact: ascorbicAcid(a) Stable model: ascorbicAcid(a), hasAtom(a, af ) for 1 ≤ i ≤ 13, i o(af ) for 1 ≤ i ≤ 6, c(af ) for 7 ≤ i ≤ 12, h(af ), single(af , af ), i i 13 8 3 single(af , af ), single(af , af ) for i ∈ {5, 11}, single(af , af ), 9 4 12 i 11 6 single(af , af ) for i ∈ {1, 9, 11, 13}, single(af , af ) for i ∈ {1, 8}, 10 i 7 i double(af , af ), double(af , af ), horc(af ) for 7 ≤ i ≤ 13, 2 7 8 9 i polyatomicEntity(a), carboxylicEster(a), cyclic(a) Ascorbic acid is a cyclic polyatomic entity and a carboxylic ester 5
  • 41. C HEMICAL C LASSES W E C OVERED 1 Existence of subcomponents 6
  • 42. C HEMICAL C LASSES W E C OVERED 1 Existence of subcomponents Carbon molecules 6
  • 43. C HEMICAL C LASSES W E C OVERED 1 Existence of subcomponents Carbon molecules Carboxylic acids and carboxylic esters 6
  • 44. C HEMICAL C LASSES W E C OVERED 1 Existence of subcomponents Carbon molecules Carboxylic acids and carboxylic esters Ketones and aldehydes 6
  • 45. C HEMICAL C LASSES W E C OVERED 1 Existence of subcomponents Carbon molecules Carboxylic acids and carboxylic esters Ketones and aldehydes 2 Exact cardinality of parts 6
  • 46. C HEMICAL C LASSES W E C OVERED 1 Existence of subcomponents Carbon molecules Carboxylic acids and carboxylic esters Ketones and aldehydes 2 Exact cardinality of parts Exactly two carbons 6
  • 47. C HEMICAL C LASSES W E C OVERED 1 Existence of subcomponents Carbon molecules Carboxylic acids and carboxylic esters Ketones and aldehydes 2 Exact cardinality of parts Exactly two carbons Dicarboxylic acid 6
  • 48. C HEMICAL C LASSES W E C OVERED 1 Existence of subcomponents Carbon molecules Carboxylic acids and carboxylic esters Ketones and aldehydes 2 Exact cardinality of parts Exactly two carbons Dicarboxylic acid 3 Exclusive composition 6
  • 49. C HEMICAL C LASSES W E C OVERED 1 Existence of subcomponents Carbon molecules Carboxylic acids and carboxylic esters Ketones and aldehydes 2 Exact cardinality of parts Exactly two carbons Dicarboxylic acid 3 Exclusive composition Inorganic molecules 6
  • 50. C HEMICAL C LASSES W E C OVERED 1 Existence of subcomponents Carbon molecules Carboxylic acids and carboxylic esters Ketones and aldehydes 2 Exact cardinality of parts Exactly two carbons Dicarboxylic acid 3 Exclusive composition Inorganic molecules Hydrocarbons 6
  • 51. C HEMICAL C LASSES W E C OVERED 1 Existence of subcomponents Carbon molecules Carboxylic acids and carboxylic esters Ketones and aldehydes 2 Exact cardinality of parts Exactly two carbons Dicarboxylic acid 3 Exclusive composition Inorganic molecules Hydrocarbons Saturated molecules 6
  • 52. C HEMICAL C LASSES W E C OVERED 1 Existence of subcomponents Carbon molecules Carboxylic acids and carboxylic esters Ketones and aldehydes 2 Exact cardinality of parts Exactly two carbons Dicarboxylic acid 3 Exclusive composition Inorganic molecules Hydrocarbons Saturated molecules 4 Cyclicity-related classes 6
  • 53. C HEMICAL C LASSES W E C OVERED 1 Existence of subcomponents Carbon molecules Carboxylic acids and carboxylic esters Ketones and aldehydes 2 Exact cardinality of parts Exactly two carbons Dicarboxylic acid 3 Exclusive composition Inorganic molecules Hydrocarbons Saturated molecules 4 Cyclicity-related classes Benzenes 6
  • 54. C HEMICAL C LASSES W E C OVERED 1 Existence of subcomponents Carbon molecules Carboxylic acids and carboxylic esters Ketones and aldehydes 2 Exact cardinality of parts Exactly two carbons Dicarboxylic acid 3 Exclusive composition Inorganic molecules Hydrocarbons Saturated molecules 4 Cyclicity-related classes Benzenes Cyclic molecules 6
  • 55. C HEMICAL C LASSES W E C OVERED 1 Existence of subcomponents Carbon molecules Carboxylic acids and carboxylic esters Ketones and aldehydes 2 Exact cardinality of parts Exactly two carbons Dicarboxylic acid 3 Exclusive composition Inorganic molecules Hydrocarbons Saturated molecules 4 Cyclicity-related classes Benzenes Cyclic molecules Alkanes 6
  • 56. E MPIRICAL E VALUATION Draws upon DLV, a deductive databases engine 7
  • 57. E MPIRICAL E VALUATION Draws upon DLV, a deductive databases engine Evaluation with data extracted from ChEBI 7
  • 58. E MPIRICAL E VALUATION Draws upon DLV, a deductive databases engine Evaluation with data extracted from ChEBI 500 molecules under 51 chemical classes in 40 secs 7
  • 59. E MPIRICAL E VALUATION Draws upon DLV, a deductive databases engine Evaluation with data extracted from ChEBI 500 molecules under 51 chemical classes in 40 secs Quicker than other approaches: 7
  • 60. E MPIRICAL E VALUATION Draws upon DLV, a deductive databases engine Evaluation with data extracted from ChEBI 500 molecules under 51 chemical classes in 40 secs Quicker than other approaches: [Hastings et al., 2010] 140 molecules in 4 hours [Magka et al., 2012] 70 molecules in 450 secs 7
  • 61. E MPIRICAL E VALUATION Draws upon DLV, a deductive databases engine Evaluation with data extracted from ChEBI 500 molecules under 51 chemical classes in 40 secs Quicker than other approaches: [Hastings et al., 2010] 140 molecules in 4 hours [Magka et al., 2012] 70 molecules in 450 secs Subsumptions exposed by our prototype: 7
  • 62. E MPIRICAL E VALUATION Draws upon DLV, a deductive databases engine Evaluation with data extracted from ChEBI 500 molecules under 51 chemical classes in 40 secs Quicker than other approaches: [Hastings et al., 2010] 140 molecules in 4 hours [Magka et al., 2012] 70 molecules in 450 secs Subsumptions exposed by our prototype: ascorbic acid is a polyatomic entity, a carboxylic ester and a cyclic molecule missing from the ChEBI OWL ontology 7
  • 63. E MPIRICAL E VALUATION Draws upon DLV, a deductive databases engine Evaluation with data extracted from ChEBI 500 molecules under 51 chemical classes in 40 secs Quicker than other approaches: [Hastings et al., 2010] 140 molecules in 4 hours [Magka et al., 2012] 70 molecules in 450 secs Subsumptions exposed by our prototype: ascorbic acid is a polyatomic entity, a carboxylic ester and a cyclic molecule missing from the ChEBI OWL ontology Contradictory subclass relation from ChEBI: 7
  • 64. E MPIRICAL E VALUATION Draws upon DLV, a deductive databases engine Evaluation with data extracted from ChEBI 500 molecules under 51 chemical classes in 40 secs Quicker than other approaches: [Hastings et al., 2010] 140 molecules in 4 hours [Magka et al., 2012] 70 molecules in 450 secs Subsumptions exposed by our prototype: ascorbic acid is a polyatomic entity, a carboxylic ester and a cyclic molecule missing from the ChEBI OWL ontology Contradictory subclass relation from ChEBI: Ascorbic acid is asserted to be a carboxylic acid (release 95) Not listed among the subsumptions derived by our prototype 7
  • 65. C ONCLUSION AND F URTHER R ESEARCH Results 1 Expressive and decidable formalism for complex objects 8
  • 66. C ONCLUSION AND F URTHER R ESEARCH Results 1 Expressive and decidable formalism for complex objects 2 Wide range of structure-based classes 8
  • 67. C ONCLUSION AND F URTHER R ESEARCH Results 1 Expressive and decidable formalism for complex objects 2 Wide range of structure-based classes 3 DLV-based implementation exhibits a significant speedup 8
  • 68. C ONCLUSION AND F URTHER R ESEARCH Results 1 Expressive and decidable formalism for complex objects 2 Wide range of structure-based classes 3 DLV-based implementation exhibits a significant speedup 4 Evaluation over ChEBI ontology revealed modelling errors 8
  • 69. C ONCLUSION AND F URTHER R ESEARCH Results 1 Expressive and decidable formalism for complex objects 2 Wide range of structure-based classes 3 DLV-based implementation exhibits a significant speedup 4 Evaluation over ChEBI ontology revealed modelling errors Language for representing biochemical structures with a favourable performance/expressivity trade-off 8
  • 70. C ONCLUSION AND F URTHER R ESEARCH Results 1 Expressive and decidable formalism for complex objects 2 Wide range of structure-based classes 3 DLV-based implementation exhibits a significant speedup 4 Evaluation over ChEBI ontology revealed modelling errors Language for representing biochemical structures with a favourable performance/expressivity trade-off Future directions SMILES-based surface syntax 8
  • 71. C ONCLUSION AND F URTHER R ESEARCH Results 1 Expressive and decidable formalism for complex objects 2 Wide range of structure-based classes 3 DLV-based implementation exhibits a significant speedup 4 Evaluation over ChEBI ontology revealed modelling errors Language for representing biochemical structures with a favourable performance/expressivity trade-off Future directions SMILES-based surface syntax ∧5 hasAtom(x, yi ) ∧ c(y1 ) ∧ o(y2 ) ∧ o(y3 ) ∧ c(y4 )∧ i=1 double(y1 , y2 ) ∧ single(y1 , y3 ) ∧ single(y3 , y4 ) ∧ single(y1 , y5 ) → carboxylicEster(x) 8
  • 72. C ONCLUSION AND F URTHER R ESEARCH Results 1 Expressive and decidable formalism for complex objects 2 Wide range of structure-based classes 3 DLV-based implementation exhibits a significant speedup 4 Evaluation over ChEBI ontology revealed modelling errors Language for representing biochemical structures with a favourable performance/expressivity trade-off Future directions SMILES-based surface syntax define carboxylicEster some hasAtom SMILES(COC(= O)[∗]) end. 8
  • 73. C ONCLUSION AND F URTHER R ESEARCH Results 1 Expressive and decidable formalism for complex objects 2 Wide range of structure-based classes 3 DLV-based implementation exhibits a significant speedup 4 Evaluation over ChEBI ontology revealed modelling errors Language for representing biochemical structures with a favourable performance/expressivity trade-off Future directions SMILES-based surface syntax Detect subsumptions between classes 8
  • 74. C ONCLUSION AND F URTHER R ESEARCH Results 1 Expressive and decidable formalism for complex objects 2 Wide range of structure-based classes 3 DLV-based implementation exhibits a significant speedup 4 Evaluation over ChEBI ontology revealed modelling errors Language for representing biochemical structures with a favourable performance/expressivity trade-off Future directions SMILES-based surface syntax Detect subsumptions between classes E.g., Carboxylic ester is an organic molecular entity 8
  • 75. C ONCLUSION AND F URTHER R ESEARCH Results 1 Expressive and decidable formalism for complex objects 2 Wide range of structure-based classes 3 DLV-based implementation exhibits a significant speedup 4 Evaluation over ChEBI ontology revealed modelling errors Language for representing biochemical structures with a favourable performance/expressivity trade-off Future directions SMILES-based surface syntax Detect subsumptions between classes Extensions with numerical datatypes 8
  • 76. C ONCLUSION AND F URTHER R ESEARCH Results 1 Expressive and decidable formalism for complex objects 2 Wide range of structure-based classes 3 DLV-based implementation exhibits a significant speedup 4 Evaluation over ChEBI ontology revealed modelling errors Language for representing biochemical structures with a favourable performance/expressivity trade-off Future directions SMILES-based surface syntax Detect subsumptions between classes Extensions with numerical datatypes E.g., Small molecules if they weigh less than 800 daltons 8
  • 77. C ONCLUSION AND F URTHER R ESEARCH Results 1 Expressive and decidable formalism for complex objects 2 Wide range of structure-based classes 3 DLV-based implementation exhibits a significant speedup 4 Evaluation over ChEBI ontology revealed modelling errors Language for representing biochemical structures with a favourable performance/expressivity trade-off Future directions SMILES-based surface syntax Detect subsumptions between classes Extensions with numerical datatypes Classification of complex biological objects 8
  • 78. C ONCLUSION AND F URTHER R ESEARCH Results 1 Expressive and decidable formalism for complex objects 2 Wide range of structure-based classes 3 DLV-based implementation exhibits a significant speedup 4 Evaluation over ChEBI ontology revealed modelling errors Language for representing biochemical structures with a favourable performance/expressivity trade-off Future directions SMILES-based surface syntax Detect subsumptions between classes Extensions with numerical datatypes Classification of complex biological objects Integration with Protégé, Bioclipse, JChemPaint,. . . 8
  • 79. C ONCLUSION AND F URTHER R ESEARCH Results 1 Expressive and decidable formalism for complex objects 2 Wide range of structure-based classes 3 DLV-based implementation exhibits a significant speedup 4 Evaluation over ChEBI ontology revealed modelling errors Language for representing biochemical structures with a favourable performance/expressivity trade-off Future directions SMILES-based surface syntax Detect subsumptions between classes Extensions with numerical datatypes Classification of complex biological objects Integration with Protégé, Bioclipse, JChemPaint,. . . Mapping from our logic to RDF 8
  • 80. C ONCLUSION AND F URTHER R ESEARCH Results 1 Expressive and decidable formalism for complex objects 2 Wide range of structure-based classes 3 DLV-based implementation exhibits a significant speedup 4 Evaluation over ChEBI ontology revealed modelling errors Language for representing biochemical structures with a favourable performance/expressivity trade-off Future directions SMILES-based surface syntax Detect subsumptions between classes Extensions with numerical datatypes Classification of complex biological objects Integration with Protégé, Bioclipse, JChemPaint,. . . Mapping from our logic to RDF Thank you! Questions?!? 8