SlideShare a Scribd company logo
C LASSIFYING C HEMICALS U SING D ESCRIPTION
     G RAPHS AND L OGIC P ROGRAMMING

    Despoina Magka, Boris Motik and Ian Horrocks

       Department of Computer Science, University of Oxford


                        May 28, 2012
O UTLINE



    1   M OTIVATION




    2   DGLP S , I MPLEMENTATION AND OVERVIEW




1
M ODELLING C HEMICALS WITH OWL
    OWL used for the representation of molecular structures




2
M ODELLING C HEMICALS WITH OWL
    OWL used for the representation of molecular structures




    Classification of chemical compounds
    [Villanueva-Rosales & Dumontier, OWLED 2007]




2
M ODELLING C HEMICALS WITH OWL
    OWL used for the representation of molecular structures




    Classification of chemical compounds
    [Villanueva-Rosales & Dumontier, OWLED 2007]
    Chemical information integration
    [Konyk et al., DILS, 2008 ]




2
M ODELLING C HEMICALS WITH OWL
    OWL used for the representation of molecular structures




    Classification of chemical compounds
    [Villanueva-Rosales & Dumontier, OWLED 2007]
    Chemical information integration
    [Konyk et al., DILS, 2008 ]
    Subsumptions between molecules and chemical classes
    [Hastings et al., OWLED, 2010]

2
T HE C H EBI O NTOLOGY

    OWL ontology Chemical Entities of Biological Interest




3
T HE C H EBI O NTOLOGY

    OWL ontology Chemical Entities of Biological Interest

        Freely accessible dictionary of ‘small’ molecular entities




3
T HE C H EBI O NTOLOGY

    OWL ontology Chemical Entities of Biological Interest

        Freely accessible dictionary of ‘small’ molecular entities

        High quality annotation and taxonomy of chemicals




3
T HE C H EBI O NTOLOGY

    OWL ontology Chemical Entities of Biological Interest

        Freely accessible dictionary of ‘small’ molecular entities

        High quality annotation and taxonomy of chemicals

        Interoperability between researchers




3
T HE C H EBI O NTOLOGY

    OWL ontology Chemical Entities of Biological Interest

        Freely accessible dictionary of ‘small’ molecular entities

        High quality annotation and taxonomy of chemicals

        Interoperability between researchers

        Drug discovery and elucidation of metabolic pathways




3
AUTOMATE C HEMICAL C LASSIFICATION
    ChEBI is manually incremented




4
AUTOMATE C HEMICAL C LASSIFICATION
    ChEBI is manually incremented

    Currently contains approx. 28,000 fully annotated entities




4
AUTOMATE C HEMICAL C LASSIFICATION
    ChEBI is manually incremented

    Currently contains approx. 28,000 fully annotated entities

    Grows at a rate of ~1,500 entities per curator per year




4
AUTOMATE C HEMICAL C LASSIFICATION
    ChEBI is manually incremented

    Currently contains approx. 28,000 fully annotated entities

    Grows at a rate of ~1,500 entities per curator per year

    Biologically interesting entities possibly > 1,000,000




4
AUTOMATE C HEMICAL C LASSIFICATION
    ChEBI is manually incremented

    Currently contains approx. 28,000 fully annotated entities

    Grows at a rate of ~1,500 entities per curator per year

    Biologically interesting entities possibly > 1,000,000

    Each new molecule is subsumed by several chemical
    classes




4
AUTOMATE C HEMICAL C LASSIFICATION
    ChEBI is manually incremented

    Currently contains approx. 28,000 fully annotated entities

    Grows at a rate of ~1,500 entities per curator per year

    Biologically interesting entities possibly > 1,000,000

    Each new molecule is subsumed by several chemical
    classes
        Is dinitrogen inorganic?




4
AUTOMATE C HEMICAL C LASSIFICATION
    ChEBI is manually incremented

    Currently contains approx. 28,000 fully annotated entities

    Grows at a rate of ~1,500 entities per curator per year

    Biologically interesting entities possibly > 1,000,000

    Each new molecule is subsumed by several chemical
    classes
        Is dinitrogen inorganic?
        Does cyclobutane contain a four-membered ring?




4
AUTOMATE C HEMICAL C LASSIFICATION
    ChEBI is manually incremented

    Currently contains approx. 28,000 fully annotated entities

    Grows at a rate of ~1,500 entities per curator per year

    Biologically interesting entities possibly > 1,000,000

    Each new molecule is subsumed by several chemical
    classes
        Is dinitrogen inorganic?
        Does cyclobutane contain a four-membered ring?
        Is acetylene a hydrocarbon?




4
AUTOMATE C HEMICAL C LASSIFICATION
    ChEBI is manually incremented

    Currently contains approx. 28,000 fully annotated entities

    Grows at a rate of ~1,500 entities per curator per year

    Biologically interesting entities possibly > 1,000,000

    Each new molecule is subsumed by several chemical
    classes
        Is dinitrogen inorganic?
        Does cyclobutane contain a four-membered ring?
        Is acetylene a hydrocarbon?
        Does benzaldehyde contain a benzene ring?




4
AUTOMATE C HEMICAL C LASSIFICATION
    ChEBI is manually incremented

    Currently contains approx. 28,000 fully annotated entities

    Grows at a rate of ~1,500 entities per curator per year

    Biologically interesting entities possibly > 1,000,000

    Each new molecule is subsumed by several chemical
    classes
        Is dinitrogen inorganic?
        Does cyclobutane contain a four-membered ring?
        Is acetylene a hydrocarbon?
        Does benzaldehyde contain a benzene ring?

    Speed up curating tasks with automated reasoning tools


4
AUTOMATE C HEMICAL C LASSIFICATION
    ChEBI is manually incremented

    Currently contains approx. 28,000 fully annotated entities

    Grows at a rate of ~1,500 entities per curator per year

    Biologically interesting entities possibly > 1,000,000

    Each new molecule is subsumed by several chemical
    classes
        Is dinitrogen inorganic?                             Yes
        Does cyclobutane contain a four-membered ring?       Yes
        Is acetylene a hydrocarbon?                          Yes
        Does benzaldehyde contain a benzene ring?            Yes

    Speed up curating tasks with automated reasoning tools


4
(M IS )R EPRESENTING R INGS WITH OWL
    Chemical compounds with rings are highly frequent




5
(M IS )R EPRESENTING R INGS WITH OWL
    Chemical compounds with rings are highly frequent
    Fundamental inability of OWL to represent cycles




5
(M IS )R EPRESENTING R INGS WITH OWL
    Chemical compounds with rings are highly frequent
    Fundamental inability of OWL to represent cycles
    At least one tree-shaped model for each consistent OWL
    knowledge base




5
(M IS )R EPRESENTING R INGS WITH OWL
          Chemical compounds with rings are highly frequent
          Fundamental inability of OWL to represent cycles
          At least one tree-shaped model for each consistent OWL
          knowledge base
     E XAMPLE
    Cyclobutane    ∃(= 4)hasAtom.(Carbon     ∃(= 2)hasBond.Carbon)

      C      C

      C      C




5
(M IS )R EPRESENTING R INGS WITH OWL
          Chemical compounds with rings are highly frequent
          Fundamental inability of OWL to represent cycles
          At least one tree-shaped model for each consistent OWL
          knowledge base
     E XAMPLE
    Cyclobutane    ∃(= 4)hasAtom.(Carbon     ∃(= 2)hasBond.Carbon)

      C      C

      C      C




5
(M IS )R EPRESENTING R INGS WITH OWL
          Chemical compounds with rings are highly frequent
          Fundamental inability of OWL to represent cycles
          At least one tree-shaped model for each consistent OWL
          knowledge base
     E XAMPLE
    Cyclobutane    ∃(= 4)hasAtom.(Carbon     ∃(= 2)hasBond.Carbon)

      C      C

      C      C




5
(M IS )R EPRESENTING R INGS WITH OWL
          Chemical compounds with rings are highly frequent
          Fundamental inability of OWL to represent cycles
          At least one tree-shaped model for each consistent OWL
          knowledge base
     E XAMPLE
    Cyclobutane    ∃(= 4)hasAtom.(Carbon     ∃(= 2)hasBond.Carbon)

      C      C

      C      C

          OWL-based reasoning support



5
(M IS )R EPRESENTING R INGS WITH OWL
          Chemical compounds with rings are highly frequent
          Fundamental inability of OWL to represent cycles
          At least one tree-shaped model for each consistent OWL
          knowledge base
     E XAMPLE
    Cyclobutane    ∃(= 4)hasAtom.(Carbon      ∃(= 2)hasBond.Carbon)

      C      C

      C      C

          OWL-based reasoning support
              Does cyclobutane contain a four-membered ring? 
              Does benzaldehyde contain a benzene ring? 

5
OWL E XTENSIONS
    Limitation of OWL to represent cycles (partially) remedied
    by extension of OWL with Description Graphs and rules
    [Motik et al., 2009]




6
OWL E XTENSIONS
    Limitation of OWL to represent cycles (partially) remedied
    by extension of OWL with Description Graphs and rules
    [Motik et al., 2009]
    A Description Graph represents structures by means of a
    directed labeled graph




6
OWL E XTENSIONS
         Limitation of OWL to represent cycles (partially) remedied
         by extension of OWL with Description Graphs and rules
         [Motik et al., 2009]
         A Description Graph represents structures by means of a
         directed labeled graph
    E XAMPLE
                  Cyclobutadiene 1


     C      C         Carbon 2       3 Carbon

     C      C          Carbon 5      4 Carbon




6
OWL E XTENSIONS
         Limitation of OWL to represent cycles (partially) remedied
         by extension of OWL with Description Graphs and rules
         [Motik et al., 2009]
         A Description Graph represents structures by means of a
         directed labeled graph
    E XAMPLE
                  Cyclobutadiene 1


     C      C         Carbon 2       3 Carbon

     C      C          Carbon 5      4 Carbon




6
OWL E XTENSIONS
         Limitation of OWL to represent cycles (partially) remedied
         by extension of OWL with Description Graphs and rules
         [Motik et al., 2009]
         A Description Graph represents structures by means of a
         directed labeled graph

    E XAMPLE
                  Cyclobutadiene 1


     C      C         Carbon 2       3 Carbon

     C      C          Carbon 5      4 Carbon



         Does cyclobutadiene have a conjugated four-membered
         ring?
6
OWL E XTENSIONS
         Limitation of OWL to represent cycles (partially) remedied
         by extension of OWL with Description Graphs and rules
         [Motik et al., 2009]
         A Description Graph represents structures by means of a
         directed labeled graph

    E XAMPLE
                  Cyclobutadiene 1


     C      C         Carbon 2       3 Carbon

     C      C          Carbon 5      4 Carbon



         Does cyclobutadiene have a conjugated four-membered
         ring? 
6
OWL E XTENSIONS
         Limitation of OWL to represent cycles (partially) remedied
         by extension of OWL with Description Graphs and rules
         [Motik et al., 2009]
         A Description Graph represents structures by means of a
         directed labeled graph
    E XAMPLE
                  Cyclobutadiene 1
                                                              Oxygen
     C      C         Carbon 2       3 Carbon

     C      C          Carbon 5      4 Carbon




6
OWL E XTENSIONS
         Limitation of OWL to represent cycles (partially) remedied
         by extension of OWL with Description Graphs and rules
         [Motik et al., 2009]
         A Description Graph represents structures by means of a
         directed labeled graph
    E XAMPLE
                  Cyclobutadiene 1
                                                              Oxygen
     C      C         Carbon 2       3 Carbon

     C      C          Carbon 5      4 Carbon
         ∀hasAtom.(Carbon      Hydrogen)     Hydrocarbon



6
OWL E XTENSIONS
         Limitation of OWL to represent cycles (partially) remedied
         by extension of OWL with Description Graphs and rules
         [Motik et al., 2009]
         A Description Graph represents structures by means of a
         directed labeled graph
    E XAMPLE
                  Cyclobutadiene 1
                                                              Oxygen
     C      C         Carbon 2       3 Carbon

     C      C          Carbon 5      4 Carbon
         ∀hasAtom.(Carbon      Hydrogen)     Hydrocarbon

         Is cyclobutadiene a hydrocarbon?

6
OWL E XTENSIONS
         Limitation of OWL to represent cycles (partially) remedied
         by extension of OWL with Description Graphs and rules
         [Motik et al., 2009]
         A Description Graph represents structures by means of a
         directed labeled graph
    E XAMPLE
                  Cyclobutadiene 1
                                                              Oxygen
     C      C         Carbon 2       3 Carbon

     C      C          Carbon 5      4 Carbon
         ∀hasAtom.(Carbon      Hydrogen)      Hydrocarbon

         Is cyclobutadiene a hydrocarbon? 

6
R ESULTS OVERVIEW
    Key idea:




7
R ESULTS OVERVIEW
    Key idea:
        Switch from first-order logic to logic programming semantics




7
R ESULTS OVERVIEW
    Key idea:
        Switch from first-order logic to logic programming semantics
        Replace classical negation with negation-as-failure




7
R ESULTS OVERVIEW
    Key idea:
        Switch from first-order logic to logic programming semantics
        Replace classical negation with negation-as-failure
    Double benefit:




7
R ESULTS OVERVIEW
    Key idea:
          Switch from first-order logic to logic programming semantics
          Replace classical negation with negation-as-failure
    Double benefit:
      1   Encode chemical classes based on the absence of
          information




7
R ESULTS OVERVIEW
    Key idea:
          Switch from first-order logic to logic programming semantics
          Replace classical negation with negation-as-failure
    Double benefit:
      1   Encode chemical classes based on the absence of
          information (e.g. hydrocarbons, inorganic molecules,
          saturated compounds,. . . )




7
R ESULTS OVERVIEW
    Key idea:
          Switch from first-order logic to logic programming semantics
          Replace classical negation with negation-as-failure
    Double benefit:
      1   Encode chemical classes based on the absence of
          information (e.g. hydrocarbons, inorganic molecules,
          saturated compounds,. . . )
      2   Represent rings with adequate precision




7
R ESULTS OVERVIEW
    Key idea:
          Switch from first-order logic to logic programming semantics
          Replace classical negation with negation-as-failure
    Double benefit:
      1   Encode chemical classes based on the absence of
          information (e.g. hydrocarbons, inorganic molecules,
          saturated compounds,. . . )
      2   Represent rings with adequate precision (e.g. cycloalkanes,
          benzene rings,. . . )




7
R ESULTS OVERVIEW
    Key idea:
          Switch from first-order logic to logic programming semantics
          Replace classical negation with negation-as-failure
    Double benefit:
      1   Encode chemical classes based on the absence of
          information (e.g. hydrocarbons, inorganic molecules,
          saturated compounds,. . . )
      2   Represent rings with adequate precision (e.g. cycloalkanes,
          benzene rings,. . . )

    Expressive decidable logic-based formalism for modelling
    structured entities: Description Graph Logic Programs
    (DGLPs)




7
R ESULTS OVERVIEW
    Key idea:
          Switch from first-order logic to logic programming semantics
          Replace classical negation with negation-as-failure
    Double benefit:
      1   Encode chemical classes based on the absence of
          information (e.g. hydrocarbons, inorganic molecules,
          saturated compounds,. . . )
      2   Represent rings with adequate precision (e.g. cycloalkanes,
          benzene rings,. . . )

    Expressive decidable logic-based formalism for modelling
    structured entities: Description Graph Logic Programs
    (DGLPs)
    Prototypical implementation of a logic-based chemical
    classification software

7
O UTLINE



    1   M OTIVATION




    2   DGLP S , I MPLEMENTATION AND OVERVIEW




8
W HAT IS A DGLP O NTOLOGY ?
    The syntactic objects of a DGLP ontology:




9
W HAT IS A DGLP O NTOLOGY ?
    The syntactic objects of a DGLP ontology:
        Description graphs




9
W HAT IS A DGLP O NTOLOGY ?
         The syntactic objects of a DGLP ontology:
             Description graphs

    E XAMPLE
                 Cyclobutane 1
                                                     O     O

                                              Dioxygen 1
     C      C     Carbon 2        3 Carbon

     C      C      Carbon 5       4 Carbon   Oxygen 2      3 Oxygen




9
W HAT IS A DGLP O NTOLOGY ?
    The syntactic objects of a DGLP ontology:
        Description graphs
        Function-free FOL Horn rules




9
W HAT IS A DGLP O NTOLOGY ?
       The syntactic objects of a DGLP ontology:
            Description graphs
            Function-free FOL Horn rules
    E XAMPLE
    Bond(x, y)         →   Bond(y, x)
    SingleBond(x, y)   →   Bond(x, y)




9
W HAT IS A DGLP O NTOLOGY ?
       The syntactic objects of a DGLP ontology:
            Description graphs
            Function-free FOL Horn rules
    E XAMPLE
    Bond(x, y)         →   Bond(y, x)
    SingleBond(x, y)   →   Bond(x, y)

            Rules with negation-as-failure




9
W HAT IS A DGLP O NTOLOGY ?
       The syntactic objects of a DGLP ontology:
            Description graphs
            Function-free FOL Horn rules
    E XAMPLE
    Bond(x, y)         →   Bond(y, x)
    SingleBond(x, y)   →   Bond(x, y)

            Rules with negation-as-failure

    E XAMPLE
    HasAtom(x, y) ∧ Carbon(y)           →    HasCarbon(x)
    Molecule(x)∧ not HasCarbon(x)       →    Inorganic(x)




9
W HAT IS A DGLP O NTOLOGY ?
       The syntactic objects of a DGLP ontology:
            Description graphs
            Function-free FOL Horn rules
    E XAMPLE
    Bond(x, y)         →   Bond(y, x)
    SingleBond(x, y)   →   Bond(x, y)

            Rules with negation-as-failure

    E XAMPLE
    HasAtom(x, y) ∧ Carbon(y)           →    HasCarbon(x)
    Molecule(x)∧ not HasCarbon(x)       →    Inorganic(x)

            Facts




9
W HAT IS A DGLP O NTOLOGY ?
       The syntactic objects of a DGLP ontology:
            Description graphs
            Function-free FOL Horn rules
    E XAMPLE
    Bond(x, y)          →    Bond(y, x)
    SingleBond(x, y)    →    Bond(x, y)

            Rules with negation-as-failure

    E XAMPLE
    HasAtom(x, y) ∧ Carbon(y)             →      HasCarbon(x)
    Molecule(x)∧ not HasCarbon(x)         →      Inorganic(x)

            Facts
    E XAMPLE
    Cyclobutane(c1 ),   Dinitrogen(c2 ), . . .


9
E NCODING D ESCRIPTION G RAPHS

     Translate DGs into logic programs with function symbols




10
E NCODING D ESCRIPTION G RAPHS

        Translate DGs into logic programs with function symbols
     E XAMPLE




10
E NCODING D ESCRIPTION G RAPHS

        Translate DGs into logic programs with function symbols
     E XAMPLE

             Cyclobutane(x)              →Gcb (x, f1 (x), f2 (x), f3 (x), f4 (x))
             Gcb (x, y1 , y2 , y3 , y4 ) →Cyclobutane(x) ∧
                                          Carbon(y1 ) ∧ Carbon(y2 ) ∧
                                          Carbon(y3 ) ∧ Carbon(y4 ) ∧
                                          HasAtom(x, y1 ) ∧ Bond(y1 , y2 ) ∧
                                          HasAtom(x, y2 ) ∧ Bond(y2 , y3 ) ∧
                                          HasAtom(x, y3 ) ∧ Bond(y3 , y4 ) ∧
                                          HasAtom(x, y4 ) ∧ Bond(y4 , y1 )




10
E NCODING D ESCRIPTION G RAPHS

        Translate DGs into logic programs with function symbols
     E XAMPLE

             Cyclobutane(x)              →Gcb (x, f1 (x), f2 (x), f3 (x), f4 (x))
             Gcb (x, y1 , y2 , y3 , y4 ) →Cyclobutane(x) ∧
                                          Carbon(y1 ) ∧ Carbon(y2 ) ∧
                                          Carbon(y3 ) ∧ Carbon(y4 ) ∧
                                          HasAtom(x, y1 ) ∧ Bond(y1 , y2 ) ∧
                                          HasAtom(x, y2 ) ∧ Bond(y2 , y3 ) ∧
                                          HasAtom(x, y3 ) ∧ Bond(y3 , y4 ) ∧
                                          HasAtom(x, y4 ) ∧ Bond(y4 , y1 )

        Function symbols allow for schema-level reasoning


10
C LASSIFYING C HEMICALS
     E XAMPLE

     Molecule(x) ∧ HasAtom(x, y) ∧ not Carbon(y) ∧ not Hydrogen(y)
     → NotHydroCarbon(x)
     Molecule(x) ∧ not NotHydroCarbon(x) → HydroCarbon(x)




11
C LASSIFYING C HEMICALS
     E XAMPLE

     Molecule(x) ∧ HasAtom(x, y) ∧ not Carbon(y) ∧ not Hydrogen(y)
     → NotHydroCarbon(x)
     Molecule(x) ∧ not NotHydroCarbon(x) → HydroCarbon(x)




             C    C

             C    C




11
C LASSIFYING C HEMICALS
     E XAMPLE

     Molecule(x) ∧ HasAtom(x, y) ∧ not Carbon(y) ∧ not Hydrogen(y)
     → NotHydroCarbon(x)
     Molecule(x) ∧ not NotHydroCarbon(x) → HydroCarbon(x)




                                     Is cyclobutane a
             C    C                  hydrocarbon? 

             C    C




11
C LASSIFYING C HEMICALS
     E XAMPLE
       Molecule(x) ∧              HasAtom(x, yi ) ∧           Bond(yi , yi+1 ) ∧
                         1≤i≤4                        1≤i≤3
       Bond(y4 , y1 )             not yi = yj
                        1≤ij≤4

       → MoleculeWith4MemberedRing(x)




12
C LASSIFYING C HEMICALS
     E XAMPLE
       Molecule(x) ∧              HasAtom(x, yi ) ∧           Bond(yi , yi+1 ) ∧
                         1≤i≤4                        1≤i≤3
       Bond(y4 , y1 )             not yi = yj
                        1≤ij≤4

       → MoleculeWith4MemberedRing(x)




             C     C

             C     C




12
C LASSIFYING C HEMICALS
     E XAMPLE
       Molecule(x) ∧              HasAtom(x, yi ) ∧           Bond(yi , yi+1 ) ∧
                         1≤i≤4                        1≤i≤3
       Bond(y4 , y1 )             not yi = yj
                        1≤ij≤4

       → MoleculeWith4MemberedRing(x)



                                                Does cyclobutane contain a
             C     C                            four-membered ring? 

             C     C




12
U NDECIDABILITY
     Logic programs with function symbols can axiomatise
     infinitely large structures




13
U NDECIDABILITY
     Logic programs with function symbols can axiomatise
     infinitely large structures
     Reasoning with DGLP ontologies is trivially undecidable




13
U NDECIDABILITY
     Logic programs with function symbols can axiomatise
     infinitely large structures
     Reasoning with DGLP ontologies is trivially undecidable
     We are only interested in finite structures




13
U NDECIDABILITY
        Logic programs with function symbols can axiomatise
        infinitely large structures
        Reasoning with DGLP ontologies is trivially undecidable
        We are only interested in finite structures
     E XAMPLE
               Carboxyl
               O                 AceticAcid          Carboxyl
                   Carbonyl           1                   1
               C
      Methyl       Hydroxyl
                                  2       3           2       3
       CH3          OH        Methyl Carboxyl Carbonyl Hydroxyl




13
ACYCLICITY C ONDITIONS
     Formalisms extensively studied, e.g. Datalog±




14
ACYCLICITY C ONDITIONS
     Formalisms extensively studied, e.g. Datalog±
     Various syntax-based acyclicity conditions




14
ACYCLICITY C ONDITIONS
     Formalisms extensively studied, e.g. Datalog±
     Various syntax-based acyclicity conditions
         weak acyclicity [Fagin et al., ICDT, 2002]
         super-weak acyclicity [Marnette, PODS, 2009]
         joint acyclicity [Krötzsch and Rudolph, IJCAI, 2011]




14
ACYCLICITY C ONDITIONS
     Formalisms extensively studied, e.g. Datalog±
     Various syntax-based acyclicity conditions
         weak acyclicity [Fagin et al., ICDT, 2002]
         super-weak acyclicity [Marnette, PODS, 2009]
         joint acyclicity [Krötzsch and Rudolph, IJCAI, 2011]
         rule out naturally-arising nested structures




14
ACYCLICITY C ONDITIONS
        Formalisms extensively studied, e.g. Datalog±
        Various syntax-based acyclicity conditions
               weak acyclicity [Fagin et al., ICDT, 2002]
               super-weak acyclicity [Marnette, PODS, 2009]
               joint acyclicity [Krötzsch and Rudolph, IJCAI, 2011]
               rule out naturally-arising nested structures

     E XAMPLE
                Carboxyl
                O                    AceticAcid           Carboxyl
                    Carbonyl               1                    1
                C
      Methyl        Hydroxyl
                                       2       3            2       3
       CH3            OH          Methyl Carboxyl Carbonyl Hydroxyl


14
ACYCLICITY C ONDITIONS
     Formalisms extensively studied, e.g. Datalog±
     Various syntax-based acyclicity conditions
         weak acyclicity [Fagin et al., ICDT, 2002]
         super-weak acyclicity [Marnette, PODS, 2009]
         joint acyclicity [Krötzsch and Rudolph, IJCAI, 2011]
         rule out naturally-arising nested structures
     Novel semantic acyclicity condition




14
ACYCLICITY C ONDITIONS
     Formalisms extensively studied, e.g. Datalog±
     Various syntax-based acyclicity conditions
         weak acyclicity [Fagin et al., ICDT, 2002]
         super-weak acyclicity [Marnette, PODS, 2009]
         joint acyclicity [Krötzsch and Rudolph, IJCAI, 2011]
         rule out naturally-arising nested structures
     Novel semantic acyclicity condition
         checks for generation of unbounded structures on the fly




14
ACYCLICITY C ONDITIONS
     Formalisms extensively studied, e.g. Datalog±
     Various syntax-based acyclicity conditions
         weak acyclicity [Fagin et al., ICDT, 2002]
         super-weak acyclicity [Marnette, PODS, 2009]
         joint acyclicity [Krötzsch and Rudolph, IJCAI, 2011]
         rule out naturally-arising nested structures
     Novel semantic acyclicity condition
         checks for generation of unbounded structures on the fly
         modelling of molecules that contain functional groups




14
ACYCLICITY C ONDITIONS
     Formalisms extensively studied, e.g. Datalog±
     Various syntax-based acyclicity conditions
         weak acyclicity [Fagin et al., ICDT, 2002]
         super-weak acyclicity [Marnette, PODS, 2009]
         joint acyclicity [Krötzsch and Rudolph, IJCAI, 2011]
         rule out naturally-arising nested structures
     Novel semantic acyclicity condition
         checks for generation of unbounded structures on the fly
         modelling of molecules that contain functional groups
                O
                    Carbonyl
                C
       Methyl       Hydroxyl
         CH3          OH
                Carboxyl
14
E MPIRICAL E VALUATION
     Data extracted from ChEBI in Molfile format




15
E MPIRICAL E VALUATION
     Data extracted from ChEBI in Molfile format
     XSB logic programming engine




15
E MPIRICAL E VALUATION
     Data extracted from ChEBI in Molfile format
     XSB logic programming engine
     Chemical classes:




15
E MPIRICAL E VALUATION
     Data extracted from ChEBI in Molfile format
     XSB logic programming engine
     Chemical classes:
         Hydrocarbons
         Inorganic molecules
         Molecules with exactly two carbons
         Molecules with a four-membered ring
         Molecules with a benzene




15
E MPIRICAL E VALUATION
     Data extracted from ChEBI in Molfile format
     XSB logic programming engine
     Chemical classes:
         Hydrocarbons
         Inorganic molecules
         Molecules with exactly two carbons
         Molecules with a four-membered ring
         Molecules with a benzene
     Preliminary evaluation ranging from 10 to 70 molecules




15
E MPIRICAL E VALUATION
     Data extracted from ChEBI in Molfile format
     XSB logic programming engine
     Chemical classes:
         Hydrocarbons
         Inorganic molecules
         Molecules with exactly two carbons
         Molecules with a four-membered ring
         Molecules with a benzene
     Preliminary evaluation ranging from 10 to 70 molecules
     Results:




15
E MPIRICAL E VALUATION
     Data extracted from ChEBI in Molfile format
     XSB logic programming engine
     Chemical classes:
         Hydrocarbons
         Inorganic molecules
         Molecules with exactly two carbons
         Molecules with a four-membered ring
         Molecules with a benzene
     Preliminary evaluation ranging from 10 to 70 molecules
     Results:
         All DGLP ontologies were found acyclic




15
E MPIRICAL E VALUATION
     Data extracted from ChEBI in Molfile format
     XSB logic programming engine
     Chemical classes:
         Hydrocarbons
         Inorganic molecules
         Molecules with exactly two carbons
         Molecules with a four-membered ring
         Molecules with a benzene
     Preliminary evaluation ranging from 10 to 70 molecules
     Results:
         All DGLP ontologies were found acyclic
         Molecules classified as expected




15
E MPIRICAL E VALUATION
     Data extracted from ChEBI in Molfile format
     XSB logic programming engine
     Chemical classes:
         Hydrocarbons
         Inorganic molecules
         Molecules with exactly two carbons
         Molecules with a four-membered ring
         Molecules with a benzene
     Preliminary evaluation ranging from 10 to 70 molecules
     Results:
         All DGLP ontologies were found acyclic
         Molecules classified as expected
         Suite of subsumption tests for largest ontology performed in
         few minutes



15
OVERVIEW AND F UTURE D IRECTIONS
     1   Expressive and decidable formalism for representation of
         structured objects




16
OVERVIEW AND F UTURE D IRECTIONS
     1   Expressive and decidable formalism for representation of
         structured objects
     2   Novel acyclicity condition for logic programs with restricted
         use of function symbols




16
OVERVIEW AND F UTURE D IRECTIONS
     1   Expressive and decidable formalism for representation of
         structured objects
     2   Novel acyclicity condition for logic programs with restricted
         use of function symbols
     3   Prototype for the automated classification of chemicals




16
OVERVIEW AND F UTURE D IRECTIONS
     1   Expressive and decidable formalism for representation of
         structured objects
     2   Novel acyclicity condition for logic programs with restricted
         use of function symbols
     3   Prototype for the automated classification of chemicals
             Is dinitrogen inorganic?
             Does cyclobutane contain a four-membered ring?
             Is acetylene a hydrocarbon?
             Does benzaldehyde contain a benzene ring?




16
OVERVIEW AND F UTURE D IRECTIONS
     1   Expressive and decidable formalism for representation of
         structured objects
     2   Novel acyclicity condition for logic programs with restricted
         use of function symbols
     3   Prototype for the automated classification of chemicals
             Is dinitrogen inorganic? 
             Does cyclobutane contain a four-membered ring? 
             Is acetylene a hydrocarbon? 
             Does benzaldehyde contain a benzene ring? 




16
OVERVIEW AND F UTURE D IRECTIONS
     1   Expressive and decidable formalism for representation of
         structured objects
     2   Novel acyclicity condition for logic programs with restricted
         use of function symbols
     3   Prototype for the automated classification of chemicals
             Is dinitrogen inorganic? 
             Does cyclobutane contain a four-membered ring? 
             Is acetylene a hydrocarbon? 
             Does benzaldehyde contain a benzene ring? 

         Future directions:




16
OVERVIEW AND F UTURE D IRECTIONS
     1   Expressive and decidable formalism for representation of
         structured objects
     2   Novel acyclicity condition for logic programs with restricted
         use of function symbols
     3   Prototype for the automated classification of chemicals
             Is dinitrogen inorganic? 
             Does cyclobutane contain a four-membered ring? 
             Is acetylene a hydrocarbon? 
             Does benzaldehyde contain a benzene ring? 

         Future directions:
             Datalog rules with existentials in the head




16
OVERVIEW AND F UTURE D IRECTIONS
     1   Expressive and decidable formalism for representation of
         structured objects
     2   Novel acyclicity condition for logic programs with restricted
         use of function symbols
     3   Prototype for the automated classification of chemicals
             Is dinitrogen inorganic? 
             Does cyclobutane contain a four-membered ring? 
             Is acetylene a hydrocarbon? 
             Does benzaldehyde contain a benzene ring? 

         Future directions:
             Datalog rules with existentials in the head
             User-friendly surface syntax




16
OVERVIEW AND F UTURE D IRECTIONS
     1   Expressive and decidable formalism for representation of
         structured objects
     2   Novel acyclicity condition for logic programs with restricted
         use of function symbols
     3   Prototype for the automated classification of chemicals
             Is dinitrogen inorganic? 
             Does cyclobutane contain a four-membered ring? 
             Is acetylene a hydrocarbon? 
             Does benzaldehyde contain a benzene ring? 

         Future directions:
             Datalog rules with existentials in the head
             User-friendly surface syntax
             Fully-fledged chemical classification system




16
OVERVIEW AND F UTURE D IRECTIONS
     1   Expressive and decidable formalism for representation of
         structured objects
     2   Novel acyclicity condition for logic programs with restricted
         use of function symbols
     3   Prototype for the automated classification of chemicals
             Is dinitrogen inorganic? 
             Does cyclobutane contain a four-membered ring? 
             Is acetylene a hydrocarbon? 
             Does benzaldehyde contain a benzene ring? 

         Future directions:
             Datalog rules with existentials in the head
             User-friendly surface syntax
             Fully-fledged chemical classification system
         Thank you for listening. Questions?


16

More Related Content

PDF
Modelling Structured Domains with Description Graphs and Logic Programming
PPT
Photosyntb.sc lecture copy-1
PDF
Acyclicity Conditions and their Application to Query Answering in Description...
PDF
Computing Stable Models for Nonmonotonic Existential Rules
PDF
Tractable Extensions of the Description Logic EL with Numerical Datatypes
PDF
thesis-despoina
PDF
Ontology-Based Classification of Molecules: a Logic Programming Approach
PDF
Functional Programming Patterns (BuildStuff '14)
Modelling Structured Domains with Description Graphs and Logic Programming
Photosyntb.sc lecture copy-1
Acyclicity Conditions and their Application to Query Answering in Description...
Computing Stable Models for Nonmonotonic Existential Rules
Tractable Extensions of the Description Logic EL with Numerical Datatypes
thesis-despoina
Ontology-Based Classification of Molecules: a Logic Programming Approach
Functional Programming Patterns (BuildStuff '14)

Similar to Classifying Chemicals with Description Graphs and Logic Programming (20)

PPT
Automatic classification in ChEBI
PPTX
Representing chemicals using OWL, Description Graphs and Rules
PPTX
Modularity requirements in bio-ontologies: a case study of ChEBI
PPT
Pipeline for automated structure-based classification in the ChEBI ontology
PPT
Representing the ChEBI ontology in OWL
PDF
EnCOrE: Chemistry, Education, Knowledge From the Real to the Virtual Needs, P...
PDF
Computational Chemistry: From Theory to Practice
PPTX
From chemicals to minds: Integrated ontologies in the search for scientific u...
PPT
20090511 Manchester Biochemistry
PDF
Class-based reasoning (OWLED2012)
PDF
CINF 17: Comparing Cahn-Ingold-Prelog Rule Implementations: The need for an o...
PPT
OWL and OBO
PPTX
Molecular symmetry and specialization of atomic connectivity by class-based r...
PDF
II-SDV 2017: The "International Chemical Ontology Network"
PDF
Comparing Cahn-Ingold-Prelog Rule Implementations
PPTX
Intro to Open Babel
PPT
Data integration and building a profile for yourself as an online scientist
PPTX
We’re all SMILES! Building Chemical Semantic Web Services with SADI, ChEBI, a...
PDF
So I have an SD File … What do I do next?
PDF
Representing molecules with minimalism: A solution to the entropy of informatics
Automatic classification in ChEBI
Representing chemicals using OWL, Description Graphs and Rules
Modularity requirements in bio-ontologies: a case study of ChEBI
Pipeline for automated structure-based classification in the ChEBI ontology
Representing the ChEBI ontology in OWL
EnCOrE: Chemistry, Education, Knowledge From the Real to the Virtual Needs, P...
Computational Chemistry: From Theory to Practice
From chemicals to minds: Integrated ontologies in the search for scientific u...
20090511 Manchester Biochemistry
Class-based reasoning (OWLED2012)
CINF 17: Comparing Cahn-Ingold-Prelog Rule Implementations: The need for an o...
OWL and OBO
Molecular symmetry and specialization of atomic connectivity by class-based r...
II-SDV 2017: The "International Chemical Ontology Network"
Comparing Cahn-Ingold-Prelog Rule Implementations
Intro to Open Babel
Data integration and building a profile for yourself as an online scientist
We’re all SMILES! Building Chemical Semantic Web Services with SADI, ChEBI, a...
So I have an SD File … What do I do next?
Representing molecules with minimalism: A solution to the entropy of informatics
Ad

Recently uploaded (20)

PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PPTX
Tartificialntelligence_presentation.pptx
PDF
Hybrid model detection and classification of lung cancer
PDF
1 - Historical Antecedents, Social Consideration.pdf
PDF
A comparative study of natural language inference in Swahili using monolingua...
PDF
WOOl fibre morphology and structure.pdf for textiles
PDF
Heart disease approach using modified random forest and particle swarm optimi...
PDF
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
PDF
Zenith AI: Advanced Artificial Intelligence
PDF
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
PPTX
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
PDF
Getting Started with Data Integration: FME Form 101
PPTX
OMC Textile Division Presentation 2021.pptx
PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
PPTX
1. Introduction to Computer Programming.pptx
PDF
Approach and Philosophy of On baking technology
PDF
Enhancing emotion recognition model for a student engagement use case through...
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
Building Integrated photovoltaic BIPV_UPV.pdf
Group 1 Presentation -Planning and Decision Making .pptx
Tartificialntelligence_presentation.pptx
Hybrid model detection and classification of lung cancer
1 - Historical Antecedents, Social Consideration.pdf
A comparative study of natural language inference in Swahili using monolingua...
WOOl fibre morphology and structure.pdf for textiles
Heart disease approach using modified random forest and particle swarm optimi...
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
Zenith AI: Advanced Artificial Intelligence
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
Getting Started with Data Integration: FME Form 101
OMC Textile Division Presentation 2021.pptx
Univ-Connecticut-ChatGPT-Presentaion.pdf
1. Introduction to Computer Programming.pptx
Approach and Philosophy of On baking technology
Enhancing emotion recognition model for a student engagement use case through...
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
Ad

Classifying Chemicals with Description Graphs and Logic Programming

  • 1. C LASSIFYING C HEMICALS U SING D ESCRIPTION G RAPHS AND L OGIC P ROGRAMMING Despoina Magka, Boris Motik and Ian Horrocks Department of Computer Science, University of Oxford May 28, 2012
  • 2. O UTLINE 1 M OTIVATION 2 DGLP S , I MPLEMENTATION AND OVERVIEW 1
  • 3. M ODELLING C HEMICALS WITH OWL OWL used for the representation of molecular structures 2
  • 4. M ODELLING C HEMICALS WITH OWL OWL used for the representation of molecular structures Classification of chemical compounds [Villanueva-Rosales & Dumontier, OWLED 2007] 2
  • 5. M ODELLING C HEMICALS WITH OWL OWL used for the representation of molecular structures Classification of chemical compounds [Villanueva-Rosales & Dumontier, OWLED 2007] Chemical information integration [Konyk et al., DILS, 2008 ] 2
  • 6. M ODELLING C HEMICALS WITH OWL OWL used for the representation of molecular structures Classification of chemical compounds [Villanueva-Rosales & Dumontier, OWLED 2007] Chemical information integration [Konyk et al., DILS, 2008 ] Subsumptions between molecules and chemical classes [Hastings et al., OWLED, 2010] 2
  • 7. T HE C H EBI O NTOLOGY OWL ontology Chemical Entities of Biological Interest 3
  • 8. T HE C H EBI O NTOLOGY OWL ontology Chemical Entities of Biological Interest Freely accessible dictionary of ‘small’ molecular entities 3
  • 9. T HE C H EBI O NTOLOGY OWL ontology Chemical Entities of Biological Interest Freely accessible dictionary of ‘small’ molecular entities High quality annotation and taxonomy of chemicals 3
  • 10. T HE C H EBI O NTOLOGY OWL ontology Chemical Entities of Biological Interest Freely accessible dictionary of ‘small’ molecular entities High quality annotation and taxonomy of chemicals Interoperability between researchers 3
  • 11. T HE C H EBI O NTOLOGY OWL ontology Chemical Entities of Biological Interest Freely accessible dictionary of ‘small’ molecular entities High quality annotation and taxonomy of chemicals Interoperability between researchers Drug discovery and elucidation of metabolic pathways 3
  • 12. AUTOMATE C HEMICAL C LASSIFICATION ChEBI is manually incremented 4
  • 13. AUTOMATE C HEMICAL C LASSIFICATION ChEBI is manually incremented Currently contains approx. 28,000 fully annotated entities 4
  • 14. AUTOMATE C HEMICAL C LASSIFICATION ChEBI is manually incremented Currently contains approx. 28,000 fully annotated entities Grows at a rate of ~1,500 entities per curator per year 4
  • 15. AUTOMATE C HEMICAL C LASSIFICATION ChEBI is manually incremented Currently contains approx. 28,000 fully annotated entities Grows at a rate of ~1,500 entities per curator per year Biologically interesting entities possibly > 1,000,000 4
  • 16. AUTOMATE C HEMICAL C LASSIFICATION ChEBI is manually incremented Currently contains approx. 28,000 fully annotated entities Grows at a rate of ~1,500 entities per curator per year Biologically interesting entities possibly > 1,000,000 Each new molecule is subsumed by several chemical classes 4
  • 17. AUTOMATE C HEMICAL C LASSIFICATION ChEBI is manually incremented Currently contains approx. 28,000 fully annotated entities Grows at a rate of ~1,500 entities per curator per year Biologically interesting entities possibly > 1,000,000 Each new molecule is subsumed by several chemical classes Is dinitrogen inorganic? 4
  • 18. AUTOMATE C HEMICAL C LASSIFICATION ChEBI is manually incremented Currently contains approx. 28,000 fully annotated entities Grows at a rate of ~1,500 entities per curator per year Biologically interesting entities possibly > 1,000,000 Each new molecule is subsumed by several chemical classes Is dinitrogen inorganic? Does cyclobutane contain a four-membered ring? 4
  • 19. AUTOMATE C HEMICAL C LASSIFICATION ChEBI is manually incremented Currently contains approx. 28,000 fully annotated entities Grows at a rate of ~1,500 entities per curator per year Biologically interesting entities possibly > 1,000,000 Each new molecule is subsumed by several chemical classes Is dinitrogen inorganic? Does cyclobutane contain a four-membered ring? Is acetylene a hydrocarbon? 4
  • 20. AUTOMATE C HEMICAL C LASSIFICATION ChEBI is manually incremented Currently contains approx. 28,000 fully annotated entities Grows at a rate of ~1,500 entities per curator per year Biologically interesting entities possibly > 1,000,000 Each new molecule is subsumed by several chemical classes Is dinitrogen inorganic? Does cyclobutane contain a four-membered ring? Is acetylene a hydrocarbon? Does benzaldehyde contain a benzene ring? 4
  • 21. AUTOMATE C HEMICAL C LASSIFICATION ChEBI is manually incremented Currently contains approx. 28,000 fully annotated entities Grows at a rate of ~1,500 entities per curator per year Biologically interesting entities possibly > 1,000,000 Each new molecule is subsumed by several chemical classes Is dinitrogen inorganic? Does cyclobutane contain a four-membered ring? Is acetylene a hydrocarbon? Does benzaldehyde contain a benzene ring? Speed up curating tasks with automated reasoning tools 4
  • 22. AUTOMATE C HEMICAL C LASSIFICATION ChEBI is manually incremented Currently contains approx. 28,000 fully annotated entities Grows at a rate of ~1,500 entities per curator per year Biologically interesting entities possibly > 1,000,000 Each new molecule is subsumed by several chemical classes Is dinitrogen inorganic? Yes Does cyclobutane contain a four-membered ring? Yes Is acetylene a hydrocarbon? Yes Does benzaldehyde contain a benzene ring? Yes Speed up curating tasks with automated reasoning tools 4
  • 23. (M IS )R EPRESENTING R INGS WITH OWL Chemical compounds with rings are highly frequent 5
  • 24. (M IS )R EPRESENTING R INGS WITH OWL Chemical compounds with rings are highly frequent Fundamental inability of OWL to represent cycles 5
  • 25. (M IS )R EPRESENTING R INGS WITH OWL Chemical compounds with rings are highly frequent Fundamental inability of OWL to represent cycles At least one tree-shaped model for each consistent OWL knowledge base 5
  • 26. (M IS )R EPRESENTING R INGS WITH OWL Chemical compounds with rings are highly frequent Fundamental inability of OWL to represent cycles At least one tree-shaped model for each consistent OWL knowledge base E XAMPLE Cyclobutane ∃(= 4)hasAtom.(Carbon ∃(= 2)hasBond.Carbon) C C C C 5
  • 27. (M IS )R EPRESENTING R INGS WITH OWL Chemical compounds with rings are highly frequent Fundamental inability of OWL to represent cycles At least one tree-shaped model for each consistent OWL knowledge base E XAMPLE Cyclobutane ∃(= 4)hasAtom.(Carbon ∃(= 2)hasBond.Carbon) C C C C 5
  • 28. (M IS )R EPRESENTING R INGS WITH OWL Chemical compounds with rings are highly frequent Fundamental inability of OWL to represent cycles At least one tree-shaped model for each consistent OWL knowledge base E XAMPLE Cyclobutane ∃(= 4)hasAtom.(Carbon ∃(= 2)hasBond.Carbon) C C C C 5
  • 29. (M IS )R EPRESENTING R INGS WITH OWL Chemical compounds with rings are highly frequent Fundamental inability of OWL to represent cycles At least one tree-shaped model for each consistent OWL knowledge base E XAMPLE Cyclobutane ∃(= 4)hasAtom.(Carbon ∃(= 2)hasBond.Carbon) C C C C OWL-based reasoning support 5
  • 30. (M IS )R EPRESENTING R INGS WITH OWL Chemical compounds with rings are highly frequent Fundamental inability of OWL to represent cycles At least one tree-shaped model for each consistent OWL knowledge base E XAMPLE Cyclobutane ∃(= 4)hasAtom.(Carbon ∃(= 2)hasBond.Carbon) C C C C OWL-based reasoning support Does cyclobutane contain a four-membered ring? Does benzaldehyde contain a benzene ring? 5
  • 31. OWL E XTENSIONS Limitation of OWL to represent cycles (partially) remedied by extension of OWL with Description Graphs and rules [Motik et al., 2009] 6
  • 32. OWL E XTENSIONS Limitation of OWL to represent cycles (partially) remedied by extension of OWL with Description Graphs and rules [Motik et al., 2009] A Description Graph represents structures by means of a directed labeled graph 6
  • 33. OWL E XTENSIONS Limitation of OWL to represent cycles (partially) remedied by extension of OWL with Description Graphs and rules [Motik et al., 2009] A Description Graph represents structures by means of a directed labeled graph E XAMPLE Cyclobutadiene 1 C C Carbon 2 3 Carbon C C Carbon 5 4 Carbon 6
  • 34. OWL E XTENSIONS Limitation of OWL to represent cycles (partially) remedied by extension of OWL with Description Graphs and rules [Motik et al., 2009] A Description Graph represents structures by means of a directed labeled graph E XAMPLE Cyclobutadiene 1 C C Carbon 2 3 Carbon C C Carbon 5 4 Carbon 6
  • 35. OWL E XTENSIONS Limitation of OWL to represent cycles (partially) remedied by extension of OWL with Description Graphs and rules [Motik et al., 2009] A Description Graph represents structures by means of a directed labeled graph E XAMPLE Cyclobutadiene 1 C C Carbon 2 3 Carbon C C Carbon 5 4 Carbon Does cyclobutadiene have a conjugated four-membered ring? 6
  • 36. OWL E XTENSIONS Limitation of OWL to represent cycles (partially) remedied by extension of OWL with Description Graphs and rules [Motik et al., 2009] A Description Graph represents structures by means of a directed labeled graph E XAMPLE Cyclobutadiene 1 C C Carbon 2 3 Carbon C C Carbon 5 4 Carbon Does cyclobutadiene have a conjugated four-membered ring? 6
  • 37. OWL E XTENSIONS Limitation of OWL to represent cycles (partially) remedied by extension of OWL with Description Graphs and rules [Motik et al., 2009] A Description Graph represents structures by means of a directed labeled graph E XAMPLE Cyclobutadiene 1 Oxygen C C Carbon 2 3 Carbon C C Carbon 5 4 Carbon 6
  • 38. OWL E XTENSIONS Limitation of OWL to represent cycles (partially) remedied by extension of OWL with Description Graphs and rules [Motik et al., 2009] A Description Graph represents structures by means of a directed labeled graph E XAMPLE Cyclobutadiene 1 Oxygen C C Carbon 2 3 Carbon C C Carbon 5 4 Carbon ∀hasAtom.(Carbon Hydrogen) Hydrocarbon 6
  • 39. OWL E XTENSIONS Limitation of OWL to represent cycles (partially) remedied by extension of OWL with Description Graphs and rules [Motik et al., 2009] A Description Graph represents structures by means of a directed labeled graph E XAMPLE Cyclobutadiene 1 Oxygen C C Carbon 2 3 Carbon C C Carbon 5 4 Carbon ∀hasAtom.(Carbon Hydrogen) Hydrocarbon Is cyclobutadiene a hydrocarbon? 6
  • 40. OWL E XTENSIONS Limitation of OWL to represent cycles (partially) remedied by extension of OWL with Description Graphs and rules [Motik et al., 2009] A Description Graph represents structures by means of a directed labeled graph E XAMPLE Cyclobutadiene 1 Oxygen C C Carbon 2 3 Carbon C C Carbon 5 4 Carbon ∀hasAtom.(Carbon Hydrogen) Hydrocarbon Is cyclobutadiene a hydrocarbon? 6
  • 41. R ESULTS OVERVIEW Key idea: 7
  • 42. R ESULTS OVERVIEW Key idea: Switch from first-order logic to logic programming semantics 7
  • 43. R ESULTS OVERVIEW Key idea: Switch from first-order logic to logic programming semantics Replace classical negation with negation-as-failure 7
  • 44. R ESULTS OVERVIEW Key idea: Switch from first-order logic to logic programming semantics Replace classical negation with negation-as-failure Double benefit: 7
  • 45. R ESULTS OVERVIEW Key idea: Switch from first-order logic to logic programming semantics Replace classical negation with negation-as-failure Double benefit: 1 Encode chemical classes based on the absence of information 7
  • 46. R ESULTS OVERVIEW Key idea: Switch from first-order logic to logic programming semantics Replace classical negation with negation-as-failure Double benefit: 1 Encode chemical classes based on the absence of information (e.g. hydrocarbons, inorganic molecules, saturated compounds,. . . ) 7
  • 47. R ESULTS OVERVIEW Key idea: Switch from first-order logic to logic programming semantics Replace classical negation with negation-as-failure Double benefit: 1 Encode chemical classes based on the absence of information (e.g. hydrocarbons, inorganic molecules, saturated compounds,. . . ) 2 Represent rings with adequate precision 7
  • 48. R ESULTS OVERVIEW Key idea: Switch from first-order logic to logic programming semantics Replace classical negation with negation-as-failure Double benefit: 1 Encode chemical classes based on the absence of information (e.g. hydrocarbons, inorganic molecules, saturated compounds,. . . ) 2 Represent rings with adequate precision (e.g. cycloalkanes, benzene rings,. . . ) 7
  • 49. R ESULTS OVERVIEW Key idea: Switch from first-order logic to logic programming semantics Replace classical negation with negation-as-failure Double benefit: 1 Encode chemical classes based on the absence of information (e.g. hydrocarbons, inorganic molecules, saturated compounds,. . . ) 2 Represent rings with adequate precision (e.g. cycloalkanes, benzene rings,. . . ) Expressive decidable logic-based formalism for modelling structured entities: Description Graph Logic Programs (DGLPs) 7
  • 50. R ESULTS OVERVIEW Key idea: Switch from first-order logic to logic programming semantics Replace classical negation with negation-as-failure Double benefit: 1 Encode chemical classes based on the absence of information (e.g. hydrocarbons, inorganic molecules, saturated compounds,. . . ) 2 Represent rings with adequate precision (e.g. cycloalkanes, benzene rings,. . . ) Expressive decidable logic-based formalism for modelling structured entities: Description Graph Logic Programs (DGLPs) Prototypical implementation of a logic-based chemical classification software 7
  • 51. O UTLINE 1 M OTIVATION 2 DGLP S , I MPLEMENTATION AND OVERVIEW 8
  • 52. W HAT IS A DGLP O NTOLOGY ? The syntactic objects of a DGLP ontology: 9
  • 53. W HAT IS A DGLP O NTOLOGY ? The syntactic objects of a DGLP ontology: Description graphs 9
  • 54. W HAT IS A DGLP O NTOLOGY ? The syntactic objects of a DGLP ontology: Description graphs E XAMPLE Cyclobutane 1 O O Dioxygen 1 C C Carbon 2 3 Carbon C C Carbon 5 4 Carbon Oxygen 2 3 Oxygen 9
  • 55. W HAT IS A DGLP O NTOLOGY ? The syntactic objects of a DGLP ontology: Description graphs Function-free FOL Horn rules 9
  • 56. W HAT IS A DGLP O NTOLOGY ? The syntactic objects of a DGLP ontology: Description graphs Function-free FOL Horn rules E XAMPLE Bond(x, y) → Bond(y, x) SingleBond(x, y) → Bond(x, y) 9
  • 57. W HAT IS A DGLP O NTOLOGY ? The syntactic objects of a DGLP ontology: Description graphs Function-free FOL Horn rules E XAMPLE Bond(x, y) → Bond(y, x) SingleBond(x, y) → Bond(x, y) Rules with negation-as-failure 9
  • 58. W HAT IS A DGLP O NTOLOGY ? The syntactic objects of a DGLP ontology: Description graphs Function-free FOL Horn rules E XAMPLE Bond(x, y) → Bond(y, x) SingleBond(x, y) → Bond(x, y) Rules with negation-as-failure E XAMPLE HasAtom(x, y) ∧ Carbon(y) → HasCarbon(x) Molecule(x)∧ not HasCarbon(x) → Inorganic(x) 9
  • 59. W HAT IS A DGLP O NTOLOGY ? The syntactic objects of a DGLP ontology: Description graphs Function-free FOL Horn rules E XAMPLE Bond(x, y) → Bond(y, x) SingleBond(x, y) → Bond(x, y) Rules with negation-as-failure E XAMPLE HasAtom(x, y) ∧ Carbon(y) → HasCarbon(x) Molecule(x)∧ not HasCarbon(x) → Inorganic(x) Facts 9
  • 60. W HAT IS A DGLP O NTOLOGY ? The syntactic objects of a DGLP ontology: Description graphs Function-free FOL Horn rules E XAMPLE Bond(x, y) → Bond(y, x) SingleBond(x, y) → Bond(x, y) Rules with negation-as-failure E XAMPLE HasAtom(x, y) ∧ Carbon(y) → HasCarbon(x) Molecule(x)∧ not HasCarbon(x) → Inorganic(x) Facts E XAMPLE Cyclobutane(c1 ), Dinitrogen(c2 ), . . . 9
  • 61. E NCODING D ESCRIPTION G RAPHS Translate DGs into logic programs with function symbols 10
  • 62. E NCODING D ESCRIPTION G RAPHS Translate DGs into logic programs with function symbols E XAMPLE 10
  • 63. E NCODING D ESCRIPTION G RAPHS Translate DGs into logic programs with function symbols E XAMPLE Cyclobutane(x) →Gcb (x, f1 (x), f2 (x), f3 (x), f4 (x)) Gcb (x, y1 , y2 , y3 , y4 ) →Cyclobutane(x) ∧ Carbon(y1 ) ∧ Carbon(y2 ) ∧ Carbon(y3 ) ∧ Carbon(y4 ) ∧ HasAtom(x, y1 ) ∧ Bond(y1 , y2 ) ∧ HasAtom(x, y2 ) ∧ Bond(y2 , y3 ) ∧ HasAtom(x, y3 ) ∧ Bond(y3 , y4 ) ∧ HasAtom(x, y4 ) ∧ Bond(y4 , y1 ) 10
  • 64. E NCODING D ESCRIPTION G RAPHS Translate DGs into logic programs with function symbols E XAMPLE Cyclobutane(x) →Gcb (x, f1 (x), f2 (x), f3 (x), f4 (x)) Gcb (x, y1 , y2 , y3 , y4 ) →Cyclobutane(x) ∧ Carbon(y1 ) ∧ Carbon(y2 ) ∧ Carbon(y3 ) ∧ Carbon(y4 ) ∧ HasAtom(x, y1 ) ∧ Bond(y1 , y2 ) ∧ HasAtom(x, y2 ) ∧ Bond(y2 , y3 ) ∧ HasAtom(x, y3 ) ∧ Bond(y3 , y4 ) ∧ HasAtom(x, y4 ) ∧ Bond(y4 , y1 ) Function symbols allow for schema-level reasoning 10
  • 65. C LASSIFYING C HEMICALS E XAMPLE Molecule(x) ∧ HasAtom(x, y) ∧ not Carbon(y) ∧ not Hydrogen(y) → NotHydroCarbon(x) Molecule(x) ∧ not NotHydroCarbon(x) → HydroCarbon(x) 11
  • 66. C LASSIFYING C HEMICALS E XAMPLE Molecule(x) ∧ HasAtom(x, y) ∧ not Carbon(y) ∧ not Hydrogen(y) → NotHydroCarbon(x) Molecule(x) ∧ not NotHydroCarbon(x) → HydroCarbon(x) C C C C 11
  • 67. C LASSIFYING C HEMICALS E XAMPLE Molecule(x) ∧ HasAtom(x, y) ∧ not Carbon(y) ∧ not Hydrogen(y) → NotHydroCarbon(x) Molecule(x) ∧ not NotHydroCarbon(x) → HydroCarbon(x) Is cyclobutane a C C hydrocarbon? C C 11
  • 68. C LASSIFYING C HEMICALS E XAMPLE Molecule(x) ∧ HasAtom(x, yi ) ∧ Bond(yi , yi+1 ) ∧ 1≤i≤4 1≤i≤3 Bond(y4 , y1 ) not yi = yj 1≤ij≤4 → MoleculeWith4MemberedRing(x) 12
  • 69. C LASSIFYING C HEMICALS E XAMPLE Molecule(x) ∧ HasAtom(x, yi ) ∧ Bond(yi , yi+1 ) ∧ 1≤i≤4 1≤i≤3 Bond(y4 , y1 ) not yi = yj 1≤ij≤4 → MoleculeWith4MemberedRing(x) C C C C 12
  • 70. C LASSIFYING C HEMICALS E XAMPLE Molecule(x) ∧ HasAtom(x, yi ) ∧ Bond(yi , yi+1 ) ∧ 1≤i≤4 1≤i≤3 Bond(y4 , y1 ) not yi = yj 1≤ij≤4 → MoleculeWith4MemberedRing(x) Does cyclobutane contain a C C four-membered ring? C C 12
  • 71. U NDECIDABILITY Logic programs with function symbols can axiomatise infinitely large structures 13
  • 72. U NDECIDABILITY Logic programs with function symbols can axiomatise infinitely large structures Reasoning with DGLP ontologies is trivially undecidable 13
  • 73. U NDECIDABILITY Logic programs with function symbols can axiomatise infinitely large structures Reasoning with DGLP ontologies is trivially undecidable We are only interested in finite structures 13
  • 74. U NDECIDABILITY Logic programs with function symbols can axiomatise infinitely large structures Reasoning with DGLP ontologies is trivially undecidable We are only interested in finite structures E XAMPLE Carboxyl O AceticAcid Carboxyl Carbonyl 1 1 C Methyl Hydroxyl 2 3 2 3 CH3 OH Methyl Carboxyl Carbonyl Hydroxyl 13
  • 75. ACYCLICITY C ONDITIONS Formalisms extensively studied, e.g. Datalog± 14
  • 76. ACYCLICITY C ONDITIONS Formalisms extensively studied, e.g. Datalog± Various syntax-based acyclicity conditions 14
  • 77. ACYCLICITY C ONDITIONS Formalisms extensively studied, e.g. Datalog± Various syntax-based acyclicity conditions weak acyclicity [Fagin et al., ICDT, 2002] super-weak acyclicity [Marnette, PODS, 2009] joint acyclicity [Krötzsch and Rudolph, IJCAI, 2011] 14
  • 78. ACYCLICITY C ONDITIONS Formalisms extensively studied, e.g. Datalog± Various syntax-based acyclicity conditions weak acyclicity [Fagin et al., ICDT, 2002] super-weak acyclicity [Marnette, PODS, 2009] joint acyclicity [Krötzsch and Rudolph, IJCAI, 2011] rule out naturally-arising nested structures 14
  • 79. ACYCLICITY C ONDITIONS Formalisms extensively studied, e.g. Datalog± Various syntax-based acyclicity conditions weak acyclicity [Fagin et al., ICDT, 2002] super-weak acyclicity [Marnette, PODS, 2009] joint acyclicity [Krötzsch and Rudolph, IJCAI, 2011] rule out naturally-arising nested structures E XAMPLE Carboxyl O AceticAcid Carboxyl Carbonyl 1 1 C Methyl Hydroxyl 2 3 2 3 CH3 OH Methyl Carboxyl Carbonyl Hydroxyl 14
  • 80. ACYCLICITY C ONDITIONS Formalisms extensively studied, e.g. Datalog± Various syntax-based acyclicity conditions weak acyclicity [Fagin et al., ICDT, 2002] super-weak acyclicity [Marnette, PODS, 2009] joint acyclicity [Krötzsch and Rudolph, IJCAI, 2011] rule out naturally-arising nested structures Novel semantic acyclicity condition 14
  • 81. ACYCLICITY C ONDITIONS Formalisms extensively studied, e.g. Datalog± Various syntax-based acyclicity conditions weak acyclicity [Fagin et al., ICDT, 2002] super-weak acyclicity [Marnette, PODS, 2009] joint acyclicity [Krötzsch and Rudolph, IJCAI, 2011] rule out naturally-arising nested structures Novel semantic acyclicity condition checks for generation of unbounded structures on the fly 14
  • 82. ACYCLICITY C ONDITIONS Formalisms extensively studied, e.g. Datalog± Various syntax-based acyclicity conditions weak acyclicity [Fagin et al., ICDT, 2002] super-weak acyclicity [Marnette, PODS, 2009] joint acyclicity [Krötzsch and Rudolph, IJCAI, 2011] rule out naturally-arising nested structures Novel semantic acyclicity condition checks for generation of unbounded structures on the fly modelling of molecules that contain functional groups 14
  • 83. ACYCLICITY C ONDITIONS Formalisms extensively studied, e.g. Datalog± Various syntax-based acyclicity conditions weak acyclicity [Fagin et al., ICDT, 2002] super-weak acyclicity [Marnette, PODS, 2009] joint acyclicity [Krötzsch and Rudolph, IJCAI, 2011] rule out naturally-arising nested structures Novel semantic acyclicity condition checks for generation of unbounded structures on the fly modelling of molecules that contain functional groups O Carbonyl C Methyl Hydroxyl CH3 OH Carboxyl 14
  • 84. E MPIRICAL E VALUATION Data extracted from ChEBI in Molfile format 15
  • 85. E MPIRICAL E VALUATION Data extracted from ChEBI in Molfile format XSB logic programming engine 15
  • 86. E MPIRICAL E VALUATION Data extracted from ChEBI in Molfile format XSB logic programming engine Chemical classes: 15
  • 87. E MPIRICAL E VALUATION Data extracted from ChEBI in Molfile format XSB logic programming engine Chemical classes: Hydrocarbons Inorganic molecules Molecules with exactly two carbons Molecules with a four-membered ring Molecules with a benzene 15
  • 88. E MPIRICAL E VALUATION Data extracted from ChEBI in Molfile format XSB logic programming engine Chemical classes: Hydrocarbons Inorganic molecules Molecules with exactly two carbons Molecules with a four-membered ring Molecules with a benzene Preliminary evaluation ranging from 10 to 70 molecules 15
  • 89. E MPIRICAL E VALUATION Data extracted from ChEBI in Molfile format XSB logic programming engine Chemical classes: Hydrocarbons Inorganic molecules Molecules with exactly two carbons Molecules with a four-membered ring Molecules with a benzene Preliminary evaluation ranging from 10 to 70 molecules Results: 15
  • 90. E MPIRICAL E VALUATION Data extracted from ChEBI in Molfile format XSB logic programming engine Chemical classes: Hydrocarbons Inorganic molecules Molecules with exactly two carbons Molecules with a four-membered ring Molecules with a benzene Preliminary evaluation ranging from 10 to 70 molecules Results: All DGLP ontologies were found acyclic 15
  • 91. E MPIRICAL E VALUATION Data extracted from ChEBI in Molfile format XSB logic programming engine Chemical classes: Hydrocarbons Inorganic molecules Molecules with exactly two carbons Molecules with a four-membered ring Molecules with a benzene Preliminary evaluation ranging from 10 to 70 molecules Results: All DGLP ontologies were found acyclic Molecules classified as expected 15
  • 92. E MPIRICAL E VALUATION Data extracted from ChEBI in Molfile format XSB logic programming engine Chemical classes: Hydrocarbons Inorganic molecules Molecules with exactly two carbons Molecules with a four-membered ring Molecules with a benzene Preliminary evaluation ranging from 10 to 70 molecules Results: All DGLP ontologies were found acyclic Molecules classified as expected Suite of subsumption tests for largest ontology performed in few minutes 15
  • 93. OVERVIEW AND F UTURE D IRECTIONS 1 Expressive and decidable formalism for representation of structured objects 16
  • 94. OVERVIEW AND F UTURE D IRECTIONS 1 Expressive and decidable formalism for representation of structured objects 2 Novel acyclicity condition for logic programs with restricted use of function symbols 16
  • 95. OVERVIEW AND F UTURE D IRECTIONS 1 Expressive and decidable formalism for representation of structured objects 2 Novel acyclicity condition for logic programs with restricted use of function symbols 3 Prototype for the automated classification of chemicals 16
  • 96. OVERVIEW AND F UTURE D IRECTIONS 1 Expressive and decidable formalism for representation of structured objects 2 Novel acyclicity condition for logic programs with restricted use of function symbols 3 Prototype for the automated classification of chemicals Is dinitrogen inorganic? Does cyclobutane contain a four-membered ring? Is acetylene a hydrocarbon? Does benzaldehyde contain a benzene ring? 16
  • 97. OVERVIEW AND F UTURE D IRECTIONS 1 Expressive and decidable formalism for representation of structured objects 2 Novel acyclicity condition for logic programs with restricted use of function symbols 3 Prototype for the automated classification of chemicals Is dinitrogen inorganic? Does cyclobutane contain a four-membered ring? Is acetylene a hydrocarbon? Does benzaldehyde contain a benzene ring? 16
  • 98. OVERVIEW AND F UTURE D IRECTIONS 1 Expressive and decidable formalism for representation of structured objects 2 Novel acyclicity condition for logic programs with restricted use of function symbols 3 Prototype for the automated classification of chemicals Is dinitrogen inorganic? Does cyclobutane contain a four-membered ring? Is acetylene a hydrocarbon? Does benzaldehyde contain a benzene ring? Future directions: 16
  • 99. OVERVIEW AND F UTURE D IRECTIONS 1 Expressive and decidable formalism for representation of structured objects 2 Novel acyclicity condition for logic programs with restricted use of function symbols 3 Prototype for the automated classification of chemicals Is dinitrogen inorganic? Does cyclobutane contain a four-membered ring? Is acetylene a hydrocarbon? Does benzaldehyde contain a benzene ring? Future directions: Datalog rules with existentials in the head 16
  • 100. OVERVIEW AND F UTURE D IRECTIONS 1 Expressive and decidable formalism for representation of structured objects 2 Novel acyclicity condition for logic programs with restricted use of function symbols 3 Prototype for the automated classification of chemicals Is dinitrogen inorganic? Does cyclobutane contain a four-membered ring? Is acetylene a hydrocarbon? Does benzaldehyde contain a benzene ring? Future directions: Datalog rules with existentials in the head User-friendly surface syntax 16
  • 101. OVERVIEW AND F UTURE D IRECTIONS 1 Expressive and decidable formalism for representation of structured objects 2 Novel acyclicity condition for logic programs with restricted use of function symbols 3 Prototype for the automated classification of chemicals Is dinitrogen inorganic? Does cyclobutane contain a four-membered ring? Is acetylene a hydrocarbon? Does benzaldehyde contain a benzene ring? Future directions: Datalog rules with existentials in the head User-friendly surface syntax Fully-fledged chemical classification system 16
  • 102. OVERVIEW AND F UTURE D IRECTIONS 1 Expressive and decidable formalism for representation of structured objects 2 Novel acyclicity condition for logic programs with restricted use of function symbols 3 Prototype for the automated classification of chemicals Is dinitrogen inorganic? Does cyclobutane contain a four-membered ring? Is acetylene a hydrocarbon? Does benzaldehyde contain a benzene ring? Future directions: Datalog rules with existentials in the head User-friendly surface syntax Fully-fledged chemical classification system Thank you for listening. Questions? 16