SlideShare a Scribd company logo
AUTOMATED STRUCTURE-BASED
CLASSIFICATION USING ChEBI
ONTOLOGY
Venkatesh Muthukrishnan
Software Engineer - ChEBI
CLASSIFICATION IN ChEBI
CHALLENGES WITH
MANUAL CLASSIFICATION
INCOMPLETE
& INCONSISTENT …
BLOCKS BULK LOADING …
http://sourceforge.net/p/chebi/news/2012/11/chebi-release-97-is-now-available/
STRUCTURE BASED
AUTO-CLASSIFICATION
PREVIOUS APPROACHES
SOCO
SELF-ORGANISING
CHEMICAL ONTOLOGIES
SMARTS & OWL
Chepelev et al. BMC Bioinformatics 2012 13:3 doi:10.1186/1471-2105-13-3
PROPOSED APPROACH
SCHEMA
DEFINITIONS
eg: ketone organic_molecular_entity and has_part some acetone
RANKED
DEFINITIONS
dicarboxylic acid dianion organic_molecular_entity and
has_part exactly 2 acetate and has_charge value "-2"^int
flavonoid organic_molecular_entity and has_skeleton some
flavones
benzoquinones organic_molecular_entity and ( has_part
some 1,2-benzoquinone or has_part some 1,4-
benzoquinone )
gamma-lactam organic_molecular_entity and has_part
some pyrrolidin-2-one and not ( has_part some
succinimide )
WHY MANUAL
DEFINITION
GENERATION
NO MCS
https://github.com/downloads/asad/SMSD/SMS
D20120718.zip
http://dalkescientific.com/writings/diary/archi
ve/2012/05/13/mcs_chebi.html
CHALLENGING CLASSES
Structure based auto-classification using ChEBI ontology
PLANNED INTEGRATION
 INTERNAL DATA LOADING
 SUBMISSION TOOL & CURATOR TOOL
ACKNOWLEDGEMENT
S COLLABORATORS
Colin Batchelor, RSC
Lian Duan, ETH
Leonid Chepelev, Ottawa
Michel Dumontier, Stanford
Despoina Magka, Oxford
FUNDING
BBSRC “Continued development of
ChEBI towards better usability for the
systems biology and metabolic
modelling communities” BB/K019783/1
THANKYOU

More Related Content

PPT
Kamill men handcreme
PPT
найдин (Nx power lite)
PPTX
Presentazione esperienza PSOF
PPTX
burnushychem2
PDF
дозирующее оборудование
PPTX
Recording Mess Ups!
XLS
Copia de pmf problem solver
PPT
E-learning e Information Retrieval
Kamill men handcreme
найдин (Nx power lite)
Presentazione esperienza PSOF
burnushychem2
дозирующее оборудование
Recording Mess Ups!
Copia de pmf problem solver
E-learning e Information Retrieval

Viewers also liked (9)

PPTX
Kynlíf og krabbamein_kynning_2011_
PDF
Apprendimento On Line
PPTX
Il sistema binario
PPTX
Informatica Concetti Di Base - prima parte
PPTX
Blog diagnóstico
PPTX
Blog diagnóstico
PPT
Softwarelibre
PPT
Презентация по BurnusHychem
PPTX
презентация1
Kynlíf og krabbamein_kynning_2011_
Apprendimento On Line
Il sistema binario
Informatica Concetti Di Base - prima parte
Blog diagnóstico
Blog diagnóstico
Softwarelibre
Презентация по BurnusHychem
презентация1
Ad

Recently uploaded (20)

PDF
A novel scalable deep ensemble learning framework for big data classification...
PPTX
The various Industrial Revolutions .pptx
PDF
DP Operators-handbook-extract for the Mautical Institute
PDF
Developing a website for English-speaking practice to English as a foreign la...
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
PDF
A Late Bloomer's Guide to GenAI: Ethics, Bias, and Effective Prompting - Boha...
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
PDF
Hybrid model detection and classification of lung cancer
PDF
1 - Historical Antecedents, Social Consideration.pdf
PDF
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
PPT
What is a Computer? Input Devices /output devices
PDF
Architecture types and enterprise applications.pdf
PDF
August Patch Tuesday
PDF
NewMind AI Weekly Chronicles – August ’25 Week III
PDF
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
PDF
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
PPTX
Final SEM Unit 1 for mit wpu at pune .pptx
PDF
Hindi spoken digit analysis for native and non-native speakers
A novel scalable deep ensemble learning framework for big data classification...
The various Industrial Revolutions .pptx
DP Operators-handbook-extract for the Mautical Institute
Developing a website for English-speaking practice to English as a foreign la...
Group 1 Presentation -Planning and Decision Making .pptx
Univ-Connecticut-ChatGPT-Presentaion.pdf
A Late Bloomer's Guide to GenAI: Ethics, Bias, and Effective Prompting - Boha...
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
Hybrid model detection and classification of lung cancer
1 - Historical Antecedents, Social Consideration.pdf
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
What is a Computer? Input Devices /output devices
Architecture types and enterprise applications.pdf
August Patch Tuesday
NewMind AI Weekly Chronicles – August ’25 Week III
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
Final SEM Unit 1 for mit wpu at pune .pptx
Hindi spoken digit analysis for native and non-native speakers
Ad

Structure based auto-classification using ChEBI ontology

Editor's Notes

  • #2: Chebi ontology has 3 sub-ontologies. Namely, role, subatomic particle and chemical ontology In this talk, I will be focusing only on the chemical ontology. An ontology that captures the structural features hierarchically.
  • #3: This is an example entry for a structural classification. By looking at the graph from top to bottom we can describe few structural features for caffeine. It is certain that caffeine has at least two cycles as polycyclic compound. narrowing down that it contains only two cycles & hetero atoms as hetero bicyclic compound. An imidazopyrimidine - a 6 ring containing two nitrogens fused to 5 ring containing two nitrogens methylxanthine - imidazopyrimidine with two ketones in 6 membraned ring.
  • #4: what are the challenges that are expected with manual classification of structures ?
  • #7: As a result we a need a auto-classification tool that would help us to identify and correct these consistencies of ontology. And also allows us to bulk load of structures. As result we could speed up the curation process and make an consistent ontology.
  • #8: SMiles ARbitrary Target Specification (SMARTS) Web ontology language (OWL) Fragmentation based approach where it captures the structural features hierarchically in SMARTS and uses owl to classify No support for negation Only “min” counting supported, not max or exactly. Thus, a dicarboxylic acid is a monocarboxylic acid SMARTS is powerful – but not very human-readable notations. Can we do better at making definitions accessible?
  • #9: So the new proposed approach is to make this definitions human friendly. So any chemically intelligent person can validate this definitions without proper computer knowledge.
  • #10: In this approach, the structural features are encoded in the owl definitions. As in this example we say a basic functional group ketone contains a structure of acetone. These owl definitions are parsed and converted in to chemoinformatics definitions. That are matched against the unclassified structures. As a result the structure is classified under highly ranked structural features.
  • #12: These definitions are manually generated to make it more sensible. As an initial exercise MCS was used to extract the structural features to generate definitions.
  • #13: In this example class benzoquinone, we have two different substituents and one is more dominant. This is the mcs result for benzoquinones by RDkit & SMSD. This makes the automatic definition generation tricky when there is multiple definitions because of substituents or ring size and so on.