SlideShare a Scribd company logo
Open Babel
               Access and interconvert chemical
               information


                       Noel M. O’Boyle
Open Babel development team and NextMove Software, Cambridge, UK




                           Nov 2012
                       Secret UK Location
Image credit: AJ Cann (AJC1 on Flickr)
Image credit: Jon Osborne (jonno101101 on Flickr)
• Volunteer effort, an open source success story
   – Originally a fork from OpenEye’s OELib in 2001
   – Lead is Geoff Hutchison (Uni of Pittsburgh)
   – 4 or 5 active developers – I got involved in late 2005

• http://openbabel.org
• Associated paper: (Open Access)
   – Open Babel: An open chemical toolbox, J. Cheminf., 2011, 3,
     33.
5

     Does anyone else use Open Babel?




• 40K downloads (from SF) in last 12 months
   – 1.4K downloads of Windows Python bindings
• Paper #1 most accessed in last year
   – Cited 60 times in 1 year
• In short, very widely-used
Features
• Multiple chemical file formats (+ options) and utility
  formats
• 2D coordinate generation and depiction (PNG and
  SVG)
• 3D coordinate generation, forcefield minimisation,
  conformer generation
• Binary fingerprints (path-based, substructure-
  based) and associated “fast search” database
• Bond perception, aromaticity detection and atom-
  typing
• Canonical labelling, automorphisms, alignment
• Plugin architecture
• Several command-line applications, but also a
  software library
• Written in C++ but bindings in several languages
obabel and file conversion
• Basic usage:
  obabel infile.extn –O outfile.extn


• Can also read from stdin, write to stdout, read
  from a SMILES string, specify the input and
  output file formats, specify conversion
  options, and format specific options
  – Or ask for help (obabel –H)…online docs better!


• Note: obabel has replaced the older babel
Conversion options
• Handle multimolecule files
    join/m, sort, C
• Handle multicomponent molecules
    r, separate
• Filter
    filter, smallest/largest, s/v, f/l, unique
• Manipulate structure or atom order
    addpolarh, align, b, c, canonical, d, h, gen2d/3d
• Forcefield
    minimize, conformer, energy
• Conformers
    readconformer, writeconformers
• Manipulate SDF properties and title
    add, addfilename, addindex, addoutindex, addtotitle, append, delete,
    property, title


See http://openbabel.org/docs
File-format options
• Particular file formats may have their own
  specific input or output options
   – To provide or handle different flavours of the file
     format
   – To specify additional information to include
   – To provide additional functionality

• Options are listed in the help text for a format
  (see next slides)

• To use:
   – specify read options with –a (e.g. –ar)
   – specify write options with –x (e.g. –xi)
Intro to Open Babel
Intro to Open Babel
Intro to Open Babel
Intro to Open Babel
SMILES output options

                                                                    1. Add explicit Hs
Note that atom order is    > obabel -:CC(=O)Cl        –osmi         2. Show them in the
preserved                  CC(=O)Cl                                 output
                           > obabel -:CC(=O)Cl        –osmi –xh -h
  Make atom 3 the first
                           [CH3]C(=O)Cl
  atom…                    > obabel -:CC(=O)Cl        -osmi -xf 3
                           O=C(C)Cl
  …and atom 1 the          > obabel -:CC(=O)Cl        -osmi -xf 3 –xl 1
  last                     O=C(Cl)C
                           > obabel -:CC(=O)Cl        -:CC(=O)Cl -osmi -xC
                           ClC(=O)C                                 Random order
                           O=C(Cl)C
                           > obabel -:CC(=O)Cl        -osmi -xF "2 4"
                           CCl
                            Fragment SMILES for the fragment
                            composed of atoms 2 and 4

    Take home message: Look through the list of options for file formats
    which you frequently use (and request new options!)
Pro tip #1 “obabel –L” is your friend




Information on plugins and plugin options.
Pro tip #1 “obabel –L” is your friend




Information on plugins and plugin options.
Pro tip #1 “obabel –L” is your friend




Information on plugins and plugin options.
Pro tip #1 “obabel –L” is your friend




Information on plugins and plugin options.
Pro tip #1 “obabel –L” is your friend




Information on plugins and plugin options.
Pro tip #1 “obabel –L” is your friend




Information on plugins and plugin options.
What can be done with descriptors and SDF
                properties?

• Filter based on value or True/False
   --filter "MW<130 & My_Property < 12"
• Sort and reverse sort --sort ~logP
• Take the N largest or smallest (or everything but)
   --largest 5 MW
• Add SDF properties --add MW
• Add to title (useful for depictions) --addtotitle MW
• Remove duplicates --unique cansmi

• Create more descriptors!
   – Group contribution, SMARTS descriptors or compound
     descriptors are easily added via text files*


   * http://open-babel.readthedocs.org/en/latest/WritePlugins/AddNewDescriptor.html
Pro Tip #2 Faster filtering




                      Also –aP if filtering based
                      on SDF properties
Pro tip #3 (Ab)use the title output format

• obabel myfile.sdf –o txt
   – List the titles of all of the molecules

• obabel myfile.sdf –otxt --title “” --append MW
   – List the molecular weights of all of the molecules

• obabel myfile.sdf –otxt --title “” --append
  My_Property
   – List the property value for all of the molecules
PNG Depiction
PNG Depiction



                                      -xC




-xa                                    -xt




-xu
                  --highlight “cCO blue”
Ascii Depiction
Pro Tip #4 SVG + Firefox = User interface
• SVG has same options as PNG…
• …but drag-and-drop onto Firefox and you have
  a zoomable user interface
   – particularly useful for visualising multimolecule
     files
   – Demo showing a 1000 molecule file (only 3MB):
      http://baoilleach.blogspot.co.uk/2011/06/molecular-zooming-with-open-babel-svg.html

• You could create a navigation interface for an
  entire database (sponsorship opportunity!)
   – E.g. make each of 1000 molecules link to another
     SVG with 1000 molecules
• Multimolecule depictions can be aligned based
  on substructure (also PNG)
   – Demo: http://baoilleach.blogspot.co.uk/2012/02/portrait-of-molecule-as-green.html
Intro to Open Babel
Pro Tip #5 Automatic conversion
On Windows, create a file sdf.bat on your
Desktop with the following text:
     @obabel.exe %1 –O "%~ndp1.%~n0"


If you drag-and-drop a chemical file onto
this, the file will be converted to an SDF
file.

(Rename to mol2.bat for mol2 files, etc.)
Alignment
• Open Babel does not have any code to
  determine the maximum common substructure
  (MCS)
   – Sponsorship opportunity ahoy!

• 2D and 3D alignment is supported –align
   – Based on Kabsch alignment (minimised RMSD)
   – You either have to align the whole molecule
     (atoms should be in same order) or else a
     specified substructure (SMARTS)

• When aligning 3D structures I find it useful to --
  join the results into a single structure and view
  in 3D viewer (e.g. Avogadro)
Spectrophores
•   Donated by Silicos-it, http://silicos-it.com/
•   Usage: obspectrophore –i myfile.extn
•   Requires 3D structure
     – Note: it does not complain if you give it a 2D structure
     – 3D conformation dependent, but orientation independent

•   48-value descriptor based on electrostatic, lipophilic and electrophilic
    property values at points on a grid (or cage) and the atomic shape
    deviation
Spectrophores
•   Donated by Silicos-it, http://silicos-it.com/
•   Usage: obspectrophore –i myfile.extn
•   Requires 3D structure
     – Note: it does not complain if you give it a 2D structure
     – 3D conformation dependent, but orientation independent

•   48-value descriptor based on electrostatic, lipophilic and electrophilic
    property values at points on a grid (or cage) and the atomic shape
    deviation



•   Custom code require to use spectrophores for similarity
•   Silicos-it have previously trained Self-Organising Maps (SOMs) using
    spectrophores for known classes of compounds and used them to
    predict novel compounds for a particular class
Progamming with Open Babel
• Sometimes the GUI or command-line interface does not do
  exactly what you want
    – You can write your own applications or scripts

• Choice of C++, Python, Java, .NET, Perl
    – But C++ and Python best supported

• Python is well-established in chemistry
    – Relatively easy to learn
    – Small number of commands
    – Can do a lot in a few lines

• Since the full Open Babel library is quite large, to make it easy
  to get started we provide a Python module Pybel
    – Makes it easy to do the most common operations
    – Very small number of classes and functions
    – The full library is still available under-the-hood

• Google “Open Babel Python”
Using the Python Bindings
import pybel

# Read a molecule
inputfile = pybel.readfile(“mol”, “tmp.mol”)
mol = next(inputfile)

print(mol.molwt) # Show molecular weight
Using the Python Bindings
import pybel

# Loop over multiple molecules
inputfile = pybel.readfile(“sdf”, “tmp.sdf”)
for mol in inputfile:
       # Show molecular weight
       print(mol.molwt)
Using the Python Bindings
import pybel

# Loop over multiple molecules
inputfile = pybel.readfile(“sdf”, “tmp.sdf”)
for mol in inputfile:
       if (mol.title.endswith(“_active”) and
           mol.wt > 100 and “S” in mol.formula):
             # Show molecular weight
             print(mol.molwt)
Using the Python Bindings
import pybel

# Loop over multiple molecules
inputfile = pybel.readfile(“sdf”, “tmp.sdf”)
outputfile = pybel.Outputfile(“smi”, “tmp.smi”)
for mol in inputfile:
       if (mol.title.endswith(“_active”) and
           mol.wt > 100 and “S” in mol.formula):
             # Add the molecule to the output file
             outputfile.write(mol)
Learn by playing at the command-line
A cry for help
Like mailing lists?
   openbabel-
   discuss@lists.sf.net
Like forums?
   http://forums.openbabel.org
Like to email a developer
directly?
   We will ask you to email the
   list :-)


Don’t forget to read the
docs first and Google it                   Image: Tintin44 (Flickr)

   http://openbabel.org/docs

More Related Content

PPTX
We need to talk about Kekulization, Aromaticity and SMILES
PPTX
Molecular modelling
PPTX
Gaussian presentation
PPTX
Methodology of organic synthesis
PPTX
Chemical File Formats for storing chemical data
PPTX
Metal ion transport and storage
PPTX
MoFs (METAL-ORGANIC FRAMEWORKS)
PPTX
Sensing of volatile organic compounds by MOFs
We need to talk about Kekulization, Aromaticity and SMILES
Molecular modelling
Gaussian presentation
Methodology of organic synthesis
Chemical File Formats for storing chemical data
Metal ion transport and storage
MoFs (METAL-ORGANIC FRAMEWORKS)
Sensing of volatile organic compounds by MOFs

What's hot (20)

PPTX
Solid state NMR Princple and application .
PPTX
Small interfering RNA (SI RNA)
PDF
Computational Drug Discovery: Machine Learning for Making Sense of Big Data i...
PPTX
Organoborane or Organoboron compounds
PPTX
Transition in ultraviolet and visible light
PPTX
Agents acting on HIV Protease enzyme
DOCX
Organic Synthesis:
PPTX
Unit 1.1(Molecular Orbital Theory)
PPTX
Energy minimization
PPTX
Two dimensional nmr
PPTX
Supramolecular host and guest design ppt
PPTX
Quantum Mechanics in Molecular modeling
PPTX
Metal organic Frameworks for sensor application
PPTX
Charge transfer- color of the complexes
PDF
Dept nmr
PPT
Molecular maodeling and drug design
PPTX
Nuclear overhauser effect
PPTX
PPTX
ION EXCLUSION CHROMATOGRAPHY
PPT
Amination
Solid state NMR Princple and application .
Small interfering RNA (SI RNA)
Computational Drug Discovery: Machine Learning for Making Sense of Big Data i...
Organoborane or Organoboron compounds
Transition in ultraviolet and visible light
Agents acting on HIV Protease enzyme
Organic Synthesis:
Unit 1.1(Molecular Orbital Theory)
Energy minimization
Two dimensional nmr
Supramolecular host and guest design ppt
Quantum Mechanics in Molecular modeling
Metal organic Frameworks for sensor application
Charge transfer- color of the complexes
Dept nmr
Molecular maodeling and drug design
Nuclear overhauser effect
ION EXCLUSION CHROMATOGRAPHY
Amination
Ad

Viewers also liked (11)

PPTX
Design your own test automation tool
PPTX
molecular file formats in bioinformatics
PPT
Biological databases
PPT
Sequence file formats
PPTX
BIOLOGICAL SEQUENCE DATABASES
PPTX
sequence of file formats in bioinformatics
PPT
Biological databases
PPTX
databases in bioinformatics
PPTX
Biological databases
PPT
Biological Databases
PPT
Biological databases
Design your own test automation tool
molecular file formats in bioinformatics
Biological databases
Sequence file formats
BIOLOGICAL SEQUENCE DATABASES
sequence of file formats in bioinformatics
Biological databases
databases in bioinformatics
Biological databases
Biological Databases
Biological databases
Ad

Similar to Intro to Open Babel (20)

PPTX
What's New and Cooking in Open Babel 2.3.2
PDF
Querying Cultural Heritage
PDF
Mon norton tut_querying cultural heritage data
PDF
So you want to liberate your data?
PDF
20_Python_Libraries_You_Aren't_Using_But_Should.pdf
PPTX
PPT
Ch3 gnu make
PDF
EKON28 - Winning the 1BRC Challenge In Pascal
PDF
Programming the Semantic Web
PDF
Vba Macros Interoperability
PDF
Rebuilding Solr 6 Examples - Layer by Layer: Presented by Alexandre Rafalovit...
PPT
Pspice Introduction
PDF
PPTX
Hadoop and Marklogic: Using the Genetic Algorithm to generate Source Code
PDF
VTU 3RD SEM UNIX AND SHELL PROGRAMMING SOLVED PAPERS
PDF
Erlang - Concurrent Language for Concurrent World
PPTX
Numerical Simulation of Nonlinear Mechanical Problems using Metafor
PDF
How DSL works on Ruby
PDF
Embedded Systems: Lecture 13: Introduction to GNU Toolchain (Build Tools)
PPTX
MongoDB Replication fundamentals - Desert Code Camp - October 2014
What's New and Cooking in Open Babel 2.3.2
Querying Cultural Heritage
Mon norton tut_querying cultural heritage data
So you want to liberate your data?
20_Python_Libraries_You_Aren't_Using_But_Should.pdf
Ch3 gnu make
EKON28 - Winning the 1BRC Challenge In Pascal
Programming the Semantic Web
Vba Macros Interoperability
Rebuilding Solr 6 Examples - Layer by Layer: Presented by Alexandre Rafalovit...
Pspice Introduction
Hadoop and Marklogic: Using the Genetic Algorithm to generate Source Code
VTU 3RD SEM UNIX AND SHELL PROGRAMMING SOLVED PAPERS
Erlang - Concurrent Language for Concurrent World
Numerical Simulation of Nonlinear Mechanical Problems using Metafor
How DSL works on Ruby
Embedded Systems: Lecture 13: Introduction to GNU Toolchain (Build Tools)
MongoDB Replication fundamentals - Desert Code Camp - October 2014

More from baoilleach (20)

PPTX
Open Babel project overview
PPTX
So I have an SD File... What do I do next?
PPTX
Chemistrify the Web
PPTX
Universal Smiles: Finally a canonical SMILES string
PPT
Protein-ligand docking
PPTX
Cheminformatics
PPT
Making the most of a QM calculation
PDF
Data Analysis in QSAR
PPTX
Large-scale computational design and selection of polymers for solar cells
PDF
My Open Access papers
PPTX
Improving the quality of chemical databases with community-developed tools (a...
PPTX
De novo design of molecular wires with optimal properties for solar energy co...
PPTX
Cinfony - Bring cheminformatics toolkits into tune
PPT
Density functional theory calculations on Ruthenium polypyridyl complexes inc...
PDF
Application of Density Functional Theory to Scanning Tunneling Microscopy
PPT
Towards Practical Molecular Devices
PPT
Why multiple scoring functions can improve docking performance - Testing hypo...
PPT
Why multiple scoring functions can improve docking performance - Testing hypo...
PPT
Improving enrichment rates
PPT
The Blue Obelisk community
Open Babel project overview
So I have an SD File... What do I do next?
Chemistrify the Web
Universal Smiles: Finally a canonical SMILES string
Protein-ligand docking
Cheminformatics
Making the most of a QM calculation
Data Analysis in QSAR
Large-scale computational design and selection of polymers for solar cells
My Open Access papers
Improving the quality of chemical databases with community-developed tools (a...
De novo design of molecular wires with optimal properties for solar energy co...
Cinfony - Bring cheminformatics toolkits into tune
Density functional theory calculations on Ruthenium polypyridyl complexes inc...
Application of Density Functional Theory to Scanning Tunneling Microscopy
Towards Practical Molecular Devices
Why multiple scoring functions can improve docking performance - Testing hypo...
Why multiple scoring functions can improve docking performance - Testing hypo...
Improving enrichment rates
The Blue Obelisk community

Recently uploaded (20)

PDF
Machine learning based COVID-19 study performance prediction
PPT
Teaching material agriculture food technology
PDF
Empathic Computing: Creating Shared Understanding
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Electronic commerce courselecture one. Pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
Modernizing your data center with Dell and AMD
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
KodekX | Application Modernization Development
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
Machine learning based COVID-19 study performance prediction
Teaching material agriculture food technology
Empathic Computing: Creating Shared Understanding
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Electronic commerce courselecture one. Pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
Chapter 3 Spatial Domain Image Processing.pdf
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Building Integrated photovoltaic BIPV_UPV.pdf
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Modernizing your data center with Dell and AMD
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
MYSQL Presentation for SQL database connectivity
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
KodekX | Application Modernization Development
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Network Security Unit 5.pdf for BCA BBA.
Reach Out and Touch Someone: Haptics and Empathic Computing

Intro to Open Babel

  • 1. Open Babel Access and interconvert chemical information Noel M. O’Boyle Open Babel development team and NextMove Software, Cambridge, UK Nov 2012 Secret UK Location
  • 2. Image credit: AJ Cann (AJC1 on Flickr)
  • 3. Image credit: Jon Osborne (jonno101101 on Flickr)
  • 4. • Volunteer effort, an open source success story – Originally a fork from OpenEye’s OELib in 2001 – Lead is Geoff Hutchison (Uni of Pittsburgh) – 4 or 5 active developers – I got involved in late 2005 • http://openbabel.org • Associated paper: (Open Access) – Open Babel: An open chemical toolbox, J. Cheminf., 2011, 3, 33.
  • 5. 5 Does anyone else use Open Babel? • 40K downloads (from SF) in last 12 months – 1.4K downloads of Windows Python bindings • Paper #1 most accessed in last year – Cited 60 times in 1 year • In short, very widely-used
  • 6. Features • Multiple chemical file formats (+ options) and utility formats • 2D coordinate generation and depiction (PNG and SVG) • 3D coordinate generation, forcefield minimisation, conformer generation • Binary fingerprints (path-based, substructure- based) and associated “fast search” database • Bond perception, aromaticity detection and atom- typing • Canonical labelling, automorphisms, alignment • Plugin architecture • Several command-line applications, but also a software library • Written in C++ but bindings in several languages
  • 7. obabel and file conversion • Basic usage: obabel infile.extn –O outfile.extn • Can also read from stdin, write to stdout, read from a SMILES string, specify the input and output file formats, specify conversion options, and format specific options – Or ask for help (obabel –H)…online docs better! • Note: obabel has replaced the older babel
  • 8. Conversion options • Handle multimolecule files join/m, sort, C • Handle multicomponent molecules r, separate • Filter filter, smallest/largest, s/v, f/l, unique • Manipulate structure or atom order addpolarh, align, b, c, canonical, d, h, gen2d/3d • Forcefield minimize, conformer, energy • Conformers readconformer, writeconformers • Manipulate SDF properties and title add, addfilename, addindex, addoutindex, addtotitle, append, delete, property, title See http://openbabel.org/docs
  • 9. File-format options • Particular file formats may have their own specific input or output options – To provide or handle different flavours of the file format – To specify additional information to include – To provide additional functionality • Options are listed in the help text for a format (see next slides) • To use: – specify read options with –a (e.g. –ar) – specify write options with –x (e.g. –xi)
  • 14. SMILES output options 1. Add explicit Hs Note that atom order is > obabel -:CC(=O)Cl –osmi 2. Show them in the preserved CC(=O)Cl output > obabel -:CC(=O)Cl –osmi –xh -h Make atom 3 the first [CH3]C(=O)Cl atom… > obabel -:CC(=O)Cl -osmi -xf 3 O=C(C)Cl …and atom 1 the > obabel -:CC(=O)Cl -osmi -xf 3 –xl 1 last O=C(Cl)C > obabel -:CC(=O)Cl -:CC(=O)Cl -osmi -xC ClC(=O)C Random order O=C(Cl)C > obabel -:CC(=O)Cl -osmi -xF "2 4" CCl Fragment SMILES for the fragment composed of atoms 2 and 4 Take home message: Look through the list of options for file formats which you frequently use (and request new options!)
  • 15. Pro tip #1 “obabel –L” is your friend Information on plugins and plugin options.
  • 16. Pro tip #1 “obabel –L” is your friend Information on plugins and plugin options.
  • 17. Pro tip #1 “obabel –L” is your friend Information on plugins and plugin options.
  • 18. Pro tip #1 “obabel –L” is your friend Information on plugins and plugin options.
  • 19. Pro tip #1 “obabel –L” is your friend Information on plugins and plugin options.
  • 20. Pro tip #1 “obabel –L” is your friend Information on plugins and plugin options.
  • 21. What can be done with descriptors and SDF properties? • Filter based on value or True/False --filter "MW<130 & My_Property < 12" • Sort and reverse sort --sort ~logP • Take the N largest or smallest (or everything but) --largest 5 MW • Add SDF properties --add MW • Add to title (useful for depictions) --addtotitle MW • Remove duplicates --unique cansmi • Create more descriptors! – Group contribution, SMARTS descriptors or compound descriptors are easily added via text files* * http://open-babel.readthedocs.org/en/latest/WritePlugins/AddNewDescriptor.html
  • 22. Pro Tip #2 Faster filtering Also –aP if filtering based on SDF properties
  • 23. Pro tip #3 (Ab)use the title output format • obabel myfile.sdf –o txt – List the titles of all of the molecules • obabel myfile.sdf –otxt --title “” --append MW – List the molecular weights of all of the molecules • obabel myfile.sdf –otxt --title “” --append My_Property – List the property value for all of the molecules
  • 25. PNG Depiction -xC -xa -xt -xu --highlight “cCO blue”
  • 27. Pro Tip #4 SVG + Firefox = User interface • SVG has same options as PNG… • …but drag-and-drop onto Firefox and you have a zoomable user interface – particularly useful for visualising multimolecule files – Demo showing a 1000 molecule file (only 3MB): http://baoilleach.blogspot.co.uk/2011/06/molecular-zooming-with-open-babel-svg.html • You could create a navigation interface for an entire database (sponsorship opportunity!) – E.g. make each of 1000 molecules link to another SVG with 1000 molecules • Multimolecule depictions can be aligned based on substructure (also PNG) – Demo: http://baoilleach.blogspot.co.uk/2012/02/portrait-of-molecule-as-green.html
  • 29. Pro Tip #5 Automatic conversion On Windows, create a file sdf.bat on your Desktop with the following text: @obabel.exe %1 –O "%~ndp1.%~n0" If you drag-and-drop a chemical file onto this, the file will be converted to an SDF file. (Rename to mol2.bat for mol2 files, etc.)
  • 30. Alignment • Open Babel does not have any code to determine the maximum common substructure (MCS) – Sponsorship opportunity ahoy! • 2D and 3D alignment is supported –align – Based on Kabsch alignment (minimised RMSD) – You either have to align the whole molecule (atoms should be in same order) or else a specified substructure (SMARTS) • When aligning 3D structures I find it useful to -- join the results into a single structure and view in 3D viewer (e.g. Avogadro)
  • 31. Spectrophores • Donated by Silicos-it, http://silicos-it.com/ • Usage: obspectrophore –i myfile.extn • Requires 3D structure – Note: it does not complain if you give it a 2D structure – 3D conformation dependent, but orientation independent • 48-value descriptor based on electrostatic, lipophilic and electrophilic property values at points on a grid (or cage) and the atomic shape deviation
  • 32. Spectrophores • Donated by Silicos-it, http://silicos-it.com/ • Usage: obspectrophore –i myfile.extn • Requires 3D structure – Note: it does not complain if you give it a 2D structure – 3D conformation dependent, but orientation independent • 48-value descriptor based on electrostatic, lipophilic and electrophilic property values at points on a grid (or cage) and the atomic shape deviation • Custom code require to use spectrophores for similarity • Silicos-it have previously trained Self-Organising Maps (SOMs) using spectrophores for known classes of compounds and used them to predict novel compounds for a particular class
  • 33. Progamming with Open Babel • Sometimes the GUI or command-line interface does not do exactly what you want – You can write your own applications or scripts • Choice of C++, Python, Java, .NET, Perl – But C++ and Python best supported • Python is well-established in chemistry – Relatively easy to learn – Small number of commands – Can do a lot in a few lines • Since the full Open Babel library is quite large, to make it easy to get started we provide a Python module Pybel – Makes it easy to do the most common operations – Very small number of classes and functions – The full library is still available under-the-hood • Google “Open Babel Python”
  • 34. Using the Python Bindings import pybel # Read a molecule inputfile = pybel.readfile(“mol”, “tmp.mol”) mol = next(inputfile) print(mol.molwt) # Show molecular weight
  • 35. Using the Python Bindings import pybel # Loop over multiple molecules inputfile = pybel.readfile(“sdf”, “tmp.sdf”) for mol in inputfile: # Show molecular weight print(mol.molwt)
  • 36. Using the Python Bindings import pybel # Loop over multiple molecules inputfile = pybel.readfile(“sdf”, “tmp.sdf”) for mol in inputfile: if (mol.title.endswith(“_active”) and mol.wt > 100 and “S” in mol.formula): # Show molecular weight print(mol.molwt)
  • 37. Using the Python Bindings import pybel # Loop over multiple molecules inputfile = pybel.readfile(“sdf”, “tmp.sdf”) outputfile = pybel.Outputfile(“smi”, “tmp.smi”) for mol in inputfile: if (mol.title.endswith(“_active”) and mol.wt > 100 and “S” in mol.formula): # Add the molecule to the output file outputfile.write(mol)
  • 38. Learn by playing at the command-line
  • 39. A cry for help Like mailing lists? openbabel- discuss@lists.sf.net Like forums? http://forums.openbabel.org Like to email a developer directly? We will ask you to email the list :-) Don’t forget to read the docs first and Google it Image: Tintin44 (Flickr) http://openbabel.org/docs

Editor's Notes

  • #3: OB is like a Swiss army knife, not a…
  • #4: …spork!
  • #9: Features of obabel, for full info see the docs.
  • #11: The same options are available at the command line…
  • #12: …in the online docs…
  • #13: …in the PDF and the book….
  • #14: …and in the GUI.
  • #27: “The 70s are calling. They want their depiction back.”
  • #28: Follow the links, or else this won’t make sense.
  • #29: Depiction of unspecified stereo.
  • #36: (Tech note: the “next” on the previous page is a Python keyword, and is implicit in the “for” loop)
  • #37: (Tech note: “endswith” and “in” above are features of Python string handling)