SlideShare a Scribd company logo
Making the most of a QM calculation


                      Noel O’Boyle




www.ccdc.cam.ac.uk
Tools

     • GaussSum

     • cclib

     • Pybel




www.ccdc.cam.ac.uk
Themes

   • Interoperability
   • Reinvent the wheel
   • Tools add value
   • Libraries spread the work, and increase the
     reach
   • Cross-platform
   • Python where possible
www.ccdc.cam.ac.uk
Python is the dominant scripting language in
    chemistry
    • Cheminformatics
        – OpenBabel, RDKit, OEChem, Daylight, Cambios Molecular
          Toolkit, Frowns, PyBabel
    • Computational chemistry
        – OpenBabel, PyQuante, NWChem, Maestro/Jaguar, MMTK
    • Visualisation
        – CCP1GUI, PyMOL, Zeobuilder
    • Scientific programming
        – numpy (interface to ATLAS, LAPACK), can interface to C/C++,
          FORTRAN, matplotlib, VTK


www.ccdc.cam.ac.uk
GaussSum
    • GUI written in Python
    • Enables comparisons of calculated properties
      with experimental results
        – orbitals and molecular structure
             • HOMO is 40% Ligand 1, 20% Ligand 2, etc.
        – vibrational frequencies and IR spectrum
             • scale frequencies individually or generally
        – electronic transitions and UV-vis, CD spectra
        – electronic transitions and molecular structure
             • lowest energy transition involves change in ‘charge density’
               on Ligand 1 from 0% to 80%

NM O’Boyle, AL Tenderholt, KM Langner. J. Comp. Chem.
2008, 29, 839. http://gausssum.sf.net
www.ccdc.cam.ac.uk
Making the most of a QM calculation
GaussSum
    • Simple features that make life easier for
      modellers
        – ‘grep’ for lines containing particular expressions
             • can store up to four expressions
        – plot convergence of geometry or SCF
             • early warning of problems (unlike plotting of energy)
        – spectra and extracted data are written to files suitable
          for Excel

    • GaussSum is popular...
        – 3300 downloads last 12 months - referenced 23 times
          in 2007
    • …but is a simple program
        – Mulliken analysis and convolution of spectra
www.ccdc.cam.ac.uk
Some questions
    • Why is it so easy to add value to QM
      calculations?
        – developers not familiar with needs of users?
    • Why don’t QM software developers list
      compatible tools on their website?
        – Good for the QM software, good for the tool
    • Why don’t QM software developers make it
      easier for tool developers?
        – API, documentation describing output, XML,
          interoperability
    • Why not open source?
        – Could fix these problems myself.
www.ccdc.cam.ac.uk
cclib - a Python library for package-
    independent computational chemistry
    algorithms
•   In Jan 2005, Adam Tenderholt started writing PyMOlyze (now
    QMForge)
     – some overlap with GaussSum
     – we decided to collaborate on a common framework for extracting data from
       QM log files
•   Karol Langner joined in Jan 2007
•   cclib now extracts and standardises data from ADF, GAMESS,
    GAMESS-UK, Gaussian, PC GAMESS, Jaguar, Molpro, ORCA...
    (someone offered this week to help with ACES, Dalton, NWChem, and
    PSI too)

NM O’Boyle, AL Tenderholt, KM Langner. J. Comp. Chem.
2008, 29, 839. http://cclib.sf.net

www.ccdc.cam.ac.uk
Why is cclib needed?

    • Analysis methods are available only to users of
      certain packages
        – Morokuma energy decomposition (implemented in
          GAMESS)
        – Charge Decomposition Analysis (Frenking's code
          only reads Gaussian output files)
    • Keeps up to date with new versions of packages
    • Allows chemists to focus on algorithms
    • Makes implementation of algorithms
      independent of proprietary software
www.ccdc.cam.ac.uk
>>> from cclib.parser import ccopen
 >>> myfile = ccopen("basicGAMESS-UK/water_mp3.out")
 >>> data = myfile.parse()
 >>> dir(data)
 ['__class__', '__delattr__', '__dict__', '__doc__',
 '__getattribute__', '__hash__', '__init__', '__module__',
 '__new__', '__reduce__', '__reduce_ex__', '__repr__',
 '__setattr__', '__str__', '__weakref__', '_attrlist',
 '_attrtypes', '_intarrays', '_listsofarrays', 'aonames',
 'arrayify', 'atombasis', 'atomcoords', 'atomnos', 'charge',
 'coreelectrons', 'gbasis', 'getattributes', 'homos',
 'listify', 'mocoeffs', 'moenergies', 'mosyms',
 'mpenergies', 'mult', 'natom', 'nbasis', 'nmo',
 'scfenergies', 'scftargets', 'scfvalues', 'setattributes']
 >>>   print data.nbasis
 7
 >>>   print data.atomcoords
 [[[   0.         0.         -0.2251786]
   [   0.         1.4941103 0.9007143]
   [   0.        -1.4941103 0.9007143]]]
 >>>


www.ccdc.cam.ac.uk
Attribute
                            Description                                                Units       Datatype
            Name
            aonames         atomic orbital names                                                   List
            aooverlaps      atomic orbital overlap matrix                                          array of rank 2
            atomcoords      atom coordinates                                           Å           array of rank 3
            atomnos         atomic numbers                                                         array of rank 1
            coreelectrons   number of core electrons in an atom's pseudopotential                  array of rank 1
                                                                                            -1
            etenergies      energies of electronic transitions                         cm          array of rank 1
            etoscs          oscillator strengths of electronic transitions                         array of rank 1
            etrotats        rotatory strengths of electronic transitions                           array of rank 1
            etsecs          singly-excited configurations for electronic transitions               list of lists
            etsyms          symmetries of electronic transitions                                   List
            fonames         fragment molecular orbital names                                       List
            fooverlaps      fragment molecular orbital overlap matrix                              array of rank 2
            gbasis          coefficients and exponents of Gaussian basis functions                 PyQuante format
            geotargets      criteria target values for geometry convergence                        array of rank 1
            geovalues       criteria values for geometry convergence                               array of rank 2
            homos           molecular orbital index of the HOMO(s)                                 array of rank 1
            mocoeffs        molecular orbital coefficients                                         list of arrays of rank 2
            moenergies      molecular orbital energies                                 eV          list of arrays of rank 1
            mosyms          molecular orbital symmetries                                           list of lists
            mpenergies      Möller-Plesset corrected electronic energies               eV          array of rank 2
            natom           number of atoms                                                        Integer
            nbasis          number of basis functions                                              Integer
            nmo             number of molecular orbitals                                           Integer
            scfenergies     electronic energy of the molecule                          eV          array of rank 1
            scftargets      criteria target values for SCF convergence                             array of rank 2
            scfvalues       criteria values for SCF convergence                                    list of arrays of rank 2
            vibdisps        Cartesian displacement vectors                             ΔÅ          array of rank 3
                                                                                          -1
            vibfreqs        vibrational frequencies                                    cm          array of rank 1
                                                                                             -1
            vibirs          IR intensities                                             km mol      array of rank 1
                                                                                        4     -1
            vibramans       Raman intensities                                          A amu       array of rank 1
            vibsyms         Symmetries of vibrations                                               List



www.ccdc.cam.ac.uk
..dataADFADF2004.01MoOCl4-sp.adfout.bz2... parsed
  ..dataADFADF2004.01mo_sp.adfout.bz2... parsed
  ..dataADFADF2004.01NH3.adfout.bz2... parsed
  ..dataADFADF2005.01Os3(CO)12-D3h.zip... parsed
  ..dataADFADF2005.01Os3.zip... parsed
  ..dataADFADF2006.01Au2.out... parsed
  ..dataADFADF2006.01Frags_NiCO4_orig.out... parsed
  ..dataADFADF2006.01HgMeBr_zso_orig.out... parsed
  ..dataADFADF2006.01dvb_gopt.adfout.bz2... parsed
Are the GAMESS UK files ccopened and parsed correctly?
  ..dataGAMESS-UKbasicGAMESS-UKdvb_gopt.out... parsed
  ..dataGAMESS-UKbasicGAMESS-UKdvb_gopt_b.out... parsed
  ..dataGAMESS-UKbasicGAMESS-UKdvb_gopt_c.out... parsed
  ..dataGAMESS-UKbasicGAMESS-UKdvb_gopt_d.out... parsed
  ..dataGAMESS-UKbasicGAMESS-UKdvb_ir.out... parsed
  ..dataGAMESS-UKbasicGAMESS-UKdvb_raman.out... parsed
  ..dataGAMESS-UKbasicGAMESS-UKdvb_sp.out... parsed
  ..dataGAMESS-UKbasicGAMESS-UKdvb_sp_b.out... parsed
  ..dataGAMESS-UKbasicGAMESS-UKdvb_un_sp.out... parsed
  ..dataGAMESS-UKbasicGAMESS-UKdvb_un_sp_b.out... parsed
  ..dataGAMESS-UKbasicGAMESS-UKMoOCl4-sp.out... parsed
  ..dataGAMESS-UKbasicGAMESS-UKwater_mp2.out... parsed
  ..dataGAMESS-UKbasicGAMESS-UKwater_mp3.out... parsed
  ..dataGAMESS-UKGAMESS-UK6.0dscf_4.out.gz... parsed
  ..dataGAMESS-UKGAMESS-UK6.0duhf_1.out.gz... parsed
  ..dataGAMESS-UKGAMESS-UK7.0mg10.out.gz... parsed
  ..dataGAMESS-UKGAMESS-UK7.0pyridine.out.gz... parsed
  ..dataGAMESS-UKGAMESS-UK7.0pyridine2_21m10r.out.gz... parsed
Are the Jaguar files ccopened and parsed correctly?
  ..dataJaguarJaguar4.2dvb_gopt.out.bz2... parsed
  ..dataJaguarJaguar4.2dvb_gopt_b.out.bz2... parsed
  ..dataJaguarJaguar4.2dvb_ir.out.bz2... parsed
  ..dataJaguarJaguar4.2dvb_sp.out.bz2... parsed
Total: 147   Failed: 0   Errors: 2
**** testGeoOpt: GAMESS-UK geometry optimization unittest. ****
Are the indices in atombasis the right amount and unique? ... ok
Are atomcoords consistent with natom and Angstroms? ... ok
Are the atomnos correct? ... ok
Are the charge and multiplicity correct? ... ok
Are the coreelectrons all 0? ... ok
Are the dimensions of mocoeffs equal to 1 x (homo+5) x nbasis? ... ok
Do the geo targets have the right dimensions? ... ok
Are atomcoords consistent with geovalues? ... ok
Are scfvalues consistent with geovalues? ... ok
Is the index of the HOMO equal to 34? ... ok
Is the number of evalues equal to nmo? ... ok
Is the number of atoms equal to 20? ... ok
Is the number of basis set functions correct? ... ok
Did this subclass overwrite normalisesym? ... ok
Is the SCF energy within 40eV of target? ... ok
Do the scf targets have the right dimensions? ... ok
Are scfvalues and its elements the right type? ... ok
Are all the symmetry labels either Ag/u or Bg/u? ... ok
Is moenergies a list containing one numpy array? ... ok
----------------------------------------------------------------------
Ran 19 tests in 0.016s
********* SUMMARY PER PACKAGE ****************
               Total    Passed Failed Errors     Skipped
ADF2007.01       48      46       0       0        2
GAMESS-UK        58      58       0       0        0
GAMESS-US        75      71       2       0        2
Gaussian03       92      88       1       0        3
Jaguar7.0        54      47       0       0        7
Molpro2006       63      59       0       0        4
ORCA2.6          54      44       5       3        2
PCGAMESS         75      74       0       0        1

********* SUMMARY OF EVERYTHING **************
TOTAL: 519      PASSED: 487     FAILED: 8        ERRORS: 3      SKIPPED: 21
But it’s Python! I only code C, FORTRAN, etc.


   • Use cclib to convert the log file to JSON
   • JSON libraries are available for
       – C, C++, Java, Javascript, Perl, PHP, Python, Ruby
   • Could easily write convertor to some type of FORTRAN
     format




www.ccdc.cam.ac.uk
Some questions
    • Why don’t QM software developers list compatible tools
      on their website?
        – Good for the QM software, good for the tool
    • Why don’t QM software developers make it easier for
      tool developers?
        – API, documentation describing output, XML, interoperability
    • Why not open source?
        – Could fix these problems myself
    • Why can’t I mix and match calculation methods from
      different programs?
    • Why do academics restrict usage of their sophisticated
      routines to a single proprietary code?

www.ccdc.cam.ac.uk
OpenBabel - “Not just file conversion”
    •   A C++ library for…
    •   Cheminformatics
        – SMARTS searching, InChI, SMILES, molecular fingerprints, group-
          contribution based descriptors, determination of SSSR, bond order
          perception, hydrogen addition, Gasteiger charge calculation
    •   Computational chemistry
        – AMBER, DMol3, Gaussian, GAMESS, GROMOS96, HyperChem,
          Jaguar, MOPAC, Q-Chem, Turbomole, ZINDO
             • varying levels of support
        – forcefield minimisation (UFF, MMFF94, Ghemical)
        – symmetrisation of almost symmetric molecules (coming soon)


                         http://openbabel.org

www.ccdc.cam.ac.uk
Language bindings…and wrappers
    • OpenBabel is a C++ library
    • SWIG allows access to OpenBabel from
        – Java, Perl, Python, Ruby (and many more if we wish)

    • SWIG bindings are direct 1-to-1 translation of C++ API
      and objects to a Python API and objects
    • Pybel is a Pythonic wrapper around the SWIG bindings
        – Makes it easy to carry out common tasks
        – Allows idiomatic Python, e.g. using iterators, direct access to
          attribute values rather than Get/Set, reduces verbosity

NM O’Boyle, C Morley, GR Hutchison. Chem. Cent. J. 2008,
2, 5. http://openbabel.org/wiki/Python

www.ccdc.cam.ac.uk
Let’s read a MOL file and optimise the geometry with
 the UFF forcefield


 SWIG bindings
 import openbabel as ob
 obconv = ob.OBConversion()
 obconv.SetInFormat(“mol")
 obmol = ob.OBMol()
 obconv.ReadFile(obmol, “caffeine.mol")
 obff = ob.OBForceField.FindForceField("UFF")
 obff.Setup(obmol)
 obff.ConjugateGradients(1000)
 obff.UpdateCoordinates(obmol)

 Pybel
 import pybel
 mol = pybel.readfile(“mol”, “caffeine.mol”).next()
 mol.optimise(“UFF”) # Coming soon!

www.ccdc.cam.ac.uk
Some questions
    • Why do some visualisation packages use their
      own parsing routines instead of adding them to
      libraries like OpenBabel or cclib?
    • Why don’t QM packages donate code or
      contract developers to improve support in
      libraries like OpenBabel or cclib?
        – ADF is doing this
    • How can we coordinate interoperability?
             …

www.ccdc.cam.ac.uk
Making the most of a QM calculation
• I propose blueobelisk-qm@lists.sf.net
Make it work on Windows!
    • Most users use Windows, and even Linux users
      want the option of jumping between OSs
    • You restrict the reach of your software (and
      hasten its replacement)
    • Case study cclib-0.8 (Nov 07):
        – cclib-0.8.tar.gz 63
        – cclib-0.8.zip 58
        – cclib-0.8-py2.4.exe 26
        – cclib-0.8-py2.5.exe 45
    • For every Linux user, there are 2 Windows
      users

www.ccdc.cam.ac.uk
Make it easy to install on Windows!
    • No dependencies
    • Case study: GaussSum 2.1.4 (Nov 2007)
        – GaussSum-2.1.4.tar.gz 143 (Linux)
        – GaussSum-2.1.4.zip 206 (Windows, requires Python, Numpy
          and Python Imaging Library)
        – GaussSumexe-2.1.4.zip 396 (Windows, no dependencies)

    • Lower the barrier to installation
        – A one-click installer > a .zip file >> a .tar.gz file
        – Make the installation instructions easy
    • Case study: OpenBabel
        – OB 2.0.1 Linux:Windows 5:4
        – OB 2.1.1 Linux:Windows 5:7.5
www.ccdc.cam.ac.uk
Thanks!
    • The OpenBabel development team and particularly
      Geoff Hutchison and Chris Morley
    • cclib: Adam Tenderholt and Karol Langner
    • SourceForge


    • Email: baoilleach@gmail.com, oboyle@ccdc.cam.ac.uk
    • Blog: http://baoilleach.blogspot.com
    • Website: http://www.redbrick.dcu.ie/~noel




www.ccdc.cam.ac.uk

More Related Content

PPTX
лекция 1 обзор методов вычислительной физики
PPTX
лекция 3 дефекты в полупроводниках ga n alsb
PDF
Mphys project - Towards a three-step laser excitation of rubidium Rydberg sta...
PPTX
лекция 2 атомные смещения в бинарных сплавах
PPTX
Towards a three-step laser excitation of rubidium Rydberg states for use in a...
PPTX
Quantum storage and manipulation of heralded single photons in atomic quantum...
PPTX
Solid State NMR
DOCX
Nuclear magnetic resonance
лекция 1 обзор методов вычислительной физики
лекция 3 дефекты в полупроводниках ga n alsb
Mphys project - Towards a three-step laser excitation of rubidium Rydberg sta...
лекция 2 атомные смещения в бинарных сплавах
Towards a three-step laser excitation of rubidium Rydberg states for use in a...
Quantum storage and manipulation of heralded single photons in atomic quantum...
Solid State NMR
Nuclear magnetic resonance

What's hot (11)

PPTX
Chapter 1 pt 2
PPTX
Advanced Molecular Dynamics 2016
PDF
Solid state nmr
PDF
Electronic structure of strongly correlated materials Part III V.Anisimov
PDF
Periodically Poled Lithium Niobate Waveguides for Quantum Frequency Conversio...
PDF
call for papers, research paper publishing, where to publish research paper, ...
PDF
Semiconductor qubits in practice
PDF
Two dimensional nmr spectroscopy (practical application and spectral analysis
PDF
Coupling Maxwell\'s Equations to Particle-Based Simulators
PDF
Mott insulators
PDF
Lecture 6: Junction Characterisation
Chapter 1 pt 2
Advanced Molecular Dynamics 2016
Solid state nmr
Electronic structure of strongly correlated materials Part III V.Anisimov
Periodically Poled Lithium Niobate Waveguides for Quantum Frequency Conversio...
call for papers, research paper publishing, where to publish research paper, ...
Semiconductor qubits in practice
Two dimensional nmr spectroscopy (practical application and spectral analysis
Coupling Maxwell\'s Equations to Particle-Based Simulators
Mott insulators
Lecture 6: Junction Characterisation
Ad

Viewers also liked (6)

PDF
Hyperchem Ma, badbarcode en_1109_nocomment-final
PPT
Quantum pharmacology. Basics
PDF
Data Analysis in QSAR
KEY
Electronegativity
PPT
Qsar lecture
Hyperchem Ma, badbarcode en_1109_nocomment-final
Quantum pharmacology. Basics
Data Analysis in QSAR
Electronegativity
Qsar lecture
Ad

Similar to Making the most of a QM calculation (20)

PDF
Electron Density Derived Descriptors in Drug Discovery and Protein Modeling
PDF
UV PES.pdf
PDF
Gate syllabus
PPT
Current Research on Quantum Algorithms.ppt
DOCX
Syllabus 4th sem
PPT
7926563mocskoff pack method k sampling.ppt
PDF
SCF methods, basis sets, and integrals part III
PDF
Quantum Computation for Predicting Electron and Phonon Properties of Solids
PDF
Uniform_Reaction_Rate_ROM_slides_2013.pdf
PPTX
Professional Technology Style by Slidesgo.pptx
PPTX
Recent developments for the quantum chemical investigation of molecular syste...
PDF
ESS-Bilbao Initiative Workshop. Low Energy Transport and space-charge compens...
PDF
Basic execution
PPTX
eQCM - Electrochemical Quartz Crystal Microbalance
DOCX
Ece syllabus
PDF
NMR Random Coil Index & Protein Dynamics
PPT
Nano-electronics
PDF
Topology of charge density from pseudopotential density functional theory cal...
PDF
Structural, electronic, elastic, optical and thermodynamical properties of zi...
PPT
Potential Energy Surface & Molecular Graphics
Electron Density Derived Descriptors in Drug Discovery and Protein Modeling
UV PES.pdf
Gate syllabus
Current Research on Quantum Algorithms.ppt
Syllabus 4th sem
7926563mocskoff pack method k sampling.ppt
SCF methods, basis sets, and integrals part III
Quantum Computation for Predicting Electron and Phonon Properties of Solids
Uniform_Reaction_Rate_ROM_slides_2013.pdf
Professional Technology Style by Slidesgo.pptx
Recent developments for the quantum chemical investigation of molecular syste...
ESS-Bilbao Initiative Workshop. Low Energy Transport and space-charge compens...
Basic execution
eQCM - Electrochemical Quartz Crystal Microbalance
Ece syllabus
NMR Random Coil Index & Protein Dynamics
Nano-electronics
Topology of charge density from pseudopotential density functional theory cal...
Structural, electronic, elastic, optical and thermodynamical properties of zi...
Potential Energy Surface & Molecular Graphics

More from baoilleach (20)

PPTX
We need to talk about Kekulization, Aromaticity and SMILES
PPTX
Open Babel project overview
PPTX
So I have an SD File... What do I do next?
PPTX
Chemistrify the Web
PPTX
Universal Smiles: Finally a canonical SMILES string
PPTX
What's New and Cooking in Open Babel 2.3.2
PPTX
Intro to Open Babel
PPT
Protein-ligand docking
PPTX
Cheminformatics
PPTX
Large-scale computational design and selection of polymers for solar cells
PDF
My Open Access papers
PPTX
Improving the quality of chemical databases with community-developed tools (a...
PPTX
De novo design of molecular wires with optimal properties for solar energy co...
PPTX
Cinfony - Bring cheminformatics toolkits into tune
PPT
Density functional theory calculations on Ruthenium polypyridyl complexes inc...
PDF
Application of Density Functional Theory to Scanning Tunneling Microscopy
PPT
Towards Practical Molecular Devices
PPT
Why multiple scoring functions can improve docking performance - Testing hypo...
PPT
Why multiple scoring functions can improve docking performance - Testing hypo...
PPT
Improving enrichment rates
We need to talk about Kekulization, Aromaticity and SMILES
Open Babel project overview
So I have an SD File... What do I do next?
Chemistrify the Web
Universal Smiles: Finally a canonical SMILES string
What's New and Cooking in Open Babel 2.3.2
Intro to Open Babel
Protein-ligand docking
Cheminformatics
Large-scale computational design and selection of polymers for solar cells
My Open Access papers
Improving the quality of chemical databases with community-developed tools (a...
De novo design of molecular wires with optimal properties for solar energy co...
Cinfony - Bring cheminformatics toolkits into tune
Density functional theory calculations on Ruthenium polypyridyl complexes inc...
Application of Density Functional Theory to Scanning Tunneling Microscopy
Towards Practical Molecular Devices
Why multiple scoring functions can improve docking performance - Testing hypo...
Why multiple scoring functions can improve docking performance - Testing hypo...
Improving enrichment rates

Recently uploaded (20)

PDF
Supply Chain Operations Speaking Notes -ICLT Program
PDF
O7-L3 Supply Chain Operations - ICLT Program
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PPTX
Pharma ospi slides which help in ospi learning
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
TR - Agricultural Crops Production NC III.pdf
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PPTX
Cell Structure & Organelles in detailed.
PDF
Insiders guide to clinical Medicine.pdf
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PPTX
Cell Types and Its function , kingdom of life
PDF
Basic Mud Logging Guide for educational purpose
PPTX
Introduction to Child Health Nursing – Unit I | Child Health Nursing I | B.Sc...
PDF
Complications of Minimal Access Surgery at WLH
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Supply Chain Operations Speaking Notes -ICLT Program
O7-L3 Supply Chain Operations - ICLT Program
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
Pharma ospi slides which help in ospi learning
Final Presentation General Medicine 03-08-2024.pptx
TR - Agricultural Crops Production NC III.pdf
Pharmacology of Heart Failure /Pharmacotherapy of CHF
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
2.FourierTransform-ShortQuestionswithAnswers.pdf
Cell Structure & Organelles in detailed.
Insiders guide to clinical Medicine.pdf
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
Cell Types and Its function , kingdom of life
Basic Mud Logging Guide for educational purpose
Introduction to Child Health Nursing – Unit I | Child Health Nursing I | B.Sc...
Complications of Minimal Access Surgery at WLH
3rd Neelam Sanjeevareddy Memorial Lecture.pdf

Making the most of a QM calculation

  • 1. Making the most of a QM calculation Noel O’Boyle www.ccdc.cam.ac.uk
  • 2. Tools • GaussSum • cclib • Pybel www.ccdc.cam.ac.uk
  • 3. Themes • Interoperability • Reinvent the wheel • Tools add value • Libraries spread the work, and increase the reach • Cross-platform • Python where possible www.ccdc.cam.ac.uk
  • 4. Python is the dominant scripting language in chemistry • Cheminformatics – OpenBabel, RDKit, OEChem, Daylight, Cambios Molecular Toolkit, Frowns, PyBabel • Computational chemistry – OpenBabel, PyQuante, NWChem, Maestro/Jaguar, MMTK • Visualisation – CCP1GUI, PyMOL, Zeobuilder • Scientific programming – numpy (interface to ATLAS, LAPACK), can interface to C/C++, FORTRAN, matplotlib, VTK www.ccdc.cam.ac.uk
  • 5. GaussSum • GUI written in Python • Enables comparisons of calculated properties with experimental results – orbitals and molecular structure • HOMO is 40% Ligand 1, 20% Ligand 2, etc. – vibrational frequencies and IR spectrum • scale frequencies individually or generally – electronic transitions and UV-vis, CD spectra – electronic transitions and molecular structure • lowest energy transition involves change in ‘charge density’ on Ligand 1 from 0% to 80% NM O’Boyle, AL Tenderholt, KM Langner. J. Comp. Chem. 2008, 29, 839. http://gausssum.sf.net www.ccdc.cam.ac.uk
  • 7. GaussSum • Simple features that make life easier for modellers – ‘grep’ for lines containing particular expressions • can store up to four expressions – plot convergence of geometry or SCF • early warning of problems (unlike plotting of energy) – spectra and extracted data are written to files suitable for Excel • GaussSum is popular... – 3300 downloads last 12 months - referenced 23 times in 2007 • …but is a simple program – Mulliken analysis and convolution of spectra www.ccdc.cam.ac.uk
  • 8. Some questions • Why is it so easy to add value to QM calculations? – developers not familiar with needs of users? • Why don’t QM software developers list compatible tools on their website? – Good for the QM software, good for the tool • Why don’t QM software developers make it easier for tool developers? – API, documentation describing output, XML, interoperability • Why not open source? – Could fix these problems myself. www.ccdc.cam.ac.uk
  • 9. cclib - a Python library for package- independent computational chemistry algorithms • In Jan 2005, Adam Tenderholt started writing PyMOlyze (now QMForge) – some overlap with GaussSum – we decided to collaborate on a common framework for extracting data from QM log files • Karol Langner joined in Jan 2007 • cclib now extracts and standardises data from ADF, GAMESS, GAMESS-UK, Gaussian, PC GAMESS, Jaguar, Molpro, ORCA... (someone offered this week to help with ACES, Dalton, NWChem, and PSI too) NM O’Boyle, AL Tenderholt, KM Langner. J. Comp. Chem. 2008, 29, 839. http://cclib.sf.net www.ccdc.cam.ac.uk
  • 10. Why is cclib needed? • Analysis methods are available only to users of certain packages – Morokuma energy decomposition (implemented in GAMESS) – Charge Decomposition Analysis (Frenking's code only reads Gaussian output files) • Keeps up to date with new versions of packages • Allows chemists to focus on algorithms • Makes implementation of algorithms independent of proprietary software www.ccdc.cam.ac.uk
  • 11. >>> from cclib.parser import ccopen >>> myfile = ccopen("basicGAMESS-UK/water_mp3.out") >>> data = myfile.parse() >>> dir(data) ['__class__', '__delattr__', '__dict__', '__doc__', '__getattribute__', '__hash__', '__init__', '__module__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__str__', '__weakref__', '_attrlist', '_attrtypes', '_intarrays', '_listsofarrays', 'aonames', 'arrayify', 'atombasis', 'atomcoords', 'atomnos', 'charge', 'coreelectrons', 'gbasis', 'getattributes', 'homos', 'listify', 'mocoeffs', 'moenergies', 'mosyms', 'mpenergies', 'mult', 'natom', 'nbasis', 'nmo', 'scfenergies', 'scftargets', 'scfvalues', 'setattributes'] >>> print data.nbasis 7 >>> print data.atomcoords [[[ 0. 0. -0.2251786] [ 0. 1.4941103 0.9007143] [ 0. -1.4941103 0.9007143]]] >>> www.ccdc.cam.ac.uk
  • 12. Attribute Description Units Datatype Name aonames atomic orbital names List aooverlaps atomic orbital overlap matrix array of rank 2 atomcoords atom coordinates Å array of rank 3 atomnos atomic numbers array of rank 1 coreelectrons number of core electrons in an atom's pseudopotential array of rank 1 -1 etenergies energies of electronic transitions cm array of rank 1 etoscs oscillator strengths of electronic transitions array of rank 1 etrotats rotatory strengths of electronic transitions array of rank 1 etsecs singly-excited configurations for electronic transitions list of lists etsyms symmetries of electronic transitions List fonames fragment molecular orbital names List fooverlaps fragment molecular orbital overlap matrix array of rank 2 gbasis coefficients and exponents of Gaussian basis functions PyQuante format geotargets criteria target values for geometry convergence array of rank 1 geovalues criteria values for geometry convergence array of rank 2 homos molecular orbital index of the HOMO(s) array of rank 1 mocoeffs molecular orbital coefficients list of arrays of rank 2 moenergies molecular orbital energies eV list of arrays of rank 1 mosyms molecular orbital symmetries list of lists mpenergies Möller-Plesset corrected electronic energies eV array of rank 2 natom number of atoms Integer nbasis number of basis functions Integer nmo number of molecular orbitals Integer scfenergies electronic energy of the molecule eV array of rank 1 scftargets criteria target values for SCF convergence array of rank 2 scfvalues criteria values for SCF convergence list of arrays of rank 2 vibdisps Cartesian displacement vectors ΔÅ array of rank 3 -1 vibfreqs vibrational frequencies cm array of rank 1 -1 vibirs IR intensities km mol array of rank 1 4 -1 vibramans Raman intensities A amu array of rank 1 vibsyms Symmetries of vibrations List www.ccdc.cam.ac.uk
  • 13. ..dataADFADF2004.01MoOCl4-sp.adfout.bz2... parsed ..dataADFADF2004.01mo_sp.adfout.bz2... parsed ..dataADFADF2004.01NH3.adfout.bz2... parsed ..dataADFADF2005.01Os3(CO)12-D3h.zip... parsed ..dataADFADF2005.01Os3.zip... parsed ..dataADFADF2006.01Au2.out... parsed ..dataADFADF2006.01Frags_NiCO4_orig.out... parsed ..dataADFADF2006.01HgMeBr_zso_orig.out... parsed ..dataADFADF2006.01dvb_gopt.adfout.bz2... parsed Are the GAMESS UK files ccopened and parsed correctly? ..dataGAMESS-UKbasicGAMESS-UKdvb_gopt.out... parsed ..dataGAMESS-UKbasicGAMESS-UKdvb_gopt_b.out... parsed ..dataGAMESS-UKbasicGAMESS-UKdvb_gopt_c.out... parsed ..dataGAMESS-UKbasicGAMESS-UKdvb_gopt_d.out... parsed ..dataGAMESS-UKbasicGAMESS-UKdvb_ir.out... parsed ..dataGAMESS-UKbasicGAMESS-UKdvb_raman.out... parsed ..dataGAMESS-UKbasicGAMESS-UKdvb_sp.out... parsed ..dataGAMESS-UKbasicGAMESS-UKdvb_sp_b.out... parsed ..dataGAMESS-UKbasicGAMESS-UKdvb_un_sp.out... parsed ..dataGAMESS-UKbasicGAMESS-UKdvb_un_sp_b.out... parsed ..dataGAMESS-UKbasicGAMESS-UKMoOCl4-sp.out... parsed ..dataGAMESS-UKbasicGAMESS-UKwater_mp2.out... parsed ..dataGAMESS-UKbasicGAMESS-UKwater_mp3.out... parsed ..dataGAMESS-UKGAMESS-UK6.0dscf_4.out.gz... parsed ..dataGAMESS-UKGAMESS-UK6.0duhf_1.out.gz... parsed ..dataGAMESS-UKGAMESS-UK7.0mg10.out.gz... parsed ..dataGAMESS-UKGAMESS-UK7.0pyridine.out.gz... parsed ..dataGAMESS-UKGAMESS-UK7.0pyridine2_21m10r.out.gz... parsed Are the Jaguar files ccopened and parsed correctly? ..dataJaguarJaguar4.2dvb_gopt.out.bz2... parsed ..dataJaguarJaguar4.2dvb_gopt_b.out.bz2... parsed ..dataJaguarJaguar4.2dvb_ir.out.bz2... parsed ..dataJaguarJaguar4.2dvb_sp.out.bz2... parsed Total: 147 Failed: 0 Errors: 2
  • 14. **** testGeoOpt: GAMESS-UK geometry optimization unittest. **** Are the indices in atombasis the right amount and unique? ... ok Are atomcoords consistent with natom and Angstroms? ... ok Are the atomnos correct? ... ok Are the charge and multiplicity correct? ... ok Are the coreelectrons all 0? ... ok Are the dimensions of mocoeffs equal to 1 x (homo+5) x nbasis? ... ok Do the geo targets have the right dimensions? ... ok Are atomcoords consistent with geovalues? ... ok Are scfvalues consistent with geovalues? ... ok Is the index of the HOMO equal to 34? ... ok Is the number of evalues equal to nmo? ... ok Is the number of atoms equal to 20? ... ok Is the number of basis set functions correct? ... ok Did this subclass overwrite normalisesym? ... ok Is the SCF energy within 40eV of target? ... ok Do the scf targets have the right dimensions? ... ok Are scfvalues and its elements the right type? ... ok Are all the symmetry labels either Ag/u or Bg/u? ... ok Is moenergies a list containing one numpy array? ... ok ---------------------------------------------------------------------- Ran 19 tests in 0.016s ********* SUMMARY PER PACKAGE **************** Total Passed Failed Errors Skipped ADF2007.01 48 46 0 0 2 GAMESS-UK 58 58 0 0 0 GAMESS-US 75 71 2 0 2 Gaussian03 92 88 1 0 3 Jaguar7.0 54 47 0 0 7 Molpro2006 63 59 0 0 4 ORCA2.6 54 44 5 3 2 PCGAMESS 75 74 0 0 1 ********* SUMMARY OF EVERYTHING ************** TOTAL: 519 PASSED: 487 FAILED: 8 ERRORS: 3 SKIPPED: 21
  • 15. But it’s Python! I only code C, FORTRAN, etc. • Use cclib to convert the log file to JSON • JSON libraries are available for – C, C++, Java, Javascript, Perl, PHP, Python, Ruby • Could easily write convertor to some type of FORTRAN format www.ccdc.cam.ac.uk
  • 16. Some questions • Why don’t QM software developers list compatible tools on their website? – Good for the QM software, good for the tool • Why don’t QM software developers make it easier for tool developers? – API, documentation describing output, XML, interoperability • Why not open source? – Could fix these problems myself • Why can’t I mix and match calculation methods from different programs? • Why do academics restrict usage of their sophisticated routines to a single proprietary code? www.ccdc.cam.ac.uk
  • 17. OpenBabel - “Not just file conversion” • A C++ library for… • Cheminformatics – SMARTS searching, InChI, SMILES, molecular fingerprints, group- contribution based descriptors, determination of SSSR, bond order perception, hydrogen addition, Gasteiger charge calculation • Computational chemistry – AMBER, DMol3, Gaussian, GAMESS, GROMOS96, HyperChem, Jaguar, MOPAC, Q-Chem, Turbomole, ZINDO • varying levels of support – forcefield minimisation (UFF, MMFF94, Ghemical) – symmetrisation of almost symmetric molecules (coming soon) http://openbabel.org www.ccdc.cam.ac.uk
  • 18. Language bindings…and wrappers • OpenBabel is a C++ library • SWIG allows access to OpenBabel from – Java, Perl, Python, Ruby (and many more if we wish) • SWIG bindings are direct 1-to-1 translation of C++ API and objects to a Python API and objects • Pybel is a Pythonic wrapper around the SWIG bindings – Makes it easy to carry out common tasks – Allows idiomatic Python, e.g. using iterators, direct access to attribute values rather than Get/Set, reduces verbosity NM O’Boyle, C Morley, GR Hutchison. Chem. Cent. J. 2008, 2, 5. http://openbabel.org/wiki/Python www.ccdc.cam.ac.uk
  • 19. Let’s read a MOL file and optimise the geometry with the UFF forcefield SWIG bindings import openbabel as ob obconv = ob.OBConversion() obconv.SetInFormat(“mol") obmol = ob.OBMol() obconv.ReadFile(obmol, “caffeine.mol") obff = ob.OBForceField.FindForceField("UFF") obff.Setup(obmol) obff.ConjugateGradients(1000) obff.UpdateCoordinates(obmol) Pybel import pybel mol = pybel.readfile(“mol”, “caffeine.mol”).next() mol.optimise(“UFF”) # Coming soon! www.ccdc.cam.ac.uk
  • 20. Some questions • Why do some visualisation packages use their own parsing routines instead of adding them to libraries like OpenBabel or cclib? • Why don’t QM packages donate code or contract developers to improve support in libraries like OpenBabel or cclib? – ADF is doing this • How can we coordinate interoperability? … www.ccdc.cam.ac.uk
  • 22. • I propose blueobelisk-qm@lists.sf.net
  • 23. Make it work on Windows! • Most users use Windows, and even Linux users want the option of jumping between OSs • You restrict the reach of your software (and hasten its replacement) • Case study cclib-0.8 (Nov 07): – cclib-0.8.tar.gz 63 – cclib-0.8.zip 58 – cclib-0.8-py2.4.exe 26 – cclib-0.8-py2.5.exe 45 • For every Linux user, there are 2 Windows users www.ccdc.cam.ac.uk
  • 24. Make it easy to install on Windows! • No dependencies • Case study: GaussSum 2.1.4 (Nov 2007) – GaussSum-2.1.4.tar.gz 143 (Linux) – GaussSum-2.1.4.zip 206 (Windows, requires Python, Numpy and Python Imaging Library) – GaussSumexe-2.1.4.zip 396 (Windows, no dependencies) • Lower the barrier to installation – A one-click installer > a .zip file >> a .tar.gz file – Make the installation instructions easy • Case study: OpenBabel – OB 2.0.1 Linux:Windows 5:4 – OB 2.1.1 Linux:Windows 5:7.5 www.ccdc.cam.ac.uk
  • 25. Thanks! • The OpenBabel development team and particularly Geoff Hutchison and Chris Morley • cclib: Adam Tenderholt and Karol Langner • SourceForge • Email: baoilleach@gmail.com, oboyle@ccdc.cam.ac.uk • Blog: http://baoilleach.blogspot.com • Website: http://www.redbrick.dcu.ie/~noel www.ccdc.cam.ac.uk