SlideShare a Scribd company logo
A Quantum Chemist Meets Cheminformatics
@janhjensen
Jan H. Jensen, University of Copenhagen
Feel free to tweet, record, …
#WATOC
2018
2017
2016
How I met RDKit
RDKit
N
H
N + Br+
-H+
N
H
N N
H
N N
H
N
Br
Br
Br
or ?
N
H
N
N
H
N H
H
H
N
H
N
H
N
H
N
169.4182.2 181.1
H HH
H
PM3/COSMOheat of formation
Kcal/mol
90% success rate for 520 compounds
Workflow / Automation
Molecule Protonated
isomers
Conformational
search
Find lowest
energy isomer
Display
result
Check for
proton transfer
ChemDraw RDKit
SMILES
c1cnc(cc1c1n(c(c(n1)c1ccc(cc1Cl)Cl)C(=O)OC)COCC[Si](C)(C)C)NC(=O)C
RDKitRDKit
N
N
N
Cl
Cl
O
O
O
Si
N
H
O
1
RegioSQM
23
4
5
6
6 isomers x 20 confs
SMILES
github.com/jensengroup/RegioSQM
Web server
regiosqm.org
xyz2mol
Molecule Protonated
isomers
Conformational
search
Find lowest
energy isomer
Display
result
Check for
proton transfer
ChemDraw RDKit
SMILES
RDKitRDKit
xyz2mol
6 isomers x 20 confs
N
N
N
Cl
Cl
O
O
O
Si
N
H
O N
N
N
Cl
Cl
O
O
O
Si
N
H
O
xyx2mol converts an xyz file to an RDKit mol object
(needs the molecular charge)
github.com/jensengroup/xyz2mol
2019
2004
2013
github.com/jensengroup/GB-GA
An RDKit implementation of
and
Last example
crossover
A Quantum Chemist Meets Cheminformatics
A Quantum Chemist Meets Cheminformatics
Unpublished: finding molecules that absorb at 600 nm
Mating pool size: 20, mutation rate: 0.05, sTDA-xTB, 10 runs
Starting from random molecule in the ZINC data base
O
N
NH
O
O
F
O
NH2
N
O
NH
N
H
O
O
N
N
N
NH2
Does it work?
H
N
O
N
N
N
O
O
O
N
O
Summary/Outlook
RDKit changed my research life
Quantum chemical studies need to be automated
Manual work replacing CPU power as rate limiting step
Mistakes become increasingly common
QM students need to learn Python/RDKit
“QM-needs” for RDKit
Conf search for finding global minimum
Generalized Born solvation model

More Related Content

PDF
Mining Correlations on Massive Bursty Time Series Collection (DASFAA2015)
PDF
Can We Automate Computational Studies of Enzymes? Lessons from Small-Molecul...
PDF
Incremental and parallel computation of structural graph summaries for evolvi...
PDF
Lec20 dimension1
PPTX
Gcf and lcm
PDF
Seminar @ U of Tokyo: 2014.04.14
PPTX
W3 Example 1 Answers
PDF
Ecat 2013
Mining Correlations on Massive Bursty Time Series Collection (DASFAA2015)
Can We Automate Computational Studies of Enzymes? Lessons from Small-Molecul...
Incremental and parallel computation of structural graph summaries for evolvi...
Lec20 dimension1
Gcf and lcm
Seminar @ U of Tokyo: 2014.04.14
W3 Example 1 Answers
Ecat 2013

Similar to A Quantum Chemist Meets Cheminformatics (11)

PPT
How the InChI identifier is used to underpin our online chemistry databases a...
PPT
How the InChI identifier is used to underpin our online chemistry databases a...
PDF
RDKit: Six Not-So-Easy Pieces [RDKit UGM 2016]
PPT
Going a mile InChI by InChI : Enabling online chemistry at ChemSpider
PPT
ACS San Francisco 2010 CINF Talk
PDF
Open-source from/in the enterprise: the RDKit
PPT
5th Meeting on U.S. Government Chemical Databases and Open Chemistry Talk
PDF
Mike Lynch Award Lecture, ICCS 2022
PPTX
How can the international chemical identifier (InChI) be extended to non triv...
PPTX
How can the international chemical identifier (InChI) be extended to non …
PPT
The importance of the InChI identifier as a foundation technology for eScienc...
How the InChI identifier is used to underpin our online chemistry databases a...
How the InChI identifier is used to underpin our online chemistry databases a...
RDKit: Six Not-So-Easy Pieces [RDKit UGM 2016]
Going a mile InChI by InChI : Enabling online chemistry at ChemSpider
ACS San Francisco 2010 CINF Talk
Open-source from/in the enterprise: the RDKit
5th Meeting on U.S. Government Chemical Databases and Open Chemistry Talk
Mike Lynch Award Lecture, ICCS 2022
How can the international chemical identifier (InChI) be extended to non triv...
How can the international chemical identifier (InChI) be extended to non …
The importance of the InChI identifier as a foundation technology for eScienc...
Ad

More from molmodbasics (20)

PDF
xyz2mol for organometallic compounds
PDF
Chemical Space Exploration
PDF
ChemRxiv, Plan S, and OA publishing
PDF
Open is Better
PDF
Proteiner du kan regne med
PDF
Using semiempirical methods for fast and automated predictions
PDF
Jan H. Jensen: profile
PDF
Why I blog
PDF
Why I tweet
PDF
Can semiempirical methods be used for high throughput screening (for enzyme m...
PDF
Thermodynamics for Biochemists: a YouTube textbook
PDF
Predicting accurate absolute binding energies in aqueous solution: thermodyn...
PDF
I lecture nomore
PDF
Teaching Tools and Tips
PDF
Short answer questions on thermodynamics
PDF
Different kinds of peer instruction questions for thermodynamics
PDF
Teaching Tools and Tips
PDF
Quantum Biochemistry: the rise of semiempirical methods
PDF
Peer instruction questions on thermodynamics part 1
PDF
Protein structure determination & refinement using QM-derived chemical shifts
xyz2mol for organometallic compounds
Chemical Space Exploration
ChemRxiv, Plan S, and OA publishing
Open is Better
Proteiner du kan regne med
Using semiempirical methods for fast and automated predictions
Jan H. Jensen: profile
Why I blog
Why I tweet
Can semiempirical methods be used for high throughput screening (for enzyme m...
Thermodynamics for Biochemists: a YouTube textbook
Predicting accurate absolute binding energies in aqueous solution: thermodyn...
I lecture nomore
Teaching Tools and Tips
Short answer questions on thermodynamics
Different kinds of peer instruction questions for thermodynamics
Teaching Tools and Tips
Quantum Biochemistry: the rise of semiempirical methods
Peer instruction questions on thermodynamics part 1
Protein structure determination & refinement using QM-derived chemical shifts
Ad

Recently uploaded (20)

DOCX
Viruses (History, structure and composition, classification, Bacteriophage Re...
PDF
An interstellar mission to test astrophysical black holes
PDF
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...
PDF
Placing the Near-Earth Object Impact Probability in Context
PDF
Mastering Bioreactors and Media Sterilization: A Complete Guide to Sterile Fe...
PPTX
Cell Membrane: Structure, Composition & Functions
PPTX
INTRODUCTION TO EVS | Concept of sustainability
PPTX
Taita Taveta Laboratory Technician Workshop Presentation.pptx
PPTX
ECG_Course_Presentation د.محمد صقران ppt
PPT
POSITIONING IN OPERATION THEATRE ROOM.ppt
PDF
. Radiology Case Scenariosssssssssssssss
PPTX
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
PPTX
neck nodes and dissection types and lymph nodes levels
PPTX
Derivatives of integument scales, beaks, horns,.pptx
PPTX
7. General Toxicologyfor clinical phrmacy.pptx
PDF
Phytochemical Investigation of Miliusa longipes.pdf
PPTX
microscope-Lecturecjchchchchcuvuvhc.pptx
PPTX
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
PPTX
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
PDF
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
Viruses (History, structure and composition, classification, Bacteriophage Re...
An interstellar mission to test astrophysical black holes
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...
Placing the Near-Earth Object Impact Probability in Context
Mastering Bioreactors and Media Sterilization: A Complete Guide to Sterile Fe...
Cell Membrane: Structure, Composition & Functions
INTRODUCTION TO EVS | Concept of sustainability
Taita Taveta Laboratory Technician Workshop Presentation.pptx
ECG_Course_Presentation د.محمد صقران ppt
POSITIONING IN OPERATION THEATRE ROOM.ppt
. Radiology Case Scenariosssssssssssssss
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
neck nodes and dissection types and lymph nodes levels
Derivatives of integument scales, beaks, horns,.pptx
7. General Toxicologyfor clinical phrmacy.pptx
Phytochemical Investigation of Miliusa longipes.pdf
microscope-Lecturecjchchchchcuvuvhc.pptx
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...

A Quantum Chemist Meets Cheminformatics

  • 1. A Quantum Chemist Meets Cheminformatics @janhjensen Jan H. Jensen, University of Copenhagen Feel free to tweet, record, … #WATOC
  • 3. N H N + Br+ -H+ N H N N H N N H N Br Br Br or ? N H N N H N H H H N H N H N H N 169.4182.2 181.1 H HH H PM3/COSMOheat of formation Kcal/mol 90% success rate for 520 compounds
  • 4. Workflow / Automation Molecule Protonated isomers Conformational search Find lowest energy isomer Display result Check for proton transfer ChemDraw RDKit SMILES c1cnc(cc1c1n(c(c(n1)c1ccc(cc1Cl)Cl)C(=O)OC)COCC[Si](C)(C)C)NC(=O)C RDKitRDKit N N N Cl Cl O O O Si N H O 1 RegioSQM 23 4 5 6 6 isomers x 20 confs SMILES github.com/jensengroup/RegioSQM
  • 6. xyz2mol Molecule Protonated isomers Conformational search Find lowest energy isomer Display result Check for proton transfer ChemDraw RDKit SMILES RDKitRDKit xyz2mol 6 isomers x 20 confs N N N Cl Cl O O O Si N H O N N N Cl Cl O O O Si N H O xyx2mol converts an xyz file to an RDKit mol object (needs the molecular charge) github.com/jensengroup/xyz2mol
  • 11. Unpublished: finding molecules that absorb at 600 nm Mating pool size: 20, mutation rate: 0.05, sTDA-xTB, 10 runs Starting from random molecule in the ZINC data base O N NH O O F O NH2 N O NH N H O O N N N NH2 Does it work? H N O N N N O O O N O
  • 12. Summary/Outlook RDKit changed my research life Quantum chemical studies need to be automated Manual work replacing CPU power as rate limiting step Mistakes become increasingly common QM students need to learn Python/RDKit “QM-needs” for RDKit Conf search for finding global minimum Generalized Born solvation model