SlideShare a Scribd company logo
Navigating the Complex Web of Chemistry Using ChemSpider
Antony Williams vs Identifiers Old Passport ID Dad, Tony, others SSN Green Card License 5 email addresses ChemSpiderman (blog, Twitter account, Facebook, Friendfeed) OpenID … .
Aspirin vs Chemical Identifiers
Aspirin names and synonyms Text searches depend on correct association 335  suggested identifiers for Aspirin just on PubChem! Disambiguation dictionaries are necessary
Linked Data Cloud
… the premium database producers are using some automatic tools to prepare a ‘first draft’ of a database record,  to be refined by eye .  Coupled with the public internet as a distribution method of choice, it is becoming possible for the first time  to create and distribute new structure based databases at much lower costs, or even free of charge.
 
 
The Final Search Strategy
All Those Names, One Structure
Content is King and  Quality  Costs Chemistry “content” is big  business. Not everyone can afford it. Patent searching Structures and properties Drug databases Literature databases Chemical Abstracts Service  (CAS), the “Gold Standard” in Chemistry related information 101 years of content $260 million revenue (2006) >50 million substances  Proprietary platform
Searching Chemistry on the  Internet How complete a result set will we get if we search for “chemicals” by name? Is there a better way to link chemistry databases? Linking by “names” is dangerous Chemists want structure and SUBstructure searching
The InChI Identifier
Multiple Layers
InChIStrings Hash to InChIKeys
Oleoylethanolamine
InChIKey Searches Work
Search Engine Dependencies
Search Engine Dependencies
InChIs have traction…
RDF Linking of Structures
PubChem
The Simplest Organic Molecule
Question Everything online: www.dhmo.org
The Structure-Based Data Cloud
Vancomycin
 
Vancomycin Who will curate? How would you clean such a large dataset?
Vancomycin on ChemSpider
Vancomycin
Vancomycin Search Molecular SKELETON Search Full Molecule
Full  Skeleton  Search: 104 Hits
Full  Molecule  Search: 4 Hits
What is ChemSpider? ChemSpider is: Building a Structure Centric Community for Chemists 22.2 million compounds, >200 data sources A deposition and curation platform A publishing platform for the community Grows daily – more depositions, more links, more data sources
For Chemical Compounds Vendor sites – Aldrich, Alfa Aesar, TCI and 100s of others Government databases – PubChem, DSSTox, FDA databases, ChemIDPlus,… Biological Databases – Protein Database, Stitch, KEGG, ChEBI,… Analytical databases –NMRShiftDB,…
How Was ChemSpider Built? ChemSpider was a “hobby project”  Housed in a basement and running off three servers – one bought, two built May 2009
3 servers – 2 homebuilt .NET architecture  SQL server Homebuilt structure/substructure Commercial components Open Source Components OpenBabel, Jmol, JSpecView, NCBI Toolkit, InChI Libraries
Search Cholesterol
Search Cholesterol
Search Cholesterol
Search Cholesterol
Linked across the internet
Kyoto Encyclopedia of Genes and Genomes
Links to Patents based on structure
 
Answering Questions for Chemists Questions a chemist might ask… What is the melting point of n-butanol?  What is the chemical structure of Xanax? Chemically, what is phenolphthalein? What are the stereocenters of cholesterol? Where can I find publications about xylene? What are the different trade names for Ketoconazole? What is the NMR spectrum of Aspirin? What are the safety handling issues for Thymol Blue?
Complex Data and Information
Remember –  QUALITY ISSUES
The FDA’s DailyMed
  Incorrect Structures
Does one stereocenter matter? Distaval, Talimol, Nibrol, Sedimide, Quietoplex, Contergan, Neurosedyn, and Softenon
Crowd-sourcing Chemistry Curation
We Need Recognition and Rewards
Master Curators, Curators, Depositors
Collaborating with Wikipedia Long term project to curate chemical compounds Robotically linking ChemSpider to Wikipedia at present Will layer on InChI Strings and InChIKeys shortly and make Wikipedia structure searchable
Blogs need InChIs too!
Blogs need InChIs too!
Use Intelligent Structures :  ChemSpider Embed Web Service
ChemSpider Web Services
Semantic Mark-up for Chemistry Semantic mark-up for  chemistry  is here RSC project prospect Nature publishing group compound linking ChemMantis
Nature Chemistry Compound Pages
Project Prospect
ChemMantis
Deposit Structures
Species – linked to Wikipedia
Semantic Linking of Structures What would you want to link off a structure? Chemical suppliers Other publications Analytical Data Related Reactions Wikipedia Patents “ Everything”
The InChI “Resolver”
InChI Resolver to DOIs Structure Search the Web
 
Conclusions Internet resources provide a collaborative community for chemistry  Crowdsourcing to expand, curate and integrate to the benefit of chemists Searching the web for chemistry is arriving InChIs are enabling chemistry on the internet  Question Quality!
[email_address] Twitter: ChemSpiderman www.chemspider.com/blog

More Related Content

PPT
Taming The Wild West Of Internet Based Chemistry You Can Help
PPT
RSC ChemSpider Science Commons Symposium Pacific Northwest #scspn
PPT
PPT
ChemSpider hosting linking and curating chemistry data for the community
PPT
ChemSpider and How The Wisdom Of The Crowds Can Improve The Quality Of ...
PPT
How Internet Resources Are Providing a Collaborative Community for Chemistry
Taming The Wild West Of Internet Based Chemistry You Can Help
RSC ChemSpider Science Commons Symposium Pacific Northwest #scspn
ChemSpider hosting linking and curating chemistry data for the community
ChemSpider and How The Wisdom Of The Crowds Can Improve The Quality Of ...
How Internet Resources Are Providing a Collaborative Community for Chemistry

What's hot (19)

PPT
ChemSpider – A Platform to Gather, Host and Integrate Structure Based Data Ac...
PPT
Enhancing Discoverability Across Royal Society Of Chemistry Content By Integr...
PPT
ChemSpider – A Community Platform for Chemistry and Resources Supporting the ...
PPT
Citizen Scientists and Their Contributions to Internet Based Chemistry
PPT
ChemSpider - Building a Foundation for the Semantic Web by Hosting a Crowd So...
PPT
ChemSpider as a Foundation for Crowdsourcing and Collaborations in Open Chemi...
PPTX
RSC ChemSpider – Building An Internet Based Community For Chemists
PPT
Crowdsourcing, Collaborations and Text-Mining in a World of Open Chemistry
PPT
ChemSpider – The Vision and Challenges Associated with Building a Free Online...
PPT
Delivering Curated Chemistry to the World via Crowdsourced Deposition and Ann...
PPT
Connecting Chemists To The Internet Training at Burlington House 2010
PPT
PPT
Integrating and curating internet based chemistry resources to serve life sci...
PPT
Text Mining for Chemistry and Building a Public Platform for Document Markup
ChemSpider – A Platform to Gather, Host and Integrate Structure Based Data Ac...
Enhancing Discoverability Across Royal Society Of Chemistry Content By Integr...
ChemSpider – A Community Platform for Chemistry and Resources Supporting the ...
Citizen Scientists and Their Contributions to Internet Based Chemistry
ChemSpider - Building a Foundation for the Semantic Web by Hosting a Crowd So...
ChemSpider as a Foundation for Crowdsourcing and Collaborations in Open Chemi...
RSC ChemSpider – Building An Internet Based Community For Chemists
Crowdsourcing, Collaborations and Text-Mining in a World of Open Chemistry
ChemSpider – The Vision and Challenges Associated with Building a Free Online...
Delivering Curated Chemistry to the World via Crowdsourced Deposition and Ann...
Connecting Chemists To The Internet Training at Burlington House 2010
Integrating and curating internet based chemistry resources to serve life sci...
Text Mining for Chemistry and Building a Public Platform for Document Markup
Ad

Similar to Navigating the Complex Web of Chemistry Using ChemSpider (17)

PPT
PPT
RSC ChemSpider -- Managing and Integrating Chemistry on the Internet to Build...
PPT
AZ of Chemspider February 2011
PPT
Chemspider hosting linking and curating chemistry data for the community
PPT
ChemSpider – An Online Database and Registration System Linking the Web
PPT
Chem spider introduction spring 2011
PPT
Crowdsourcing, Collaborations And Text Mining In A World Of Open Chemistry
PPT
ChemSpider as a Platform for Crowd Participation in Curating Chemistry
PPT
Using Text-Mining and Crowdsourced Curation to Build a Structure Centric Comm...
PPT
A Presentation At Nature Publishing Group Crowdsourcing, Collaborations And T...
PPT
Hosting public domain chemicals data online for the community – the challenge...
PPT
ChemSpider - Building a Crowdsourced Chemical Database for the Chemistry Comm...
PPT
Crowdsourced Curation of Chemistry Data. How Bad is Online Chemistry Data?
PPT
ChemSpider as an integration hub for interlinked chemistry data
RSC ChemSpider -- Managing and Integrating Chemistry on the Internet to Build...
AZ of Chemspider February 2011
Chemspider hosting linking and curating chemistry data for the community
ChemSpider – An Online Database and Registration System Linking the Web
Chem spider introduction spring 2011
Crowdsourcing, Collaborations And Text Mining In A World Of Open Chemistry
ChemSpider as a Platform for Crowd Participation in Curating Chemistry
Using Text-Mining and Crowdsourced Curation to Build a Structure Centric Comm...
A Presentation At Nature Publishing Group Crowdsourcing, Collaborations And T...
Hosting public domain chemicals data online for the community – the challenge...
ChemSpider - Building a Crowdsourced Chemical Database for the Chemistry Comm...
Crowdsourced Curation of Chemistry Data. How Bad is Online Chemistry Data?
ChemSpider as an integration hub for interlinked chemistry data
Ad

Recently uploaded (20)

PDF
Network Security Unit 5.pdf for BCA BBA.
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
Spectroscopy.pptx food analysis technology
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
cuic standard and advanced reporting.pdf
Network Security Unit 5.pdf for BCA BBA.
“AI and Expert System Decision Support & Business Intelligence Systems”
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Spectroscopy.pptx food analysis technology
The AUB Centre for AI in Media Proposal.docx
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Dropbox Q2 2025 Financial Results & Investor Presentation
Reach Out and Touch Someone: Haptics and Empathic Computing
Chapter 3 Spatial Domain Image Processing.pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Building Integrated photovoltaic BIPV_UPV.pdf
MIND Revenue Release Quarter 2 2025 Press Release
Unlocking AI with Model Context Protocol (MCP)
NewMind AI Weekly Chronicles - August'25-Week II
Diabetes mellitus diagnosis method based random forest with bat algorithm
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Per capita expenditure prediction using model stacking based on satellite ima...
A comparative analysis of optical character recognition models for extracting...
cuic standard and advanced reporting.pdf

Navigating the Complex Web of Chemistry Using ChemSpider