SlideShare a Scribd company logo
Providing Support for JC Bradley’s
Vision of Open Science using RSC
Cheminformatics Platforms
Antony Williams
Jean-Claude Bradley Memorial Symposium
July 14th
2014
How Visions Aligned…
• We serve the community with data, services
and platforms to support science
• So much of what JC (and Andy!) needed
already existed on ChemSpider
• Many members of our team helped for the
sake of science…working outside work
hours…data curation
• Some of us bought into the vision of Open
Notebook Science…ahead of the curve
• So how did we help??
• ~30 million chemicals and growing
• Data sourced from >500 different sources
• Crowdsourced curation and annotation
• Ongoing deposition of data from our journals
and our collaborators
• A structure centric hub for web-searching
• JC tapped into ChemSpider a lot for data
validation and integration to his ONS wikis
ChemSpider
APIs
APIs
ChemSpider Spectra
www.SpectralGame.com
http://www.jcheminf.com/content/1/1/9
Where can SpectralGame Go?
• We are interested in supporting extensions
and enhancements to SpectralGame
• More data required….our spectral data
repository can host it
• Hosting assigned spectral data and using in
SpectralGame makes sense!
• And what about educating/testing students as
they do real time assignments?
• A project for when there is time and interest…
Javascript viewer NMR, MS, IR
Collaborations in Openness
• JC believed in HIGH-QUALITY data
• He invested himself, and his students, in
validating, checking and re-measuring data
• He demanded openness of data, free of
restrictions and constraints
• Do his efforts make a difference???
Supporting Open Data
Data Validation/Standardization
is critical – about to apply to MP
Thanks to Igor Tetko, OCHEM
Providing support for JC Bradleys vision of open science using RSC cheminformatics platforms
Providing support for JC Bradleys vision of open science using RSC cheminformatics platforms
Collaborations in Openness
• JC believed in HIGH-QUALITY data
• He invested himself, and his students, in
validating, checking and re-measuring data
• He demanded openness of data, free of
restrictions and constraints
• Do his efforts make a difference???
• How can the resulting models be used?
• Free prediction engines, warning/flagging
data in ELNs, at deposition into databases
Text-mining Data – Daniel Lowe
Open Notebook Science Wikis
• The vast majority of scientists don’t want or
don’t have the skills to manage ONS systems
• If they had the right platform for ONS they
might just use it…
• But we hear: privacy before sharing, more
functionality required, not what I need etc.
• We provided data storage and access first (and
JC used it) and are now collaborating on ELNs
Building the RSC Data Repository
• Registration of chemical compounds
• Deposition of chemical syntheses
• Addition of analytical data
• Integration to electronic notebooks
• Rewards and recognition for data sharing
• Document processing
• Hosting of data as private, embargoed or
public
What we will deliver for all data
• Simple interfaces for uploading of data
• Embeddable widgets and programming
interfaces to utilize in in-house systems, ELNs
• Automated harvesting approaches
• Data validation approaches where possible
JC and Drug Discovery
• JC cared passionately about neglected
disease research
• Many of our conversations were around better
data-sharing for the various groups
• We are trying to help…
Open Source Drug Discovery
OSDD Collaboration
• We will provide access and support to the
ChemSpider API to integrate to their OSDD
cheminformatics platform
• We will extend our data model to support their
Open Data – compounds, pharmacology data
• Synthetic reactions will be published to
ChemSpider SyntheticPages and Reactions
• Analytical Data to be hosted in Data Repository
Providing support for JC Bradleys vision of open science using RSC cheminformatics platforms
• 3-year Innovative Medicines Initiative project
• Integrating chemistry and biology data using
semantic web technologies
• Open source code, open data and open
standards
• Academics, Pharmas, Publishers…
• To put medicines in the pipeline…
Providing support for JC Bradleys vision of open science using RSC cheminformatics platforms
Open Sourcing Data and Code
• All Open PHACTS data is licensed as Open
Data and available from Open PHACTS
website – ca. 2 Million chemicals
• The Chemical Registration Service, including
Chemical Validation and Standardization
Platform will be released as Open Source
code to the community (from Open PHACTS
github site)
Providing support for JC Bradleys vision of open science using RSC cheminformatics platforms
Thank you
Email: williamsa@rsc.org
ORCID: 0000-0002-2668-4821
Twitter: @ChemConnector
Personal Blog: www.chemconnector.com
SLIDES: www.slideshare.net/AntonyWilliams

More Related Content

PPTX
SciDataCon - How to increase accessibility and reuse for clinical and persona...
PPTX
Altmetrics : Rodrigo Costas Comesaña
PPTX
The Kaleidoscope of Impact: same data, different perspectives, constantly cha...
PPTX
THOR Workshop - Introduction
PPTX
THOR Workshop - Data Publishing Elsevier
PPTX
Why would a publisher care about open data?
PPTX
THOR Workshop - Data Publishing
PPTX
Integrating research indicators for use in the repositories infrastructure
SciDataCon - How to increase accessibility and reuse for clinical and persona...
Altmetrics : Rodrigo Costas Comesaña
The Kaleidoscope of Impact: same data, different perspectives, constantly cha...
THOR Workshop - Introduction
THOR Workshop - Data Publishing Elsevier
Why would a publisher care about open data?
THOR Workshop - Data Publishing
Integrating research indicators for use in the repositories infrastructure

What's hot (20)

PPT
Our dire need to mandate data standards and expectations for scientific publi...
PPT
Big Data (SOCIOMETRIC METHODS FOR RELEVANCY ANALYSIS OF LONG TAIL SCIENCE D...
PPTX
THOR Workshop - Data Publishing PLOS
PDF
Attribution From Res Lib Perspective - Micah Altman, MIT
PPTX
Burton - Security, Privacy and Trust
PDF
20190527_Dietmar Lampert _ New indicators for Open Sciene
PPTX
THOR Workshop - Persistent Identifier Linking
PPT
Dealing with the complex challenge of managing diverse chemistry data online
PDF
THOR Workshop - Services PANGAEA
PPTX
COAR Next Generation Repositories WG - Text mining and Recommender system sto...
PDF
A snake, a planet, and a bear ditching spreadsheets for quick, reproducible r...
PPT
eScience at the Royal Society of Chemistry and our current initiatives
PDF
20190527_Paolo Manghi_ OpenAIRE monitoring
PPTX
Transparency and reproducibility in research
PDF
Mendeley Data: Enhancing Data Discovery, Sharing and Reuse
PPT
Encouraging undergraduate students to participate as authors of scientific pu...
PPTX
Practical applications for altmetrics in a changing metrics landscape
PDF
Simon Goudie - Wiley’s Recommendations for Journal Data Policies
PPTX
Research information management: making sense of it all
PDF
Data Metadata and Data Citation - Emma Ganley (PLoS)
Our dire need to mandate data standards and expectations for scientific publi...
Big Data (SOCIOMETRIC METHODS FOR RELEVANCY ANALYSIS OF LONG TAIL SCIENCE D...
THOR Workshop - Data Publishing PLOS
Attribution From Res Lib Perspective - Micah Altman, MIT
Burton - Security, Privacy and Trust
20190527_Dietmar Lampert _ New indicators for Open Sciene
THOR Workshop - Persistent Identifier Linking
Dealing with the complex challenge of managing diverse chemistry data online
THOR Workshop - Services PANGAEA
COAR Next Generation Repositories WG - Text mining and Recommender system sto...
A snake, a planet, and a bear ditching spreadsheets for quick, reproducible r...
eScience at the Royal Society of Chemistry and our current initiatives
20190527_Paolo Manghi_ OpenAIRE monitoring
Transparency and reproducibility in research
Mendeley Data: Enhancing Data Discovery, Sharing and Reuse
Encouraging undergraduate students to participate as authors of scientific pu...
Practical applications for altmetrics in a changing metrics landscape
Simon Goudie - Wiley’s Recommendations for Journal Data Policies
Research information management: making sense of it all
Data Metadata and Data Citation - Emma Ganley (PLoS)
Ad

Similar to Providing support for JC Bradleys vision of open science using RSC cheminformatics platforms (20)

PPT
Delivering on the promise of a chemistry data repository for the world
PDF
ICIC 2013 Conference Proceedings Antony Williams Royal Society of Chemistry
PPT
Big data challenges associated with building a national data repository for c...
PPT
Activities at the Royal Society of Chemistry to gather, extract and analyze b...
PPT
Facilitating Scientific Discovery through Crowdsourcing and Distributed Parti...
PPT
ChemSpider – disseminating data and enabling an abundance of chemistry platforms
PDF
The art of depositing social science data: maximising quality and ensuring go...
PPTX
Public access to research results at USDA
PPTX
Workshop - finding and accessing data - Cambridge August 22 2016
PPTX
Data Harmonization for a Molecularly Driven Health System
PPTX
Meeting Federal Research Requirements
PPTX
Finding and Accessing Human Genomics Datasets
PPT
Dealing with the complex challenge of managing diverse chemistry data online
PPTX
Data Harmonization for a Molecularly Driven Health System
PPTX
Big Data in Pediatric Critical Care by Mohit Mehra
PPTX
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
PPT
Marrying ACDLabs technologies to eScience Projects at the Royal Society of C...
PPTX
Hadoop Enabled Healthcare
PPTX
tomaz vindonja
Delivering on the promise of a chemistry data repository for the world
ICIC 2013 Conference Proceedings Antony Williams Royal Society of Chemistry
Big data challenges associated with building a national data repository for c...
Activities at the Royal Society of Chemistry to gather, extract and analyze b...
Facilitating Scientific Discovery through Crowdsourcing and Distributed Parti...
ChemSpider – disseminating data and enabling an abundance of chemistry platforms
The art of depositing social science data: maximising quality and ensuring go...
Public access to research results at USDA
Workshop - finding and accessing data - Cambridge August 22 2016
Data Harmonization for a Molecularly Driven Health System
Meeting Federal Research Requirements
Finding and Accessing Human Genomics Datasets
Dealing with the complex challenge of managing diverse chemistry data online
Data Harmonization for a Molecularly Driven Health System
Big Data in Pediatric Critical Care by Mohit Mehra
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Marrying ACDLabs technologies to eScience Projects at the Royal Society of C...
Hadoop Enabled Healthcare
tomaz vindonja
Ad

Recently uploaded (20)

PDF
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
PPTX
neck nodes and dissection types and lymph nodes levels
PPTX
Taita Taveta Laboratory Technician Workshop Presentation.pptx
PPTX
Introduction to Cardiovascular system_structure and functions-1
PPTX
ECG_Course_Presentation د.محمد صقران ppt
PDF
. Radiology Case Scenariosssssssssssssss
PPTX
INTRODUCTION TO EVS | Concept of sustainability
PPTX
2. Earth - The Living Planet Module 2ELS
PPTX
Comparative Structure of Integument in Vertebrates.pptx
PPTX
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
PDF
AlphaEarth Foundations and the Satellite Embedding dataset
PDF
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
PPTX
Protein & Amino Acid Structures Levels of protein structure (primary, seconda...
PPTX
Cell Membrane: Structure, Composition & Functions
PPTX
7. General Toxicologyfor clinical phrmacy.pptx
PDF
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
PDF
Sciences of Europe No 170 (2025)
PPT
protein biochemistry.ppt for university classes
DOCX
Q1_LE_Mathematics 8_Lesson 5_Week 5.docx
PPTX
Vitamins & Minerals: Complete Guide to Functions, Food Sources, Deficiency Si...
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
neck nodes and dissection types and lymph nodes levels
Taita Taveta Laboratory Technician Workshop Presentation.pptx
Introduction to Cardiovascular system_structure and functions-1
ECG_Course_Presentation د.محمد صقران ppt
. Radiology Case Scenariosssssssssssssss
INTRODUCTION TO EVS | Concept of sustainability
2. Earth - The Living Planet Module 2ELS
Comparative Structure of Integument in Vertebrates.pptx
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
AlphaEarth Foundations and the Satellite Embedding dataset
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
Protein & Amino Acid Structures Levels of protein structure (primary, seconda...
Cell Membrane: Structure, Composition & Functions
7. General Toxicologyfor clinical phrmacy.pptx
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
Sciences of Europe No 170 (2025)
protein biochemistry.ppt for university classes
Q1_LE_Mathematics 8_Lesson 5_Week 5.docx
Vitamins & Minerals: Complete Guide to Functions, Food Sources, Deficiency Si...

Providing support for JC Bradleys vision of open science using RSC cheminformatics platforms

  • 1. Providing Support for JC Bradley’s Vision of Open Science using RSC Cheminformatics Platforms Antony Williams Jean-Claude Bradley Memorial Symposium July 14th 2014
  • 2. How Visions Aligned… • We serve the community with data, services and platforms to support science • So much of what JC (and Andy!) needed already existed on ChemSpider • Many members of our team helped for the sake of science…working outside work hours…data curation • Some of us bought into the vision of Open Notebook Science…ahead of the curve • So how did we help??
  • 3. • ~30 million chemicals and growing • Data sourced from >500 different sources • Crowdsourced curation and annotation • Ongoing deposition of data from our journals and our collaborators • A structure centric hub for web-searching • JC tapped into ChemSpider a lot for data validation and integration to his ONS wikis
  • 9. Where can SpectralGame Go? • We are interested in supporting extensions and enhancements to SpectralGame • More data required….our spectral data repository can host it • Hosting assigned spectral data and using in SpectralGame makes sense! • And what about educating/testing students as they do real time assignments? • A project for when there is time and interest…
  • 11. Collaborations in Openness • JC believed in HIGH-QUALITY data • He invested himself, and his students, in validating, checking and re-measuring data • He demanded openness of data, free of restrictions and constraints • Do his efforts make a difference???
  • 14. Thanks to Igor Tetko, OCHEM
  • 17. Collaborations in Openness • JC believed in HIGH-QUALITY data • He invested himself, and his students, in validating, checking and re-measuring data • He demanded openness of data, free of restrictions and constraints • Do his efforts make a difference??? • How can the resulting models be used? • Free prediction engines, warning/flagging data in ELNs, at deposition into databases
  • 18. Text-mining Data – Daniel Lowe
  • 19. Open Notebook Science Wikis • The vast majority of scientists don’t want or don’t have the skills to manage ONS systems • If they had the right platform for ONS they might just use it… • But we hear: privacy before sharing, more functionality required, not what I need etc. • We provided data storage and access first (and JC used it) and are now collaborating on ELNs
  • 20. Building the RSC Data Repository • Registration of chemical compounds • Deposition of chemical syntheses • Addition of analytical data • Integration to electronic notebooks • Rewards and recognition for data sharing • Document processing • Hosting of data as private, embargoed or public
  • 21. What we will deliver for all data • Simple interfaces for uploading of data • Embeddable widgets and programming interfaces to utilize in in-house systems, ELNs • Automated harvesting approaches • Data validation approaches where possible
  • 22. JC and Drug Discovery • JC cared passionately about neglected disease research • Many of our conversations were around better data-sharing for the various groups • We are trying to help…
  • 23. Open Source Drug Discovery
  • 24. OSDD Collaboration • We will provide access and support to the ChemSpider API to integrate to their OSDD cheminformatics platform • We will extend our data model to support their Open Data – compounds, pharmacology data • Synthetic reactions will be published to ChemSpider SyntheticPages and Reactions • Analytical Data to be hosted in Data Repository
  • 26. • 3-year Innovative Medicines Initiative project • Integrating chemistry and biology data using semantic web technologies • Open source code, open data and open standards • Academics, Pharmas, Publishers… • To put medicines in the pipeline…
  • 28. Open Sourcing Data and Code • All Open PHACTS data is licensed as Open Data and available from Open PHACTS website – ca. 2 Million chemicals • The Chemical Registration Service, including Chemical Validation and Standardization Platform will be released as Open Source code to the community (from Open PHACTS github site)
  • 30. Thank you Email: williamsa@rsc.org ORCID: 0000-0002-2668-4821 Twitter: @ChemConnector Personal Blog: www.chemconnector.com SLIDES: www.slideshare.net/AntonyWilliams