The Chemtools LaBLog Recording research in the real world Cameron Neylon Contributions from Jeremy Frey, Andrew Milsted, Steve Wilson, Simon Coles, Mark Borkum, Jenny Hale, and others
Goals A complete and useable record for the researcher and research team Enable a human reader to fully reproduce all experiments and replicate all data analysis in detail New functionality (video, search, communication, links, visualisation) Enable machine reading for automated aggregation and analysis
A small challenge… Can anyone name or identify a paper in which it is possible to completely and precisely replicate the data analysis, including availability of raw data, full details of tools, version, and parameters for data analysis, and version (or date) of any databases used in the analysis.
A blog as the lab book http://chemtools.chem.soton.ac.uk/projects/blog /   “Bio Blogs” http://blogs.openwetware.org/scienceintheopen   Discussion
One item – one post (1I-1P) system
1I-1P gives every sample a URI
1I-1P relationships between posts An rdf dump of posts and links between them rendered using Welkin (simile.mit.edu/welkin)
1I-1P relationships between posts
What about semantics? System is semantically unaware Arbitrary key-value pairs stored as XML Complete freedom to add or modify metadata Complete freedom to muck it up
Templates provide ease of use and consistent metadata [table] [row] Lane[col]Sample[col]ul [/row] … [row] 4[col] [[Dna:%]] [col] [[box]] [/row] … [/table] [[Section>Procedure]] [[Procedure_Type>electrophoresis_agarose]] [[Sandpit_group>DrexelDemo]]
System to date Our main laboratory notebook system Around 4000 posts, 800 Gb of data Used for biochemistry, synthetic chemistry, biophysics Also used as a collaboration and management tool in other projects Currently rolling out onto other sites
Goals A complete and useable record for the researcher and research team Enable a human reader to fully reproduce all experiments and replicate all data analysis in detail New functionality (video, search, communication, links, visualisation) Enable machine reading for automated aggregation and analysis
Versioning and provenance for analysis using workflows and API Workflow enacted online (MyExperiment) Pull down data from lab book and process Write results and record back to blog Provenance of workflow, versioning, and sharing via MyExp Record of enactment in LaBLog
Automatic Blogging by Machines
Automatic Blogging by Sensors Continuous log of ‘environmental’ conditions in a laboratory Instant detection of erroneous events Correlate with inconsistencies in datasets
Goals A complete and useable record for the researcher and research team Enable a human reader to fully reproduce all experiments and verify all data analysis in detail New functionality (video, search, communication, links, visualisation) Enable machine reading for automated aggregation and analysis
Visualisations and communication
 
 
Pictorial commenting Annotation tools allow comments and foster collaboration and / or communication  Need for more advanced Blog tools / technology around data
Goals A complete and useable record for the researcher and research team Enable a human reader to fully reproduce all experiments and verify all data analysis in detail New functionality (video, search, communication, links, visualisation) Enable machine reading for automated aggregation and analysis ?
RDF to ‘real’ RDF? Currently just links and post titles Include metadata Infer a vocabulary (probably human driven process) Refactor to generate a rich rdf version
Linking it all up Unstructured Unfiltered Arbitrary vocabulary Structured Filtered Controlled vocab Primary lab book Autoblogging instrument Published paper Database entry Personal journal Raw data Data processing
What could it look like? GO Ontology Browser Raw SANS Data - D22 run #29483 from  D22 at the Institut Laue-Langevin Raw SANS Data - D22 run #29483 from  D22 at the Institut Laue-Langevin
 

More Related Content

PPT
exFrame: a Semantic Web Platform for Genomics Experiments
PPT
eXframe: A Semantic Web Platform for Genomic Experiments
PPT
Annotopia open annotation services platform
ODP
2011 03-provenance-workshop-edingurgh
PPTX
ACS 248th Paper 146 VIVO/ScientistsDB Integration into Eureka
PPT
Ngsp
PPTX
247th ACS Meeting: The Eureka Research Workbench
PPTX
Fairport domain specific metadata using w3 c dcat & skos w ontology views
exFrame: a Semantic Web Platform for Genomics Experiments
eXframe: A Semantic Web Platform for Genomic Experiments
Annotopia open annotation services platform
2011 03-provenance-workshop-edingurgh
ACS 248th Paper 146 VIVO/ScientistsDB Integration into Eureka
Ngsp
247th ACS Meeting: The Eureka Research Workbench
Fairport domain specific metadata using w3 c dcat & skos w ontology views

What's hot (20)

PDF
Open Harvester - Search publications for a researcher from CrossRef, PubMed a...
PPTX
Converting Metadata to Linked Data
PPT
BioTorrents: A File Sharing Service for Scientific Data
PPTX
Reproducible research: practice
PPTX
Unknown Genes, Community Profiling, & Biotorrents.net
PPT
Columbia ONS Archiving May09
PPTX
Building genomic data cyberinfrastructure with the online database software T...
PPT
Bio solr building a better search for bioinformatics
PPTX
Search Me: Using Lucene.Net
PDF
Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...
PPTX
2014-06-13 Research objects in the wild
PPTX
Chemistry Validation and Standardization Platform v2.0
PPT
2011linked science4mccuskermcguinnessfinal
PPTX
Opportunities in chemical structure standardization
PPT
Science Commons Open Notebook Science Talk
PDF
WoSC19: Serverless Workflows for Indexing Large Scientific Data
PPT
An Integrated Framework on Mining Logs Files for Computing System Management
PPTX
Mercer bosc2010 microsoft_framework
PPT
Understanding WeboNaver
PDF
GlobusWorld 2015
Open Harvester - Search publications for a researcher from CrossRef, PubMed a...
Converting Metadata to Linked Data
BioTorrents: A File Sharing Service for Scientific Data
Reproducible research: practice
Unknown Genes, Community Profiling, & Biotorrents.net
Columbia ONS Archiving May09
Building genomic data cyberinfrastructure with the online database software T...
Bio solr building a better search for bioinformatics
Search Me: Using Lucene.Net
Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...
2014-06-13 Research objects in the wild
Chemistry Validation and Standardization Platform v2.0
2011linked science4mccuskermcguinnessfinal
Opportunities in chemical structure standardization
Science Commons Open Notebook Science Talk
WoSC19: Serverless Workflows for Indexing Large Scientific Data
An Integrated Framework on Mining Logs Files for Computing System Management
Mercer bosc2010 microsoft_framework
Understanding WeboNaver
GlobusWorld 2015
Ad

Similar to The Chemtools LaBLog (20)

PPT
Results may vary: Collaborations Workshop, Oxford 2014
PPTX
Introduction to FAIRDOM
PPT
Bhagat Myexperiment Bosc2008
PPT
Open Archives Initiative Object Reuse and Exchange
PPTX
UCIAD overview
PPT
eResearch workflows for studying free and open source software development
PPTX
Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...
PDF
Executable papers
PPTX
FAIR Computational Workflows
PPTX
2013 06-24 Wf4Ever: Annotating research objects (PPTX)
PDF
2013 06-24 Wf4Ever: Annotating research objects (PDF)
PPT
eScience: A Transformed Scientific Method
PPTX
The Research Object Initiative: Frameworks and Use Cases
PPT
GeoChronos
PPTX
Being Reproducible: SSBSS Summer School 2017
PPTX
Reproducibility: 10 Simple Rules
PDF
Sharing massive data analysis: from provenance to linked experiment reports
PPTX
Research Objects: more than the sum of the parts
PPTX
ACS 248th Paper 71 ChAMP Project
PPT
Knowledge Infrastructure for Global Systems Science
Results may vary: Collaborations Workshop, Oxford 2014
Introduction to FAIRDOM
Bhagat Myexperiment Bosc2008
Open Archives Initiative Object Reuse and Exchange
UCIAD overview
eResearch workflows for studying free and open source software development
Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...
Executable papers
FAIR Computational Workflows
2013 06-24 Wf4Ever: Annotating research objects (PPTX)
2013 06-24 Wf4Ever: Annotating research objects (PDF)
eScience: A Transformed Scientific Method
The Research Object Initiative: Frameworks and Use Cases
GeoChronos
Being Reproducible: SSBSS Summer School 2017
Reproducibility: 10 Simple Rules
Sharing massive data analysis: from provenance to linked experiment reports
Research Objects: more than the sum of the parts
ACS 248th Paper 71 ChAMP Project
Knowledge Infrastructure for Global Systems Science
Ad

More from Cameron Neylon (20)

PPTX
OA Advocacy Today
PDF
Open Knowledge Institutions: Is there a future for the university in a networ...
PDF
Research Excellence is a Neo-Colonial Agenda
PDF
Network Enabled Research: Connectivity, groups and growth in the production o...
PDF
Open Science Needs Open Indicators
PDF
Excellence is a neo-colonial agenda...and what we can do about
PDF
The Power of Infrastructures and the Infrastructures of Power
PDF
Will we still know ourselves? Identity and Community in a Transforming Knowle...
PDF
Beyond Open: Culture and Scaling in the Making of Knowledge
PDF
Excellence is Bullshit
PPTX
From Cycles to Networks
PDF
Where next for Open Scholarship?
PDF
Openness in Scholarship: A return to core values?
PDF
Interpreting Shadows on the Elephant in the Room
PDF
Sustaining Scholarly Infrastructures through Collective Action: The lessons t...
PDF
Sustainable Futures for Research Communication
PPTX
PPTX
Investing in Scholarly Futures
PDF
Excellence, Innovation, Evaluation: Collaboration x Innovation
PDF
No stories without evidence, no evidence without stories
OA Advocacy Today
Open Knowledge Institutions: Is there a future for the university in a networ...
Research Excellence is a Neo-Colonial Agenda
Network Enabled Research: Connectivity, groups and growth in the production o...
Open Science Needs Open Indicators
Excellence is a neo-colonial agenda...and what we can do about
The Power of Infrastructures and the Infrastructures of Power
Will we still know ourselves? Identity and Community in a Transforming Knowle...
Beyond Open: Culture and Scaling in the Making of Knowledge
Excellence is Bullshit
From Cycles to Networks
Where next for Open Scholarship?
Openness in Scholarship: A return to core values?
Interpreting Shadows on the Elephant in the Room
Sustaining Scholarly Infrastructures through Collective Action: The lessons t...
Sustainable Futures for Research Communication
Investing in Scholarly Futures
Excellence, Innovation, Evaluation: Collaboration x Innovation
No stories without evidence, no evidence without stories

Recently uploaded (20)

PDF
The influence of sentiment analysis in enhancing early warning system model f...
PDF
UiPath Agentic Automation session 1: RPA to Agents
PPTX
GROUP4NURSINGINFORMATICSREPORT-2 PRESENTATION
PDF
Flame analysis and combustion estimation using large language and vision assi...
PDF
Taming the Chaos: How to Turn Unstructured Data into Decisions
PPT
Module 1.ppt Iot fundamentals and Architecture
PPTX
The various Industrial Revolutions .pptx
PPTX
AI IN MARKETING- PRESENTED BY ANWAR KABIR 1st June 2025.pptx
PDF
“A New Era of 3D Sensing: Transforming Industries and Creating Opportunities,...
PDF
Developing a website for English-speaking practice to English as a foreign la...
PDF
A review of recent deep learning applications in wood surface defect identifi...
PDF
sustainability-14-14877-v2.pddhzftheheeeee
PDF
A proposed approach for plagiarism detection in Myanmar Unicode text
PDF
How IoT Sensor Integration in 2025 is Transforming Industries Worldwide
PDF
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
PDF
CloudStack 4.21: First Look Webinar slides
DOCX
search engine optimization ppt fir known well about this
PDF
Comparative analysis of machine learning models for fake news detection in so...
PDF
Consumable AI The What, Why & How for Small Teams.pdf
PDF
1 - Historical Antecedents, Social Consideration.pdf
The influence of sentiment analysis in enhancing early warning system model f...
UiPath Agentic Automation session 1: RPA to Agents
GROUP4NURSINGINFORMATICSREPORT-2 PRESENTATION
Flame analysis and combustion estimation using large language and vision assi...
Taming the Chaos: How to Turn Unstructured Data into Decisions
Module 1.ppt Iot fundamentals and Architecture
The various Industrial Revolutions .pptx
AI IN MARKETING- PRESENTED BY ANWAR KABIR 1st June 2025.pptx
“A New Era of 3D Sensing: Transforming Industries and Creating Opportunities,...
Developing a website for English-speaking practice to English as a foreign la...
A review of recent deep learning applications in wood surface defect identifi...
sustainability-14-14877-v2.pddhzftheheeeee
A proposed approach for plagiarism detection in Myanmar Unicode text
How IoT Sensor Integration in 2025 is Transforming Industries Worldwide
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
CloudStack 4.21: First Look Webinar slides
search engine optimization ppt fir known well about this
Comparative analysis of machine learning models for fake news detection in so...
Consumable AI The What, Why & How for Small Teams.pdf
1 - Historical Antecedents, Social Consideration.pdf

The Chemtools LaBLog

  • 1. The Chemtools LaBLog Recording research in the real world Cameron Neylon Contributions from Jeremy Frey, Andrew Milsted, Steve Wilson, Simon Coles, Mark Borkum, Jenny Hale, and others
  • 2. Goals A complete and useable record for the researcher and research team Enable a human reader to fully reproduce all experiments and replicate all data analysis in detail New functionality (video, search, communication, links, visualisation) Enable machine reading for automated aggregation and analysis
  • 3. A small challenge… Can anyone name or identify a paper in which it is possible to completely and precisely replicate the data analysis, including availability of raw data, full details of tools, version, and parameters for data analysis, and version (or date) of any databases used in the analysis.
  • 4. A blog as the lab book http://chemtools.chem.soton.ac.uk/projects/blog / “Bio Blogs” http://blogs.openwetware.org/scienceintheopen Discussion
  • 5. One item – one post (1I-1P) system
  • 6. 1I-1P gives every sample a URI
  • 7. 1I-1P relationships between posts An rdf dump of posts and links between them rendered using Welkin (simile.mit.edu/welkin)
  • 9. What about semantics? System is semantically unaware Arbitrary key-value pairs stored as XML Complete freedom to add or modify metadata Complete freedom to muck it up
  • 10. Templates provide ease of use and consistent metadata [table] [row] Lane[col]Sample[col]ul [/row] … [row] 4[col] [[Dna:%]] [col] [[box]] [/row] … [/table] [[Section>Procedure]] [[Procedure_Type>electrophoresis_agarose]] [[Sandpit_group>DrexelDemo]]
  • 11. System to date Our main laboratory notebook system Around 4000 posts, 800 Gb of data Used for biochemistry, synthetic chemistry, biophysics Also used as a collaboration and management tool in other projects Currently rolling out onto other sites
  • 12. Goals A complete and useable record for the researcher and research team Enable a human reader to fully reproduce all experiments and replicate all data analysis in detail New functionality (video, search, communication, links, visualisation) Enable machine reading for automated aggregation and analysis
  • 13. Versioning and provenance for analysis using workflows and API Workflow enacted online (MyExperiment) Pull down data from lab book and process Write results and record back to blog Provenance of workflow, versioning, and sharing via MyExp Record of enactment in LaBLog
  • 15. Automatic Blogging by Sensors Continuous log of ‘environmental’ conditions in a laboratory Instant detection of erroneous events Correlate with inconsistencies in datasets
  • 16. Goals A complete and useable record for the researcher and research team Enable a human reader to fully reproduce all experiments and verify all data analysis in detail New functionality (video, search, communication, links, visualisation) Enable machine reading for automated aggregation and analysis
  • 18.  
  • 19.  
  • 20. Pictorial commenting Annotation tools allow comments and foster collaboration and / or communication Need for more advanced Blog tools / technology around data
  • 21. Goals A complete and useable record for the researcher and research team Enable a human reader to fully reproduce all experiments and verify all data analysis in detail New functionality (video, search, communication, links, visualisation) Enable machine reading for automated aggregation and analysis ?
  • 22. RDF to ‘real’ RDF? Currently just links and post titles Include metadata Infer a vocabulary (probably human driven process) Refactor to generate a rich rdf version
  • 23. Linking it all up Unstructured Unfiltered Arbitrary vocabulary Structured Filtered Controlled vocab Primary lab book Autoblogging instrument Published paper Database entry Personal journal Raw data Data processing
  • 24. What could it look like? GO Ontology Browser Raw SANS Data - D22 run #29483 from D22 at the Institut Laue-Langevin Raw SANS Data - D22 run #29483 from D22 at the Institut Laue-Langevin
  • 25.