Saturday, May 07, 2011

Evan Curtin is the May 2011 RSC ONS Challenge Winner

Evan Curtin, a chemistry freshman student working under the supervision of Jean-Claude Bradley at Drexel University, is the May 2011 Royal Society of Chemistry Open Notebook Science Challenge Award winner. He wins a cash prize from the RSC.

Evan's primary focus has centered on synthesizing aromatic imines and measuring their solubility in a number of organic solvents. This will allow us to generate Abraham descriptors for this class of compounds in order to predict their solubility in 70+ solvents. Coupled with our new model to include temperature dependent solubility, this should greatly facilitate optimal solvent prediction for this and related reactions.

Imine formation is of particular interest to the UsefulChem group because it is the first step of the Ugi reaction, which we have used to synthesize compounds with anti-malarial activity. But it is also a simple convenient reaction in itself to test our Solvent Selector's ability to predict optimal conditions (solvent and temperature) for isolation of products by precipitation.

Evan's synthesis experiments are available here:
http://usefulchem.wikispaces.com/Exp263
http://usefulchem.wikispaces.com/Exp262
http://usefulchem.wikispaces.com/Exp261


and his solubility experiments are listed here:

http://onschallenge.wikispaces.com/Exp207
http://onschallenge.wikispaces.com/Exp206
http://onschallenge.wikispaces.com/Exp205
http://onschallenge.wikispaces.com/Exp204
http://onschallenge.wikispaces.com/Exp201
http://onschallenge.wikispaces.com/Exp198
http://onschallenge.wikispaces.com/Exp197

Three more RSC ONS Awards will be made during 2011. Submissions from students in the US and the UK are still welcome.
For more information see:
http://onschallenge.wikispaces.com
http://onschallenge.wikispaces.com/RSCAwards2010

Labels: , , , ,

Thursday, November 04, 2010

Sozit Kurtu is Nov10 RSC ONS Award Winner

Sozit Kurtu, a chemistry student working under the supervision of Jean-Claude Bradley at Drexel University, is the November 2010 Royal Society of Chemistry Open Notebook Science Challenge Award winner. She wins a cash prize from the RSC.

Sozit has performed important work in determining the accuracy of a density method of determining solubility and has explored the temperature dependence of the solubility of low melting point solutes in hexane. See her experiments here:
http://onschallenge.wikispaces.com/list+of+experiments

Four more RSC ONS Awards will be made during 2010-11. Submissions from students in the US and the UK are still welcome.
For more information see:
http://onschallenge.wikispaces.com
http://onschallenge.wikispaces.com/RSCAwards2010

Labels:

Friday, October 15, 2010

Dynamic links to private tagged Mendeley collections

My close collaborators and I have been using Mendeley as a convenient way to share PDFs of journal articles. Not all of us have access to the same libraries so links are not enough - we need the full documents. We also use Dropbox as a redundancy but Mendeley allows tagging and recording notes, which is very handy for everyone in the group.

Now that Mendeley is providing an API, Andrew Lang has written code that significantly leverages the information in our private ONS collection. We can now create public links that return the most updated results for specific tags, including multiple tags (which I don't think you can do on Mendeley). For example the following link returns all articles in the ONS collection tagged with "science2.0" and "chemistry":
The results include available information from Mendeley, including the title, authors, journal citation, doi, url, tags and the abstract. Because this information is public the PDFs can't be provided but the hyperlinks make it as convenient as possible.


At the end of the report the full list of all available tags for the ONS collection is provided. A more refined or different search can be done immediately simply by checking boxes and hitting the submit button.Because the tags are controlled by the users of the private collection, these links can be useful when discussing an ongoing project and referring to a very specific topic. For example, we have been collecting examples of articles where a Ugi reaction is carried out and the product precipitates. This link provides an updated report on that very narrow topic:
http://showme.physics.drexel.edu/onsc/mendeley/?tags=Ugi+precipitate
There are still 2 major limitations to this service:

1) The search is very slow (can take a minute or two) because there is no way currently to use the Mendeley API to selectively return results based on tags. Every search requires initially returning all results for the collection (currently a few hundred).

2) Notes are currently not returned. If the API is updated to include these the usefulness would increase dramatically. For example in the results for the above query I took notes of the conditions involved in the Ugi precipitate for each paper. With the current format, one has to read each paper to find the relevant information.

Progress on our Mendeley related services will be posted on the ONSwebservices wiki.

Labels: , , ,

Thursday, October 07, 2010

Drexel Chemistry Mini-Symposium on Bradley Lab

Every year the chemistry department at Drexel gives faculty the opportunity to present their research to incoming students in 10 minutes slots. On September 30, 2010 I presented on "Open Notebook Science for Malaria Drug Discovery and Solubility Modeling". I think such a short format is good for keeping student attention. Recording it also provides a handy link to use for other purposes. Most people just don't have time for 30-60 minute presentations.


Labels: , , ,

Tuesday, April 20, 2010

ONS Books Wiki

I recently reported on our use of Nature Precedings to archive different editions of the ONS Solubility Challenge book. One of the advantages is that Precedings automatically alerts visitors if more recent editions exist.

However, today I learned that there is a glitch to this system: it is not possible to link individual versions on Precedings to a corresponding book edition on LuLu. That means that if you find yourself on the Nature Precedings entry and want to order the book from LuLu it isn't obvious at all how to do so.

To resolve this issue once and for all I just created a wiki page (ONSbooks.wikispaces.com) to track every edition of the book. This is actually better because I can also provide links to all the available data archives and blog posts corresponding to each edition.

This is also the page where we will keep track of every edition of other Open Notebook Science books. The next one to be published shortly is for the UsefulChem project.

Labels: , , ,

Tuesday, April 06, 2010

ONS t-shirts from Zazzle

Inspired by Graham Steel, I just received my t-shirt with an Open Notebook Science Logo and a picture of our crystal on the cover of our ONS Solubility Challenge book.

I was going to set up an ONS store but Zazzle does not permit zero royalties (don't see the logic there). But making up t-shirts on Zazzle is super simple - just grab a logo of your choice from the ONSclaims wiki.

Any other pic is your choice - this is the crystal from UCEXP150C


You can also order all kinds of other personalized things, including coffee cups.

Labels: , , ,

Saturday, March 27, 2010

Education 2.0: Leveraging Collaborative Tools for Teaching

On March 25, 2010 I presented at the Drexel E-Learning 2.0 Conference on "Education 2.0: Leveraging Collaborative Tools for Teaching". It was an opportunity to update my slides with what I did and learned from the Chemical Information Retrieval course I taught over the Fall 2009 term.

I described using a wiki to organize course content and to allow students to contribute useful resources. Their assignments were also designed to be useful to other students in the class as well as to the general library and chemistry community.

I covered using wikis and other collaborative tools to mentor students doing laboratory research with Open Notebook Science. At the end I provided a quick overview of using games and Second Life for educational purposes.

Labels: , , , , , , , ,

Saturday, March 20, 2010

Reaction Attempts on ChemSpider

Just as we have done with the Open Notebook Science Solubility Challenge, we are adding more structure to the UsefulChem project.

This is a little bit more difficult because the UC notebook represents mainly chemical reactions, while the ONSC data are simply solubility measurements. Since most of the UC reactions are Ugi reactions, we have been keeping summary data in the CombiUgi Google Spreadsheet, which is completely specialized for this reaction and variations in our reaction conditions. This lets us search or sort by reactant, concentration, solvent, etc. However, we cannot do substructure searching directly using the CombiUgi sheet and we cannot add other types of reactions.

In order to enable substructure searching and add other reactions, Antony Williams has created 2 new data sources in ChemSpider: Attempted Reactions - Reactants and Attempted Reactions - Products. The data represented in the CombiUgi sheet has been restructured into 2 new Google Spreadsheets: RXIDs Reaction Attempts and Reaction Attempts.

Both of these sheets use a common Reaction ID to tie together an unlimited number of reactants and products (Reaction Attempts) and other pertinent reaction conditions (RXIDs Reaction Attempts), such as the concentration of the limiting reagent, the solvent, yield, notes, etc.

Currently only the data in the Reaction Attempts sheet has been imported into ChemSpider. But this alone gives us new functionality: we can perform substructure searches for either reactants or products.

For example lets say we want to search for all reaction attempts using aromatic carboxylic acids. First we simply do a substructure search on ChemSpider drawing benzoic acid and selecting Attempted Reactions - Reactants as the Data Source.


This pulls up 8 compounds that were used as a reactant at least once.


Clicking on one of these hits brings us to the ChemSpider entry. Selecting the Syntheses tab in the Data Sources shows links to the lab notebook pages where this compound was used.


The system is configured to accept reactions with fully characterized products to reactions where products were not isolated or even reactions in progress. I'm not using the term "failed reaction" because the term has no meaning without the context of the objective of the reaction. In our Ugi reactions we are typically looking for the product to precipitate out. By our criteria, reactions where no precipitate was observed after a few days would be classified as "failed". However it may well be that product was formed but did not precipitate. Even when product is obtained, some might consider 30% isolated yields to be failures, while others would not. Context is everything in qualifying success.

But even with a clear definition of success, many reactions are simply neither successful or failures. Reactions in progress fall into that category. The student may have even completed the reaction but not yet analyzed the results. But that doesn't matter so much if the raw monitoring data has been provided.

The general structure of this database means that we can add not only our reactions but those of anybody. Even in cases where someone does not have an Open Notebook, just providing a link to contact information of the researcher could be very useful to start a conversation. In that case the system would function more as a social networking platform - connecting researchers who work on similar molecules.

I don't think people are willing to do extensive write-ups for what they consider to be "failed experiments". However, if all that is requested is the list of reactants and target products that may not be such a burden if it potentially means connecting up with another researcher who can help or even start a new collaboration.

Currently ChemSpider does not take into account the information in the RXIDs Reaction Attempts sheet but we hope to be able to make use of that at some point. That would let us do more sophisticated searches like - search for any reaction attempt where an aromatic carboxylic acid was reacted with an aliphatic amine in methanol.

Andrew Lang has also provided the information of the 2 spreadsheets as XML:
http://showme.physics.drexel.edu/onsc/Services/OData.svc/Reactions/
http://showme.physics.drexel.edu/onsc/Services/OData.svc/ReactionCompounds/
[Note: if viewing on FireFox select View Source to see all the XML]

We will likely use these live feeds for performing more sophisticated queries and we welcome others to use them for any purpose.

Labels: , , , ,

Friday, March 19, 2010

RSC Sponsors Open Notebook Science Challenge

I am very pleased to report that the Royal Society of Chemistry is sponsoring 5 new $500 awards for the Open Notebook Science Solubility Challenge.

The previous round of 10 awards was sponsored by Submeta, Nature and Sigma-Aldrich. With the final award of that round having been made in December 2009, this is very good timing.

The criteria and rules for the contest have not changed. Students from the US and the UK are generally eligible to participate. See the Rules and Application Form for full details:
http://onschallenge.wikispaces.com/RSCAwards2010

All of the solubility measurements will continue to be compiled and distributed in several formats, including a book where biographies and pictures of all the award winners can be found. The most recent edition - with all 10 previous winners - is available here:
http://precedings.nature.com/documents/4243/version/3

I am very grateful to Antony Williams for being instrumental in making this happen.

Labels: , ,

Thursday, March 04, 2010

Nature Precedings as an Archiving Tool for ONS Solubility Book

The issue of archiving and citation is a topic that is usually raised whenever I give a talk about Open Notebook Science. We have recently tried to address this using several complementary strategies.

The publication of a book containing a snapshot of all the values obtained from the Open Notebook Science Solubility Challenge has turned out to be a convenient mechanism. By using LuLu, the book can be either downloaded for free as a PDF or ordered as a physical copy for just the printing and shipping charges.

However, Lulu does not have a convenient method of keeping track of different editions of the book and it is unclear how to best cite them.

Nature Precedings solves both of these problems quite nicely. I have uploaded the PDF of each book edition to NP and the versions are automatically linked to each other. In fact if you try to access an older edition, NP pops up a warning that a more recent version is available with the corresponding link (see image below).

Precedings also provides information about how to cite the document, including a DOI for each version. Unfortunately it appears that it can take some time for the DOIs to resolve. Links to different versions can also be formatted like this:
http://precedings.nature.com/documents/4243/version/1
http://precedings.nature.com/documents/4243/version/2
http://precedings.nature.com/documents/4243/version/3
Links to the Lulu version of each book are also provided, which is convenient for anyone who might want to order a physical copy.

At this time Precedings does not accept zip files containing the full archive of the source files for each book version - although a link to the archive is provided in the preface of the book. We have found that our library's DSpace repository is a convenient location for these.

Labels: , , , , ,

Friday, February 12, 2010

ONS Solubility Book: Edition 3 with Notebook Archive

Edition 3 (2010-02-11) of the ONS Solubility Challenge book is now available.

We've been trying for some time to find a way to conveniently take a snapshot of our Open Notebooks and all associated raw data files. This could serve as a way to back up all of our work as well as provide a means of finding out the state of knowledge for a project at a given moment in time. There is also a tremendous benefit to confidently using the best of free hosted Web2.0 services out there (e.g. GoogleDocs and Wikispaces) without being concerned with changes in policies or access down the road.

Our recent use of the ONS Challenge Solubility book to periodically create releases of summarized data has opened up a convenient opportunity. And yesterday the last piece of the puzzle fell into place. Through a combination of fairly quick manual and automated tasks, Andrew Lang and I are able to push out a full snapshot of all relevant files and lab notebook pages and associate it with an edition of the book.

As described below, the archive is accessible interactively on a server, as a zip download or as a CD from LuLu. Perhaps we can also find a home on library servers in the future.

More details are provided in the preface for Edition 3 (2010-02-11):
This is the first edition to include a full archive of the ONS Challenge notebook. A space export from Wikispaces provides an initial version of all the HTML pages in the notebook with local hyperlinks to copies of all images and files uploaded onto the wiki. All of the Google Spreadsheets are automatically downloaded as Excel spreadsheets and placed in the same "files" folder as the images. NMR spectra, stored as JCAMP-DX files, are placed in the "spectra" folder. All of the HTML pages are reformatted to provide local references to both Excel spreadsheets and the JCAMP-DX files.

The notebook archive is meant to represent a snapshot of the state of all source documents at the time of the publication of an edition of this book. When used from a server with web services running, clicking on links to the spectra will allow interaction via a browser interface, including zooming in or out and integration of the NMR spectrum. When accessed in stand-alone mode after downloading or directly from a CD, everything will work the same, except that JCAMP-DX files must be open from JSpecView running on the desktop. Excel files will retain any calculations in the cells of the original Google Spreadsheets but dynamic values generated from calling web services - such the script that automatically integrates NMR spectra - will be frozen as simple values. However the link to the web service used will be stored in the cell as a comment. Links to external websites are not crawled and embedded Google Spreadsheets or videos are not copied. These will work but will reflect live data on the web.

The February 11, 2010 version of the notebook archive is available on a hosted site, on a CD or by download.

Labels: , , , , , ,

Tuesday, December 29, 2009

ONS Solubility Book: Edition 2 - with Predicted Values

The Second Edition (2009-12-27) of the Open Notebook Science Solubility Challenge book is now available. The issues with some missing text have been resolved, in addition to providing clickable links for the references in the PDF version.

However, the main difference is the addition of a new section on solubility predictions. The book is now somewhat larger than the first edition, coming in at 129 pages but still very affordable at $8.16 (covers printing and shipping costs from LuLu).

This was added to the preface:
Predicted Solubilities

In this edition, a new section is added to provide predicted solubility values for selected solutes in a range of solvents. Specifically, solutes are included when measurements from at least 5 different solvents are available. A method using Abraham descriptors depends on the experimental solubility measurements from several solvents to make predictions, which is detailed in that section of the book. For this reason, this edition also includes some aqueous solubility measurements, which are generally available from the literature. The focus of this collection remains on non-aqueous solubility.

Consistent with how the experimental measurements are made available, the predicted solubility values are provided as a work in progress. The purpose in providing them is to suggest solvents of interest for various applications. The boiling point of each solvent is also listed in the table to allow a convenient selection. When available, experimental measurements are listed next to the predicted values. This information can be helpful to gauge the usefulness of the model to some extent but does not guarantee its reliability for the other solvents. As more measurements are collected the reliability of the predictions is likely to increase and this will be reflected in future editions of this book.
Andrew Lang has been busily learning about building models using Abraham descriptors. As luck would have it, Michael Abraham just published an extensive collection of his descriptors for many solvents in a recent publication:
Abraham M.H.; Smith R.E.; Luchtefeld R.; Boorem A.J.; Luo R.; Acree Jr. W.E. Prediction of solubility of drugs and other compounds in organic solvents. J. Pharm. Sci. Early View Sept. 22 (2009) http://dx.doi.org/10.1002/jps.21922
This is an important step for the ONS Challenge project by taking us closer to the eventual goal of providing chemists an open tool for anticipating the solubility behavior of their reactants and products in a particular solvent. Researchers might think of trying new solvents after perusing their measured or predicted solubilization potential for a given solute.

We don't know how good the predictions will turn out but we will certainly find out in the coming months and report as we go. Even though the Submeta awards have all been distributed we still welcome measurement contributions.



Labels: , ,

Saturday, December 12, 2009

First Edition of ONS Solubility Challenge Book

Andrew Lang and I have been working on a book version of the Open Notebook Science Solubility Challenge database. The timing is good since we just awarded the last ONS Challenge Submeta award this month. All of the students, judges and educational partner are included as co-authors. A biography and picture of everyone is included in the book.
Jean-Claude Bradley, Associate Professor of Chemistry at Drexel University
Cameron Neylon, Senior Scientist at the ISIS Pulsed Neutron Source, Rutherford Appleton Laboratory and Lecturer in Chemical Biology at the School of Chemistry at the University of Southampton
Rajarshi Guha, Research Scientist at the NIH Chemical Genomics Center
Antony Williams, Vice President of Strategic Development, ChemSpider at the Royal Society of Chemistry
Bill Hooker, Postdoctoral Researcher in Molecular Biology
Andrew Lang, Professor of Mathematics at Oral Roberts University
Brent Friesen, Associate Professor of Chemistry at Dominican University
and
Tim Bohinski, David Bulger, Matthew Federici, Jenny Hale, Jenna Mancinelli, Khalid Mirza, Marshall Moritz, Daniel Rein, Cedric Tchakounte, and Hai Truong
We selected LuLu as a convenient mechanism to distribute copies. This 6 x 9 inches black and white soft cover edition is available for $5.96, which just covers the printing and shipping charges. Other formats are possible - such as a larger hardcover in color - but these are much more expensive. We thought it would be good to start with the most affordable version and look at other options later. The electronic version of the book is available for free on LuLu.

We were inspired by the style of the solubility book published by Atherton Seidell in 1919, freely available on Google Books. The compound entries are listed in alphabetical order, with tables of compound data and solubilities. We included data that we found to be useful for practical applications, including predicted density, room temperature phase and the solubility in molarity, mole fraction and g/100g solvent. References link to lab notebook pages or literature references.

Andy found a way to create the fully formatted book in an almost completely automated way, pulling the data directly from the Solubilities Summary and other Google spreadsheets and querying ChemSpider. The preface and biographies of the students, judges and educational partner are also automatically pulled in from Google Docs. With this system in place, it will be straightforward to publish future editions with the most updated information frequently.

This was also a good opportunity to make use of the WebCite service. It enables us to link the book to a frozen version of the Solubilities Summary sheet archived as an Excel spreadsheet. This format retains all the formulas and hyperlinks in the original Google Spreadsheet.

The preface further explains the scope of the book and project:

The Open Notebook Science Solubility Challenge

Solubility is an important consideration for many chemistry applications. Synthetic chemists usually use a solvent to perform reactions and knowledge of the solubility of the starting materials or products can be very useful to pick an appropriate solvent. Analytical chemists can use solubility to design separation techniques and factor in dynamic range considerations. Physical chemists can create and evaluate their models of how molecules interact in the solubilization and precipitation processes.

Solubility data can be obtained from a variety of online and offline sources. As with all chemical data, it can be a challenge to evaluate reported measurements. Some databases offer no references while others provide citations to peer reviewed journal articles. Given the choice, more weight is generally given to the latter. This is reasonable in most cases because more information about the purity of compounds and the methods used are available in peer-reviewed articles.

However, the information for how a specific measurement was obtained within a journal article is not generally provided. General methods are provided but the raw data for a specific measurement are typically not published. Peer review is not intended to validate individual measurements - its function is to ensure that the authors made appropriate conclusions based on their processed datasets and the state of knowledge in the field.

The Open Notebook Science Challenge was initiated in the fall of 2008 as the result of a discussion on a train in the UK between Jean-Claude Bradley and Cameron Neylon.[1,2] The concept was very simple: create a crowdsourcing opportunity for the chemistry community to contribute solubility measurements under Open Notebook Science conditions. This method of publication entails providing immediate public access to the chemist's laboratory notebook, as well as all raw data used to compute the measurements.[3,4]

On Sept 3, 2008 the first ONSC measurements were recorded by Bradley and Neylon at the University of Southampton in Neylon's laboratory.[5] The project was soon sponsored by Submeta, offering ten $500 awards for students in the US or the UK who best recorded how they performed their experiments.[6] Furthermore, the first 3 winners also received one year subscriptions to Nature magazine, thanks to a sponsorship from the Nature Publishing Group.[7] Sigma-Aldrich supported the contest by donating chemicals upon request.[8]

Students were evaluated by a group of judges who convened once a month to deliberate the next award. Judges also provided feedback to the students by commenting on their lab notebook pages directly on the wiki. Their expertise ranged from chemistry to mathematics, spectroscopy and molecular biology.

Techniques

Participants in the ONS Challenge were not required to use a specific method to measure solubility - although they were required to properly document their experiments and analyses. Due to its simplicity, most measurements in the past year were made using the SAMS NMR technique, requiring no volume measurement or calibration curves.[9] Two assumptions are made with this method. The first is that the volume of solute and solvent are additive, with the error becoming negligible at low solubility values. The second is that NMR integration values are proportional to the amount of solvent and solute. Some deviations from this have been observed for default NMR parameters and in later experiments long relaxation times are introduced into the protocol (D1 = 50s).[10]

Data Curation

Since an Open Notebook approach is used in this work, those interested in the validity of the measurements can assess the methods used - both for the preparation of saturated solutions and the raw data from the measurements. Over time, values in the database are likely to improve and possibly some errors may be uncovered and corrected. However, on the whole, we feel that the values provided in this work should be of use to chemists trying to gain an appreciation of solubility for most applications. This is especially the case for values that are not obtainable from any other source.

When clearly erroneous data points are discovered, they are flagged in the database as "DONOTUSE". This way interfaces with the dataset can ignore these values while allowing anyone to investigate why the data points were flagged. This might happen when early experiments did not allow for sufficient mixing or NMR D1 relaxation times were long enough to fully integrate peaks of interest. Out of 681 reported measurements, 51 are currently marked in this way. A shared Google Spreadsheet is used to collect and curate the dataset. This allows easy data entry while providing a simple way to interrogate the database for visualization applications via the Google API.[11]

Literature data and format conversions

An additional 400 solubility measurements from the literature are included in the database. These generally correspond to compounds that are structurally identical or similar to the compounds measured by the ONS Challenge participants. These values are averaged in with the values from the participants, with appropriate references provided. In order to compare values, conversions from molar fraction or g solute/100g solvent to molarity were made by assuming that the volumes are additive and obtaining the density of the solutes in most cases from the predicted values in ChemSpider.[12]

For the convenience of chemists with diverse applications, all three formats are provided. For the cases where solutes are miscible with the solvent, the molarity reported is simply the solute's density. The practical interpretation of this is that solutions of any molarity below the solute's density can be prepared.

In the process of converting units and averaging heterogeneous data sources, no attempt has been made to track significant figures. Those interested in any information about the precision of measurements should consult each individual data source. This may not be an easy task for measurements only carried out once and where factors such as the quality of spectral peaks and baselines are not optimal.

This collection will be most valuable for those who do not require highly precise measurements for their applications. For example, synthetic chemists can easily use rough estimates of solubility to select appropriate solvents for a reaction. In any case, one would be wise to consider all measurements as provisional, regardless of the source. As more data are collected, subsequent editions of this book will adjust values accordingly.

Searching the database

The values in this database can be accessed and filtered in various ways. More information is available at the ONS Challenge wiki[13] and Chapter 16 of the book "Beautiful Data".[14]

Database version

Archived as Excel Spreadsheet by WebCite on December 11, 2009.[15]

References

[1] Bradley, JC Open Notebook Science Challenge, UsefulChem blog (2008) http://usefulchem.blogspot.com/2008/09/open-notebook-science-challenge.html
[2] Open Notebook Science Challenge Wikipedia entry http://en.wikipedia.org/wiki/Open_Notebook_Science_Challenge
[3] Bradley, JC Open Notebook Science, Drexel CoAS E-Learning Blog (2006) http://drexel-coas-elearning.blogspot.com/2006/09/open-notebook-science.html
[4] Open Notebook Science Wikipedia entry http://en.wikipedia.org/wiki/Open_Notebook_Science
[5] Bradley, JC; Neylon, C UsefulChem Experiment 207 http://usefulchem.wikispaces.com/Exp207
[6] Bradley, JC Submeta Open Notebook Science Awards, UsefulChem Blog (2008) http://usefulchem.blogspot.com/2008/11/submeta-open-notebook-science-awards.html
[7] Bradley, JC Nature Sponsors Open Notebook Science, UsefulChem Blog (2008) http://usefulchem.blogspot.com/2008/11/nature-sponsors-open-notebook-science.html
[8] Bradley, JC Sigma-Aldrich First Official Sponsor of Open Notebook Science Challenge, UsefulChem Blog (2008) http://usefulchem.blogspot.com/2008/09/sigma-aldrich-first-official-sponsor-of.html
[9] Bradley, JC Semi-Automated Measurement of Solubility, UsefulChem Blog (2009) http://usefulchem.blogspot.com/2009/03/semi-automated-measurement-of.html
[10] Bradley, JC NMR Integration Progress for Solubility Measurements, UsefulChem Blog (2009) http://usefulchem.blogspot.com/2009/06/nmr-integration-progress-for-solubility.html
[11] Bradley, JC Interactive Visualization of ONS Solubility Data, UsefulChem Blog (2009) http://usefulchem.blogspot.com/2009/01/interactive-visualization-of-ons.html
[12] ChemSpider database http://www.chemspider.com
[13] ONS Challenge List of Experiments Page http://onschallenge.wikispaces.com/list+of+experiments
[14] Bradley, J.-C.; Guha, R.; Lang, A.S.I.D.; Lindenbaum, P; Neylon, C.; Williams, A.J. & Willighagen, E. Chapter 16: Beautifying Data in the Real World from Beautiful Data. O'Reilly Media, Eds: Segaran, T. & Hammerbacher, J. (2009)
[15] Bradley, Jean-Claude; Lang Andrew. Solubilities Summary Sheet. Open Notebook Science Challenge. 2009-12-11. URL:http://spreadsheets.google.com/pub?key=plwwufp30hfq0udnEmRD1aQ&output=xls. Accessed: 2009-12-11. (Archived by WebCite® at http://www.webcitation.org/5lx5ry3BV)


Labels: , , , , ,

Wednesday, September 02, 2009

Jenna Mancinelli is Sept09 Submeta ONS Award Winner

Jenna Mancinelli, working under the supervision of Jean-Claude Bradley at Drexel University, is the September 2009 Submeta Open Notebook Science Challenge Award winner. She wins a cash prize from Submeta.

Jenna used both NMR and the sequential precipitation technique to obtain solubility data. See her experiments here:
http://onschallenge.wikispaces.com/list+of+experiments

One more Submeta ONS Award will be made during 2009. Submissions from students in the US and the UK are still welcome.
For more information see:
http://onschallenge.wikispaces.com
http://onschallenge.wikispaces.com/submetaawards08

Labels: ,

Monday, August 17, 2009

My first talk at ACS09 fall meeting on Crowdsourcing Solubility and ONS

Yesterday (August 16, 2009) I gave my first talk at the ACS meeting in Washington. It was part of an outstanding session on Chemical Text Mining and Public Molecular Databases, organized by Antony Williams and Alex Tropsha.
9:00 AM 1 U.S. EPA computational toxicology programs: Central role of chemical-annotation efforts and molecular databases
Ann M. Richard, Maritja A. Wolf, ClarLynda R. Williams-Devane, Richard Judson
9:25 AM 2 Linking public and commercial chemical data: ChemSpider and SureChem
Nicko Goncharoff
9:50 AM 3 Building an integrated system for chemistry markup and online publishing integrated to online chemistry resources
A J Williams
10:30 AM 4 Turning mining inside out
Colin R Batchelor
10:55 AM 5 Chemreader: A tool for extracting chemical structure information from digital raster images
Jungkap Park, Kazu Saitou, Kerby Shedden, Gus R. Rosania
11:20 AM 6 Exploiting a hidden treasure: Automated chemical entity recognition in Chemisches Zentralblatt
Valentina Eigner-Pitto, Heinz Saller, Peter Loew
1:30 PM 12 Online chemical modeling environment: database
Sergii Novotarskyi, Iurii Sushko, Robert Körner, Anil Kumar Pandey, Igor V. Tetko
1:55 PM 13 Public molecular databases: How can their value be increased by generation of additional data in silico?
Vladimir V. Poroikov, Dmitry Filimonov, Marc C. Nicklaus
2:20 PM 14 Chemical space management of large libraries for new active small molecules selection for prostate cancer treatment
Andrew V. Scorenko, Andrei A. Gakh, Andrey V. Sosnov, Mikhail Yu. Krasavin
2:45 PM 15 Crowdsourcing nonaqueous solubility and synthesis using Open Notebook Science
Jean-Claude Bradley, Khalid Mirza, Rajarshi Guha, Andrew Lang, A. Williams
3:25 PM 16 ChemXSeer: A cyberinfrastructure for environmental chemical kinetics
Karl T. Mueller, William J. Brouwer, C. Lee Giles, Prasenjit Mitra, Carl Lagoze
4:15 PM 18 Reliable reactions and stable structures
Jonathan M Goodman
Many of the presentations highlighted the use of ChemSpider or full collaborations (such as the integration with SureChem patent data). The acquisition of ChemSpider by RSC was repeatedly discussed and this seems to have accelerated such collaborative projects. Colin Batchelor from the RSC provided a great talk on their approach of using ontologies to better leverage the power of chemistry publications. [The presentations were judged and Colin won first prize - I won second, which was pretty cool :) and won me a ticket to the CINF lunch on Tuesday]

I also got to meet Gus Rosania in person for the first time. We had met via the blogsphere a while back over our interests in malaria and Open Notebook Science. Gus was there to share his results from ChemReader, a software package he developed to automatically read chemical structures from images.

I started my presentation by detailing the recent events surrounding the report of the oxidation of secondary alcohols using NaH. The timing of this was perfect because it really showed how useful it can be to immediately share the full data of experiments. This is the type of thing that would have been extremely helpful during the initial reports of Cold Fusion but the tools for sharing in such a detailed way were just not available. Carmen Drahl just wrote an article about this for the August 17, 2009 issue of Chemical & Engineering News (subscriber access).



Labels: , , , ,

Wednesday, July 29, 2009

Iterating a 5D solubility space

About three weeks ago I described how we are mapping a 5D solubility space (mixtures of 4 solvents and temperature). Andrew Lang has been re-running his code to populate the DoSol request sheet with the most useful next measurements. After a few iterations of Marshall Moritz doing experiments and combining with any existing data from the literature we now have 76 measurements for the solubility of 4-nitrobenzaldehyde in mixtures of chloroform, acetonitrile, toluene and THF within the temperature range of -25 to 40 C.


We are now working on ways of quantifying how well we have covered the space and how confident we are of specific predictions. At some point we would like to generalize the predictions based on molecular descriptors of the solvents.

The existing dataset can be sliced in some interesting ways. For example, using Mathematica, Andy has created a plot of the solvent combinations giving the highest possible solubilities of 4-nitrobenzaldehyde at a given temperature. At room temperature this corresponds to a mixture of 38% chloroform and 62% acetonitrile (molar ratio). Below 10C, toluene enters the mix to obtain maximum solubility. At no temperature does THF help.

Labels: ,

Thursday, July 09, 2009

ChemADVISOR promotes ONS Challenge

I was quite pleased to discover this morning that ChemADVISOR has posted a notice about our Open Notebook Science Solubility Challenge on their newsletter.

I had a nice chat with Matt Kaus this afternoon about possible ways we can work together to further our common objectives. This seems to be a win-win situation for many stakeholders, including the students participating in our ONS solubility who are looking for employment opportunities. Our solubility data is also apparently in demand from their subscribers.

Lets see how this plays out but I am certainly excited about the possible projects going forward.

Labels: , ,

Wednesday, July 01, 2009

Marshall Moritz is July09 Submeta ONS Award Winner

Marshall Moritz, a chemistry and math student at Syracuse University, working under the supervision of Jean-Claude Bradley at Drexel University over the summer, is the July 2009 Submeta Open Notebook Science Challenge Award winner. He wins a cash prize from Submeta.

Marshall started out using NMR to measure solubility and recently has made some important contributions to the Challenge by using the sequential precipitation technique to obtain solubilities in different solvent mixtures at various temperatures. See his experiments here:
http://onschallenge.wikispaces.com/list+of+experiments

Three more Submeta ONS Awards will be made during 2009. Submissions from students in the US and the UK are still welcome.
For more information see:
http://onschallenge.wikispaces.com
http://onschallenge.wikispaces.com/submetaawards08

Labels: ,

Friday, June 26, 2009

Solubility surfaces in 3D

After a little bump in the road with attempting to measure the solubility of 4-nitrobenzaldehyde in methanol, Marshall Moritz extended the study (ONSC-EXP111) to chloroform and acetonitrile, where no hemiacetal can form. The sequential precipitation method seems to work quite well with a new thermostated bath that lets us go down to almost -30C.

The values in pure acetonitrile and chloroform were consistent with those obtained from Maccarone, E.; Perrini G. Gazetta Chimica Italiana. 1982, 112, p. 447. (private access). But Marshall went further and measured 9 points with mixed solvents and different concentrations. The result is this saddle shaped 3D surface plot.


I think the potential for extending the ONS Challenge to cover the full multi-dimensional space of mixtures of about 10 common solvents and temperature is very exciting. It would be foolish to think that we can map in extreme detail such large surfaces. However, I'll bet we can come up with some useful estimates of what the surface looks like in many areas of the space.

The long rage plan for this approach would involve coming up with at least an empirical model based on molecular descriptors of the solvents and solutes. This would help predict the space for completely new combinations.

Being able to predict the solubility of all starting materials and products would enable organic chemists to rationally select solvent systems and temperatures for their reactions maximizing product yield from simple precipitation. This is something that we are investigating for the Ugi reaction and I'll report on this as the data come in.

Labels: , ,

Tuesday, June 16, 2009

Temperature solubility curves using sequential precipitation

We were going to wait for a while to tackle temperature related solubility because it wasn't clear that our NMR technique was applicable. The main problem is that separating out the supernatant at a given temperature can be tricky for both filtration and centrifugation with our available equipment.

However, while discussing the issue with Marshall last week it occurred to me that we might be able to get the data by sequential crystallization. The idea is to prepare several solutions at known concentrations at a high temperature then let the bath cool down and note the temperature when precipitates are first observed. We can then intrapolate to report the solubility at room temperature if necessary.

Khalid and I made an attempt on Friday for pyrene in acetonitrile (EXP109) and obtained a value of 0.16 M at 22C, about twice the value found by Marshall in EXP108 using NMR. Part of the discrepancy is probably related to longer relaxation times of aromatic protons but we have to keep in mind that NMR is not particularly precise at such low concentrations. Hopefully we will soon figure out how to run our NMR experiments to allow full relaxation of all protons and address this issue more conclusively.

Labels: , , ,

Creative Commons Attribution Share-Alike 2.5 License