SlideShare a Scribd company logo
data sharing:
a look at the issues

             kaitlin thaney
program manager, science commons
 trieste, italy - ICTP - 16 oct 2009


 This presentation is licensed under the CreativeCommons-Attribution-3.0 license.
xi.
before jumping into data ...
    (where we left off)
make sharing easy, legal and scalable

        integrated approach

building part of the infrastructure for
          knowledge sharing
scientific revolutions occur when a
 sufficient body of data accumulates to
   overthrow the dominant theories
        we use to frame reality

     a so-called paradigm shift

                    - from thomas kuhn
content needs to be legally and
    technically accessible
indexing, translation, redistribution: disallowed
“ By open access to the literature, we mean its free
availability on the public internet, permitting users
 to read, download, copy, distribute, print, search, or link
     to the full texts of the articles, crawl them for
indexing, pass them as data to software, or use them for
   any other lawful purpose, without financial, legal or
  technical barriers other than those inseparable from
           gaining access to the internet itself.”



           Image from the Public Library of Science, licensed to the public, under
                                        CC-BY-3.0
“The only constraint on reproduction and distribution,
 and the only role for copyright in this domain, should
 be to give authors control over the integrity of their
 work and the right to be properly acknowledged
                      and cited.”
Data sharing:  a look at the issues - Trieste
legal
implementation
don’t forget
  about the
physical tools
     UBMTA


      SLA


     SCMTA
knowledge?

    journal articles
          data
       ontologies
      annotations
plasmids and cell lines
as a means to achieve Open Access
      but what about data?
the data web
“the future is here ...
just unevenly distributed”
                      - william gibson
(i.e., linked data, W3C, neurocommons...)
1.
three layers of resistance:
 technical, semantic, legal

           save legal for last ...
“read 189,000
  papers” is not
the ideal answer.
DRD1, 1812      adenylate cyclase activation
ADRB2, 154      adenylate cyclase activation
ADRB2, 154      arrestin mediated desensitization of G-protein coupled receptor protein signaling pathway
DRD1IP, 50632   dopamine receptor signaling pathway
DRD1, 1812      dopamine receptor, adenylate cyclase activating pathway
DRD2, 1813      dopamine receptor, adenylate cyclase inhibiting pathway
GRM7, 2917      G-protein coupled receptor protein signaling pathway
GNG3, 2785      G-protein coupled receptor protein signaling pathway
GNG12, 55970    G-protein coupled receptor protein signaling pathway
DRD2, 1813      G-protein coupled receptor protein signaling pathway
ADRB2, 154      G-protein coupled receptor protein signaling pathway
CALM3, 808      G-protein coupled receptor protein signaling pathway
HTR2A, 3356     G-protein coupled receptor protein signaling pathway
DRD1, 1812      G-protein signaling, coupled to cyclic nucleotide second messenger
SSTR5, 6755     G-protein signaling, coupled to cyclic nucleotide second messenger
MTNR1A, 4543    G-protein signaling, coupled to cyclic nucleotide second messenger
CNR2, 1269      G-protein signaling, coupled to cyclic nucleotide second messenger
HTR6, 3362      G-protein signaling, coupled to cyclic nucleotide second messenger
GRIK2, 2898     glutamate signaling pathway
GRIN1, 2902     glutamate signaling pathway
GRIN2A, 2903    glutamate signaling pathway
GRIN2B, 2904    glutamate signaling pathway
ADAM10, 102     integrin-mediated signaling pathway
GRM7, 2917      negative regulation of adenylate cyclase activity
LRP1, 4035      negative regulation of Wnt receptor signaling pathway
ADAM10, 102     Notch receptor processing
ASCL1, 429      Notch signaling pathway
HTR2A, 3356     serotonin receptor signaling pathway
ADRB2, 154      transmembrane receptor protein tyrosine kinase activation (dimerization)
PTPRG, 5793     transmembrane receptor protein tyrosine kinase signaling pathway
EPHA4, 2043     transmembrane receptor protein tyrosine kinase signaling pathway
NRTN, 4902      transmembrane receptor protein tyrosine kinase signaling pathway
CTNND1, 1500    Wnt receptor signaling pathway
`
technical
traditional transfer of copyright agreement
(1) KEGG - Kyoto Encyclopedia of Genes and Genomes
“Non-academic users and Academic users intending to use KEGG for
commercial purposes are requested to obtain a license agreement
through KEGG's exclusive licensing agent, Pathway Solutions, for installation
of KEGG at their sites, for distribution or reselling of KEGG data, for
software development or any other commercial activities that make use of
KEGG, or as end users of any third-party application that requires
downloading of KEGG data or access to KEGG data via the KEGG API.

(2) HapMap - human genetic variation data
“The click-wrap license was designed as a temporary tool to continue the
practice of providing rapid access to human genome data [...]. One
consequence of the license requirement was that the [...] license
prevented HapMap data from being integrated into major public
databases, which require that data deposited carry no conditions on
use ...” - Wellcome Trust, Sanger, Dec 2004
what companies think we’re doing with the web
2.
   people like stories ...

why Open Access is needed
semantic
agreement
  is hard.
Data sharing:  a look at the issues - Trieste
Data sharing:  a look at the issues - Trieste
espresso
  coffee
             cafe
                    kopi
                             cafezinho

latte               koffee

           mocha             americano
“choice” or interoperability.

         (pick one)
converge on common names

    “coffee”


    “cafe”              coffee

    “kopi”      http://ontology.foo.org/1234567
Data sharing:  a look at the issues - Trieste
Data sharing:  a look at the issues - Trieste
Data sharing:  a look at the issues - Trieste
better answers through better formats:


                                                                                    Mesh: Pyramidal Neurons
select ?gene_name ?process_name
where                                                                               Pubmed: Journal Articles
{ PropertyValue(?pubmed_record, ?p, mesh:D017966)
    PropertyValue(?article, sc:identified_by_pmid , ?pubmed_record)
    PropertyValue(?gene_record, sc:describes_gene_or_gene_product_mentioned_by, ?article)
    SubClassOf(?protein, some(ro:has_function, some(ro:realized_as, ?process)))
    SubClassOf(?process, or(go:GO_0007166, some(ro:part_of, go:GO_0007166))
                                                                                     Entrez Gene: Genes
    SubClassOf(?protein, some(sc:is_protein_gene_product_of_dna_described_by,?gene_record))
    Annotation(?gene_record,rdfs:label,{?gene_name})


}
    Annotation(?process,rdfs:label,?process_name)
                                                                                     GO: Signal Transduction
DRD1, 1812      adenylate cyclase activation
ADRB2, 154      adenylate cyclase activation
ADRB2, 154      arrestin mediated desensitization of G-protein coupled receptor protein signaling pathway
DRD1IP, 50632   dopamine receptor signaling pathway
DRD1, 1812      dopamine receptor, adenylate cyclase activating pathway
DRD2, 1813      dopamine receptor, adenylate cyclase inhibiting pathway
GRM7, 2917      G-protein coupled receptor protein signaling pathway
GNG3, 2785      G-protein coupled receptor protein signaling pathway
GNG12, 55970    G-protein coupled receptor protein signaling pathway
DRD2, 1813      G-protein coupled receptor protein signaling pathway
ADRB2, 154      G-protein coupled receptor protein signaling pathway
CALM3, 808      G-protein coupled receptor protein signaling pathway
HTR2A, 3356     G-protein coupled receptor protein signaling pathway
DRD1, 1812      G-protein signaling, coupled to cyclic nucleotide second messenger
SSTR5, 6755     G-protein signaling, coupled to cyclic nucleotide second messenger
MTNR1A, 4543    G-protein signaling, coupled to cyclic nucleotide second messenger
CNR2, 1269      G-protein signaling, coupled to cyclic nucleotide second messenger
HTR6, 3362      G-protein signaling, coupled to cyclic nucleotide second messenger
GRIK2, 2898     glutamate signaling pathway
GRIN1, 2902     glutamate signaling pathway
GRIN2A, 2903    glutamate signaling pathway
GRIN2B, 2904    glutamate signaling pathway
ADAM10, 102     integrin-mediated signaling pathway
GRM7, 2917      negative regulation of adenylate cyclase activity
LRP1, 4035      negative regulation of Wnt receptor signaling pathway
ADAM10, 102     Notch receptor processing
ASCL1, 429      Notch signaling pathway
HTR2A, 3356     serotonin receptor signaling pathway
ADRB2, 154      transmembrane receptor protein tyrosine kinase activation (dimerization)
PTPRG, 5793     transmembrane receptor protein tyrosine kinase signaling pathway
EPHA4, 2043     transmembrane receptor protein tyrosine kinase signaling pathway
NRTN, 4902      transmembrane receptor protein tyrosine kinase signaling pathway
CTNND1, 1500    Wnt receptor signaling pathway
`
turn ugly query code into a link
http://hcls1.csail.mit.edu:8890/sparql/?query=prefix%20go%3A%20%3Chttp%3A%2F%2Fpurl.org%2Fobo%2Fowl%2FGO%23%3E
%0Aprefix%20rdfs%3A%20%3Chttp%3A%2F%2Fwww.w3.org%2F2000%2F01%2Frdf-schema%23%3E%0Aprefix%20owl%3A
%20%3Chttp%3A%2F%2Fwww.w3.org%2F2002%2F07%2Fowl%23%3E%0Aprefix%20mesh%3A%20%3Chttp%3A%2F%2Fpurl.org
%2Fcommons%2Frecord%2Fmesh%2F%3E%0Aprefix%20sc%3A%20%3Chttp%3A%2F%2Fpurl.org%2Fscience%2Fowl
%2Fsciencecommons%2F%3E%0Aprefix%20ro%3A%20%3Chttp%3A%2F%2Fwww.obofoundry.org%2Fro%2Fro.owl%23%3E%0A
%0Aselect%20%3Fgenename%20%3Fprocessname%0Awhere%0A%7B%20%20graph%20%3Chttp%3A%2F%2Fpurl.org
%2Fcommons%2Fhcls%2Fpubmesh%3E%0A%20%20%20%20%20%7B%20%3Fpaper%20%3Fp%20mesh%3AD017966%20.%0A
%20%20%20%20%20%20%20%3Farticle%20sc%3Aidentified_by_pmid%20%3Fpaper.%0A%20%20%20%20%20%20%20%3Fgene
%20sc%3Adescribes_gene_or_gene_product_mentioned_by%20%3Farticle.%0A%20%20%20%20%20%7D%0A%20%20%20graph
%20%3Chttp%3A%2F%2Fpurl.org%2Fcommons%2Fhcls%2Fgoa%3E%0A%20%20%20%20%20%7B%20%3Fprotein%20rdfs
%3AsubClassOf%20%3Fres.%0A%20%20%20%20%20%20%20%3Fres%20owl%3AonProperty%20ro%3Ahas_function.%0A
%20%20%20%20%20%20%20%3Fres%20owl%3AsomeValuesFrom%20%3Fres2.%0A
%20%20%20%20%20%20%20%3Fres2%20owl%3AonProperty%20ro%3Arealized_as.%0A
%20%20%20%20%20%20%20%3Fres2%20owl%3AsomeValuesFrom%20%3Fprocess.%0A%20%20%20graph%20%3Chttp%3A%2F
%2Fpurl.org%2Fcommons%2Fhcls%2F20070416%2Fclassrelations%3E%0A%20%20%20%20%20%7B%7B%3Fprocess%20%3Chttp
%3A%2F%2Fpurl.org%2Fobo%2Fowl%2Fobo%23part_of%3E%20go%3AGO_0007166%7D%0A%20%20%20%20%20%20%20union
%0A%20%20%20%20%20%20%7B%3Fprocess%20rdfs%3AsubClassOf%20go%3AGO_0007166%20%7D%7D%0A
%20%20%20%20%20%20%20%3Fprotein%20rdfs%3AsubClassOf%20%3Fparent.%0A%20%20%20%20%20%20%20%3Fparent
%20owl%3AequivalentClass%20%3Fres3.%0A%20%20%20%20%20%20%20%3Fres3%20owl%3AhasValue%20%3Fgene.%0A
%20%20%20%20%20%20%7D%0A%20%20%20graph%20%3Chttp%3A%2F%2Fpurl.org%2Fcommons%2Fhcls%2Fgene%3E%0A
%20%20%20%20%20%7B%20%3Fgene%20rdfs%3Alabel%20%3Fgenename%20%7D%0A%20%20%20graph%20%3Chttp%3A
%2F%2Fpurl.org%2Fcommons%2Fhcls%2F20070416%3E%0A%20%20%20%20%20%7B%20%3Fprocess%20rdfs%3Alabel
%20%3Fprocessname%7D%0A%7D&format=&maxrows=50
Data sharing:  a look at the issues - Trieste
Data sharing:  a look at the issues - Trieste
3.
the data “rights” conundrum...
Open Access (OA)




          Photo Credit: Peter Jeffs
©
“creative expression”
is it creative?
is it creative?
is it creative?
category errors
the problem of...
   Non-Commercial


   for data
Non-Commercial


what’s a commercial use
   of the data web?
the problem of...
  Share Alike


   for data
1854
the problem of...
   Attribution


   for data
Data sharing:  a look at the issues - Trieste
Data sharing:  a look at the issues - Trieste
the problem of...
  any license

   for data
database protections based on jurisdiction

              sui generis,
          “sweat of the brow”
            Crown copyright
              moral rights

          the list goes on ....
attribution = license
         citation = norms

which one applies? which is best fit?


 “credit where credit is due”
attribution:
             (legal entity)

   “triggered by making of a copy”
         does it apply to facts?
how to attribute? (papers, ontologies, data)

      “in a manner specified by ...”
           attribution stacking
citation:
(gentle(wo)man’s club)

    legal requirement?
     interoperability?
credit where credit is due
entrenched scientific norm
we shouldn’t use the law to make it
   hard to do the wrong thing ...
<mosquitos><transmit><malaria>


      is it true? can i trust it?
     to what does it connect?
need for a legally accurate and
              simple solution

reducing or eliminating the need to make the
       distinction of what’s protected

requires modular, standards based approach
                  to licensing
Data sharing:  a look at the issues - Trieste
Data sharing:  a look at the issues - Trieste
Data sharing:  a look at the issues - Trieste
Data sharing:  a look at the issues - Trieste
calls for data providers to waive all rights
necessary for data extraction and re-use

  requires provider place no additional
    obligations (like share-alike) to limit
              downstream use

 request behavior (like attribution) through
        norms and terms of use
Data sharing:  a look at the issues - Trieste
Data sharing:  a look at the issues - Trieste
Data sharing:  a look at the issues - Trieste
Data sharing:  a look at the issues - Trieste
Data sharing:  a look at the issues - Trieste
4.
         an example
(and a break from the slides)
5.
 at best, we’re partially right.
at worst, we’re really wrong.
infrastructure for a data web

 the digital commons

law + content + technology +
         community
data without structure and annotation is a
            lost opportunity.

data should flow in an open, public, and
        extensible infrastructure

support recombination and reconfiguration
into computer models, queryable by search
                engine

        treated as public good
resist the temptation to treat
              as property

embrace the potential to treat instead
      as a network resource
the right to fix our mistakes.
(remember Prodigy and AOL?)
thank you.

kaitlin@creativecommons.org
      sciencecommons.org
     creativecommons.org
   slideshare.net/kaythaney

More Related Content

PDF
Open Science and Data Sharing - CERF
PPT
outreach presentation on pedestrian access
PDF
Knowledge Sharing - aCCCeso
PDF
Knowledge sharing and the Commons
ODP
Semantic Web Adoption
PPT
A Reason Able View To The Web Of Pathway Data
PDF
Pistoia talk apr 12 2011
PDF
Drug Repositioning Conference Washington DC 20190923
Open Science and Data Sharing - CERF
outreach presentation on pedestrian access
Knowledge Sharing - aCCCeso
Knowledge sharing and the Commons
Semantic Web Adoption
A Reason Able View To The Web Of Pathway Data
Pistoia talk apr 12 2011
Drug Repositioning Conference Washington DC 20190923

Similar to Data sharing: a look at the issues - Trieste (20)

PPTX
Using biological network approaches for dynamic extension of micronutrient re...
PDF
Illuminating the Druggable Genome with Knowledge Engineering and Machine Lear...
PPTX
Analysis with biological pathways:
PDF
Bioinformática aplicada al estudio del control de la expresión de genes en el...
PDF
Dynamic Integration of Semantic Metadata in Biomedical Communications
PPTX
Cameron.bibm2011
PPT
INDUS: A System for Information Integration and Knowledge Acquisition from Au...
PPTX
NetBioSIG2012 ugurdogrusoz-cbio
PDF
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
PPTX
Linked APIs for Life Sciences Tutorial at SWAT4LS 3011
PPT
Quantitative Medicine Feb 2009
PPTX
2013 nas-ehs-data-integration-dc
PPT
Integration of heterogeneous data
PPTX
2016 bd2k bgood_wikidata
PPTX
dkNET Webinar: Illuminating The Druggable Genome With Pharos 10/23/2020
PPTX
Big data from small data: A deep survey of the neuroscience landscape data via
PDF
SBGN comprehensive disease maps at LCSB.
PPT
Advanced bioinformatics of proteomics datasets
PPTX
2014 aus-agta
Using biological network approaches for dynamic extension of micronutrient re...
Illuminating the Druggable Genome with Knowledge Engineering and Machine Lear...
Analysis with biological pathways:
Bioinformática aplicada al estudio del control de la expresión de genes en el...
Dynamic Integration of Semantic Metadata in Biomedical Communications
Cameron.bibm2011
INDUS: A System for Information Integration and Knowledge Acquisition from Au...
NetBioSIG2012 ugurdogrusoz-cbio
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
Linked APIs for Life Sciences Tutorial at SWAT4LS 3011
Quantitative Medicine Feb 2009
2013 nas-ehs-data-integration-dc
Integration of heterogeneous data
2016 bd2k bgood_wikidata
dkNET Webinar: Illuminating The Druggable Genome With Pharos 10/23/2020
Big data from small data: A deep survey of the neuroscience landscape data via
SBGN comprehensive disease maps at LCSB.
Advanced bioinformatics of proteomics datasets
2014 aus-agta
Ad

More from Kaitlin Thaney (20)

PDF
Megaphones to (No)where: On Sustaining Change
PDF
Lessons in Resilience - International Women's Day Keynote @ Brooklyn College
PDF
Building Capacity for Open Science
PDF
Fueling the Open Movement - Compute Midwest
PDF
Shifting Scientific Practice - ORCID 2015
PDF
Mozilla Science Lab 101
PDF
Building capacity for open science - COASP Meeting
PDF
Leveraging the power of the web - Rocky Mountain Advanced Computing Conference
PDF
Leveraging the power of the web - Open Repositories 2015
PDF
Building capacity for open, data-driven science - Grand Rounds
PDF
National Data Integrity Conference - Making the web work for science
PDF
Capturing Contribution - ARCS
PDF
Making the web work for science - RIT Dean's Lecture Series
PDF
Piloting Contributorship Badges for Science
PDF
"Designing for Truth, Scale and Sustainability" - WSSSPE2 Keynote
PDF
"Making the Web Work for Science" - NCI CBIIT
PDF
"Building Capacity for Open Research" - AAMC
PDF
Making the web work for science - eResearch nz
PDF
Making the web work for science - University of Queensland
PDF
Discoverability and Web-Enabled Science - #ScholarAfrica
Megaphones to (No)where: On Sustaining Change
Lessons in Resilience - International Women's Day Keynote @ Brooklyn College
Building Capacity for Open Science
Fueling the Open Movement - Compute Midwest
Shifting Scientific Practice - ORCID 2015
Mozilla Science Lab 101
Building capacity for open science - COASP Meeting
Leveraging the power of the web - Rocky Mountain Advanced Computing Conference
Leveraging the power of the web - Open Repositories 2015
Building capacity for open, data-driven science - Grand Rounds
National Data Integrity Conference - Making the web work for science
Capturing Contribution - ARCS
Making the web work for science - RIT Dean's Lecture Series
Piloting Contributorship Badges for Science
"Designing for Truth, Scale and Sustainability" - WSSSPE2 Keynote
"Making the Web Work for Science" - NCI CBIIT
"Building Capacity for Open Research" - AAMC
Making the web work for science - eResearch nz
Making the web work for science - University of Queensland
Discoverability and Web-Enabled Science - #ScholarAfrica
Ad

Recently uploaded (20)

PDF
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
PPTX
PPH.pptx obstetrics and gynecology in nursing
PDF
O7-L3 Supply Chain Operations - ICLT Program
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PDF
Business Ethics Teaching Materials for college
PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PDF
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
PDF
Classroom Observation Tools for Teachers
PDF
Anesthesia in Laparoscopic Surgery in India
PPTX
Cell Types and Its function , kingdom of life
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
01-Introduction-to-Information-Management.pdf
PPTX
Cell Structure & Organelles in detailed.
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
PPH.pptx obstetrics and gynecology in nursing
O7-L3 Supply Chain Operations - ICLT Program
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
2.FourierTransform-ShortQuestionswithAnswers.pdf
Business Ethics Teaching Materials for college
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
Classroom Observation Tools for Teachers
Anesthesia in Laparoscopic Surgery in India
Cell Types and Its function , kingdom of life
human mycosis Human fungal infections are called human mycosis..pptx
Microbial diseases, their pathogenesis and prophylaxis
01-Introduction-to-Information-Management.pdf
Cell Structure & Organelles in detailed.
Supply Chain Operations Speaking Notes -ICLT Program
FourierSeries-QuestionsWithAnswers(Part-A).pdf
102 student loan defaulters named and shamed – Is someone you know on the list?
3rd Neelam Sanjeevareddy Memorial Lecture.pdf

Data sharing: a look at the issues - Trieste

  • 1. data sharing: a look at the issues kaitlin thaney program manager, science commons trieste, italy - ICTP - 16 oct 2009 This presentation is licensed under the CreativeCommons-Attribution-3.0 license.
  • 2. xi. before jumping into data ... (where we left off)
  • 3. make sharing easy, legal and scalable integrated approach building part of the infrastructure for knowledge sharing
  • 4. scientific revolutions occur when a sufficient body of data accumulates to overthrow the dominant theories we use to frame reality a so-called paradigm shift - from thomas kuhn
  • 5. content needs to be legally and technically accessible
  • 7. “ By open access to the literature, we mean its free availability on the public internet, permitting users to read, download, copy, distribute, print, search, or link to the full texts of the articles, crawl them for indexing, pass them as data to software, or use them for any other lawful purpose, without financial, legal or technical barriers other than those inseparable from gaining access to the internet itself.” Image from the Public Library of Science, licensed to the public, under CC-BY-3.0
  • 8. “The only constraint on reproduction and distribution, and the only role for copyright in this domain, should be to give authors control over the integrity of their work and the right to be properly acknowledged and cited.”
  • 11. don’t forget about the physical tools UBMTA SLA SCMTA
  • 12. knowledge? journal articles data ontologies annotations plasmids and cell lines
  • 13. as a means to achieve Open Access but what about data?
  • 15. “the future is here ... just unevenly distributed” - william gibson (i.e., linked data, W3C, neurocommons...)
  • 16. 1. three layers of resistance: technical, semantic, legal save legal for last ...
  • 17. “read 189,000 papers” is not the ideal answer.
  • 18. DRD1, 1812 adenylate cyclase activation ADRB2, 154 adenylate cyclase activation ADRB2, 154 arrestin mediated desensitization of G-protein coupled receptor protein signaling pathway DRD1IP, 50632 dopamine receptor signaling pathway DRD1, 1812 dopamine receptor, adenylate cyclase activating pathway DRD2, 1813 dopamine receptor, adenylate cyclase inhibiting pathway GRM7, 2917 G-protein coupled receptor protein signaling pathway GNG3, 2785 G-protein coupled receptor protein signaling pathway GNG12, 55970 G-protein coupled receptor protein signaling pathway DRD2, 1813 G-protein coupled receptor protein signaling pathway ADRB2, 154 G-protein coupled receptor protein signaling pathway CALM3, 808 G-protein coupled receptor protein signaling pathway HTR2A, 3356 G-protein coupled receptor protein signaling pathway DRD1, 1812 G-protein signaling, coupled to cyclic nucleotide second messenger SSTR5, 6755 G-protein signaling, coupled to cyclic nucleotide second messenger MTNR1A, 4543 G-protein signaling, coupled to cyclic nucleotide second messenger CNR2, 1269 G-protein signaling, coupled to cyclic nucleotide second messenger HTR6, 3362 G-protein signaling, coupled to cyclic nucleotide second messenger GRIK2, 2898 glutamate signaling pathway GRIN1, 2902 glutamate signaling pathway GRIN2A, 2903 glutamate signaling pathway GRIN2B, 2904 glutamate signaling pathway ADAM10, 102 integrin-mediated signaling pathway GRM7, 2917 negative regulation of adenylate cyclase activity LRP1, 4035 negative regulation of Wnt receptor signaling pathway ADAM10, 102 Notch receptor processing ASCL1, 429 Notch signaling pathway HTR2A, 3356 serotonin receptor signaling pathway ADRB2, 154 transmembrane receptor protein tyrosine kinase activation (dimerization) PTPRG, 5793 transmembrane receptor protein tyrosine kinase signaling pathway EPHA4, 2043 transmembrane receptor protein tyrosine kinase signaling pathway NRTN, 4902 transmembrane receptor protein tyrosine kinase signaling pathway CTNND1, 1500 Wnt receptor signaling pathway `
  • 20. traditional transfer of copyright agreement
  • 21. (1) KEGG - Kyoto Encyclopedia of Genes and Genomes “Non-academic users and Academic users intending to use KEGG for commercial purposes are requested to obtain a license agreement through KEGG's exclusive licensing agent, Pathway Solutions, for installation of KEGG at their sites, for distribution or reselling of KEGG data, for software development or any other commercial activities that make use of KEGG, or as end users of any third-party application that requires downloading of KEGG data or access to KEGG data via the KEGG API. (2) HapMap - human genetic variation data “The click-wrap license was designed as a temporary tool to continue the practice of providing rapid access to human genome data [...]. One consequence of the license requirement was that the [...] license prevented HapMap data from being integrated into major public databases, which require that data deposited carry no conditions on use ...” - Wellcome Trust, Sanger, Dec 2004
  • 22. what companies think we’re doing with the web
  • 23. 2. people like stories ... why Open Access is needed
  • 27. espresso coffee cafe kopi cafezinho latte koffee mocha americano
  • 29. converge on common names “coffee” “cafe” coffee “kopi” http://ontology.foo.org/1234567
  • 33. better answers through better formats: Mesh: Pyramidal Neurons select ?gene_name ?process_name where Pubmed: Journal Articles { PropertyValue(?pubmed_record, ?p, mesh:D017966) PropertyValue(?article, sc:identified_by_pmid , ?pubmed_record) PropertyValue(?gene_record, sc:describes_gene_or_gene_product_mentioned_by, ?article) SubClassOf(?protein, some(ro:has_function, some(ro:realized_as, ?process))) SubClassOf(?process, or(go:GO_0007166, some(ro:part_of, go:GO_0007166)) Entrez Gene: Genes SubClassOf(?protein, some(sc:is_protein_gene_product_of_dna_described_by,?gene_record)) Annotation(?gene_record,rdfs:label,{?gene_name}) } Annotation(?process,rdfs:label,?process_name) GO: Signal Transduction
  • 34. DRD1, 1812 adenylate cyclase activation ADRB2, 154 adenylate cyclase activation ADRB2, 154 arrestin mediated desensitization of G-protein coupled receptor protein signaling pathway DRD1IP, 50632 dopamine receptor signaling pathway DRD1, 1812 dopamine receptor, adenylate cyclase activating pathway DRD2, 1813 dopamine receptor, adenylate cyclase inhibiting pathway GRM7, 2917 G-protein coupled receptor protein signaling pathway GNG3, 2785 G-protein coupled receptor protein signaling pathway GNG12, 55970 G-protein coupled receptor protein signaling pathway DRD2, 1813 G-protein coupled receptor protein signaling pathway ADRB2, 154 G-protein coupled receptor protein signaling pathway CALM3, 808 G-protein coupled receptor protein signaling pathway HTR2A, 3356 G-protein coupled receptor protein signaling pathway DRD1, 1812 G-protein signaling, coupled to cyclic nucleotide second messenger SSTR5, 6755 G-protein signaling, coupled to cyclic nucleotide second messenger MTNR1A, 4543 G-protein signaling, coupled to cyclic nucleotide second messenger CNR2, 1269 G-protein signaling, coupled to cyclic nucleotide second messenger HTR6, 3362 G-protein signaling, coupled to cyclic nucleotide second messenger GRIK2, 2898 glutamate signaling pathway GRIN1, 2902 glutamate signaling pathway GRIN2A, 2903 glutamate signaling pathway GRIN2B, 2904 glutamate signaling pathway ADAM10, 102 integrin-mediated signaling pathway GRM7, 2917 negative regulation of adenylate cyclase activity LRP1, 4035 negative regulation of Wnt receptor signaling pathway ADAM10, 102 Notch receptor processing ASCL1, 429 Notch signaling pathway HTR2A, 3356 serotonin receptor signaling pathway ADRB2, 154 transmembrane receptor protein tyrosine kinase activation (dimerization) PTPRG, 5793 transmembrane receptor protein tyrosine kinase signaling pathway EPHA4, 2043 transmembrane receptor protein tyrosine kinase signaling pathway NRTN, 4902 transmembrane receptor protein tyrosine kinase signaling pathway CTNND1, 1500 Wnt receptor signaling pathway `
  • 35. turn ugly query code into a link http://hcls1.csail.mit.edu:8890/sparql/?query=prefix%20go%3A%20%3Chttp%3A%2F%2Fpurl.org%2Fobo%2Fowl%2FGO%23%3E %0Aprefix%20rdfs%3A%20%3Chttp%3A%2F%2Fwww.w3.org%2F2000%2F01%2Frdf-schema%23%3E%0Aprefix%20owl%3A %20%3Chttp%3A%2F%2Fwww.w3.org%2F2002%2F07%2Fowl%23%3E%0Aprefix%20mesh%3A%20%3Chttp%3A%2F%2Fpurl.org %2Fcommons%2Frecord%2Fmesh%2F%3E%0Aprefix%20sc%3A%20%3Chttp%3A%2F%2Fpurl.org%2Fscience%2Fowl %2Fsciencecommons%2F%3E%0Aprefix%20ro%3A%20%3Chttp%3A%2F%2Fwww.obofoundry.org%2Fro%2Fro.owl%23%3E%0A %0Aselect%20%3Fgenename%20%3Fprocessname%0Awhere%0A%7B%20%20graph%20%3Chttp%3A%2F%2Fpurl.org %2Fcommons%2Fhcls%2Fpubmesh%3E%0A%20%20%20%20%20%7B%20%3Fpaper%20%3Fp%20mesh%3AD017966%20.%0A %20%20%20%20%20%20%20%3Farticle%20sc%3Aidentified_by_pmid%20%3Fpaper.%0A%20%20%20%20%20%20%20%3Fgene %20sc%3Adescribes_gene_or_gene_product_mentioned_by%20%3Farticle.%0A%20%20%20%20%20%7D%0A%20%20%20graph %20%3Chttp%3A%2F%2Fpurl.org%2Fcommons%2Fhcls%2Fgoa%3E%0A%20%20%20%20%20%7B%20%3Fprotein%20rdfs %3AsubClassOf%20%3Fres.%0A%20%20%20%20%20%20%20%3Fres%20owl%3AonProperty%20ro%3Ahas_function.%0A %20%20%20%20%20%20%20%3Fres%20owl%3AsomeValuesFrom%20%3Fres2.%0A %20%20%20%20%20%20%20%3Fres2%20owl%3AonProperty%20ro%3Arealized_as.%0A %20%20%20%20%20%20%20%3Fres2%20owl%3AsomeValuesFrom%20%3Fprocess.%0A%20%20%20graph%20%3Chttp%3A%2F %2Fpurl.org%2Fcommons%2Fhcls%2F20070416%2Fclassrelations%3E%0A%20%20%20%20%20%7B%7B%3Fprocess%20%3Chttp %3A%2F%2Fpurl.org%2Fobo%2Fowl%2Fobo%23part_of%3E%20go%3AGO_0007166%7D%0A%20%20%20%20%20%20%20union %0A%20%20%20%20%20%20%7B%3Fprocess%20rdfs%3AsubClassOf%20go%3AGO_0007166%20%7D%7D%0A %20%20%20%20%20%20%20%3Fprotein%20rdfs%3AsubClassOf%20%3Fparent.%0A%20%20%20%20%20%20%20%3Fparent %20owl%3AequivalentClass%20%3Fres3.%0A%20%20%20%20%20%20%20%3Fres3%20owl%3AhasValue%20%3Fgene.%0A %20%20%20%20%20%20%7D%0A%20%20%20graph%20%3Chttp%3A%2F%2Fpurl.org%2Fcommons%2Fhcls%2Fgene%3E%0A %20%20%20%20%20%7B%20%3Fgene%20rdfs%3Alabel%20%3Fgenename%20%7D%0A%20%20%20graph%20%3Chttp%3A %2F%2Fpurl.org%2Fcommons%2Fhcls%2F20070416%3E%0A%20%20%20%20%20%7B%20%3Fprocess%20rdfs%3Alabel %20%3Fprocessname%7D%0A%7D&format=&maxrows=50
  • 38. 3. the data “rights” conundrum...
  • 39. Open Access (OA) Photo Credit: Peter Jeffs
  • 45. the problem of... Non-Commercial for data
  • 46. Non-Commercial what’s a commercial use of the data web?
  • 47. the problem of... Share Alike for data
  • 48. 1854
  • 49. the problem of... Attribution for data
  • 52. the problem of... any license for data
  • 53. database protections based on jurisdiction sui generis, “sweat of the brow” Crown copyright moral rights the list goes on ....
  • 54. attribution = license citation = norms which one applies? which is best fit? “credit where credit is due”
  • 55. attribution: (legal entity) “triggered by making of a copy” does it apply to facts? how to attribute? (papers, ontologies, data) “in a manner specified by ...” attribution stacking
  • 56. citation: (gentle(wo)man’s club) legal requirement? interoperability? credit where credit is due entrenched scientific norm
  • 57. we shouldn’t use the law to make it hard to do the wrong thing ...
  • 58. <mosquitos><transmit><malaria> is it true? can i trust it? to what does it connect?
  • 59. need for a legally accurate and simple solution reducing or eliminating the need to make the distinction of what’s protected requires modular, standards based approach to licensing
  • 64. calls for data providers to waive all rights necessary for data extraction and re-use requires provider place no additional obligations (like share-alike) to limit downstream use request behavior (like attribution) through norms and terms of use
  • 70. 4. an example (and a break from the slides)
  • 71. 5. at best, we’re partially right. at worst, we’re really wrong.
  • 72. infrastructure for a data web the digital commons law + content + technology + community
  • 73. data without structure and annotation is a lost opportunity. data should flow in an open, public, and extensible infrastructure support recombination and reconfiguration into computer models, queryable by search engine treated as public good
  • 74. resist the temptation to treat as property embrace the potential to treat instead as a network resource
  • 75. the right to fix our mistakes.
  • 77. thank you. kaitlin@creativecommons.org sciencecommons.org creativecommons.org slideshare.net/kaythaney