SlideShare a Scribd company logo
Susan Reilly
Executive Director, LIBER
Tensions between Intellectual
Property and Knowledge Discovery
in the Digital Age
LIBER: Information Infrastructure for
World Class Research?
• Collaborative
– Growth in collaboration from 13% (2003)- 17% (2011)
• International
– 40% of French & German research outputs a result of
international collaboration
– Rate of citation grows as geographic extent of collaboration
increases
• Interdisciplinary
– Foundation of frontiers research
• Data intensive
• supports interdisciplinary exploration
• … and open
Higgs boson
2012 Journal
Physics B paper:
6235 citations
Overview
• What is knowledge discovery?
• Barriers
• Principles
• Actions
What is knowledge
discovery?
Text & Data Mining is the future
“Text and data mining (TDM) is the process of deriving
information from machine-read material. It works by
copying large quantities of material, extracting the data,
and recombining it to identify patterns.” JISC
Content
Mining
Data
Analytics
Data
Mining
Knowledge Discovery
• Ultimate goal is to extract high level knowledge
from low level data
• Allows analysis across disciplines
• “Undiscovered public knowledge” (Swanson)
• Identifies patterns in the data to produce new
knowledge
• It’s not a new thing, it’s just digital information
makes it a whole lot more powerful and relevant!
1. Deeper discovery: topic modelling
Human Computers (1901)
http://marlowe-shakespeare.blogspot.nl/2009/02/on-mendenhall-and-compelling-evidence.html
"This above all:
to thine own self
be true".
4 5 3 2 5 3 4 2 4
Use of words(2009)
Marsden J, Budden D, Craig H, Moscato P (2013) Language Individuation and Marker Words:
Shakespeare and His Maxwell's Demon. PLoS ONE 8(6): e66813.
doi:10.1371/journal.pone.0066813
Cancer diagnosis(2013)
• http://theconversation.com/shakespeare-and-cancer-diagnoses-how-bard-can-it-be-
15381ata sets
2. Saving lives
3. Increasing transparency…
• Sentiment analysis- attitudes towards
women
• Tone analysis and vocabulary associated
with women
• Women parliamentarians in debate
• Data from NL, UK, Canada
e.g. sentiment analysis
4.Cultural insight
Tensions between intellectual property and knowledge discovery in the digital age
5. Economics & Competitiveness
(Europe)
• TDM potentially worth 5.3 billion euro a year to European
research budget (2%)
• Knock-on effect would be a minimum of 32.5 billion euro
increase in GDP
• US responsible for over half
the articles and patents on TDM
- 1100 US patents compared to 39
EU by 2013
• Non-english speaking countries
falliing behind
Barriers
• Lack of legal clarity
• Licences
• TPMs
Does copyright apply?
• We are not seeking free access!
• We are not seeking to steal content.
• We are seeking to convert content that we have legal
access to into machine readable format so that our
computers can ‘read’ the content and extract facts, data
and ideas.
“To thine own
self be true”
Are licences the solution?
• Scalability
• Intellectual privacy
• Transparency
• Access
Case Study:
Large European research
university
•10 additional library staff for
monitoring and compliance
•Costs for academic time, to
comply with publisher licence
requirements, of €0.68 million
What about open access?
• Interoperability
• cc-by and cc0
• Formats
• Open infrastructure
Principles?
Elsevier TDM Policy
• Access through API only
• Text only- no images, tables
• Research must register details
• Click-through licence
• Terms can change any time
• Reproducibility of results
1. INTELLECTUAL PROPERTY WAS NOT
DESIGNED TO REGULATE THE FREE
FLOW OF FACTS, DATA AND IDEAS,
BUT HAS AS A KEY OBJECTIVE THE
PROMOTION OF RESEARCH ACTIVITY
2. PEOPLE SHOULD HAVE THE
FREEDOM TO ANALYSE AND PURSUE
INTELLECTUAL CURIOSITY WITHOUT
FEAR OF MONITORING OR
REPERCUSSIONS
3. LICENSES AND CONTRACT TERMS
SHOULD NOT RESTRICT INDIVIDUALS
FROM USING FACTS, DATA AND IDEAS
4. ETHICS AROUND THE USE OF
CONTENT MINING TECHNIQUES WILL
NEED TO CONTINUE TO EVOLVE IN
RESPONSE TO CHANGING
TECHNOLOGY
5. INNOVATION AND COMMERCIAL
RESEARCH BASED ON THE USE OF
FACTS, DATA, AND IDEAS SHOULD NOT
BE RESTRICTED BY INTELLECTUAL
PROPERTY LAW
Actions
What is LIBER doing? Advocacy
“A mandatory pan-European
exception for text and data
mining (and analogous
activities), which cannot be
overridden by contract, and
is not limited to non-
commercial activity.”
Photo: Howard Lake
https://www.flickr.com/photos/howardlake/5540462170
What is LIBER doing? OpenMinTed
• Open public infrastructure for text and data mining
• Primary content is accessible through standardised
programmatic interfaces and access rules
• Well-documented and easily discoverable text mining
services and workflows which process, analyse and
annotate text to
• identify patterns and extract new meaningful actionable
knowledge
• structuring, indexing and searching content, and, in
tandem, e) act as a new knowledge resource useful for
drawing new relations between content items and firing a
new mining cycle
What can you do?
• Advocate for knowledge discovery friendly policy
• Increase accessibility of digital collections
• Use open standards
• Support collaboration and innovation-become a
laboratory!
• Provide infrastructure for storage of and access
to datasets and algorithms
• Plug into openMinTeD!
Thank You!
Any questions?
@skreilly
www.the haguedeclaration.com
www.openminted.eu
www.libereurope.eu

More Related Content

PPT
Ethical and legal issues in making research data open
PDF
Text Data Mining & Publishing: Legal Literacies
PPT
LIBER fostering Open Science and Knowledge Discovery
PPTX
Privacy in the Digital Age, Helen Cullyer
PPTX
ICT, Development & Privacy: Exploring the Debate and Evaluating Argumentative...
PPTX
State of the Art Informatics for Research Reproducibility, Reliability, and...
PDF
Marden - Privacy in the 21st Century Why It Matters Now More Than Ever
Ethical and legal issues in making research data open
Text Data Mining & Publishing: Legal Literacies
LIBER fostering Open Science and Knowledge Discovery
Privacy in the Digital Age, Helen Cullyer
ICT, Development & Privacy: Exploring the Debate and Evaluating Argumentative...
State of the Art Informatics for Research Reproducibility, Reliability, and...
Marden - Privacy in the 21st Century Why It Matters Now More Than Ever

What's hot (20)

PPTX
Miscellaneous Info: The Digital Past, Present, Future
PPTX
Isa12b
PPTX
Evolution of the internet
PDF
Online text data for machine learning, data science, and research - Who can p...
PPTX
Social Networking Sites and Privacy as Contextual Integrity
PDF
Using social media to address professional issues in LIS
PDF
Introduction to Scholarly Communication and the CSCDC
PDF
Lecture 2011.05B - FOSS Communities and the Spread of Free (Digital Sustainab...
PPTX
Rebecca Grant - Archiving and Digital Preservation (Figshare Fest)
PPTX
Deptartment of State at UNH Law : LIBRARIANS AND ARCHIVISTS AS DEFENDERS OF I...
PPTX
Role of Information Technology in Library Management in Digital Era
PPT
GDPR - Thoughts on the EU Data Protection Regulation, Research and Libraries
PPSX
Future internet information centric networking is the door
PDF
งานวิจัยต่างประเทศเกี่ยวกับ information technology
PPT
Electronic discovery
PPTX
Towards long-term preservation of linked data - the PRELIDA project
PPTX
Digital libraries
DOCX
Resume harris 19
PDF
L16 A World Wide Network
Miscellaneous Info: The Digital Past, Present, Future
Isa12b
Evolution of the internet
Online text data for machine learning, data science, and research - Who can p...
Social Networking Sites and Privacy as Contextual Integrity
Using social media to address professional issues in LIS
Introduction to Scholarly Communication and the CSCDC
Lecture 2011.05B - FOSS Communities and the Spread of Free (Digital Sustainab...
Rebecca Grant - Archiving and Digital Preservation (Figshare Fest)
Deptartment of State at UNH Law : LIBRARIANS AND ARCHIVISTS AS DEFENDERS OF I...
Role of Information Technology in Library Management in Digital Era
GDPR - Thoughts on the EU Data Protection Regulation, Research and Libraries
Future internet information centric networking is the door
งานวิจัยต่างประเทศเกี่ยวกับ information technology
Electronic discovery
Towards long-term preservation of linked data - the PRELIDA project
Digital libraries
Resume harris 19
L16 A World Wide Network
Ad

Similar to Tensions between intellectual property and knowledge discovery in the digital age (20)

PPT
Libraries Enabling Open Science: LIBER Strategy & Advocacy
PPT
Library Science Talk: Tensions between copyright and knowledge discovery
PPTX
ContentMining and Copyright at CopyCamp2017
PPT
Conference - ITechLaw CyberSpaceCamp
PPTX
A coordinated framework for open data open science in Botswana/Simon Hodson
PDF
David De Roure - What's so different about Arts and Humanities data?
PDF
Open Data - strategies for research data management & impact of best practices
PPTX
FAIR vs GDPR: which will win?
PPTX
Rights to privacy and freedom of expression in public libraries: squaring the...
PDF
Getting Started with Institutional Repositories and Open Access
PPTX
UKSG 2024 -From algorithms to empowerment:teaching algorithmic literacy (AL) ...
PPTX
Research and Innovation in transformation: the transition to Open Science
PPTX
Why not use ict in sudan
PDF
Open Data how to
PDF
Digital research: Collections, data, tools and methods
PDF
When Search becomes Research and Research becomes Search
PDF
Curating the Scholarly Record: Data Management and Research Libraries
PDF
Ethics and Privacy in the Application of Learning Analytics (#EP4LA)
PPTX
Automatic Extraction of Science and Medicine from the scholarly literature
PPTX
Automatic Extraction of Science and Medicine from the scholarly literature
Libraries Enabling Open Science: LIBER Strategy & Advocacy
Library Science Talk: Tensions between copyright and knowledge discovery
ContentMining and Copyright at CopyCamp2017
Conference - ITechLaw CyberSpaceCamp
A coordinated framework for open data open science in Botswana/Simon Hodson
David De Roure - What's so different about Arts and Humanities data?
Open Data - strategies for research data management & impact of best practices
FAIR vs GDPR: which will win?
Rights to privacy and freedom of expression in public libraries: squaring the...
Getting Started with Institutional Repositories and Open Access
UKSG 2024 -From algorithms to empowerment:teaching algorithmic literacy (AL) ...
Research and Innovation in transformation: the transition to Open Science
Why not use ict in sudan
Open Data how to
Digital research: Collections, data, tools and methods
When Search becomes Research and Research becomes Search
Curating the Scholarly Record: Data Management and Research Libraries
Ethics and Privacy in the Application of Learning Analytics (#EP4LA)
Automatic Extraction of Science and Medicine from the scholarly literature
Automatic Extraction of Science and Medicine from the scholarly literature
Ad

More from LIBER Europe (20)

PPTX
LIBER Europe Covid-19 Research Libraries Survey - December 2020
PDF
LIBER Webinar: Turning FAIR Data Into Reality
PDF
Copyright Reform: EU Legislative Process & LIBER Advocacy
PPTX
LIBER Webinar: Supporting Data Literacy
PPTX
Applying Bourdieu's Field Theory to MLS Curricula Development. Charlotte Nord...
PPTX
Growing a Culture for Change at The University of Manchester Library. Penny H...
PDF
Knowledge Exchange Consensus: Monitoring of Open Access Publications and Cost...
PDF
The GND initiative 2017-2021: Developing a Backbone for the Web of Cultural a...
PDF
The Role of Libraries in the Adoption of Research Data Management. Ingeborg V...
PDF
LibChain – Open, Verifiable and Anonymous Access Management. Juan Cabello, P...
PDF
From Open Access to Open Data: Collaborative Work in the University Libraries...
PPTX
The Perks and Challenges of Drawing Maps and Walking at the Same Time
PDF
TIB AV-Portal: Semantic Content Mining with Semi-Automatic Metadata Editing. ...
PDF
Text and Data Mining : Making the Most of a Copyright Exception. Julien Roche...
PDF
Adoption and Integration of Persistent Identifiers in European Research Infor...
PDF
Digital Humanities Clinics – Leading Dutch Librarians into DH. Lotte Wilms, N...
PDF
COUNTER Standards for Open Access: The Value of Measuring/The Measuring of Va...
PPTX
Enabling the Exchange and use of Data in Agriculture
PPT
Research Data Services and Data Collections: Library Synergies for Economic R...
PPTX
The Tribal Approach Academia Takes to Research Data Management
LIBER Europe Covid-19 Research Libraries Survey - December 2020
LIBER Webinar: Turning FAIR Data Into Reality
Copyright Reform: EU Legislative Process & LIBER Advocacy
LIBER Webinar: Supporting Data Literacy
Applying Bourdieu's Field Theory to MLS Curricula Development. Charlotte Nord...
Growing a Culture for Change at The University of Manchester Library. Penny H...
Knowledge Exchange Consensus: Monitoring of Open Access Publications and Cost...
The GND initiative 2017-2021: Developing a Backbone for the Web of Cultural a...
The Role of Libraries in the Adoption of Research Data Management. Ingeborg V...
LibChain – Open, Verifiable and Anonymous Access Management. Juan Cabello, P...
From Open Access to Open Data: Collaborative Work in the University Libraries...
The Perks and Challenges of Drawing Maps and Walking at the Same Time
TIB AV-Portal: Semantic Content Mining with Semi-Automatic Metadata Editing. ...
Text and Data Mining : Making the Most of a Copyright Exception. Julien Roche...
Adoption and Integration of Persistent Identifiers in European Research Infor...
Digital Humanities Clinics – Leading Dutch Librarians into DH. Lotte Wilms, N...
COUNTER Standards for Open Access: The Value of Measuring/The Measuring of Va...
Enabling the Exchange and use of Data in Agriculture
Research Data Services and Data Collections: Library Synergies for Economic R...
The Tribal Approach Academia Takes to Research Data Management

Recently uploaded (20)

PPTX
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
PPTX
Database Infoormation System (DBIS).pptx
PDF
REAL ILLUMINATI AGENT IN KAMPALA UGANDA CALL ON+256765750853/0705037305
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
importance of Data-Visualization-in-Data-Science. for mba studnts
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
PPTX
QUANTUM_COMPUTING_AND_ITS_POTENTIAL_APPLICATIONS[2].pptx
PDF
Lecture1 pattern recognition............
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PPTX
SAP 2 completion done . PRESENTATION.pptx
PDF
[EN] Industrial Machine Downtime Prediction
PPTX
Modelling in Business Intelligence , information system
PPTX
Leprosy and NLEP programme community medicine
PDF
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
PPTX
(Ali Hamza) Roll No: (F24-BSCS-1103).pptx
PDF
Optimise Shopper Experiences with a Strong Data Estate.pdf
PPT
Predictive modeling basics in data cleaning process
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
Database Infoormation System (DBIS).pptx
REAL ILLUMINATI AGENT IN KAMPALA UGANDA CALL ON+256765750853/0705037305
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
importance of Data-Visualization-in-Data-Science. for mba studnts
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
QUANTUM_COMPUTING_AND_ITS_POTENTIAL_APPLICATIONS[2].pptx
Lecture1 pattern recognition............
Introduction-to-Cloud-ComputingFinal.pptx
Galatica Smart Energy Infrastructure Startup Pitch Deck
SAP 2 completion done . PRESENTATION.pptx
[EN] Industrial Machine Downtime Prediction
Modelling in Business Intelligence , information system
Leprosy and NLEP programme community medicine
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
(Ali Hamza) Roll No: (F24-BSCS-1103).pptx
Optimise Shopper Experiences with a Strong Data Estate.pdf
Predictive modeling basics in data cleaning process

Tensions between intellectual property and knowledge discovery in the digital age

  • 1. Susan Reilly Executive Director, LIBER Tensions between Intellectual Property and Knowledge Discovery in the Digital Age
  • 2. LIBER: Information Infrastructure for World Class Research? • Collaborative – Growth in collaboration from 13% (2003)- 17% (2011) • International – 40% of French & German research outputs a result of international collaboration – Rate of citation grows as geographic extent of collaboration increases • Interdisciplinary – Foundation of frontiers research • Data intensive • supports interdisciplinary exploration • … and open Higgs boson 2012 Journal Physics B paper: 6235 citations
  • 3. Overview • What is knowledge discovery? • Barriers • Principles • Actions
  • 5. Text & Data Mining is the future “Text and data mining (TDM) is the process of deriving information from machine-read material. It works by copying large quantities of material, extracting the data, and recombining it to identify patterns.” JISC Content Mining Data Analytics Data Mining
  • 6. Knowledge Discovery • Ultimate goal is to extract high level knowledge from low level data • Allows analysis across disciplines • “Undiscovered public knowledge” (Swanson) • Identifies patterns in the data to produce new knowledge • It’s not a new thing, it’s just digital information makes it a whole lot more powerful and relevant!
  • 7. 1. Deeper discovery: topic modelling
  • 9. Use of words(2009) Marsden J, Budden D, Craig H, Moscato P (2013) Language Individuation and Marker Words: Shakespeare and His Maxwell's Demon. PLoS ONE 8(6): e66813. doi:10.1371/journal.pone.0066813
  • 12. 3. Increasing transparency… • Sentiment analysis- attitudes towards women • Tone analysis and vocabulary associated with women • Women parliamentarians in debate • Data from NL, UK, Canada
  • 16. 5. Economics & Competitiveness (Europe) • TDM potentially worth 5.3 billion euro a year to European research budget (2%) • Knock-on effect would be a minimum of 32.5 billion euro increase in GDP • US responsible for over half the articles and patents on TDM - 1100 US patents compared to 39 EU by 2013 • Non-english speaking countries falliing behind
  • 17. Barriers • Lack of legal clarity • Licences • TPMs
  • 18. Does copyright apply? • We are not seeking free access! • We are not seeking to steal content. • We are seeking to convert content that we have legal access to into machine readable format so that our computers can ‘read’ the content and extract facts, data and ideas. “To thine own self be true”
  • 19. Are licences the solution? • Scalability • Intellectual privacy • Transparency • Access Case Study: Large European research university •10 additional library staff for monitoring and compliance •Costs for academic time, to comply with publisher licence requirements, of €0.68 million
  • 20. What about open access? • Interoperability • cc-by and cc0 • Formats • Open infrastructure
  • 22. Elsevier TDM Policy • Access through API only • Text only- no images, tables • Research must register details • Click-through licence • Terms can change any time • Reproducibility of results
  • 23. 1. INTELLECTUAL PROPERTY WAS NOT DESIGNED TO REGULATE THE FREE FLOW OF FACTS, DATA AND IDEAS, BUT HAS AS A KEY OBJECTIVE THE PROMOTION OF RESEARCH ACTIVITY
  • 24. 2. PEOPLE SHOULD HAVE THE FREEDOM TO ANALYSE AND PURSUE INTELLECTUAL CURIOSITY WITHOUT FEAR OF MONITORING OR REPERCUSSIONS
  • 25. 3. LICENSES AND CONTRACT TERMS SHOULD NOT RESTRICT INDIVIDUALS FROM USING FACTS, DATA AND IDEAS
  • 26. 4. ETHICS AROUND THE USE OF CONTENT MINING TECHNIQUES WILL NEED TO CONTINUE TO EVOLVE IN RESPONSE TO CHANGING TECHNOLOGY
  • 27. 5. INNOVATION AND COMMERCIAL RESEARCH BASED ON THE USE OF FACTS, DATA, AND IDEAS SHOULD NOT BE RESTRICTED BY INTELLECTUAL PROPERTY LAW
  • 29. What is LIBER doing? Advocacy “A mandatory pan-European exception for text and data mining (and analogous activities), which cannot be overridden by contract, and is not limited to non- commercial activity.” Photo: Howard Lake https://www.flickr.com/photos/howardlake/5540462170
  • 30. What is LIBER doing? OpenMinTed • Open public infrastructure for text and data mining • Primary content is accessible through standardised programmatic interfaces and access rules • Well-documented and easily discoverable text mining services and workflows which process, analyse and annotate text to • identify patterns and extract new meaningful actionable knowledge • structuring, indexing and searching content, and, in tandem, e) act as a new knowledge resource useful for drawing new relations between content items and firing a new mining cycle
  • 31. What can you do? • Advocate for knowledge discovery friendly policy • Increase accessibility of digital collections • Use open standards • Support collaboration and innovation-become a laboratory! • Provide infrastructure for storage of and access to datasets and algorithms • Plug into openMinTeD!
  • 32. Thank You! Any questions? @skreilly www.the haguedeclaration.com www.openminted.eu www.libereurope.eu

Editor's Notes

  • #3: Top 2 cited papers in high energy physics in 2013
  • #6: Text and data mining (TDM) is the process of deriving information from machine-read material. It works by copying large quantities of material, extracting the data, and recombining it to identify patterns. TDM is essentially another method of reading, done by the computer rather than the human eye. It is a natural next step for the research process, as more and more content is electronic. For libraries what this means is that researchers are able to extract more value from our vast collections- born digital and digitised. I’d like to show you some examples of the added value of TDM.
  • #7: In 1986, an information scientist, Don Swanson, coined the term undiscovered public knowledge. By ‘knowledge’ he meant products of human intellectual activity as encoded in the public record (so externally manifested). The challenge was to overcome the “unruly problem of meaning” and to make to connection between the outputs of different disciplines in order to draw inferences. The classic example of Swanson’s theory was the link he made between Raynard’s disease and fish oil. Swanson made the link between one area of research, which was on the health benefits of fish oils, which was found to thin the blood, and research in Raynars’s, which identified high blood viscosity as one of the symptoms of the diseases. Making this link Swanson was able to postulate that fish oil could help treat Raynard’s. He was correct. Implicit
  • #9: Physicist T.C. Mendenhall hired 2 women to count the length of words in Shakepeares works. Word length frequency curve remains consistent- way to ascertain authenticity. Unlike most English authors he used more 4 letter words than 3 letter words. No correlation to Bacon but (as was discovered years later) was as similar to  Christopher Marlowe (another Elizabethan playwrite and poet)  as he was to himself
  • #10: Marker of Shakespeares writing is his comparative underuse of words and selections of words Identified 20 most used words across body of literature of period. New scoring of markers not just based on the use of words but on the underuse. Excellent methods for identifying markers in large datasets  fluctuations of the observed frequencies of words all to (infinitive) now ye
  • #11: Biomarker could be elevated enzyme levels, help to indvidualise treatment, panels of biomarkers more effective, increaing sensitivity and specificity. application of scoring method developed previously- mislabelled samples, significant outliers etc in big data
  • #12: 42 protiens identifies as interacting with muscular distriphoy0 7 know. A few false positives, others verified. PPIs can be targeted with drugs to inhibit interaction. More targeted, less invasive treatment
  • #13: Enhance data using natural language processing and linked data. EG in the Netherlands there is a project called Expose- exploratory political search, which converts the parliamentary records (since 1500s)records into XML and makes them linked open data, they link them up to related collections such as the KB’s newspaper collection
  • #15: Developed an the antidote to the constant measuring of economic indicators to assess how a country is doing, it measures happiness instead. Crawls Twitter daily. Data via API. Download database of words
  • #19: Strikes at the heart of what libraries do.
  • #20: Will Greenacre
  • #21: Will Greenacre
  • #24: The free flow of information and ideas is an essential human right4. It is a catalyst for the production of human knowledge, which underpins welfare and prosperity. Societies around the world have chosen to protect certain limited rights in intellectual property as incentives both to innovation and the dissemination of knowledge. Intellectual property law was never intended to cover facts, ideas and pure data. However the modern application of intellectual property law is increasingly becoming an obstacle to knowledge creation and dissemination that use even these most simple building blocks of knowledge. In some countries, copyright law5 in particular has been interpreted to restrict the ability to apply computer reading and analysis to otherwise legally-available content. Other legislative frameworks such as patent law and database law may have a similar impact. When intellectual property law allows content to be read and analysed manually by humans but not by their machines, it has failed its original purposes.
  • #25: Providers of content should respect the intellectual privacy of individual readers and should take measures to protect readers’ privacy from interference by any external body. Any exception, which for example would result in an encroachment of individual privacy, will need to be necessary and proportionate and provided for by law. The use of facts, data and ideas must not prejudice the legitimate rights of individuals to privacy and a private life.
  • #26: Generally, licences and contract terms that regulate and restrict how individuals may analyse and use facts, data and ideas are unacceptable and inhibit innovation and the creation of new knowledge and, therefore, should not be adopted. Similarly, it is unacceptable that technical measures in digital rights management systems should inhibit the lawful right to perform content mining.
  • #27: The observation of well-established ethical norms in research and business, as well as the continued development of such standards and laws, must be supported and encouraged in order to ensure that content mining technologies are deployed for the benefit of society.
  • #28: As facts, data, and ideas are not copyrightable it does not make sense to restrict ethical commercial use of those facts, data, and ideas extracted from content which has been obtained legally. It is recognised that while patent law is designed to protect innovations and inventions, this is not meant to encompass facts and data. Restrictions on the use of facts, data and ideas can have a serious impact on innovation and on economic development globally. It can also reduce the ability to use tools and processes which can benefit citizens in the areas of health, science, employment, research, the environment and culture.