SlideShare a Scribd company logo
OpenDataMonitor
Horizon 2020
Coordination and Support Action
GARRI-3-2014 Scientific Information in the Digital Age: Text and Data Mining (TDM)
Project number: 665940
Increasing Uptake of Text and Data Mining in the EU
FutureTDM
Reducing Barriers and Increasing Uptake of Text and Data Mining for Research Environments
using a Collaborative Knowledge and Open Information Approach
Brian Hole, Ubiquity Press, Open Repositories workshop, Brisbane, Australia
27 June 2017
Workshop overview
• Introduction
•Defining Text and Data Mining (Brian Hole)
•An overview of the FTDM project and its aims (Brian Hole)
•Results of the FTDM project (Freyja van den Boom)
•Related projects: OpenMinted and CORE (Petr Knoth)
• Discussion of workshop expectations
• Hands on TDM (Petr Knoth)
Tea
• Experimenting with TDM
• Q&A
• Wrap up (Freya van den Boom)
2FutureTDM
“the discovery by computer of new, previously
unknown information, by automatically extracting
and relating information from different (…)resources,
to reveal otherwise hidden meanings” (Hearst, 1999)
What is TDM?
3
16 trillion
gigabytes of
data by 2020
(236% growth)
Doubles every
2 years
(Moores Law,
1965)
Over 80% EU
citizens have
internet access
(Eurostat 2014)
Potential of TDM
4
• Addressing grand challenges such as climate change and global
epidemics
• Improving population health, wealth and development
• Creating new jobs and employment
• Exponentially increasing the speed and progress of science through new
insights and greater efficiency of research
• Increasing transparency of governments and their actions
• Fostering innovation and collaboration and boosting the impact of open
science
• Creating tools for education and research
• Providing new and richer cultural insights
• Speeding economic and social development in all parts of the globe
(The Hague Declaration on Knowledge Discovery)
5
TDM is not a homogeneous, self-contained,
scientific domain, but rather a diverse and
complex set of methods and technologies
deployed in the framework of diverse disciplines
and business activities
The challenge
5
FutureTDM - the opportunity
The FutureTDM project seeks to improve uptake of text and data mining
(TDM) in the EU by actively engaging with stakeholders such as
researchers, developers, publishers and SMEs.
The use of content mining is significantly lower in Europe
than in some American and Asian countries.
The partners in the FutureTDM consortium share the ambition behind the
EC’s call to develop policy and legal frameworks to reduce the barriers of
TDM uptake and with it, promote the awareness of TDM opportunities
across Europe.
6
FutureTDM
7
ELABORATE a legal and policy
framework for future TDM, define
policy priorities, specify a research
agenda to foster the spread of TDM in
various research fields within the EU
BUILD a Collaborative
Knowledge Base and an
Open Information Hub
combined on a web-based
platform including intuitive
tools
ANALYSE current application areas
and trends in TDM including
statistics and key figures, collect
relevant research and industrial
projects and best practices
ASSESS existing studies, legal
regulations and policies on
TDM within the European
Union
Main Objectives of FutureTDM
INVOLVE all key
stakeholders to identify
practices, requirements, and
specific challenges in the
field of TDM
INCREASE awareness of
TDM to attract new
target groups and
science domains
8
8
Remove existing legal, technological and skill barriers that
prevent TDM technology from being adopted within the EU.
Increase awareness about the social, economic and
scientific benefits of TDM.
Increase the Union’s competitiveness with other
high-tech economies (like Japan, South Korea, US) by
enhancing TDM adoption.
Foster the adoption of TDM in science and
economy.
Lead to Research & Innovation policy that is more relevant and
responsive to society
Impact
9
We were involved in the “Licenses for Europe”
consultation by the European Commission in 2013.
TDM from a publisher’s experience
Legacy publishers present were lobbying against free access to TDM.
As an unconditionally open publisher, Ubiquity Press is committed to making content
available in all forms, for consumption by anyone and for any means. Allowing TDM on
our platform is therefore standard practice, and we fully encourage it.
Our position was that copyright reform with a clear exception for TDM was the best way
to go forward, and that we would not support a licensing solution.
The EC backed down from imposing a licensing solution through legislation.
The EC committed to taking all positions into Account when reviewing the EU Copyright
framework.
10
Automatic download of all article XML, or a subset
TDM from a publisher’s experience
Features under development
Automatic deposit of XML to repositories
Journal and press-wide TDM resources for TDM,
e.g. tool profiles
Integration with TDM tools, e.g. ContentMine
QuickScrape
Investigating features to track TDM usage and
use cases
Investigating hosting of managed, TDM-optimized
Hydra/Samvera repositories.
Any Questions?
All slides will be on FutureTDM slideshare after the event
#FutureTDM
11

More Related Content

PPTX
Tdm dechamp colin_open_minted
PPT
Open Data: EU Policies and Activities
PPT
Berlin 6 Open Access Conference: Deirdre Furlong
PPT
EU Digital Agenda and Open Data
PPT
Celina Ramjoué: Open Access in the European Research Area (ERA)
PPTX
The 2018 European Commission Data Package
PPTX
FutureTDM Roadmap
PDF
Technologies and infrastructures supporting text and data analytics: Challeng...
Tdm dechamp colin_open_minted
Open Data: EU Policies and Activities
Berlin 6 Open Access Conference: Deirdre Furlong
EU Digital Agenda and Open Data
Celina Ramjoué: Open Access in the European Research Area (ERA)
The 2018 European Commission Data Package
FutureTDM Roadmap
Technologies and infrastructures supporting text and data analytics: Challeng...

What's hot (20)

PPTX
Open Data and Open Science in the European Commission
PPT
The importance of content-mining in the EC policy on open access
PPT
The Open Science Agenda in Europe: Policy convergence & diversity of approaches
PPT
Open Digital Science & e-infrastructures
PPT
Research4Development : Managing and Sharing DFID's Research Evidence
PDF
EUDAT 3rd Conference: Bringing Data e-Infrastructures to Horizon2020 - Carl-C...
PPT
ENP Belgrade WS Introduction
PPT
The current status of TDM in Europe
PPTX
OSFair2017 Workshop | Service provisioning for excellent sciences
PPTX
OSFair2017 Workshop | Towards a Policy Framework for the European Open Scienc...
PDF
TEAM B reconsidering the bayh-dole act
PPT
Presentation of Hans-Jörg Lieder, BnF Information Day
PPT
ENP Belgrade WS refinement introduction
PPTX
Presentation of Clemens Neudecker, BnF Information Day
PPTX
Marketplaces for cloud services
PPT
ENP Belgrade Workshop Project Overview
PDF
Open aire2020 general_coimbra_20171004_assinen
PPTX
Dr Lykotrafiti
PPTX
Openness a principle in need for a code of conduct
PPTX
PaNOSC: EOSC for Photon and Neutron Facilities Users
Open Data and Open Science in the European Commission
The importance of content-mining in the EC policy on open access
The Open Science Agenda in Europe: Policy convergence & diversity of approaches
Open Digital Science & e-infrastructures
Research4Development : Managing and Sharing DFID's Research Evidence
EUDAT 3rd Conference: Bringing Data e-Infrastructures to Horizon2020 - Carl-C...
ENP Belgrade WS Introduction
The current status of TDM in Europe
OSFair2017 Workshop | Service provisioning for excellent sciences
OSFair2017 Workshop | Towards a Policy Framework for the European Open Scienc...
TEAM B reconsidering the bayh-dole act
Presentation of Hans-Jörg Lieder, BnF Information Day
ENP Belgrade WS refinement introduction
Presentation of Clemens Neudecker, BnF Information Day
Marketplaces for cloud services
ENP Belgrade Workshop Project Overview
Open aire2020 general_coimbra_20171004_assinen
Dr Lykotrafiti
Openness a principle in need for a code of conduct
PaNOSC: EOSC for Photon and Neutron Facilities Users
Ad

Similar to FutureTDM: Increasing Uptake of Text and Data Mining in the EU (20)

PDF
FutureTDM Workshop II 29 March
PDF
Text and data mining - the opportunities and the EU conundrum - why aren’t we...
PPT
201201 19 gold oa (dechamp)
PPTX
BDE_SC4_WS3_5_Arnaud Burgess - LeMO Project
PPTX
Jean claude burgelman implications of open data
PPTX
Open Research Data: Present and planned EC Policy, Jean-Claude Burgelman impl...
PPTX
Research and Innovation in transformation: the transition to Open Science
PDF
Text and Data Mining : Making the Most of a Copyright Exception. Julien Roche...
PPTX
Open Source & Open Data Session report from imaGIne 2014 Conference
PPTX
Text and data mining in UK and France (ADBU - 13 Dec 16)
PPTX
Eu policy on open access april 2019 tsoukala
PDF
Ima g ine2014_8c1report
PDF
Leveraging Big Data to Manage Transport Operations (LeMO project)
PPTX
Openlaws LAPSI2 meeting Amsterdam 4/9/14
PDF
Enhanced Access To Publicly Funded Data For Science Technology And Innovation...
PPTX
Introduction to the FutureTDM project
PDF
NordForsk Open Access Reykjavik 14-15/8-2014: H2020
PPTX
Fit for Purpose! Shaping Open Access and Open Science Policies for Horizon Eu...
PPTX
DGI 2015 Roundtable 5 Co-chair's presentation
PDF
Treaty (knowledge transfer)
FutureTDM Workshop II 29 March
Text and data mining - the opportunities and the EU conundrum - why aren’t we...
201201 19 gold oa (dechamp)
BDE_SC4_WS3_5_Arnaud Burgess - LeMO Project
Jean claude burgelman implications of open data
Open Research Data: Present and planned EC Policy, Jean-Claude Burgelman impl...
Research and Innovation in transformation: the transition to Open Science
Text and Data Mining : Making the Most of a Copyright Exception. Julien Roche...
Open Source & Open Data Session report from imaGIne 2014 Conference
Text and data mining in UK and France (ADBU - 13 Dec 16)
Eu policy on open access april 2019 tsoukala
Ima g ine2014_8c1report
Leveraging Big Data to Manage Transport Operations (LeMO project)
Openlaws LAPSI2 meeting Amsterdam 4/9/14
Enhanced Access To Publicly Funded Data For Science Technology And Innovation...
Introduction to the FutureTDM project
NordForsk Open Access Reykjavik 14-15/8-2014: H2020
Fit for Purpose! Shaping Open Access and Open Science Policies for Horizon Eu...
DGI 2015 Roundtable 5 Co-chair's presentation
Treaty (knowledge transfer)
Ad

More from Brian Hole (20)

PPTX
For-Profit and Unconditionally Open
PPTX
Up levy 20181024
PPTX
Up lpf 20180523
PPTX
Open Scholarship: more important than ever. OA week 2018
PPTX
Researcher-led Open Access Publishing
PPTX
Developments in Researcher-led, Open Access Publishing
PPTX
Open Access via Open Source
PPTX
Ubiquity Press
PPTX
New models for Open Access Monograph funding
PPTX
The Growing Role of Libraries in Publishing
PPTX
Revolution by 1000 cuts: University Presses are the Future of Publishing
PPTX
Publishing for a truly global research community
PPTX
Open Access Publishing
PPTX
Disrupting Academic Publishing
PPTX
Disrupting Academic Publishing
PPTX
Innovation in Open Access Monographs, Archives and Journals
PPTX
Emerging models in digital scholarship, research, publication and open science
PPTX
The Shift to Open Access Publishing
PPTX
The Ubiquity Partner Network: Enabling Library-Based Publishing
PPTX
Preparing Data for (Open) Publication
For-Profit and Unconditionally Open
Up levy 20181024
Up lpf 20180523
Open Scholarship: more important than ever. OA week 2018
Researcher-led Open Access Publishing
Developments in Researcher-led, Open Access Publishing
Open Access via Open Source
Ubiquity Press
New models for Open Access Monograph funding
The Growing Role of Libraries in Publishing
Revolution by 1000 cuts: University Presses are the Future of Publishing
Publishing for a truly global research community
Open Access Publishing
Disrupting Academic Publishing
Disrupting Academic Publishing
Innovation in Open Access Monographs, Archives and Journals
Emerging models in digital scholarship, research, publication and open science
The Shift to Open Access Publishing
The Ubiquity Partner Network: Enabling Library-Based Publishing
Preparing Data for (Open) Publication

Recently uploaded (20)

PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Machine learning based COVID-19 study performance prediction
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
MYSQL Presentation for SQL database connectivity
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PPT
Teaching material agriculture food technology
PDF
Encapsulation theory and applications.pdf
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
Big Data Technologies - Introduction.pptx
PDF
cuic standard and advanced reporting.pdf
PPTX
sap open course for s4hana steps from ECC to s4
“AI and Expert System Decision Support & Business Intelligence Systems”
Machine learning based COVID-19 study performance prediction
Spectral efficient network and resource selection model in 5G networks
Dropbox Q2 2025 Financial Results & Investor Presentation
Diabetes mellitus diagnosis method based random forest with bat algorithm
Mobile App Security Testing_ A Comprehensive Guide.pdf
MYSQL Presentation for SQL database connectivity
NewMind AI Weekly Chronicles - August'25 Week I
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Programs and apps: productivity, graphics, security and other tools
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Teaching material agriculture food technology
Encapsulation theory and applications.pdf
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Big Data Technologies - Introduction.pptx
cuic standard and advanced reporting.pdf
sap open course for s4hana steps from ECC to s4

FutureTDM: Increasing Uptake of Text and Data Mining in the EU

  • 1. OpenDataMonitor Horizon 2020 Coordination and Support Action GARRI-3-2014 Scientific Information in the Digital Age: Text and Data Mining (TDM) Project number: 665940 Increasing Uptake of Text and Data Mining in the EU FutureTDM Reducing Barriers and Increasing Uptake of Text and Data Mining for Research Environments using a Collaborative Knowledge and Open Information Approach Brian Hole, Ubiquity Press, Open Repositories workshop, Brisbane, Australia 27 June 2017
  • 2. Workshop overview • Introduction •Defining Text and Data Mining (Brian Hole) •An overview of the FTDM project and its aims (Brian Hole) •Results of the FTDM project (Freyja van den Boom) •Related projects: OpenMinted and CORE (Petr Knoth) • Discussion of workshop expectations • Hands on TDM (Petr Knoth) Tea • Experimenting with TDM • Q&A • Wrap up (Freya van den Boom) 2FutureTDM
  • 3. “the discovery by computer of new, previously unknown information, by automatically extracting and relating information from different (…)resources, to reveal otherwise hidden meanings” (Hearst, 1999) What is TDM? 3 16 trillion gigabytes of data by 2020 (236% growth) Doubles every 2 years (Moores Law, 1965) Over 80% EU citizens have internet access (Eurostat 2014)
  • 4. Potential of TDM 4 • Addressing grand challenges such as climate change and global epidemics • Improving population health, wealth and development • Creating new jobs and employment • Exponentially increasing the speed and progress of science through new insights and greater efficiency of research • Increasing transparency of governments and their actions • Fostering innovation and collaboration and boosting the impact of open science • Creating tools for education and research • Providing new and richer cultural insights • Speeding economic and social development in all parts of the globe (The Hague Declaration on Knowledge Discovery)
  • 5. 5 TDM is not a homogeneous, self-contained, scientific domain, but rather a diverse and complex set of methods and technologies deployed in the framework of diverse disciplines and business activities The challenge 5
  • 6. FutureTDM - the opportunity The FutureTDM project seeks to improve uptake of text and data mining (TDM) in the EU by actively engaging with stakeholders such as researchers, developers, publishers and SMEs. The use of content mining is significantly lower in Europe than in some American and Asian countries. The partners in the FutureTDM consortium share the ambition behind the EC’s call to develop policy and legal frameworks to reduce the barriers of TDM uptake and with it, promote the awareness of TDM opportunities across Europe. 6
  • 7. FutureTDM 7 ELABORATE a legal and policy framework for future TDM, define policy priorities, specify a research agenda to foster the spread of TDM in various research fields within the EU BUILD a Collaborative Knowledge Base and an Open Information Hub combined on a web-based platform including intuitive tools ANALYSE current application areas and trends in TDM including statistics and key figures, collect relevant research and industrial projects and best practices ASSESS existing studies, legal regulations and policies on TDM within the European Union Main Objectives of FutureTDM INVOLVE all key stakeholders to identify practices, requirements, and specific challenges in the field of TDM INCREASE awareness of TDM to attract new target groups and science domains
  • 8. 8 8 Remove existing legal, technological and skill barriers that prevent TDM technology from being adopted within the EU. Increase awareness about the social, economic and scientific benefits of TDM. Increase the Union’s competitiveness with other high-tech economies (like Japan, South Korea, US) by enhancing TDM adoption. Foster the adoption of TDM in science and economy. Lead to Research & Innovation policy that is more relevant and responsive to society Impact
  • 9. 9 We were involved in the “Licenses for Europe” consultation by the European Commission in 2013. TDM from a publisher’s experience Legacy publishers present were lobbying against free access to TDM. As an unconditionally open publisher, Ubiquity Press is committed to making content available in all forms, for consumption by anyone and for any means. Allowing TDM on our platform is therefore standard practice, and we fully encourage it. Our position was that copyright reform with a clear exception for TDM was the best way to go forward, and that we would not support a licensing solution. The EC backed down from imposing a licensing solution through legislation. The EC committed to taking all positions into Account when reviewing the EU Copyright framework.
  • 10. 10 Automatic download of all article XML, or a subset TDM from a publisher’s experience Features under development Automatic deposit of XML to repositories Journal and press-wide TDM resources for TDM, e.g. tool profiles Integration with TDM tools, e.g. ContentMine QuickScrape Investigating features to track TDM usage and use cases Investigating hosting of managed, TDM-optimized Hydra/Samvera repositories.
  • 11. Any Questions? All slides will be on FutureTDM slideshare after the event #FutureTDM 11

Editor's Notes

  • #4: The number of wireless sensors and actuators worldwide has exceeded 24 million, presenting an increase of 553% between 2011 and 20166. ● By 2020 there will be more than 16 zettabytes of useful data (16 Trillion GB)7. ● YouTube claims to upload 24 hours of video every minute, making the site a hugely significant data aggregator8. ● “Every second, on average, around 6,000 tweets are tweeted on Twitter, which corresponds to over 350,000 tweets sent per minute, 500 million tweets per day and around 200 billion tweets per year”9. ● 74,200,000 pages exist on Facebook, with 7 million apps and websites integrated with Facebook on 30/5/2016.10 ● Over 1 billion websites and 3,36 billion internet users, on 11 May 201611. ● On average a new scientific article is being published every 30 seconds12. ● 60 000 publications on a single gene, p53, in the literature13.
  • #5: Instead of reading through these maybe pick out one or two and give examples of TDM in practice?