SlideShare a Scribd company logo
DATA SCIENCE SOLUTION
FOR RESEARCH
Because today science is complex,
multidisciplinary and multi-
stakeholders
Twitter - Facebook - Linkedin
-------------------------
Paris - San Francisco - Luxembourg
© MyScienceWork - 2018
2© MyScienceWork - 2018
Who we are?
Democratizing science
Innovative
monitoring solution
for research
assessment
Data oriented solution
for digital asset
management system and
analytics
Free discovery tool &
promotion platform for
scientists and research
professionals
3
568+ Sources
indexed from Open
Institutional Repositories
and Publishers
70 Million
scientific
publications
12 Million
Patents
Strong
Partnerships
MyScienceWork Database
Partners & Figures
© MyScienceWork - 2018
© MyScienceWork -
2018
BUILDING AN OPEN
REPOSITORY FOR
SCIENTIFIC
INSTITUTIONS TO
ANALYSE AND PROCESS
DATA
5
The Story
From a research institute need to the project launch
© MyScienceWork - 2018
Institutional
Repository
Inter
operability
Data
science
Innovative
technologies
UX/UI
6
The challenge
Build a data oriented solution that guarantee sustainable evolution
What issue do we address?
With the large amount of data available on the Internet and the rise of the Semantic Web (Web 3.0) that has
seen an incredible amount of work towards the standardization of metadata exchanges around the world,
Institutions still face nowadays major challenges regarding the management of their data.
Several standards have been developed concurrently, and it is difficult to design bridges between every
single one.
Moreover, it is often striking to see the discrepancies that exist between solutions to store data into
complex databases, solutions to harvest and ingest new documents and solutions to visualize them in a
user-friendly way.
© MyScienceWork - 2018
The solution:
Data driven solution for digital asset management & analytics dashboard.
7© MyScienceWork - 2018
Key Success point:
Innovative Technologies
Solution built on innovative & open technologies
8
DATA SCIENCE
NLP, ML, AI
© MyScienceWork - 2018© MyScienceWork - 2018
SEMANTIC SEARCH
ENGINE
Elasticsearch
FRAMEWORK
Node.js
REACTIVE &
RESPONSIVE
Vue.js
FILES MANAGMENT
Minio
Key Success point: 4 Interfaces
Dedicated user interfaces
9
RESEARCHERS
❖ Metadata autocompletion (pdf extraction,
CrossRef…)
❖ Right balance b/w mandatory and optional info
❖ Thesaurus/Controlled vocabulary list
❖ Useful tools (Bib. Man. Tool, Extraction…)
© MyScienceWork - 2018© MyScienceWork - 2018
CURATORS/LIBRARIANS
❖ Easy to use Publication Review Tool
❖ Embargo management automated
❖ Control of new vocabulary entries within a list…
RESEARCH DIRECTOR
❖ Analytical dashboard (collaboration, research
trends, financial insights…)
❖ Customized reports for evaluation and
communication
❖ Bibliographic management tool
ADMIN.
❖ Well documented solution (GitHub)
❖ Low level of IT knowledge needed
❖ Easy set up of users and accounts
❖ …
Key Success point: Openness
Open data and interoperability
© MyScienceWork - 2018
In a technical environment that constantly evolved quickly, a solution need to be built on a
framework that guarantee sustainable evolution. Polaris OS addresses the following
shortcomings of open repositories:
 Infrastructure/Integration: facilitate integration between systems and databases
 Data reuse: structure, clean and enrich the different formats of data in order to
optimize the management and reusability of it.
 New open technologies and standards: use it to build a sustainable solution
Key Success point: Analytics
11© MyScienceWork - 2018
AI
ML
NLP
DATA
SCIENCE
YANN MAHE
yann.mahe@mysciencework.com
+33(0)6 03 43 64 96
https://www.linkedin.com/in/yann-mahé-
8aab4061/
@yannmahe80
© MyScienceWork - 2018
THANK YOU!

More Related Content

PPTX
OpenAIRE-Advance: Advancing Open Scholarship (Presentation at RDA 11th Plenary)
PPTX
The Spanish Open Research Data Network. Lessons learned
PDF
Business context of FAIR health data networks - The Hyve - MEDINFO Lyon 2019
PDF
How 2019 became the year FAIR landed in biopharmaceutical R&D
PPTX
Grand Challenges Learning Analytics
PPTX
"e-Infrastructures - The Starting Blocks for Open Science and Innovation"
PDF
WEBINAR: Open Access to publications in Horizon 2020
PPTX
OpenAIRE services and tools, Pedro Príncipe (OpenAIRE workshop, Ghent, Nov.20...
OpenAIRE-Advance: Advancing Open Scholarship (Presentation at RDA 11th Plenary)
The Spanish Open Research Data Network. Lessons learned
Business context of FAIR health data networks - The Hyve - MEDINFO Lyon 2019
How 2019 became the year FAIR landed in biopharmaceutical R&D
Grand Challenges Learning Analytics
"e-Infrastructures - The Starting Blocks for Open Science and Innovation"
WEBINAR: Open Access to publications in Horizon 2020
OpenAIRE services and tools, Pedro Príncipe (OpenAIRE workshop, Ghent, Nov.20...

What's hot (20)

PDF
ProteomeXchange update
PPTX
WEBINAR: Open Research Data in Horizon 2020
PDF
Open Research Data in Horizon 2020
PPTX
WEBINAR: "How to manage your data to make them open and fair"
PDF
Biased Information Retrieval in Pharmaceutical Drug Development
PDF
Optimising Content Spending with Analytics
PDF
Database Solutions : Database research & update
PPTX
Overview of the data pilot and OpenAIRE tools, Elly Dijk and Marjan Grootveld...
PPTX
2019-10-11 The value of FAIR data in health data networks - The Hyve - ELIXIR...
PPTX
3rd DBpedia Community Meeting - ALIGNED
PPTX
ALIGNED Data Curation Methods and Tools
PPTX
OpenAIRE Metrics Service: Usage Statistics (24x7 presentation at #OR2018)
PDF
OpenAIRE Infrastructure & Services: we need your input!
PPT
Data mining tools used in business intelligence
PPTX
SkillMatch: Extracting facts from unstructured text
PDF
Why CASRAI?
PPTX
Building a Multi-dimensional Analytics Engine with RedisGraph by Matthew Goos...
PPTX
Tagging with Rich Knowledge Graphs
PDF
Keynote: Graphs in Government_Lance Walter, CMO
PDF
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...
ProteomeXchange update
WEBINAR: Open Research Data in Horizon 2020
Open Research Data in Horizon 2020
WEBINAR: "How to manage your data to make them open and fair"
Biased Information Retrieval in Pharmaceutical Drug Development
Optimising Content Spending with Analytics
Database Solutions : Database research & update
Overview of the data pilot and OpenAIRE tools, Elly Dijk and Marjan Grootveld...
2019-10-11 The value of FAIR data in health data networks - The Hyve - ELIXIR...
3rd DBpedia Community Meeting - ALIGNED
ALIGNED Data Curation Methods and Tools
OpenAIRE Metrics Service: Usage Statistics (24x7 presentation at #OR2018)
OpenAIRE Infrastructure & Services: we need your input!
Data mining tools used in business intelligence
SkillMatch: Extracting facts from unstructured text
Why CASRAI?
Building a Multi-dimensional Analytics Engine with RedisGraph by Matthew Goos...
Tagging with Rich Knowledge Graphs
Keynote: Graphs in Government_Lance Walter, CMO
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...
Ad

Similar to MyScienceWork @ ConTech London 2018 (20)

PDF
MyScienceWork's presentation with Ined at the 14th International Open Reposit...
PDF
OpenAIRE: Science. Set Free, Iryna Kuchma, EIFL
PPTX
Presentation quaest strategic sourcing software
PDF
MyScienceWork's presentation at ConTech Forum 2019
PPTX
Flink's Journey from Academia to the ASF
PPTX
Frankfurt Big Data Lab & Refugee Projeect
PPTX
OpenAIRE services and tools - presentation at #DI4R2016
PDF
Smart cities no ai without ia
PDF
MyScienceWork & NFAIS - Webinar 07 11 2017
PDF
Agile data science
PPTX
Workshop Fraunhofer Portugal on Open Science in Horizon 2020
PDF
Analytical Innovation: How to Build the Next Generation Data Platform
PPTX
The Rise of Big Data Governance: Insight on this Emerging Trend from Active O...
PPTX
Infraestructuras, recursos y servicios de OpenAIRE. OpenAIRE Workshop Spain, ...
PPTX
PoolParty Semantic Suite - LT-Innovate Industry Summit-2016 - Brussels
PPTX
OpenAIRE infrastructure and Services (OpenAIRE Workshop Malta)
PDF
SAP_IoT_Activities_Overview_short strategy
PDF
Redis Labs - NOAH18 Tel Aviv
PPTX
OpenAIRE: Implementing Open Science in EOSC - crosscutting with RDA (Presenta...
PPTX
2019 DSA 105 Introduction to Data Science Week 4
MyScienceWork's presentation with Ined at the 14th International Open Reposit...
OpenAIRE: Science. Set Free, Iryna Kuchma, EIFL
Presentation quaest strategic sourcing software
MyScienceWork's presentation at ConTech Forum 2019
Flink's Journey from Academia to the ASF
Frankfurt Big Data Lab & Refugee Projeect
OpenAIRE services and tools - presentation at #DI4R2016
Smart cities no ai without ia
MyScienceWork & NFAIS - Webinar 07 11 2017
Agile data science
Workshop Fraunhofer Portugal on Open Science in Horizon 2020
Analytical Innovation: How to Build the Next Generation Data Platform
The Rise of Big Data Governance: Insight on this Emerging Trend from Active O...
Infraestructuras, recursos y servicios de OpenAIRE. OpenAIRE Workshop Spain, ...
PoolParty Semantic Suite - LT-Innovate Industry Summit-2016 - Brussels
OpenAIRE infrastructure and Services (OpenAIRE Workshop Malta)
SAP_IoT_Activities_Overview_short strategy
Redis Labs - NOAH18 Tel Aviv
OpenAIRE: Implementing Open Science in EOSC - crosscutting with RDA (Presenta...
2019 DSA 105 Introduction to Data Science Week 4
Ad

Recently uploaded (20)

PPT
First Aid Training Presentation Slides.ppt
PPTX
Lesson-7-Gas. -Exchange_074636.pptx
PDF
Tunisia's Founding Father(s) Pitch-Deck 2022.pdf
PDF
IKS PPT.....................................
PPTX
ANICK 6 BIRTHDAY....................................................
PPTX
Shizophrnia ppt for clinical psychology students of AS
PDF
PM Narendra Modi's speech from Red Fort on 79th Independence Day.pdf
PPTX
Research Process - Research Methods course
PPTX
chapter8-180915055454bycuufucdghrwtrt.pptx
PPTX
NORMAN_RESEARCH_PRESENTATION.in education
PDF
6.-propertise of noble gases, uses and isolation in noble gases
PDF
Presentation1 [Autosaved].pdf diagnosiss
DOC
LSTM毕业证学历认证,利物浦大学毕业证学历认证怎么认证
PPTX
HOW TO HANDLE THE STAGE FOR ACADEMIA AND OTHERS.pptx
PPTX
Module_4_Updated_Presentation CORRUPTION AND GRAFT IN THE PHILIPPINES.pptx
PDF
Unnecessary information is required for the
PPTX
CAPE CARIBBEAN STUDIES- Integration-1.pptx
PPTX
ART-APP-REPORT-FINctrwxsg f fuy L-na.pptx
PPTX
2025-08-17 Joseph 03 (shared slides).pptx
PPTX
PHIL.-ASTRONOMY-AND-NAVIGATION of ..pptx
First Aid Training Presentation Slides.ppt
Lesson-7-Gas. -Exchange_074636.pptx
Tunisia's Founding Father(s) Pitch-Deck 2022.pdf
IKS PPT.....................................
ANICK 6 BIRTHDAY....................................................
Shizophrnia ppt for clinical psychology students of AS
PM Narendra Modi's speech from Red Fort on 79th Independence Day.pdf
Research Process - Research Methods course
chapter8-180915055454bycuufucdghrwtrt.pptx
NORMAN_RESEARCH_PRESENTATION.in education
6.-propertise of noble gases, uses and isolation in noble gases
Presentation1 [Autosaved].pdf diagnosiss
LSTM毕业证学历认证,利物浦大学毕业证学历认证怎么认证
HOW TO HANDLE THE STAGE FOR ACADEMIA AND OTHERS.pptx
Module_4_Updated_Presentation CORRUPTION AND GRAFT IN THE PHILIPPINES.pptx
Unnecessary information is required for the
CAPE CARIBBEAN STUDIES- Integration-1.pptx
ART-APP-REPORT-FINctrwxsg f fuy L-na.pptx
2025-08-17 Joseph 03 (shared slides).pptx
PHIL.-ASTRONOMY-AND-NAVIGATION of ..pptx

MyScienceWork @ ConTech London 2018

  • 1. DATA SCIENCE SOLUTION FOR RESEARCH Because today science is complex, multidisciplinary and multi- stakeholders Twitter - Facebook - Linkedin ------------------------- Paris - San Francisco - Luxembourg © MyScienceWork - 2018
  • 2. 2© MyScienceWork - 2018 Who we are? Democratizing science Innovative monitoring solution for research assessment Data oriented solution for digital asset management system and analytics Free discovery tool & promotion platform for scientists and research professionals
  • 3. 3 568+ Sources indexed from Open Institutional Repositories and Publishers 70 Million scientific publications 12 Million Patents Strong Partnerships MyScienceWork Database Partners & Figures © MyScienceWork - 2018
  • 4. © MyScienceWork - 2018 BUILDING AN OPEN REPOSITORY FOR SCIENTIFIC INSTITUTIONS TO ANALYSE AND PROCESS DATA
  • 5. 5 The Story From a research institute need to the project launch © MyScienceWork - 2018 Institutional Repository Inter operability Data science Innovative technologies UX/UI
  • 6. 6 The challenge Build a data oriented solution that guarantee sustainable evolution What issue do we address? With the large amount of data available on the Internet and the rise of the Semantic Web (Web 3.0) that has seen an incredible amount of work towards the standardization of metadata exchanges around the world, Institutions still face nowadays major challenges regarding the management of their data. Several standards have been developed concurrently, and it is difficult to design bridges between every single one. Moreover, it is often striking to see the discrepancies that exist between solutions to store data into complex databases, solutions to harvest and ingest new documents and solutions to visualize them in a user-friendly way. © MyScienceWork - 2018
  • 7. The solution: Data driven solution for digital asset management & analytics dashboard. 7© MyScienceWork - 2018
  • 8. Key Success point: Innovative Technologies Solution built on innovative & open technologies 8 DATA SCIENCE NLP, ML, AI © MyScienceWork - 2018© MyScienceWork - 2018 SEMANTIC SEARCH ENGINE Elasticsearch FRAMEWORK Node.js REACTIVE & RESPONSIVE Vue.js FILES MANAGMENT Minio
  • 9. Key Success point: 4 Interfaces Dedicated user interfaces 9 RESEARCHERS ❖ Metadata autocompletion (pdf extraction, CrossRef…) ❖ Right balance b/w mandatory and optional info ❖ Thesaurus/Controlled vocabulary list ❖ Useful tools (Bib. Man. Tool, Extraction…) © MyScienceWork - 2018© MyScienceWork - 2018 CURATORS/LIBRARIANS ❖ Easy to use Publication Review Tool ❖ Embargo management automated ❖ Control of new vocabulary entries within a list… RESEARCH DIRECTOR ❖ Analytical dashboard (collaboration, research trends, financial insights…) ❖ Customized reports for evaluation and communication ❖ Bibliographic management tool ADMIN. ❖ Well documented solution (GitHub) ❖ Low level of IT knowledge needed ❖ Easy set up of users and accounts ❖ …
  • 10. Key Success point: Openness Open data and interoperability © MyScienceWork - 2018 In a technical environment that constantly evolved quickly, a solution need to be built on a framework that guarantee sustainable evolution. Polaris OS addresses the following shortcomings of open repositories:  Infrastructure/Integration: facilitate integration between systems and databases  Data reuse: structure, clean and enrich the different formats of data in order to optimize the management and reusability of it.  New open technologies and standards: use it to build a sustainable solution
  • 11. Key Success point: Analytics 11© MyScienceWork - 2018 AI ML NLP DATA SCIENCE
  • 12. YANN MAHE yann.mahe@mysciencework.com +33(0)6 03 43 64 96 https://www.linkedin.com/in/yann-mahé- 8aab4061/ @yannmahe80 © MyScienceWork - 2018 THANK YOU!

Editor's Notes

  • #3: MSW aims to make science more open and accessible to all. This is reflected through 3 pillars: Popularization and scientific communication: we work with a network of scientific journalists to popularize and enhance scholarly outputs + Access to scientific publications through our open database of more than 70M publications and patents Polaris OS: which aims to provide research institutes with an expert solution in data management. Sirius: Polaris OS goes hand in hand with the data processing expertise we have developed over the past 8 years.
  • #5: Let’s talk about how to build an open repository for scientific institutions to analyze and process data
  • #6: Polaris OS is an open source solution developed in partnership with the famous French Institute for Demographic Studies INED. We started the project with them last year and at this time, they would like to get an institutional repository solution that can address this specific challenges: Provide their researchers with an easy solution for publication deposit: UX/UI, auto-completion of metadata and useful tools (bibliographic management, push of publication toward other repositories…) Get a solution where they can control the quality of data produced by their research activities and thus be able to: Increase the visibility of their research outputs in other databases (national or thematic repo…) and on the web Produce analytics graphs and reports for evaluation on solid basis Get a sustainable solution: Metadata model flexible... Based on open and innovative technologies (data oriented…) Interoperable with other framework and infrastructures Easy new development environment Being able to be as autonomous as possible to manage the solution (no IT expertise required).
  • #7: When we started to think about the project, we identified the main challenge as having a solution data oriented that can guarantee a sustainable evolution.
  • #8: In order to help institutions overcome technical and budgetary hurdles, we developed an open source repository designed to analyze and process data. This is a major technological breakthrough to improve data management, analyze research impact, and better user experience. Polaris OS integrates into the ecosystem of research institutes by recovering/exporting all types of data (scientific, technological, financial, and managerial) to and from their existing information systems to organize, refine, and enrich it. Polaris OS is a combination of a: Enterprise Resource Planning: Integrate external streams: it’s the ability of interfacing the solution with external sources coming from IT department from our clients. Aggregate data under a common model: it’s the ability to gather data disseminated on various databases. Roles & users management: defining fine-grained access to the platform on both backoffice & frontoffice Extract Transform Load: Using pipeline to handle stream of data. A pipeline is a succession of functions used to format, complete, filter, transform as well as validate an input. The pipelines are extremely flexible and give access to a wide range of transformations and completions. Complex functions can be designed using transducers (a combination of simple functions that output always the same result given the same input). Data can be retrieved from a wide range of protocols including : ODBC (connection to relational databases), SOAP-XML (used primarily with Java and Hibernate), REST (standard way to exchange data on the internet nowadays), SFTP ((secured) file transfer protocol). Content Management System: Templates describe the way a page looks (does it have a header, a footer, a horizontal or vertical menu, …) Menus can be changed (items can be added, moved, removed, …) A widget is a basic element that has a unique purpose: search, browse, showing an image, a text, and so on. Widgets are placed on a 12-column grid based on standard CSS framework used by web designers and integrators.
  • #9: I would like to emphasize on some few matters that make Polaris OS unique. First of all, we choose very carefully the technology necessary to build the solution with 2 ideas in mind: the technologies chosen must be open and innovative (data oriented). As an example, we choose: Vue.js: this is one of the most promising technology to develop reactive interface Elasticsearch: currently the best search engine for data Node.js: framework…
  • #10: Second point, Interfaces have to be designed for all type of users. In our case, we identified 4 types of users and thus we set up 4 customized interfaces.
  • #11: Get a solution that allow the institution to get more visibility and that can be totally integrate into customer’s system. Incoming data flow Be able to integrate all kind of sources Metadata model flexible (no predefined data model) Outgoing data flow Open/push it Be able to integrate all kind of sources Flexible metadata model SEO compliant Google Google Scholar
  • #12: Last but not least, once your data are cleaned and structured, you are ready to use a very powerful « Analytics Dashboard » Productivity metrics & Impact metrics Collaboration & reference graphs Data visualization for strategic intelligence The system can also export report with customized/pre-defined cover. With your complete database we will be allow to create your own analytics reports, for example: Average number of funded studies by country, by type of funding, by type of collaborations Trends of research projects topics (by country, type of funding…) Fields of the studies (based on your “research categories”)