SlideShare a Scribd company logo
The NIH Data Commons
Digital Ecosystems for using and sharing FAIR Data
Vivien Bonazzi, Ph.D.
Senior Advisor for Data Science
Office of Data Science (ADDS)
National Institutes of Health
The Data Commons
is a platform
that fosters the development of a digital ecosystem
That digital ecosystem allows
transactions to occur on FAIR data
at scale
Data Commons is a Platform that fosters development of a digital Ecosystem
Treats products of research – data, software, methods, papers etc as a
digital asset (object)
Digital objects need to conform to FAIR principles
- Findable, Accessible, Interoperable, Reproducible
Digital objects exist in a shared virtual space (initial)
- Find, Deposit, Manage, Share and Reuse: digital assets
Enables interactions between Producers and Consumers of digital assets
Gives currency to digital assets and the people who develop and support
them
To understand the
Data Commons Platform
(and how it works for biomedical data)
we need to use a Platform stack
to help visualize the concept
NIH Data Commons - Platform Stack
https://datascience.nih.gov/commons
https://datascience.nih.gov/commons
NIH Data Commons - Platform Stack
NIH Data Commons - Platform Stack
Digital Market Place, Bazaar, Community
Sangeet Paul Choudary – Platform Scale
Network/Com
munity
Market Place
Technology
Data
NIH Data Commons Pilots
Current Data Commons Pilots
Reference Data Sets
Commons Stack
Pilots
Cloud Credit Model
Resource Search &
Index
• Explore feasibility of the Commons Platform (FW)
• Provide data objects to populate the Commons
• Facilitate collaboration and interoperability
• Provide access to cloud (IaaS) and PaaS/SaaS via credits
• Connecting credits to NIH Grant
• Making large and/or high value NIH funded data sets
and tool accessible in the cloud
• Developing Data & Software Indexing methods
• Leveraging BD2K efforts bioCADDIE et al
• Collaborating with external groups
NIH Data Commons  - Note:  Presentation has animations
Data Commons Pilot – connecting the pieces
Co-location of large and/or highly
utilized NIH funded data on the cloud
+ commonly used tools for analyzing
and sharing digital objects
to create an interoperable resource for
the research community.
Investigators will be able to collaborate
and share digital objects within this
environment and connect with others
An NIH Wide Data Commons Pilot
Data Lake
Data Lake
Indexing
Data Lake
Indexing
Data Lake
Indexing
Data Lake
New large data
projects
Messy data
Data Pond
Indexing
Authorization /authentication layer
Considerations
Metrics - understanding and accounting of data usage patterns
Cost - Cloud Storage, pay for use cloud compute (NIH credits)
Hybrid Clouds – Mix of research and commercial clouds
Connecting - Interoperability with other Commons, clouds
Consent - Reconsenting data, Dynamic consents
Standards – Metadata, UIDs, APIs
A digital economy is
characterized by making
data a central currency
to gain a business advantage
Organizations that are not born
digital will be at a disadvantage
in the new economy
Thank you
• ADDS Office
- Phil Bourne, Michelle Dunn, Jennie Larkin, Mark Guyer, Sonynka Ngosso
• NCBI: George Komatsoulis
• NHGRI: Valentina di Francesco
• NIGMS: Susan Gregurik
• CIT: Andrea Norris, Debbie Sinmao,
• NCI: Warren Kibbe, Tony Kerlavage, Tanja Davidsen, Ian Fore
• NIAID: JJ McGowan, Nick Weber, Darrell Hurt, Maria Giovanni, Alison Yao
• The NIH Common Fund: Betsy Wilder, Jim Anderson, Leslie Derr
• Trans NIH BD2K Executive Committee & Working groups
• Many biomedical researchers, cloud providers, IT professionals
Stay in Touch
QR Business Card
LinkedIn
@VivienBonazzi
Slideshare
Blog
(Coming soon!)
Vivien Bonazzi
bonazziv@mail.nih.gov

More Related Content

PPTX
BD2K and the Commons : ELIXR All Hands
PPTX
The NIH Data Commons - BD2K All Hands Meeting 2015
PPTX
Data Commons Garvan - 2016
PPTX
NIH Data Summit - The NIH Data Commons
PPTX
Bonazzi data commons nhgri council feb 2017
PPTX
Data commons bonazzi bd2 k fundamentals of science feb 2017
PPTX
EMBL Australian Bioinformatics Resource AHM - Data Commons
PPTX
Bonazzi commons bd2 k ahm 2016 v2
BD2K and the Commons : ELIXR All Hands
The NIH Data Commons - BD2K All Hands Meeting 2015
Data Commons Garvan - 2016
NIH Data Summit - The NIH Data Commons
Bonazzi data commons nhgri council feb 2017
Data commons bonazzi bd2 k fundamentals of science feb 2017
EMBL Australian Bioinformatics Resource AHM - Data Commons
Bonazzi commons bd2 k ahm 2016 v2

What's hot (20)

PPTX
NDS Relevant Update from the NIH Data Science (ADDS) Office
PPTX
Komatsoulis internet2 global forum 2015
PPTX
Komatsoulis internet2 executive track
PPT
A Framework for Geospatial Web Services for Public Health by Dr. Leslie Lenert
PPTX
D4Science Data Infrastructure - Facilitator for a FAIR Data Management
PPTX
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
PDF
Towards Lightweight Cyber-Physical Energy Systems using Linked Data, the Web ...
PDF
Briefing on US EPA Open Data Strategy using a Linked Data Approach
PDF
Ross Wilkinson - Data Publication: Australian and Global Policy Developments
PPTX
Imaging dearry ncrdc 11062017
PDF
Infrastructure, relationships, trust, and RDA
PDF
Unidata's Approach to Community Broadening through Data and Technology Sharing
PPTX
Paving the way to open and interoperable research data service workflows Prog...
PPT
Opportunities and Challenges for International Cooperation Around Big Data
PPTX
Overview of grid computing
PPTX
The potential of the cloud
PDF
Beyond Meta-Data: Nano-Publications Recording Scientific Endeavour
PPT
PPTX
Data Harmonization for a Molecularly Driven Health System
PPTX
Research data management & planning: an introduction
NDS Relevant Update from the NIH Data Science (ADDS) Office
Komatsoulis internet2 global forum 2015
Komatsoulis internet2 executive track
A Framework for Geospatial Web Services for Public Health by Dr. Leslie Lenert
D4Science Data Infrastructure - Facilitator for a FAIR Data Management
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Towards Lightweight Cyber-Physical Energy Systems using Linked Data, the Web ...
Briefing on US EPA Open Data Strategy using a Linked Data Approach
Ross Wilkinson - Data Publication: Australian and Global Policy Developments
Imaging dearry ncrdc 11062017
Infrastructure, relationships, trust, and RDA
Unidata's Approach to Community Broadening through Data and Technology Sharing
Paving the way to open and interoperable research data service workflows Prog...
Opportunities and Challenges for International Cooperation Around Big Data
Overview of grid computing
The potential of the cloud
Beyond Meta-Data: Nano-Publications Recording Scientific Endeavour
Data Harmonization for a Molecularly Driven Health System
Research data management & planning: an introduction
Ad

Viewers also liked (11)

PDF
Advanced motion controls b12a6
PDF
Advanced motion controls 50a8
PDF
Advanced motion controls 120a10b 24
PDF
Advanced motion controls 25a20
PDF
Advanced motion controls ps16h80 l
PPTX
Market Intelligence Briefing: The Department of Defense FY17 Budget
PDF
Advanced motion controls 25a20dd
PDF
Advanced motion controls 120a10
PDF
Advanced motion controls ps16h36 l
PPTX
Virtual Gov Day - Application Delivery Breakout - Northrop Grumman Informatio...
DOC
Petición administrativa 30% - 2016
Advanced motion controls b12a6
Advanced motion controls 50a8
Advanced motion controls 120a10b 24
Advanced motion controls 25a20
Advanced motion controls ps16h80 l
Market Intelligence Briefing: The Department of Defense FY17 Budget
Advanced motion controls 25a20dd
Advanced motion controls 120a10
Advanced motion controls ps16h36 l
Virtual Gov Day - Application Delivery Breakout - Northrop Grumman Informatio...
Petición administrativa 30% - 2016
Ad

Similar to NIH Data Commons - Note: Presentation has animations (20)

PPTX
The NIH Commons: A Cloud-based Training Environment
PPTX
Sharing Big Data - Bob Jones
PPTX
A coordinated framework for open data open science in Botswana/Simon Hodson
PPTX
Big data analytics for the bussiness purpose
PPTX
Optimising Scientific Knowledge Transfer: How Collective Sensemaking Can Ena...
PPTX
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...
PDF
Enabling Smarter Cities through Internet of Things, Web of Data & Citizen Par...
PDF
UNIT 1 -BIG DATA ANALYTICS Full.pdf
PPTX
FAIR data: what it means, how we achieve it, and the role of RDA
PDF
Data ecosystems: turning data into public value
PDF
PPTX
Publishing Data on the Web
PDF
Data management plans – EUDAT Best practices and case study | www.eudat.eu
PPTX
The Commons: Leveraging the Power of the Cloud for Big Data
PPTX
Big data and data mining
PPT
Advancing Science In A Collaborative Web 20 World
PPTX
Unit-1 -2-3- BDA PIET 6 AIDS.pptx
PPTX
Introduction to BIG DATA
PPTX
Management of Data Collections
PPT
Information Services: Breaking down Departmental Silos
The NIH Commons: A Cloud-based Training Environment
Sharing Big Data - Bob Jones
A coordinated framework for open data open science in Botswana/Simon Hodson
Big data analytics for the bussiness purpose
Optimising Scientific Knowledge Transfer: How Collective Sensemaking Can Ena...
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...
Enabling Smarter Cities through Internet of Things, Web of Data & Citizen Par...
UNIT 1 -BIG DATA ANALYTICS Full.pdf
FAIR data: what it means, how we achieve it, and the role of RDA
Data ecosystems: turning data into public value
Publishing Data on the Web
Data management plans – EUDAT Best practices and case study | www.eudat.eu
The Commons: Leveraging the Power of the Cloud for Big Data
Big data and data mining
Advancing Science In A Collaborative Web 20 World
Unit-1 -2-3- BDA PIET 6 AIDS.pptx
Introduction to BIG DATA
Management of Data Collections
Information Services: Breaking down Departmental Silos

Recently uploaded (20)

PPTX
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
PPTX
2. Earth - The Living Planet earth and life
PPTX
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
PDF
Phytochemical Investigation of Miliusa longipes.pdf
PDF
The scientific heritage No 166 (166) (2025)
PPTX
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
PPTX
Microbiology with diagram medical studies .pptx
PPTX
Classification Systems_TAXONOMY_SCIENCE8.pptx
PPTX
neck nodes and dissection types and lymph nodes levels
PPTX
SCIENCE10 Q1 5 WK8 Evidence Supporting Plate Movement.pptx
PDF
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
PPTX
Cell Membrane: Structure, Composition & Functions
PPTX
microscope-Lecturecjchchchchcuvuvhc.pptx
PDF
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
PPTX
Comparative Structure of Integument in Vertebrates.pptx
PPTX
2. Earth - The Living Planet Module 2ELS
PPTX
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
PDF
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...
PPTX
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
PDF
Sciences of Europe No 170 (2025)
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
2. Earth - The Living Planet earth and life
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
Phytochemical Investigation of Miliusa longipes.pdf
The scientific heritage No 166 (166) (2025)
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
Microbiology with diagram medical studies .pptx
Classification Systems_TAXONOMY_SCIENCE8.pptx
neck nodes and dissection types and lymph nodes levels
SCIENCE10 Q1 5 WK8 Evidence Supporting Plate Movement.pptx
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
Cell Membrane: Structure, Composition & Functions
microscope-Lecturecjchchchchcuvuvhc.pptx
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
Comparative Structure of Integument in Vertebrates.pptx
2. Earth - The Living Planet Module 2ELS
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
Sciences of Europe No 170 (2025)

NIH Data Commons - Note: Presentation has animations

  • 1. The NIH Data Commons Digital Ecosystems for using and sharing FAIR Data Vivien Bonazzi, Ph.D. Senior Advisor for Data Science Office of Data Science (ADDS) National Institutes of Health
  • 2. The Data Commons is a platform that fosters the development of a digital ecosystem
  • 3. That digital ecosystem allows transactions to occur on FAIR data at scale
  • 4. Data Commons is a Platform that fosters development of a digital Ecosystem Treats products of research – data, software, methods, papers etc as a digital asset (object) Digital objects need to conform to FAIR principles - Findable, Accessible, Interoperable, Reproducible Digital objects exist in a shared virtual space (initial) - Find, Deposit, Manage, Share and Reuse: digital assets Enables interactions between Producers and Consumers of digital assets Gives currency to digital assets and the people who develop and support them
  • 5. To understand the Data Commons Platform (and how it works for biomedical data) we need to use a Platform stack to help visualize the concept
  • 6. NIH Data Commons - Platform Stack https://datascience.nih.gov/commons
  • 8. NIH Data Commons - Platform Stack Digital Market Place, Bazaar, Community Sangeet Paul Choudary – Platform Scale Network/Com munity Market Place Technology Data
  • 10. Current Data Commons Pilots Reference Data Sets Commons Stack Pilots Cloud Credit Model Resource Search & Index • Explore feasibility of the Commons Platform (FW) • Provide data objects to populate the Commons • Facilitate collaboration and interoperability • Provide access to cloud (IaaS) and PaaS/SaaS via credits • Connecting credits to NIH Grant • Making large and/or high value NIH funded data sets and tool accessible in the cloud • Developing Data & Software Indexing methods • Leveraging BD2K efforts bioCADDIE et al • Collaborating with external groups
  • 12. Data Commons Pilot – connecting the pieces Co-location of large and/or highly utilized NIH funded data on the cloud + commonly used tools for analyzing and sharing digital objects to create an interoperable resource for the research community. Investigators will be able to collaborate and share digital objects within this environment and connect with others
  • 13. An NIH Wide Data Commons Pilot Data Lake
  • 17. Indexing Data Lake New large data projects Messy data Data Pond
  • 19. Considerations Metrics - understanding and accounting of data usage patterns Cost - Cloud Storage, pay for use cloud compute (NIH credits) Hybrid Clouds – Mix of research and commercial clouds Connecting - Interoperability with other Commons, clouds Consent - Reconsenting data, Dynamic consents Standards – Metadata, UIDs, APIs
  • 20. A digital economy is characterized by making data a central currency to gain a business advantage Organizations that are not born digital will be at a disadvantage in the new economy
  • 21. Thank you • ADDS Office - Phil Bourne, Michelle Dunn, Jennie Larkin, Mark Guyer, Sonynka Ngosso • NCBI: George Komatsoulis • NHGRI: Valentina di Francesco • NIGMS: Susan Gregurik • CIT: Andrea Norris, Debbie Sinmao, • NCI: Warren Kibbe, Tony Kerlavage, Tanja Davidsen, Ian Fore • NIAID: JJ McGowan, Nick Weber, Darrell Hurt, Maria Giovanni, Alison Yao • The NIH Common Fund: Betsy Wilder, Jim Anderson, Leslie Derr • Trans NIH BD2K Executive Committee & Working groups • Many biomedical researchers, cloud providers, IT professionals
  • 22. Stay in Touch QR Business Card LinkedIn @VivienBonazzi Slideshare Blog (Coming soon!) Vivien Bonazzi bonazziv@mail.nih.gov

Editor's Notes

  • #6: Framework helps visualize the concept of the platform