SlideShare a Scribd company logo
An Informal Discussion About Big Data
Better Stated as

A Vision for Biomedical
Research
Digitally enabling the length and
quality of life
Philip E. Bourne
pbourne@ucsd.edu
http://pebourne.wordpress.com/2013/12/21/taking-on-the-role-of-associate-director-for-data-science-at-the-nih-my-originalvision-statement/
The Context for This Discussion
• On March 3, 2014 I will begin as the first
Associate Director of the NIH devoted to data
science
• I am giving up tenure and the sun because I
believe this is the right time for change
• The change that I will try and instill at NIH and
beyond is that of a Digital Enterprise

http://www.nih.gov/news/health/dec2013/od-09.htm
What Do I Mean By the Digital
Enterprise?
An organization that succeeds by
maximizing the use of its digital assets
to achieve its goals
Why the Digital Enterprise Now?
• Biomedical research is increasingly digital –
the talk of “Big Data” is one manifestation
• Fulfillment of the NIH mission (among others)
will increasingly be tied to actions taken on
digital data across boundaries

• History already has lessons to teach us to
make the job easier
Actions on Data Implies:
•
•
•
•
•
•
•
•
•

Insuring data quality and hence trust
Making data sustainable
Making data open and accessible
Making data findable
Providing suitable metadata and annotation
Making data queryable
Making data analyzable
Presenting data as to maximize its value
Rewarding good data practices
Boundaries on Data Implies:
• Working across biological scales
• Working across biomedical disciplines
• Working across basic and clinical research and
practice
• Working across institutional boundaries
• Working across public and private sectors
• Working across national and international
borders
• Working across funding agencies
Where to Start?

An external advisory group provided a
valuable blueprint for what should be
done
http://acd.od.nih.gov/Data%20and%20Informatics%20Working%20Group%20Report.pdf
Blueprint Recommendations
• Promote central and federated catalogs
– Establish minimal metadata framework
– Tools to facilitate data sharing
– Elaborate on existing data sharing policies

• Support methods and applications
– Fund all phases of software development
– Leverage lessons from National Centers

• Training
– More funding
– Enhance review of training apps
– Quantitative component to all awards

• On campus IT strategic plan
– Catalog of existing tools
– Informatics laboratory
– Ditto big data

• Sustainable funding commitment
What is Under Way?
•

Now:
–
–
–
–
–

Data centers (under review)
Data science training grants (call Q1 14)
Pilot data catalog consortium (call out)
Genomic Research Data Alliance (being finalized)
Piloting “NIH-drive”

• In Year One:
–
–
–
–
–
–

Extended public-private programs specifically for data science activities
Interagency activities
International exchange programs
Programs for better data descriptions
Reward institutions/communities
Policies to get clinical trial data into the public domain
Longer Term Strategy: Support for
The Research Lifecycle
Authoring
Tools

Data
Capture

Lab
Notebooks

Software
Repositories

Analysis
Tools

Scholarly
Communication
Visualization

IDEAS – HYPOTHESES – EXPERIMENTS – DATA - ANALYSIS - COMPREHENSION - DISSEMINATION

Commercial &
Public Tools

DisciplineBased Metadata
Standards

Community Portals
Git-like
Resources
By Discipline
Training

Institutional Repositories
Commercial Repositories

Data Journals

New Reward
Systems
Longer Term Strategy: Support for
The Research Lifecycle
Authoring
Tools

Data
Capture

Lab
Notebooks

Software
Repositories

Analysis
Tools

Scholarly
Communication
Visualization

IDEAS – HYPOTHESES – EXPERIMENTS – DATA - ANALYSIS - COMPREHENSION - DISSEMINATION

Commercial &
Public Tools

DisciplineBased Metadata
Standards

Community Portals
Git-like
Resources
By Discipline
Training

Institutional Repositories
Commercial Repositories

Data Journals

New Reward
Systems
References
• http://bd2k.nih.gov/
• http://pebourne.wordpress.com/2013/12/21/
taking-on-the-role-of-associate-director-fordata-science-at-the-nih-my-original-visionstatement/
• http://rd-alliance.org/
• http://www.genomeinformaticsalliance.org/
• http://www.force11.org/
pbourne@ucsd.edu

Discussion
Back Pocket Slides
The Role of Associate Director for Data
Science
1.

2.
3.
4.
5.
6.
7.

provide broad trans-NIH programmatic leadership in the area of
data science;
lead long-term NIH strategic planning in areas of data science;
provide oversight of the BD2K Initiative;
establish and nurture a trans-NIH intellectual and programmatic
‘hub’ for coordinating and enhancing data science activities;
coordinate with data science activities beyond NIH (e.g., other
government agencies, other funding agencies, and the private
sector);
play a major role in data sharing policy development and oversight
at NIH; and
interact with the Chief Information Officer, NIH to generate
synergy between BD2K and the Infrastructure Plus program.
Strategy
•
•
•
•

Use the Blueprint as a starting point
Work with IC’s to determine science drivers
Define developments needed for these drivers
Look for commonalities across IC’s – make those
a priority
• Manage and enable emergent developments
– data catalog – used to define the minimal data
description and a home for domain definitions
– Centers of excellence – test beds and exemplars for
best practices
Ways to Sell the NIH Data Science
Vision
• Developed in response to well recognized scientific needs
• Support for the complete research lifecycle – this is more
than just data
• Simple and well understood by all stakeholders (i.e.,
branded)
• A shared vision
• As ubiquitous as TCP/IP is to the Internet – a backbone for
the digital enterprise
• To data what PLOS is to knowledge – a movement that
people believe in and get behind
• An app store for the research enterprise
General Features of NIH Data Science
• Lightweight metadata standards
• Data & software registries
• Expanded policies on data sharing, open
source software
• Training programs & reward systems
• Institutional incentives
• Private sector incentives
• Data centers serving community needs

More Related Content

PPTX
Data Management Planning for Engineers
PPTX
Building and providing data management services a framework for everyone!
PDF
Federal funder mandates
PPTX
Methods for measuring citizen-science impact
PDF
Poster RDAP13: A Workflow for Depositing to a Research Data Repository: A Cas...
DOCX
Data management plan template
PPT
Workshop intro090314
PPT
Introduction to Data Management Planning
Data Management Planning for Engineers
Building and providing data management services a framework for everyone!
Federal funder mandates
Methods for measuring citizen-science impact
Poster RDAP13: A Workflow for Depositing to a Research Data Repository: A Cas...
Data management plan template
Workshop intro090314
Introduction to Data Management Planning

What's hot (20)

PPTX
RDAP 16: Perspective on DMPs, Funders and Public Access (Panel 5: DMPs and Pu...
PPT
North American funders' DMP requirements
PPT
Data management policies
PPTX
Digital curation for postgraduate students
PPTX
RDAP 16: DMPs and Public Access: An NIH Perspective (Panel 5, DMPs and Public...
PPT
DAF methodology
PPT
Meeting the Computational Challenges Associated with Human Health
DOCX
RDAP 16: DMPs and Public Access: Agency and Data Service Experiences
PPTX
Winter school in research data science research data management - final
PPTX
PPTX
Data management plans and planning - a gentle introduction
PPTX
How to Comply with Grants: Writing Data Management Plans and Providing Public...
PDF
Valen Metadata and the [Data] Repository
PPTX
Al aposter mhenderson2015
PPTX
Compliance: Data Management Plans and Public Access to Data
PPT
Ask Not What the NIH Can Do For You; Ask What You Can Do For the NIH
PPTX
ACRL STS Liaisons Forum - AIBS
PPTX
RDAP 16: If I could turn back time: Looking back on 2+ years of DMP consultin...
PPTX
SWOT Analysis - What Does it Tell Us?
PDF
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
RDAP 16: Perspective on DMPs, Funders and Public Access (Panel 5: DMPs and Pu...
North American funders' DMP requirements
Data management policies
Digital curation for postgraduate students
RDAP 16: DMPs and Public Access: An NIH Perspective (Panel 5, DMPs and Public...
DAF methodology
Meeting the Computational Challenges Associated with Human Health
RDAP 16: DMPs and Public Access: Agency and Data Service Experiences
Winter school in research data science research data management - final
Data management plans and planning - a gentle introduction
How to Comply with Grants: Writing Data Management Plans and Providing Public...
Valen Metadata and the [Data] Repository
Al aposter mhenderson2015
Compliance: Data Management Plans and Public Access to Data
Ask Not What the NIH Can Do For You; Ask What You Can Do For the NIH
ACRL STS Liaisons Forum - AIBS
RDAP 16: If I could turn back time: Looking back on 2+ years of DMP consultin...
SWOT Analysis - What Does it Tell Us?
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
Ad

Similar to PSB2014 A Vision for Biomedical Research (20)

PPT
The NIH as a Digital Enterprise: Implications for PAG
PPT
The Thinking Behind Big Data at the NIH
PPT
Yale Day of Data
PDF
Praetzellis "Data Management Planning and Tools"
PDF
dkNET Office Hours: NIH Data Management and Sharing Mandate 05/03/2024
PPT
Opportunities and Challenges for International Cooperation Around Big Data
PPTX
Managing and Sharing Research Data
PPT
AMIA 2014
PPT
PPTX
3 dvc nsf-062813
PPTX
Leveraging the dmp tool
PPTX
Meeting Federal Research Requirements for Data Management Plans, Public Acces...
PPTX
A SWOT Analysis of Data Science @ NIH
PDF
Supporting Research Data Management at the University of Stirling
PPTX
RDM LIASA webinar
PPT
Survey of research data management practices up2010digschol2011
PPTX
Practical Research Data Management: tools and approaches, pre- and post-award
PPTX
Guidelines for OSTP Data Access Plans
PPTX
RDM in higher education
PDF
Data Management Plan Advising? A New Business Venture for Libraries
The NIH as a Digital Enterprise: Implications for PAG
The Thinking Behind Big Data at the NIH
Yale Day of Data
Praetzellis "Data Management Planning and Tools"
dkNET Office Hours: NIH Data Management and Sharing Mandate 05/03/2024
Opportunities and Challenges for International Cooperation Around Big Data
Managing and Sharing Research Data
AMIA 2014
3 dvc nsf-062813
Leveraging the dmp tool
Meeting Federal Research Requirements for Data Management Plans, Public Acces...
A SWOT Analysis of Data Science @ NIH
Supporting Research Data Management at the University of Stirling
RDM LIASA webinar
Survey of research data management practices up2010digschol2011
Practical Research Data Management: tools and approaches, pre- and post-award
Guidelines for OSTP Data Access Plans
RDM in higher education
Data Management Plan Advising? A New Business Venture for Libraries
Ad

More from Philip Bourne (20)

PPTX
Your Science Needs You - More Than Ever Before
PPTX
The Biological Data Sustainability Paradox: A Time to Think Differently
PPTX
Data Science and AI in Biomedicine: The World has Changed
PPTX
Data Science and AI in Biomedicine: The World has Changed
PPTX
AI in Medical Education A Meta View to Start a Conversation
PPTX
AI+ Now and Then How Did We Get Here And Where Are We Going
PPTX
Thoughts on Biological Data Sustainability
PPTX
What is FAIR Data and Who Needs It?
PPTX
Data Science Meets Biomedicine, Does Anything Change
PPTX
Data Science Meets Drug Discovery
PPTX
Biomedical Data Science: We Are Not Alone
PPTX
BIMS7100-2023. Social Responsibility in Research
PPTX
AI from the Perspective of a School of Data Science
PPTX
What Data Science Will Mean to You - One Person's View
PPTX
Novo Nordisk 080522.pptx
PPTX
Towards a US Open research Commons (ORC)
PPTX
COVID and Precision Education
PPTX
One View of Data Science
PPTX
Cancer Research Meets Data Science — What Can We Do Together?
PPTX
Data Science Meets Open Scholarship – What Comes Next?
Your Science Needs You - More Than Ever Before
The Biological Data Sustainability Paradox: A Time to Think Differently
Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has Changed
AI in Medical Education A Meta View to Start a Conversation
AI+ Now and Then How Did We Get Here And Where Are We Going
Thoughts on Biological Data Sustainability
What is FAIR Data and Who Needs It?
Data Science Meets Biomedicine, Does Anything Change
Data Science Meets Drug Discovery
Biomedical Data Science: We Are Not Alone
BIMS7100-2023. Social Responsibility in Research
AI from the Perspective of a School of Data Science
What Data Science Will Mean to You - One Person's View
Novo Nordisk 080522.pptx
Towards a US Open research Commons (ORC)
COVID and Precision Education
One View of Data Science
Cancer Research Meets Data Science — What Can We Do Together?
Data Science Meets Open Scholarship – What Comes Next?

Recently uploaded (20)

PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PDF
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PDF
Basic Mud Logging Guide for educational purpose
PPTX
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
PPTX
Pharma ospi slides which help in ospi learning
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PPTX
Introduction to Child Health Nursing – Unit I | Child Health Nursing I | B.Sc...
PDF
Pre independence Education in Inndia.pdf
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
Insiders guide to clinical Medicine.pdf
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PDF
RMMM.pdf make it easy to upload and study
PDF
01-Introduction-to-Information-Management.pdf
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
Final Presentation General Medicine 03-08-2024.pptx
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
Basic Mud Logging Guide for educational purpose
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
Pharma ospi slides which help in ospi learning
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Introduction to Child Health Nursing – Unit I | Child Health Nursing I | B.Sc...
Pre independence Education in Inndia.pdf
2.FourierTransform-ShortQuestionswithAnswers.pdf
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
STATICS OF THE RIGID BODIES Hibbelers.pdf
human mycosis Human fungal infections are called human mycosis..pptx
Insiders guide to clinical Medicine.pdf
Abdominal Access Techniques with Prof. Dr. R K Mishra
RMMM.pdf make it easy to upload and study
01-Introduction-to-Information-Management.pdf
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student

PSB2014 A Vision for Biomedical Research

  • 1. An Informal Discussion About Big Data Better Stated as A Vision for Biomedical Research Digitally enabling the length and quality of life Philip E. Bourne pbourne@ucsd.edu http://pebourne.wordpress.com/2013/12/21/taking-on-the-role-of-associate-director-for-data-science-at-the-nih-my-originalvision-statement/
  • 2. The Context for This Discussion • On March 3, 2014 I will begin as the first Associate Director of the NIH devoted to data science • I am giving up tenure and the sun because I believe this is the right time for change • The change that I will try and instill at NIH and beyond is that of a Digital Enterprise http://www.nih.gov/news/health/dec2013/od-09.htm
  • 3. What Do I Mean By the Digital Enterprise? An organization that succeeds by maximizing the use of its digital assets to achieve its goals
  • 4. Why the Digital Enterprise Now? • Biomedical research is increasingly digital – the talk of “Big Data” is one manifestation • Fulfillment of the NIH mission (among others) will increasingly be tied to actions taken on digital data across boundaries • History already has lessons to teach us to make the job easier
  • 5. Actions on Data Implies: • • • • • • • • • Insuring data quality and hence trust Making data sustainable Making data open and accessible Making data findable Providing suitable metadata and annotation Making data queryable Making data analyzable Presenting data as to maximize its value Rewarding good data practices
  • 6. Boundaries on Data Implies: • Working across biological scales • Working across biomedical disciplines • Working across basic and clinical research and practice • Working across institutional boundaries • Working across public and private sectors • Working across national and international borders • Working across funding agencies
  • 7. Where to Start? An external advisory group provided a valuable blueprint for what should be done http://acd.od.nih.gov/Data%20and%20Informatics%20Working%20Group%20Report.pdf
  • 8. Blueprint Recommendations • Promote central and federated catalogs – Establish minimal metadata framework – Tools to facilitate data sharing – Elaborate on existing data sharing policies • Support methods and applications – Fund all phases of software development – Leverage lessons from National Centers • Training – More funding – Enhance review of training apps – Quantitative component to all awards • On campus IT strategic plan – Catalog of existing tools – Informatics laboratory – Ditto big data • Sustainable funding commitment
  • 9. What is Under Way? • Now: – – – – – Data centers (under review) Data science training grants (call Q1 14) Pilot data catalog consortium (call out) Genomic Research Data Alliance (being finalized) Piloting “NIH-drive” • In Year One: – – – – – – Extended public-private programs specifically for data science activities Interagency activities International exchange programs Programs for better data descriptions Reward institutions/communities Policies to get clinical trial data into the public domain
  • 10. Longer Term Strategy: Support for The Research Lifecycle Authoring Tools Data Capture Lab Notebooks Software Repositories Analysis Tools Scholarly Communication Visualization IDEAS – HYPOTHESES – EXPERIMENTS – DATA - ANALYSIS - COMPREHENSION - DISSEMINATION Commercial & Public Tools DisciplineBased Metadata Standards Community Portals Git-like Resources By Discipline Training Institutional Repositories Commercial Repositories Data Journals New Reward Systems
  • 11. Longer Term Strategy: Support for The Research Lifecycle Authoring Tools Data Capture Lab Notebooks Software Repositories Analysis Tools Scholarly Communication Visualization IDEAS – HYPOTHESES – EXPERIMENTS – DATA - ANALYSIS - COMPREHENSION - DISSEMINATION Commercial & Public Tools DisciplineBased Metadata Standards Community Portals Git-like Resources By Discipline Training Institutional Repositories Commercial Repositories Data Journals New Reward Systems
  • 12. References • http://bd2k.nih.gov/ • http://pebourne.wordpress.com/2013/12/21/ taking-on-the-role-of-associate-director-fordata-science-at-the-nih-my-original-visionstatement/ • http://rd-alliance.org/ • http://www.genomeinformaticsalliance.org/ • http://www.force11.org/
  • 15. The Role of Associate Director for Data Science 1. 2. 3. 4. 5. 6. 7. provide broad trans-NIH programmatic leadership in the area of data science; lead long-term NIH strategic planning in areas of data science; provide oversight of the BD2K Initiative; establish and nurture a trans-NIH intellectual and programmatic ‘hub’ for coordinating and enhancing data science activities; coordinate with data science activities beyond NIH (e.g., other government agencies, other funding agencies, and the private sector); play a major role in data sharing policy development and oversight at NIH; and interact with the Chief Information Officer, NIH to generate synergy between BD2K and the Infrastructure Plus program.
  • 16. Strategy • • • • Use the Blueprint as a starting point Work with IC’s to determine science drivers Define developments needed for these drivers Look for commonalities across IC’s – make those a priority • Manage and enable emergent developments – data catalog – used to define the minimal data description and a home for domain definitions – Centers of excellence – test beds and exemplars for best practices
  • 17. Ways to Sell the NIH Data Science Vision • Developed in response to well recognized scientific needs • Support for the complete research lifecycle – this is more than just data • Simple and well understood by all stakeholders (i.e., branded) • A shared vision • As ubiquitous as TCP/IP is to the Internet – a backbone for the digital enterprise • To data what PLOS is to knowledge – a movement that people believe in and get behind • An app store for the research enterprise
  • 18. General Features of NIH Data Science • Lightweight metadata standards • Data & software registries • Expanded policies on data sharing, open source software • Training programs & reward systems • Institutional incentives • Private sector incentives • Data centers serving community needs