SlideShare a Scribd company logo
National COVID Cohort Collaborative
(N3C)
NCATS CTSA COVID-19 Data Science Enclave
Warren Kibbe
@data2health
@ncats_nih_g
ov
https://ncats.nih.gov/n3
c
https://covid.cd2h.org/
@wakibbe
● ML algorithms (diagnosis, triage, predictive, etc.)
● Best practices for resource allocation
● Drug discovery
● Reduced disease severity
● Coordinate our efforts to maximize efficiency
All these things require the creation of a
comprehensive clinical data set
This pandemic highlights urgent needs
DUA Access Principles
Access Principles: “Share widely and wisely”
The end-goal is broad access, including:
● Academic and Commercial
● Credentialed researchers* (limited data set, LDS) and Individual / “citizen
scientists” (Synthetic Data)
● Domestic
● Directed to COVID-Related research
● Activities in the N3C Enclave are recorded and can be audited
● Disclosure of research results to the N3C Enclave for the public good
● Contributor Attribution
● No download of Limited Data Set
● Access authorization must be renewed annually
*Credentialed researchers are researchers from academic or commercial institutions who have completed Human Subjects Protection
training
Architecting Attribution in the N3C
The N3C Collaborative analytics platform will support robust
tracking of provenance and attribution; the DUA will require
attribution of all scientific outcomes to everyone who
contributed.cd2h.org/attribution
Artifact Contribution Agent
Qualified
contribution
Contribution
made to
Contribution
made by
Qualified
contribution
Any research artifact or
product, such as data,
data quality tool,
terminology, algorithm,
or software
The role of the
person or
organization in the
creation of the
artifact
The person, group
and/or organization
FDA
Mitra Rocca
Scott Gideon
Wei Chen
NIDDK
Robert Star
NIGMS
Ming Lee
NCATS ITRB
Sam Michael
Mariam Deacy
Gary Berkson
Josephine Kennedy
Usman Sheikh
Mark Backus
Nam Ngo
Amit Virakatmath
Keats Kirsch
Sulochana Nunna
Rafael Fuentes
Reid Simon
Biju Mathew
Tim Mierzwa
Ke Wang
Kalle Virtaneva
Partners, Teams, Collaborators
NCATS
Chris Austin
Joni Rutter
Mike Kurilla
Clare Schmitt
Ken Gersing
Xinzhi Zhang
Erica Rosemond
Sam Bozzette
Lili Portilla
Chris Dillon
Penny Burgoon
Emily Marti
Meredith Temple-
O’Connor
Sam Jonson
Christine Cutillo
Nicole Garbarini
NIH & HHS
Partners
NCI
Janelle Cortner
Stephen Hewitt
Denise Warzel
CD2H
OHSU/OSU
Melissa Haendel
Anita Walden
Julie McMurry
Moni Munoz-Torres
Andrea Volz
Connor Cook
Racquel Dietz
Andrew Neumann
Rich Lorimor
Sage Bionetworks
Justin Guinney
James Eddy
U of Iowa:
Dave Eichmann
Alexis Graves
Northwestern:
Kristi Holmes
Justin Starren
Lisa O’Keefe
Washington U.
Philip Payne
Albert Lai
Tom Dillon
CD2H
U. Of Washington
Adam Wilcox
Liz Zampino
Johns Hopkins U
Chris Chute
Tricia Francis
Jax Labs
Peter Robinson
Scripps
Chunlei Wu
Teams
Phenotype & Acquisition
Emily Pfaff, UNC
ACT
Michele Morris, Pitt
Shyam Visweswaran, Pitt
Shawn Murphy HRD
OMOP
Kristin Kostka, IQVIA
Karthik Natarajan, Columbia
Clare Blacketer JNJ
PCORI
Kellie Walters, UNC
Robert Bradford, UNC
Marshall Clark, UNC
Adam Lee, UNC
Evan Colmenares, UNC
TriNetX
Matvey Palchuk
Lora Lingrey
Teams
Governance
Sage Bionetworks
John Wilbanks
Christine Suver
Data Harmonization
JHU
Davera Gabriel
Stephanie Hong
Harold Lehmann
Tanner Zhang
Richard Zhu
SAMVIT
Smita Hastak
Charles Yaghmour
NCATS
Raju Hemadri
Nancy Nurthen
Sai Manjula
Adeptia
Sandeep Naredla
Teams
Analytics
Warren Kibbe, Duke
Heidi Sprait, UTMB
Tell Bennett, U of CO
Andrew Williams, Tufts
Joel Saltz, SBU
Janos Hajagos, SBU
Richard Moffitt, SBU
Tahsin Kurc, SBU
Palantir
Nabeel Qureshi
Andrew Girvin
Amin Manna
Synthetic Data
Regenstrief
Peter Embi
MDClone
Daniel Blumenthal
Hovav Dror
Luz Erez
Josh Rubel
Microsoft
Allison T Rodriguez
Kenji Takeda
Data
partnership &
governance
Phenotype &
Data
acquisition
Data ingest
&
harmonizati
Collaborative analytics
&
FAIR Sharing/Credit
N3C Overview
HarmonizeIngest Collaborate
(Analytics Platform)
OMO
P
LimitedDataSets
Limited/Safe
Harbor
Data Sets
Common Data Model Harmonization
First Stage Ingestion
● Unpack Zip’ed csv Files. Check data manifests
● Reconstitute into native CDM formats
● Hybrid Data Quality checks adapting OHDSI Data Quality Dashboard
Workflow
Data Quality Dashboard (shared with site)
✔️
✔️
✔️
Discover
Dashboards Reports Studies Researchers
Analyze
Build
Two-factor
Auth
DAC NCATS Cloud
Palantir
NCATS
Translator
Collaborative Analytics - N3C Secure Data Enclave
N3C Analytics Platform
Predictive Modeling: Risk of Ventilation and AKI
Random forest model trained on 200 COVID-19 patients, 100 of whom
required ventilation, and 100 did not. It performs well, with an AUC of
0.85. Shown are the top features in the model predicting ventilator
usage as an outcome.
Using these features, we are able to see separation in a PCA
plot between the ventilator population in orange and the non-
ventilator population in blue.
N3C Community Workstreams
NCATS N3C website: ncats.nih.gov/n3c
CD2H N3C website: covid.cd2h.org
Hub Partnership packet: https://covid.cd2h.org/partnership_welcome_packet
Onboarding to N3C: bit.ly/cd2h-onboarding-form
Join the conversation
Onboarding to N3C: bit.ly/cd2h-onboarding-form
Joining Workstreams:
N3C Data Ingestion & Harmonization Workstream
Slack Channel Harmonization
Google Group Harmonization
N3C Phenotype & Data Acquisition Workstream
Slack Channel Phenotype
Google Group Phenotype
N3C Collaborative Analytics Workstream
Slack Channel Analytics
Google Group Analytics
N3C Data Partnership & Governance Workstream
Slack Channel Governance
Google Group Governance
N3C Synthetic Data Workstream
Slack Channel Synthetic Data
Google Group Synthetic Data
Additional Information:
Onboarding N3C, Slack, Google | Finding and Joining a Google Group
Thank you!

More Related Content

PDF
openEHR v COVID-19
PDF
Guideline based CDSS for COVID-19
PDF
openEHR template development for COVID-19
PDF
H2O World - Machine Learning to Save Lives - Taposh Dutta Roy
PPTX
Digital Scholar Webinar: Breaking Down (Brick) Walls: Switching to Remote, Vi...
PDF
Building Secure Analysis and Storage Systems with Golden Helix
PPTX
Sharing Confidential Data in ICPSR
PPTX
Managing sensitive data at the Australian Data Archive
openEHR v COVID-19
Guideline based CDSS for COVID-19
openEHR template development for COVID-19
H2O World - Machine Learning to Save Lives - Taposh Dutta Roy
Digital Scholar Webinar: Breaking Down (Brick) Walls: Switching to Remote, Vi...
Building Secure Analysis and Storage Systems with Golden Helix
Sharing Confidential Data in ICPSR
Managing sensitive data at the Australian Data Archive

What's hot (20)

PDF
Next Generation Digital Trials - Introduction to a changing landscape
PPTX
Radiology Appropriateness Criteria - American College of Radiology (ACR)
PPTX
Jsm madduri-august-2015
PDF
New ways to better Healthcare Research
PDF
Transparency Life Sciences
PPTX
Personalized Medicine with IBM-Watson: Future of Cancer care
PDF
OpenTrials - Clinical Innovation and Partnering World 2017
PDF
Clinical Data Review Best Practices - E. Herbel
PDF
Patient Centricity: EHR Pillars to Patient Centricity
PDF
DayOne Experts - Next generation clinical trials
PDF
United Health Group Entire Annual Report (1360k)
PPTX
Effective Translation of Epidemiological Modeling to Support Resilience Build...
PPT
OS20 - How much is enough? Modelling and other methods to guide investments i...
PPTX
Shaping Informatics for Allied Health - Refining our voice
PDF
Standards in health informatics - problem, clinical models and terminology
PPTX
CI4CC sustainability-panel
PDF
Annotating and Cataloging CNVs in VarSeq
PPTX
Ngs workshop passarelli-mapping-1
PPTX
Introduction to Reliability
PDF
Connected Health & Me - Matic Meglic - Nov 24th 2014
Next Generation Digital Trials - Introduction to a changing landscape
Radiology Appropriateness Criteria - American College of Radiology (ACR)
Jsm madduri-august-2015
New ways to better Healthcare Research
Transparency Life Sciences
Personalized Medicine with IBM-Watson: Future of Cancer care
OpenTrials - Clinical Innovation and Partnering World 2017
Clinical Data Review Best Practices - E. Herbel
Patient Centricity: EHR Pillars to Patient Centricity
DayOne Experts - Next generation clinical trials
United Health Group Entire Annual Report (1360k)
Effective Translation of Epidemiological Modeling to Support Resilience Build...
OS20 - How much is enough? Modelling and other methods to guide investments i...
Shaping Informatics for Allied Health - Refining our voice
Standards in health informatics - problem, clinical models and terminology
CI4CC sustainability-panel
Annotating and Cataloging CNVs in VarSeq
Ngs workshop passarelli-mapping-1
Introduction to Reliability
Connected Health & Me - Matic Meglic - Nov 24th 2014
Ad

Similar to NCATS CTSA N3C (20)

PPTX
DCHI webinar on N3C January 2021
PPTX
Real world data, the National COVID-19 Cohort Consortium, and Oncology 2021
PPTX
Data Commons & Data Science Workshop
PPTX
Clinical Data Models - The Hyve - Bio IT World April 2019
PDF
Real-World Evidence: The Future of Data Generation and Usage
PDF
Meaningful (meta)data at scale: removing barriers to precision medicine research
PPTX
NCI Cancer Genomics, Open Science and PMI: FAIR
PDF
Personalized health knowledge graph ckg workshop - iswc 2018 (2)
PPTX
Univ of Miami CTSI: Citizen science seminar; Oct 2014
PPTX
Medicinal Chemistry Due Diligence: Computational Predictions of an expert’s e...
PDF
Sharing and standards christopher hart - clinical innovation and partnering...
PDF
2016 LabHIT Vision
PPT
Big Data in Biomedicine: Where is the NIH Headed
PDF
Health Science Data and Metadata: Trends and Needs
PPTX
The Role of the FAIR Guiding Principles for an effective Learning Health System
PDF
Decentralized Electronic Health Record
PDF
CDD: Vault, CDD: Vision and CDD: Models software for biologists and chemists ...
PPTX
Clinical Trial Accrual Challenges: Is Social Media Here to Help? (A. Denicoff)
PPT
Quantitative Medicine Feb 2009
PDF
Research Data Alliance (RDA) Webinar: What do you really know about that anti...
DCHI webinar on N3C January 2021
Real world data, the National COVID-19 Cohort Consortium, and Oncology 2021
Data Commons & Data Science Workshop
Clinical Data Models - The Hyve - Bio IT World April 2019
Real-World Evidence: The Future of Data Generation and Usage
Meaningful (meta)data at scale: removing barriers to precision medicine research
NCI Cancer Genomics, Open Science and PMI: FAIR
Personalized health knowledge graph ckg workshop - iswc 2018 (2)
Univ of Miami CTSI: Citizen science seminar; Oct 2014
Medicinal Chemistry Due Diligence: Computational Predictions of an expert’s e...
Sharing and standards christopher hart - clinical innovation and partnering...
2016 LabHIT Vision
Big Data in Biomedicine: Where is the NIH Headed
Health Science Data and Metadata: Trends and Needs
The Role of the FAIR Guiding Principles for an effective Learning Health System
Decentralized Electronic Health Record
CDD: Vault, CDD: Vision and CDD: Models software for biologists and chemists ...
Clinical Trial Accrual Challenges: Is Social Media Here to Help? (A. Denicoff)
Quantitative Medicine Feb 2009
Research Data Alliance (RDA) Webinar: What do you really know about that anti...
Ad

More from Warren Kibbe (20)

PPTX
CCDI Kibbe Wake Forest University Dec 2023.pptx
PPTX
Big Data Training for Cancer Research, Purdue, May 2023
PPTX
CCDI Overview November 2022
PPTX
RADx-UP CDCC Overview November 2022
PPTX
CCDI Kibbe Big Data Training May 2022
PPTX
Childhood Cancer Data Initiative presentation to the Children’s Brain Tumor N...
PPTX
RADx-UP CDCC presentation for the NIH Disaster Interest Group
PPTX
NAACCR June 2020
PPTX
NCI HTAN, cancer trajectories, precision oncology
PPTX
ENAR 2020
PPTX
ENAR 2020
PPTX
Technology and connected health for population science kibbe duke jan 2020
PPTX
Super computing 19 Cancer Computing Workshop Keynote
PPTX
Data Harmonization for a Molecularly Driven Health System
PPTX
Data supporting precision oncology fda wakibbe
PPTX
Role of data in precision oncology
PPTX
Data Harmonization for a Molecularly Driven Health System
PPTX
Data sharing Webinar March 2019
PPTX
Data in precision oncology SAMSI Precision Medicine Meeting mar 2019
PPTX
Opportunities for computing in cancer research
CCDI Kibbe Wake Forest University Dec 2023.pptx
Big Data Training for Cancer Research, Purdue, May 2023
CCDI Overview November 2022
RADx-UP CDCC Overview November 2022
CCDI Kibbe Big Data Training May 2022
Childhood Cancer Data Initiative presentation to the Children’s Brain Tumor N...
RADx-UP CDCC presentation for the NIH Disaster Interest Group
NAACCR June 2020
NCI HTAN, cancer trajectories, precision oncology
ENAR 2020
ENAR 2020
Technology and connected health for population science kibbe duke jan 2020
Super computing 19 Cancer Computing Workshop Keynote
Data Harmonization for a Molecularly Driven Health System
Data supporting precision oncology fda wakibbe
Role of data in precision oncology
Data Harmonization for a Molecularly Driven Health System
Data sharing Webinar March 2019
Data in precision oncology SAMSI Precision Medicine Meeting mar 2019
Opportunities for computing in cancer research

Recently uploaded (20)

PDF
Extended-Expanded-role-of-Nurses.pdf is a key for student Nurses
PDF
Plant-Based Antimicrobials: A New Hope for Treating Diarrhea in HIV Patients...
PPTX
Hearthhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
PPTX
Human Reproduction: Anatomy, Physiology & Clinical Insights.pptx
PPT
Rheumatology Member of Royal College of Physicians.ppt
PPTX
1. Basic chemist of Biomolecule (1).pptx
PPTX
MANAGEMENT SNAKE BITE IN THE TROPICALS.pptx
DOCX
PEADIATRICS NOTES.docx lecture notes for medical students
PPT
Infections Member of Royal College of Physicians.ppt
PPTX
Epidemiology of diptheria, pertusis and tetanus with their prevention
PPTX
Post Op complications in general surgery
PPT
neurology Member of Royal College of Physicians (MRCP).ppt
PPTX
preoerative assessment in anesthesia and critical care medicine
PDF
OSCE SERIES ( Questions & Answers ) - Set 3.pdf
PPTX
Reading between the Rings: Imaging in Brain Infections
PDF
Comparison of Swim-Up and Microfluidic Sperm Sorting.pdf
PPTX
NRP and care of Newborn.pptx- APPT presentation about neonatal resuscitation ...
PDF
Calcified coronary lesions management tips and tricks
PDF
Oral Aspect of Metabolic Disease_20250717_192438_0000.pdf
PPTX
Acute Coronary Syndrome for Cardiology Conference
Extended-Expanded-role-of-Nurses.pdf is a key for student Nurses
Plant-Based Antimicrobials: A New Hope for Treating Diarrhea in HIV Patients...
Hearthhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
Human Reproduction: Anatomy, Physiology & Clinical Insights.pptx
Rheumatology Member of Royal College of Physicians.ppt
1. Basic chemist of Biomolecule (1).pptx
MANAGEMENT SNAKE BITE IN THE TROPICALS.pptx
PEADIATRICS NOTES.docx lecture notes for medical students
Infections Member of Royal College of Physicians.ppt
Epidemiology of diptheria, pertusis and tetanus with their prevention
Post Op complications in general surgery
neurology Member of Royal College of Physicians (MRCP).ppt
preoerative assessment in anesthesia and critical care medicine
OSCE SERIES ( Questions & Answers ) - Set 3.pdf
Reading between the Rings: Imaging in Brain Infections
Comparison of Swim-Up and Microfluidic Sperm Sorting.pdf
NRP and care of Newborn.pptx- APPT presentation about neonatal resuscitation ...
Calcified coronary lesions management tips and tricks
Oral Aspect of Metabolic Disease_20250717_192438_0000.pdf
Acute Coronary Syndrome for Cardiology Conference

NCATS CTSA N3C

  • 1. National COVID Cohort Collaborative (N3C) NCATS CTSA COVID-19 Data Science Enclave Warren Kibbe @data2health @ncats_nih_g ov https://ncats.nih.gov/n3 c https://covid.cd2h.org/ @wakibbe
  • 2. ● ML algorithms (diagnosis, triage, predictive, etc.) ● Best practices for resource allocation ● Drug discovery ● Reduced disease severity ● Coordinate our efforts to maximize efficiency All these things require the creation of a comprehensive clinical data set This pandemic highlights urgent needs
  • 3. DUA Access Principles Access Principles: “Share widely and wisely” The end-goal is broad access, including: ● Academic and Commercial ● Credentialed researchers* (limited data set, LDS) and Individual / “citizen scientists” (Synthetic Data) ● Domestic ● Directed to COVID-Related research ● Activities in the N3C Enclave are recorded and can be audited ● Disclosure of research results to the N3C Enclave for the public good ● Contributor Attribution ● No download of Limited Data Set ● Access authorization must be renewed annually *Credentialed researchers are researchers from academic or commercial institutions who have completed Human Subjects Protection training
  • 4. Architecting Attribution in the N3C The N3C Collaborative analytics platform will support robust tracking of provenance and attribution; the DUA will require attribution of all scientific outcomes to everyone who contributed.cd2h.org/attribution Artifact Contribution Agent Qualified contribution Contribution made to Contribution made by Qualified contribution Any research artifact or product, such as data, data quality tool, terminology, algorithm, or software The role of the person or organization in the creation of the artifact The person, group and/or organization
  • 5. FDA Mitra Rocca Scott Gideon Wei Chen NIDDK Robert Star NIGMS Ming Lee NCATS ITRB Sam Michael Mariam Deacy Gary Berkson Josephine Kennedy Usman Sheikh Mark Backus Nam Ngo Amit Virakatmath Keats Kirsch Sulochana Nunna Rafael Fuentes Reid Simon Biju Mathew Tim Mierzwa Ke Wang Kalle Virtaneva Partners, Teams, Collaborators NCATS Chris Austin Joni Rutter Mike Kurilla Clare Schmitt Ken Gersing Xinzhi Zhang Erica Rosemond Sam Bozzette Lili Portilla Chris Dillon Penny Burgoon Emily Marti Meredith Temple- O’Connor Sam Jonson Christine Cutillo Nicole Garbarini NIH & HHS Partners NCI Janelle Cortner Stephen Hewitt Denise Warzel CD2H OHSU/OSU Melissa Haendel Anita Walden Julie McMurry Moni Munoz-Torres Andrea Volz Connor Cook Racquel Dietz Andrew Neumann Rich Lorimor Sage Bionetworks Justin Guinney James Eddy U of Iowa: Dave Eichmann Alexis Graves Northwestern: Kristi Holmes Justin Starren Lisa O’Keefe Washington U. Philip Payne Albert Lai Tom Dillon CD2H U. Of Washington Adam Wilcox Liz Zampino Johns Hopkins U Chris Chute Tricia Francis Jax Labs Peter Robinson Scripps Chunlei Wu Teams Phenotype & Acquisition Emily Pfaff, UNC ACT Michele Morris, Pitt Shyam Visweswaran, Pitt Shawn Murphy HRD OMOP Kristin Kostka, IQVIA Karthik Natarajan, Columbia Clare Blacketer JNJ PCORI Kellie Walters, UNC Robert Bradford, UNC Marshall Clark, UNC Adam Lee, UNC Evan Colmenares, UNC TriNetX Matvey Palchuk Lora Lingrey Teams Governance Sage Bionetworks John Wilbanks Christine Suver Data Harmonization JHU Davera Gabriel Stephanie Hong Harold Lehmann Tanner Zhang Richard Zhu SAMVIT Smita Hastak Charles Yaghmour NCATS Raju Hemadri Nancy Nurthen Sai Manjula Adeptia Sandeep Naredla Teams Analytics Warren Kibbe, Duke Heidi Sprait, UTMB Tell Bennett, U of CO Andrew Williams, Tufts Joel Saltz, SBU Janos Hajagos, SBU Richard Moffitt, SBU Tahsin Kurc, SBU Palantir Nabeel Qureshi Andrew Girvin Amin Manna Synthetic Data Regenstrief Peter Embi MDClone Daniel Blumenthal Hovav Dror Luz Erez Josh Rubel Microsoft Allison T Rodriguez Kenji Takeda
  • 6. Data partnership & governance Phenotype & Data acquisition Data ingest & harmonizati Collaborative analytics & FAIR Sharing/Credit N3C Overview HarmonizeIngest Collaborate (Analytics Platform) OMO P LimitedDataSets Limited/Safe Harbor Data Sets
  • 7. Common Data Model Harmonization First Stage Ingestion ● Unpack Zip’ed csv Files. Check data manifests ● Reconstitute into native CDM formats ● Hybrid Data Quality checks adapting OHDSI Data Quality Dashboard Workflow Data Quality Dashboard (shared with site) ✔️ ✔️ ✔️
  • 8. Discover Dashboards Reports Studies Researchers Analyze Build Two-factor Auth DAC NCATS Cloud Palantir NCATS Translator Collaborative Analytics - N3C Secure Data Enclave
  • 10. Predictive Modeling: Risk of Ventilation and AKI Random forest model trained on 200 COVID-19 patients, 100 of whom required ventilation, and 100 did not. It performs well, with an AUC of 0.85. Shown are the top features in the model predicting ventilator usage as an outcome. Using these features, we are able to see separation in a PCA plot between the ventilator population in orange and the non- ventilator population in blue.
  • 11. N3C Community Workstreams NCATS N3C website: ncats.nih.gov/n3c CD2H N3C website: covid.cd2h.org Hub Partnership packet: https://covid.cd2h.org/partnership_welcome_packet Onboarding to N3C: bit.ly/cd2h-onboarding-form
  • 12. Join the conversation Onboarding to N3C: bit.ly/cd2h-onboarding-form Joining Workstreams: N3C Data Ingestion & Harmonization Workstream Slack Channel Harmonization Google Group Harmonization N3C Phenotype & Data Acquisition Workstream Slack Channel Phenotype Google Group Phenotype N3C Collaborative Analytics Workstream Slack Channel Analytics Google Group Analytics N3C Data Partnership & Governance Workstream Slack Channel Governance Google Group Governance N3C Synthetic Data Workstream Slack Channel Synthetic Data Google Group Synthetic Data Additional Information: Onboarding N3C, Slack, Google | Finding and Joining a Google Group