SlideShare a Scribd company logo
CAIRR: A pipeline to submit AIRR data to the
NCBI through the CEDAR Workbench
Syed Ahmad Chan Bukhari, Martin J. O'Connor, Marcos Martínez-Romero, Attila L. Egyedi, Debra
Willrett, John Graybeal , Mark A. Musen, Florian Rubelt, Steven H. Kleinstein , Kei-Hoi Cheung
AIRR Community defined MiAIRR and implemented with NCBI
NCBI is an important resource to archive biomedical data
● NCBI hosts a collection of biomedical databases and provide long-term support.
○ BioProject, BioSample, SRA, GenBank, GEO etc.
● Minimal use of standard terminologies to define the necessary metadata
○ Ontologies recommended for some data elements (Not implemented)
● NCBI metadata are often described using inconsistent terminologies
○ Limit our ability to access, find, interoperate and reuse the data sets
What are the issues with the current NCBI
submission process?
● Rapid growth
● Lack of metadata standardization
● Error prone data entry
● Lack of community-specific metadata
(e.g., AIRR)
CEDAR for AIRR
Submit your AIRR
metadata to NCBI,
faster and better.
Organism NCBITAXON
Disease/Diagnosis DOID
Tissue BTO
Cell Subset CL
Example of Ontological Mapping
CAIRR Workflow
Three simple steps to deposit AIRR data to the NCBI
Find template by typing MiAIRR
Three simple steps to deposit AIRR data to the NCBI
Add metadata
Three simple steps to deposit AIRR data to the NCBI
Upload and Submit
Why use CAIRR instead of direct submission?
1. Just one simple form (with tool tips!) to fill out, instead of multiple NCBI
templates.
2. Make your metadata right the first time with auto-completion, suggestions,
and validation.
3. Exact answers—Your metadata attributes and values come from unique
ontology concepts, so they are unambiguous and fully described.
4. Better feedback during the submission process.
5. CEDAR is faster and easier!

More Related Content

PPTX
LesionTracker
PPTX
Imaging Community Call - Introduction
PPT
A global information portal to facilitate and promote accessibility and ratio...
PDF
Open Data in Bioinformatics and Required Infrastructure towards achieving the...
PDF
Open data in Health Science: towards achieving the SDGs/John Ataguba
PPTX
Pacific Symposium on Biocomputing 2018
PPTX
VIVO Keynote
PPTX
ELIXIR . Technical Coordinator
LesionTracker
Imaging Community Call - Introduction
A global information portal to facilitate and promote accessibility and ratio...
Open Data in Bioinformatics and Required Infrastructure towards achieving the...
Open data in Health Science: towards achieving the SDGs/John Ataguba
Pacific Symposium on Biocomputing 2018
VIVO Keynote
ELIXIR . Technical Coordinator

Similar to CAIRR: A pipeline to submit AIRR data to the NCBI through the CEDAR Workbench (20)

PDF
MiAIRR:Minimum information about an Adaptive Immune Receptor Repertoire Seque...
PPTX
Enabling faster analysis of vaccine adverse event reports with ontology support
PPTX
Data Harmonization for a Molecularly Driven Health System
PPTX
EMBL Australian Bioinformatics Resource AHM - Data Commons
PPTX
Will Biomedical Research Fundamentally Change in the Era of Big Data?
PPTX
Why should researchers care about data curation?
PPTX
dkNET Introduction for Librarians
PPTX
Connecting eh rdataquad12
PPTX
Supporting researchers in the molecular life sciences Jeff Christiansen
PDF
dkNET Webinar: Discover the Latest from dkNET - Biomed Resource Watch 06/02/2023
PDF
Finding and Reusing Biomedical Datasets using CEDAR Metadata Repository and T...
PPSX
Bacterial Counting: Quick, easy and accurate?
PDF
Ontology-Driven Clinical Intelligence: A Path from the Biobank to Cross-Disea...
PPT
ELIXIR and data grand challenges in life sciences
PPT
Data Science BD2K Update for NIH
PPT
Standards for interoperable EHR Christopher G Chute MD DrPH Professor, Biomed...
PPT
iHT2 Health IT Summit San Francisco 2013 - Christopher Chute, Division of Bio...
PDF
FAIRness Assessment of the Library of Integrated Network-based Cellular Signa...
PPT
Modern methods in research by dr malik khalid mehmood ph_d
PPTX
Research paper ppt.pptxResearch paper ppt.pptxResearch paper ppt.pptx
MiAIRR:Minimum information about an Adaptive Immune Receptor Repertoire Seque...
Enabling faster analysis of vaccine adverse event reports with ontology support
Data Harmonization for a Molecularly Driven Health System
EMBL Australian Bioinformatics Resource AHM - Data Commons
Will Biomedical Research Fundamentally Change in the Era of Big Data?
Why should researchers care about data curation?
dkNET Introduction for Librarians
Connecting eh rdataquad12
Supporting researchers in the molecular life sciences Jeff Christiansen
dkNET Webinar: Discover the Latest from dkNET - Biomed Resource Watch 06/02/2023
Finding and Reusing Biomedical Datasets using CEDAR Metadata Repository and T...
Bacterial Counting: Quick, easy and accurate?
Ontology-Driven Clinical Intelligence: A Path from the Biobank to Cross-Disea...
ELIXIR and data grand challenges in life sciences
Data Science BD2K Update for NIH
Standards for interoperable EHR Christopher G Chute MD DrPH Professor, Biomed...
iHT2 Health IT Summit San Francisco 2013 - Christopher Chute, Division of Bio...
FAIRness Assessment of the Library of Integrated Network-based Cellular Signa...
Modern methods in research by dr malik khalid mehmood ph_d
Research paper ppt.pptxResearch paper ppt.pptxResearch paper ppt.pptx
Ad

More from Syed Ahmad Chan Bukhari, PhD (15)

PDF
CEDAR: Easing Authoring of Metadata to Make Biomedical Data Sets More Findabl...
PDF
CEDAR Technologies for AIRR Submissions
PDF
CEDAR: Web-Based Tools for Accelerating the Creation of Standardized Metadata
PDF
Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...
PDF
Standardization of the HIPC Data Templates
PPTX
A semantic framework for biomedical image discovery
PPTX
Semantic enrichment and similarity approximation for biomedical sequence images
PDF
Cedar OnDemand: An intelligent browser extension to generate ontology-based m...
PDF
Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
PDF
BioNLP-SADI: A Suite of interoperable BioNLP Semantic Web Services based on S...
PDF
Type 2 fuzzy ontology ahmadchan
PPTX
AN Intelligent Realtime multiple vessel collision risk assessment system
PDF
Canadian health census to lod
PPTX
Type-2 Fuzzy Ontology
CEDAR: Easing Authoring of Metadata to Make Biomedical Data Sets More Findabl...
CEDAR Technologies for AIRR Submissions
CEDAR: Web-Based Tools for Accelerating the Creation of Standardized Metadata
Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...
Standardization of the HIPC Data Templates
A semantic framework for biomedical image discovery
Semantic enrichment and similarity approximation for biomedical sequence images
Cedar OnDemand: An intelligent browser extension to generate ontology-based m...
Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
BioNLP-SADI: A Suite of interoperable BioNLP Semantic Web Services based on S...
Type 2 fuzzy ontology ahmadchan
AN Intelligent Realtime multiple vessel collision risk assessment system
Canadian health census to lod
Type-2 Fuzzy Ontology
Ad

Recently uploaded (20)

PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPTX
climate analysis of Dhaka ,Banglades.pptx
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PDF
Lecture1 pattern recognition............
PDF
Fluorescence-microscope_Botany_detailed content
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPTX
Introduction to Knowledge Engineering Part 1
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PPTX
1_Introduction to advance data techniques.pptx
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PPTX
Market Analysis -202507- Wind-Solar+Hybrid+Street+Lights+for+the+North+Amer...
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PPTX
STERILIZATION AND DISINFECTION-1.ppthhhbx
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PPT
ISS -ESG Data flows What is ESG and HowHow
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
climate analysis of Dhaka ,Banglades.pptx
IBA_Chapter_11_Slides_Final_Accessible.pptx
Introduction-to-Cloud-ComputingFinal.pptx
Lecture1 pattern recognition............
Fluorescence-microscope_Botany_detailed content
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
Data_Analytics_and_PowerBI_Presentation.pptx
Introduction to Knowledge Engineering Part 1
Qualitative Qantitative and Mixed Methods.pptx
1_Introduction to advance data techniques.pptx
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
Market Analysis -202507- Wind-Solar+Hybrid+Street+Lights+for+the+North+Amer...
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
STERILIZATION AND DISINFECTION-1.ppthhhbx
Acceptance and paychological effects of mandatory extra coach I classes.pptx
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
ISS -ESG Data flows What is ESG and HowHow
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx

CAIRR: A pipeline to submit AIRR data to the NCBI through the CEDAR Workbench

  • 1. CAIRR: A pipeline to submit AIRR data to the NCBI through the CEDAR Workbench Syed Ahmad Chan Bukhari, Martin J. O'Connor, Marcos Martínez-Romero, Attila L. Egyedi, Debra Willrett, John Graybeal , Mark A. Musen, Florian Rubelt, Steven H. Kleinstein , Kei-Hoi Cheung
  • 2. AIRR Community defined MiAIRR and implemented with NCBI
  • 3. NCBI is an important resource to archive biomedical data ● NCBI hosts a collection of biomedical databases and provide long-term support. ○ BioProject, BioSample, SRA, GenBank, GEO etc. ● Minimal use of standard terminologies to define the necessary metadata ○ Ontologies recommended for some data elements (Not implemented) ● NCBI metadata are often described using inconsistent terminologies ○ Limit our ability to access, find, interoperate and reuse the data sets
  • 4. What are the issues with the current NCBI submission process? ● Rapid growth ● Lack of metadata standardization ● Error prone data entry ● Lack of community-specific metadata (e.g., AIRR)
  • 5. CEDAR for AIRR Submit your AIRR metadata to NCBI, faster and better. Organism NCBITAXON Disease/Diagnosis DOID Tissue BTO Cell Subset CL Example of Ontological Mapping
  • 7. Three simple steps to deposit AIRR data to the NCBI Find template by typing MiAIRR
  • 8. Three simple steps to deposit AIRR data to the NCBI Add metadata
  • 9. Three simple steps to deposit AIRR data to the NCBI Upload and Submit
  • 10. Why use CAIRR instead of direct submission? 1. Just one simple form (with tool tips!) to fill out, instead of multiple NCBI templates. 2. Make your metadata right the first time with auto-completion, suggestions, and validation. 3. Exact answers—Your metadata attributes and values come from unique ontology concepts, so they are unambiguous and fully described. 4. Better feedback during the submission process. 5. CEDAR is faster and easier!