SlideShare a Scribd company logo
http://cancerimagingarchive.net
Justin Kirby – justin.kirby@nih.gov
Frederick National Laboratory for Cancer Research
Leidos Biomedical Research, Inc.
Support to: Cancer Imaging Program/DCTD/NCI
A Practical Guide to Using TCIA for QIN
Challenges & Collaborative Projects
QIN F2F 2018
2
Intro to TCIA
3
The Cancer Imaging Archive: Brief intro
• 84 data sets consisting of over 41,000 subjects
available for download
• Covers most modalities (CT/MR/PET/RT)
• Wide variety of cancers + phantoms
• Patient populations vary from a handful to
>26,000 (NLST)
• Many have associated meta-data
 Demographics/outcomes/therapy
 Pathology imaging
 Radiologist expert and automated
computational analyses (segmentations,
features)
• ‘Omics ties to GDC/TCGA, CPTAC, and GEO
http://www.cancerimagingarchive.net
4
Organization of TCIA ecosystem
The
Cancer
Imaging
Archive
Data Collection Center
•Tools and staffing to support data
collection, curation, and de-
identification
Data Access
•Browse (home page)
•Filter/Search (Data Portal)
•REST API
•Analysis Data
Data Analysis Centers
•3rd party web sites or tools which
connect to TCIA’s API or mirror its
data
5
TCIA services (not just software)
Relieves PI of majority of data sharing burden/risks
• Data hosting with >99% uptime
• De-identification using pre-configured RSNA’s Clinical Trials Processor (CTP) and
DICOM PS 3.15 Annex E standards
• Multi-tiered QC process inspects both DICOM headers and pixels for PHI and
integrity of data set
Phone/email support available for end users and submitters
Extensive documentation throughout the site
Exposure to a large community of researchers
• Increase visibility of your work, get more citations!
6
Data Collection Center: Publish Your Data
Primary Data (radiology, pathology, clinical, etc) Analysis Results (derived from primary data)
Image credit: Hugo Aerts
7
Data Collection Center:
Publishing data in addition to manuscripts
Data citations for both primary and analysis data to enable reproducible research
Analysis Dataset Citation (derived image features)
Gutman DA, Cooper LA, Hwang SN, Holder CA, Gao J, Aurora TD, Dunn WD Jr, Scarpace L,
Mikkelsen T, Jain R, Wintermark M, Jilwan M, Raghavan P, Huang E, Clifford RJ, Mongkolwat
P, Kleper V, Freymann J, Kirby J, Zinn PO, Moreno CS, Jaffe C, Colen R, Rubin DL, Saltz J,
Flanders A, Brat DJ. (2014). MR Imaging Predictors of Molecular Profile and Survival: Multi-
institutional Study of the TCGA Glioblastoma Data Set. The Cancer Imaging Archive.
http://doi.org/10.7937/K9/TCIA.2014.4HTXYRCN
Publication Citation (cites specific data used)
MR imaging predictors of molecular profile and survival: multi-
institutional study of the TCGA glioblastoma data set. Radiology.
2013 May;267(2):560-9. doi: 10.1148/radiol.13120118. Epub
2013 Feb 7. PubMed PMID: 23392431; PubMed Central PMCID:
PMC3632807.
Primary Data Citation (TCIA images used for study)
Scarpace, L., Mikkelsen, T., Cha, soonmee, Rao, S., Tekchandani, S.,
Gutman, D., … Pierce, L. J. (2016). Radiology Data from The Cancer
Genome Atlas Glioblastoma Multiforme [TCGA-GBM] collection. The
Cancer Imaging Archive. http://doi.org/10.7937/K9/TCIA.2016.RNYFUYE9
8
Data Descriptor Journals
Journal Recommended Repositories
Nature Scientific Data https://www.nature.com/sdata/policies/repositories#imaging
Medical Physics http://aapm.onlinelibrary.wiley.com/hub/journal/10.1002/(ISSN)2473-4209/about/author-
guidelines.html (see section 13-Medical Physics Dataset Articles)
Elsevier Data in Brief http://www.elsevier.com/authors/author-services/research-data/data-base-linking/supported-
data-repositories#Health
PLOS ONE http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories
Research Data Support https://www.springernature.com/gp/authors/research-data-policy/repositories-bio/12327160
Publish detailed descriptions about how to use your TCIA data to gain
academic credit (publication/citations) in addition to the novel scientific
findings you might publish in traditional journals.
9
A significantly growing community!
38 incoming data sets in varying stages of curation
Over 10,000 active users per month
• Up from ~3,000/month in 2015
Downloads of 40-50TB per month
• Up from ~2TB/month in 2015
613 publications based on TCIA data
• 134 new publications in 2017
10
Researchers want to share data – 38 data set queue
Community Proposed Data Sets
GBM-DSC-MRI-DRO
ASCC TNM Consensus
Colorectal Liver Metastases
QIN-BREAST-02
MyelomaTT3a
Low Dose CT Liver Metastases
Lung Fused-CT-Pathology
HNSCC Oropharyngeal Radiomics
OPC-Radiomics
Oropharynx Phantoms
HNSCC 3D CT RT
MSK Pancreatic Cancer Repository
Program Data Sets Notes
TCGA 2 collection//sites
CPTAC 9 cancer types, 14
sites
Exceptional
Responders
24 of 58 subjects
remaining
Immunotherapy 2 cancer types
PDX mouse Not started
NCTN integration RTOG 0617 pilot in
process
QIN ECOG-ACRIN 10 trials
11
QIN Status Update
12
QIN Use Cases
Collaborative research projects
Challenge competitions
Data sharing requirements
(NIH guidelines, publication
guidelines)
13
Summary of QIN Use:
15 QIN Data Sets from 11 out of 19 active sites
Collection Cancer Type Modalities Subjects Access Updated QIN Use Case
Brain-Tumor-Progression Brain Cancer MR 20 Limited 2018/01/31 Collaborative project
QIN LUNG CT Non-small Cell Lung Cancer CT 47 Public 2017/07/31 Lung Seg Challenge
ACRIN FLT Breast Breast Cancer PET, CT, OT 83 Limited 2017/12/11 TBD
Breast-MRI-NACT-Pilot Breast Cancer MR, SEG 64 Public 2016/01/26 BMMR Challenge
ISPY1 Breast Cancer MR, SEG 222 Public 2016/08/31 BMMR Challenge
Lung Phantom Lung Phantom CT 1 Public 2014/06/19 Lung Seg Challenge
QIN-BRAIN-DSC-MRI Low & High Grade Glioma MR 49 Limited 2015/08/28 DSC MRI Challenge
QIN-Breast Breast Cancer MR, PT, CT 67 Limited 2015/09/04 General data sharing
QIN Breast DCE-MRI Breast Cancer MR, KO 10 Public 2014/07/31 DCE Challenge
QIN GBM DCE-MRI Glioblastoma Multiforme MR 10 Limited 2015/08/14 TBD
QIN GBM Treatment Response Glioblastoma Multiforme MR 54 Limited 2015/08/12 TBD
QIN-HeadNeck Head and Neck Carcinomas PT, CT, SR, SEG, RWV 156 Public 2014/08/26 Publications, ITCR use,
FDG PET Seg Challenge
QIN PET Phantom PET Phantom PT 2 Public 2014/09/04 FDG PET Seg Challenge
QIN Prostate Prostate Cancer MR 22 Limited 2014/07/02 Collaborative project,
AIF challenge
QIN-SARCOMA Sarcomas MR 15 Limited 2014/09/05 AIF Challenge
14
Suggestions for the coming year
Re-use of existing data has been limited
Capabilities to archive results data underutilized thus far (limited
as they may be)
More lead time required due to increased demand
Continued efforts to share diverse/rich data sets are critical
While painful compared to “quick and dirty” solutions,
standardization is worthwhile when planning CCP’s and data
storage
15
Coming soon…
Challenge competitions
Software Updates
New QIN CCP’s
More data
Advancing TCGA-GBM and TCGA-LGG with expert labels & radiomic features
Rich datasets without segmentation
labels
• Essential for quantitative
studies.
• Enabling radiogenomic analyses.
www.braintumorsegmentation.org
Highly competitive challenge utilizing TCIA data
ranked 1st at BraTS’15
Bakas et al., “GLISTRboost”, Springer LNCS 2016
Automated method Combining Biophysical
Tumor Growth Modeling with Machine Learning
for Glioma Segmentation
Input: TCIA data Segmentations:
using GLISTRboost
Manual segmentations
approved by expert neuro-
radiologist
Publicly available contribution towards repeatable and reproducible studies, by:
- enabling direct utilization of the TCGA/TCIA glioma collections
- allowing to fully exploit their potential in clinical and computational studies
1.
Panel of >500 imaging features, extracted from the manual segmentations2.
17
New Data Portal interface
18
New submission software
19
Clinical trials data: ECOG-ACRIN / QIN
 2 completed trials
• 6657/ISPY1 (public)
• 6688/FLT-Breast (restricted to QIN use until 12/18/18).
 2 trials in progress
• 6684/brain FMISO expected to be completed next month
• 6668/NSCLC FDG-PET has been transferred from ACRIN, now starting curation at TCIA
 How can QIN best leverage this data in clinically meaningful CCP’s?
• Schedule presentations about these data to Executive Committee or Working Groups as they come online?
• Identify logical follow up questions to ask based on trial publications?
• Sequester portions of trial data for validation sets?
• Alter the priority list as new needs arise?
20
Clinical trials data: NCTN Data Archive Integration
 Pilot project underway
to connect RTOG 0617
data in TCIA to NCTN
Data Archive clinical data
21
Clinical Proteomics Tumor Analysis Consortium (CPTAC)
Collection Cancer Type Modalities Subjects Location Updated
CPTAC-CCRCC Clear Cell Carcinoma CT 18 Kidney 2018/04/26
CPTAC-GBM Glioblastoma
Multiforme
CT, MR 24 Brain 2018/04/26
CPTAC-LSCC Squamous Cell
Carcinoma
CT,CR, DX 3 Lung 2018/04/26
CPTAC-LUAD Adenocarcinoma CT, MR, PT 10 Lung 2018/04/26
CPTAC-PDA Ductal
Adenocarcinoma
CT, MR, DX, CR 43 Pancreas 2018/04/26
CPTAC-UCEC Corpus Endometrial
Carcinoma
CT,MR,PT 31 Uterus 2018/04/26
CPTAC-HNSCC Head and Neck
Cancer
CT 5 Head-Neck 2018/04/26
CPTAC-CM Cutaneous Melanoma MR 1 Brain 2018/02/13
 Precision medicine data
• Proteomics
• Radiology
• Pathology
• Clinical
• Genomics
 Similar to TCGA, but prospective
• Newer scanners
• Same variability in acquisitions
22
Applied Proteogenomics OrganizationaL
Learning and Outcomes (APOLLO) network
6. Sequencing
7. Proteomics
23
Pilot accomplishments with VA and DOD
Boston VA MAVERIC program
• Now have VA-approved SOPs for image transfer
• Proof of principle: 36 imaging studies on 7 patients hosted
• Genomic data submitted to GDC
Walter Reed GYN-COE Imaging for APOLLO 2
• SOPs developed for images from three sites
• 250 studies on 90 patients transferred, not yet public
• Facilitating image feature extraction analysis by experts
APOLLO-5 Year 1 Projected Cases
Multiple Cancer Types and Imaging Modalities
Cancer Type Low Estimate High Estimate
GYN 300 400
Breast 100 200
Prostate 50 100
Colon/GI 50 100
ENT/Thyroid 50 100
Kidney 25 50
Lung 25 50
Brain 10 20
Sarcomas 10 20
Lymphoid 10 20
TOTAL 630 1060
• Imaging will be from multiple imaging modalities
• Most cases will have multiple image sets & time points
25
Crowds Cure Cancer – BIDS WG pipeline project
Preliminary pipeline stages
• Ingest tumor location seed points
from TCGA subjects
• Generate 3d segmentations
• Compute radiomic features
• Predict outcomes
Access the data
• Visit our Analysis Results page
• Direct link:
https://doi.org/10.7937/K9/TCIA.201
8.OW73VLO2
26
PET-CT Image Feature Standardization
 Extending/improving IBSI guidelines
 Utilizing TCIA data such as
CC-Radiomics-Phantom
27
Stay tuned!
28
Acknowledgements
Funding: NCI Cancer Imaging Program
Frederick National Laboratory for Cancer Research
• John Freymann, Justin Kirby, Brenda Fevrier-Sullivan, Luis Cordeiro,
Craig Hill
Consultant - Carl Jaffe
University of Arkansas Medical School
• Fred Prior, Kirk Smith, Lawrence Tarbox, Bill Bennett, Tracy Nolan,
Dwayne Dobbins
Emory University
• Ashish Sharma

More Related Content

PPTX
TCIA Update
PPTX
Hedvig Hricak & Imaging Prostate Cancer
PDF
Radiomics: Novel Paradigm of Deep Learning for Clinical Decision Support towa...
PPTX
Keynote at NVIDIA GPU Technology Conference in D.C.
PDF
tranSMART Community Meeting 5-7 Nov 13 - Session 3: The TraIT user stories fo...
PDF
Breast cancer diagnosis via data mining performance analysis of seven differe...
TCIA Update
Hedvig Hricak & Imaging Prostate Cancer
Radiomics: Novel Paradigm of Deep Learning for Clinical Decision Support towa...
Keynote at NVIDIA GPU Technology Conference in D.C.
tranSMART Community Meeting 5-7 Nov 13 - Session 3: The TraIT user stories fo...
Breast cancer diagnosis via data mining performance analysis of seven differe...

What's hot (20)

PDF
K1 K. Straif
PDF
PPTX
University of Toronto - Radiomics for Oncology - 2017
PDF
Data Standards in Radiomics Research
PDF
Presentation clinical applications of Artificial Intelligence for radiation o...
PDF
Transplantation_of_organs_from_deceased_donors_with_cancer_or_a_history_of_ca...
PPT
IMRT in Prostate Cancer
PDF
Anti-lymphangiogenic properties of mTOR inhibitors in head and neck squamous ...
PDF
PPT
Tex Rad.Pps
PPTX
Professor Harrison Bai, Artificial Intelligence Applications in Radiology_mHe...
PDF
Intensity-modulated radiotherapy with simultaneous modulated accelerated boos...
PPTX
Public Databases for Radiomics Research: Current Status and Future Directions
PPTX
Radiotheray transition from 2D to 3D Conformal radiotherapy(3D-CRT)
PDF
Artificial Intelligence in Radiation Oncology
PDF
IRJET - A Conceptual Method for Breast Tumor Classification using SHAP Values ...
PDF
BDW16 London - Mishal Patel, NHS - Modernising Routine Breast Cancer Using Bi...
PDF
Adjuvant radiation based on genomic risk factors emerging scenarios
PPTX
Sfide della Oncologia Personalizzata
PDF
Digital version thesis Salvage for radiorecurrent prostate cancer, Max Peters
K1 K. Straif
University of Toronto - Radiomics for Oncology - 2017
Data Standards in Radiomics Research
Presentation clinical applications of Artificial Intelligence for radiation o...
Transplantation_of_organs_from_deceased_donors_with_cancer_or_a_history_of_ca...
IMRT in Prostate Cancer
Anti-lymphangiogenic properties of mTOR inhibitors in head and neck squamous ...
Tex Rad.Pps
Professor Harrison Bai, Artificial Intelligence Applications in Radiology_mHe...
Intensity-modulated radiotherapy with simultaneous modulated accelerated boos...
Public Databases for Radiomics Research: Current Status and Future Directions
Radiotheray transition from 2D to 3D Conformal radiotherapy(3D-CRT)
Artificial Intelligence in Radiation Oncology
IRJET - A Conceptual Method for Breast Tumor Classification using SHAP Values ...
BDW16 London - Mishal Patel, NHS - Modernising Routine Breast Cancer Using Bi...
Adjuvant radiation based on genomic risk factors emerging scenarios
Sfide della Oncologia Personalizzata
Digital version thesis Salvage for radiorecurrent prostate cancer, Max Peters
Ad

Similar to A practical guide to using The Cancer Imaging Archive for QIN Challenges and Collaborative Projects (20)

PPTX
Crowds Cure Canver: Annotating Data from The Cancer Imaging Archive
PPTX
TCIA Update
PPTX
An introduction to The Cancer Imaging Archive (Hands on)
PDF
TCIA Data Harmonization Project
PPTX
Radiomics Data Management, Computation, and Analysis for QIN F2F 2016
PPTX
TCIA Update - 2017/01/09
PPTX
Extreme Computing, Clinical Medicine and GPUs or Can GPUs Cure Cancer
PDF
Twenty Years of Whole Slide Imaging - the Coming Phase Change
PPTX
Advancing Innovation and Convergence in Cancer Research: US Federal Cancer Mo...
PPTX
Digital Pathology, FDA Approval and Precision Medicine
PDF
EBI Industry programme TCGA Warren KIbbe November 2013
PPTX
Twenty Years of Whole Slide Imaging - the Coming Phase Change
PDF
CHIME Lead Forum 2015 - NYC
PPTX
Federal Research & Development for the Florida system Sept 2014
PPTX
SAMSI Precision Medicine Keynote, August 2018: Data: where Precision Oncology...
PPTX
Precision Oncology - using Genomics, Proteomics and Imaging to inform biology...
PPTX
Genomics and Computation in Precision Medicine March 2017
PPTX
High Dimensional Fused-Informatics
PPTX
Imaging dearry ncrdc 11062017
PPTX
The Cancer imaging Phenomics Toolkit (CaPTk)
Crowds Cure Canver: Annotating Data from The Cancer Imaging Archive
TCIA Update
An introduction to The Cancer Imaging Archive (Hands on)
TCIA Data Harmonization Project
Radiomics Data Management, Computation, and Analysis for QIN F2F 2016
TCIA Update - 2017/01/09
Extreme Computing, Clinical Medicine and GPUs or Can GPUs Cure Cancer
Twenty Years of Whole Slide Imaging - the Coming Phase Change
Advancing Innovation and Convergence in Cancer Research: US Federal Cancer Mo...
Digital Pathology, FDA Approval and Precision Medicine
EBI Industry programme TCGA Warren KIbbe November 2013
Twenty Years of Whole Slide Imaging - the Coming Phase Change
CHIME Lead Forum 2015 - NYC
Federal Research & Development for the Florida system Sept 2014
SAMSI Precision Medicine Keynote, August 2018: Data: where Precision Oncology...
Precision Oncology - using Genomics, Proteomics and Imaging to inform biology...
Genomics and Computation in Precision Medicine March 2017
High Dimensional Fused-Informatics
Imaging dearry ncrdc 11062017
The Cancer imaging Phenomics Toolkit (CaPTk)
Ad

Recently uploaded (20)

PPTX
JUVENILE NASOPHARYNGEAL ANGIOFIBROMA.pptx
PPTX
SKIN Anatomy and physiology and associated diseases
PDF
Copy of OB - Exam #2 Study Guide. pdf
DOC
Adobe Premiere Pro CC Crack With Serial Key Full Free Download 2025
PDF
NEET PG 2025 | 200 High-Yield Recall Topics Across All Subjects
PPTX
CEREBROVASCULAR DISORDER.POWERPOINT PRESENTATIONx
PPT
Management of Acute Kidney Injury at LAUTECH
PPTX
anal canal anatomy with illustrations...
PPTX
anaemia in PGJKKKKKKKKKKKKKKKKHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH...
PPTX
Transforming Regulatory Affairs with ChatGPT-5.pptx
PDF
Handout_ NURS 220 Topic 10-Abnormal Pregnancy.pdf
PDF
Oral Aspect of Metabolic Disease_20250717_192438_0000.pdf
PPTX
Important Obstetric Emergency that must be recognised
PPTX
Acid Base Disorders educational power point.pptx
PDF
Intl J Gynecology Obste - 2021 - Melamed - FIGO International Federation o...
PPTX
Human Reproduction: Anatomy, Physiology & Clinical Insights.pptx
PPT
Obstructive sleep apnea in orthodontics treatment
PPTX
POLYCYSTIC OVARIAN SYNDROME.pptx by Dr( med) Charles Amoateng
PPTX
Chapter-1-The-Human-Body-Orientation-Edited-55-slides.pptx
PPTX
15.MENINGITIS AND ENCEPHALITIS-elias.pptx
JUVENILE NASOPHARYNGEAL ANGIOFIBROMA.pptx
SKIN Anatomy and physiology and associated diseases
Copy of OB - Exam #2 Study Guide. pdf
Adobe Premiere Pro CC Crack With Serial Key Full Free Download 2025
NEET PG 2025 | 200 High-Yield Recall Topics Across All Subjects
CEREBROVASCULAR DISORDER.POWERPOINT PRESENTATIONx
Management of Acute Kidney Injury at LAUTECH
anal canal anatomy with illustrations...
anaemia in PGJKKKKKKKKKKKKKKKKHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH...
Transforming Regulatory Affairs with ChatGPT-5.pptx
Handout_ NURS 220 Topic 10-Abnormal Pregnancy.pdf
Oral Aspect of Metabolic Disease_20250717_192438_0000.pdf
Important Obstetric Emergency that must be recognised
Acid Base Disorders educational power point.pptx
Intl J Gynecology Obste - 2021 - Melamed - FIGO International Federation o...
Human Reproduction: Anatomy, Physiology & Clinical Insights.pptx
Obstructive sleep apnea in orthodontics treatment
POLYCYSTIC OVARIAN SYNDROME.pptx by Dr( med) Charles Amoateng
Chapter-1-The-Human-Body-Orientation-Edited-55-slides.pptx
15.MENINGITIS AND ENCEPHALITIS-elias.pptx

A practical guide to using The Cancer Imaging Archive for QIN Challenges and Collaborative Projects

  • 1. http://cancerimagingarchive.net Justin Kirby – justin.kirby@nih.gov Frederick National Laboratory for Cancer Research Leidos Biomedical Research, Inc. Support to: Cancer Imaging Program/DCTD/NCI A Practical Guide to Using TCIA for QIN Challenges & Collaborative Projects QIN F2F 2018
  • 3. 3 The Cancer Imaging Archive: Brief intro • 84 data sets consisting of over 41,000 subjects available for download • Covers most modalities (CT/MR/PET/RT) • Wide variety of cancers + phantoms • Patient populations vary from a handful to >26,000 (NLST) • Many have associated meta-data  Demographics/outcomes/therapy  Pathology imaging  Radiologist expert and automated computational analyses (segmentations, features) • ‘Omics ties to GDC/TCGA, CPTAC, and GEO http://www.cancerimagingarchive.net
  • 4. 4 Organization of TCIA ecosystem The Cancer Imaging Archive Data Collection Center •Tools and staffing to support data collection, curation, and de- identification Data Access •Browse (home page) •Filter/Search (Data Portal) •REST API •Analysis Data Data Analysis Centers •3rd party web sites or tools which connect to TCIA’s API or mirror its data
  • 5. 5 TCIA services (not just software) Relieves PI of majority of data sharing burden/risks • Data hosting with >99% uptime • De-identification using pre-configured RSNA’s Clinical Trials Processor (CTP) and DICOM PS 3.15 Annex E standards • Multi-tiered QC process inspects both DICOM headers and pixels for PHI and integrity of data set Phone/email support available for end users and submitters Extensive documentation throughout the site Exposure to a large community of researchers • Increase visibility of your work, get more citations!
  • 6. 6 Data Collection Center: Publish Your Data Primary Data (radiology, pathology, clinical, etc) Analysis Results (derived from primary data) Image credit: Hugo Aerts
  • 7. 7 Data Collection Center: Publishing data in addition to manuscripts Data citations for both primary and analysis data to enable reproducible research Analysis Dataset Citation (derived image features) Gutman DA, Cooper LA, Hwang SN, Holder CA, Gao J, Aurora TD, Dunn WD Jr, Scarpace L, Mikkelsen T, Jain R, Wintermark M, Jilwan M, Raghavan P, Huang E, Clifford RJ, Mongkolwat P, Kleper V, Freymann J, Kirby J, Zinn PO, Moreno CS, Jaffe C, Colen R, Rubin DL, Saltz J, Flanders A, Brat DJ. (2014). MR Imaging Predictors of Molecular Profile and Survival: Multi- institutional Study of the TCGA Glioblastoma Data Set. The Cancer Imaging Archive. http://doi.org/10.7937/K9/TCIA.2014.4HTXYRCN Publication Citation (cites specific data used) MR imaging predictors of molecular profile and survival: multi- institutional study of the TCGA glioblastoma data set. Radiology. 2013 May;267(2):560-9. doi: 10.1148/radiol.13120118. Epub 2013 Feb 7. PubMed PMID: 23392431; PubMed Central PMCID: PMC3632807. Primary Data Citation (TCIA images used for study) Scarpace, L., Mikkelsen, T., Cha, soonmee, Rao, S., Tekchandani, S., Gutman, D., … Pierce, L. J. (2016). Radiology Data from The Cancer Genome Atlas Glioblastoma Multiforme [TCGA-GBM] collection. The Cancer Imaging Archive. http://doi.org/10.7937/K9/TCIA.2016.RNYFUYE9
  • 8. 8 Data Descriptor Journals Journal Recommended Repositories Nature Scientific Data https://www.nature.com/sdata/policies/repositories#imaging Medical Physics http://aapm.onlinelibrary.wiley.com/hub/journal/10.1002/(ISSN)2473-4209/about/author- guidelines.html (see section 13-Medical Physics Dataset Articles) Elsevier Data in Brief http://www.elsevier.com/authors/author-services/research-data/data-base-linking/supported- data-repositories#Health PLOS ONE http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories Research Data Support https://www.springernature.com/gp/authors/research-data-policy/repositories-bio/12327160 Publish detailed descriptions about how to use your TCIA data to gain academic credit (publication/citations) in addition to the novel scientific findings you might publish in traditional journals.
  • 9. 9 A significantly growing community! 38 incoming data sets in varying stages of curation Over 10,000 active users per month • Up from ~3,000/month in 2015 Downloads of 40-50TB per month • Up from ~2TB/month in 2015 613 publications based on TCIA data • 134 new publications in 2017
  • 10. 10 Researchers want to share data – 38 data set queue Community Proposed Data Sets GBM-DSC-MRI-DRO ASCC TNM Consensus Colorectal Liver Metastases QIN-BREAST-02 MyelomaTT3a Low Dose CT Liver Metastases Lung Fused-CT-Pathology HNSCC Oropharyngeal Radiomics OPC-Radiomics Oropharynx Phantoms HNSCC 3D CT RT MSK Pancreatic Cancer Repository Program Data Sets Notes TCGA 2 collection//sites CPTAC 9 cancer types, 14 sites Exceptional Responders 24 of 58 subjects remaining Immunotherapy 2 cancer types PDX mouse Not started NCTN integration RTOG 0617 pilot in process QIN ECOG-ACRIN 10 trials
  • 12. 12 QIN Use Cases Collaborative research projects Challenge competitions Data sharing requirements (NIH guidelines, publication guidelines)
  • 13. 13 Summary of QIN Use: 15 QIN Data Sets from 11 out of 19 active sites Collection Cancer Type Modalities Subjects Access Updated QIN Use Case Brain-Tumor-Progression Brain Cancer MR 20 Limited 2018/01/31 Collaborative project QIN LUNG CT Non-small Cell Lung Cancer CT 47 Public 2017/07/31 Lung Seg Challenge ACRIN FLT Breast Breast Cancer PET, CT, OT 83 Limited 2017/12/11 TBD Breast-MRI-NACT-Pilot Breast Cancer MR, SEG 64 Public 2016/01/26 BMMR Challenge ISPY1 Breast Cancer MR, SEG 222 Public 2016/08/31 BMMR Challenge Lung Phantom Lung Phantom CT 1 Public 2014/06/19 Lung Seg Challenge QIN-BRAIN-DSC-MRI Low & High Grade Glioma MR 49 Limited 2015/08/28 DSC MRI Challenge QIN-Breast Breast Cancer MR, PT, CT 67 Limited 2015/09/04 General data sharing QIN Breast DCE-MRI Breast Cancer MR, KO 10 Public 2014/07/31 DCE Challenge QIN GBM DCE-MRI Glioblastoma Multiforme MR 10 Limited 2015/08/14 TBD QIN GBM Treatment Response Glioblastoma Multiforme MR 54 Limited 2015/08/12 TBD QIN-HeadNeck Head and Neck Carcinomas PT, CT, SR, SEG, RWV 156 Public 2014/08/26 Publications, ITCR use, FDG PET Seg Challenge QIN PET Phantom PET Phantom PT 2 Public 2014/09/04 FDG PET Seg Challenge QIN Prostate Prostate Cancer MR 22 Limited 2014/07/02 Collaborative project, AIF challenge QIN-SARCOMA Sarcomas MR 15 Limited 2014/09/05 AIF Challenge
  • 14. 14 Suggestions for the coming year Re-use of existing data has been limited Capabilities to archive results data underutilized thus far (limited as they may be) More lead time required due to increased demand Continued efforts to share diverse/rich data sets are critical While painful compared to “quick and dirty” solutions, standardization is worthwhile when planning CCP’s and data storage
  • 15. 15 Coming soon… Challenge competitions Software Updates New QIN CCP’s More data
  • 16. Advancing TCGA-GBM and TCGA-LGG with expert labels & radiomic features Rich datasets without segmentation labels • Essential for quantitative studies. • Enabling radiogenomic analyses. www.braintumorsegmentation.org Highly competitive challenge utilizing TCIA data ranked 1st at BraTS’15 Bakas et al., “GLISTRboost”, Springer LNCS 2016 Automated method Combining Biophysical Tumor Growth Modeling with Machine Learning for Glioma Segmentation Input: TCIA data Segmentations: using GLISTRboost Manual segmentations approved by expert neuro- radiologist Publicly available contribution towards repeatable and reproducible studies, by: - enabling direct utilization of the TCGA/TCIA glioma collections - allowing to fully exploit their potential in clinical and computational studies 1. Panel of >500 imaging features, extracted from the manual segmentations2.
  • 17. 17 New Data Portal interface
  • 19. 19 Clinical trials data: ECOG-ACRIN / QIN  2 completed trials • 6657/ISPY1 (public) • 6688/FLT-Breast (restricted to QIN use until 12/18/18).  2 trials in progress • 6684/brain FMISO expected to be completed next month • 6668/NSCLC FDG-PET has been transferred from ACRIN, now starting curation at TCIA  How can QIN best leverage this data in clinically meaningful CCP’s? • Schedule presentations about these data to Executive Committee or Working Groups as they come online? • Identify logical follow up questions to ask based on trial publications? • Sequester portions of trial data for validation sets? • Alter the priority list as new needs arise?
  • 20. 20 Clinical trials data: NCTN Data Archive Integration  Pilot project underway to connect RTOG 0617 data in TCIA to NCTN Data Archive clinical data
  • 21. 21 Clinical Proteomics Tumor Analysis Consortium (CPTAC) Collection Cancer Type Modalities Subjects Location Updated CPTAC-CCRCC Clear Cell Carcinoma CT 18 Kidney 2018/04/26 CPTAC-GBM Glioblastoma Multiforme CT, MR 24 Brain 2018/04/26 CPTAC-LSCC Squamous Cell Carcinoma CT,CR, DX 3 Lung 2018/04/26 CPTAC-LUAD Adenocarcinoma CT, MR, PT 10 Lung 2018/04/26 CPTAC-PDA Ductal Adenocarcinoma CT, MR, DX, CR 43 Pancreas 2018/04/26 CPTAC-UCEC Corpus Endometrial Carcinoma CT,MR,PT 31 Uterus 2018/04/26 CPTAC-HNSCC Head and Neck Cancer CT 5 Head-Neck 2018/04/26 CPTAC-CM Cutaneous Melanoma MR 1 Brain 2018/02/13  Precision medicine data • Proteomics • Radiology • Pathology • Clinical • Genomics  Similar to TCGA, but prospective • Newer scanners • Same variability in acquisitions
  • 22. 22 Applied Proteogenomics OrganizationaL Learning and Outcomes (APOLLO) network 6. Sequencing 7. Proteomics
  • 23. 23 Pilot accomplishments with VA and DOD Boston VA MAVERIC program • Now have VA-approved SOPs for image transfer • Proof of principle: 36 imaging studies on 7 patients hosted • Genomic data submitted to GDC Walter Reed GYN-COE Imaging for APOLLO 2 • SOPs developed for images from three sites • 250 studies on 90 patients transferred, not yet public • Facilitating image feature extraction analysis by experts
  • 24. APOLLO-5 Year 1 Projected Cases Multiple Cancer Types and Imaging Modalities Cancer Type Low Estimate High Estimate GYN 300 400 Breast 100 200 Prostate 50 100 Colon/GI 50 100 ENT/Thyroid 50 100 Kidney 25 50 Lung 25 50 Brain 10 20 Sarcomas 10 20 Lymphoid 10 20 TOTAL 630 1060 • Imaging will be from multiple imaging modalities • Most cases will have multiple image sets & time points
  • 25. 25 Crowds Cure Cancer – BIDS WG pipeline project Preliminary pipeline stages • Ingest tumor location seed points from TCGA subjects • Generate 3d segmentations • Compute radiomic features • Predict outcomes Access the data • Visit our Analysis Results page • Direct link: https://doi.org/10.7937/K9/TCIA.201 8.OW73VLO2
  • 26. 26 PET-CT Image Feature Standardization  Extending/improving IBSI guidelines  Utilizing TCIA data such as CC-Radiomics-Phantom
  • 28. 28 Acknowledgements Funding: NCI Cancer Imaging Program Frederick National Laboratory for Cancer Research • John Freymann, Justin Kirby, Brenda Fevrier-Sullivan, Luis Cordeiro, Craig Hill Consultant - Carl Jaffe University of Arkansas Medical School • Fred Prior, Kirk Smith, Lawrence Tarbox, Bill Bennett, Tracy Nolan, Dwayne Dobbins Emory University • Ashish Sharma

Editor's Notes

  • #8: Provide DOIs to collections and meta-collections (article’s analysis) Publication can refer to the specific data sets used via the DOIs in the data citations Currently working with NLM, collaborating with Nature Scientific Data and other publications
  • #24: The VA team through its MAVERIC (Massachusetts Veterans Epidemiology Research and Information Center) and RePOP projects have worked extensively to develop an internal SOP to collect and centralize the imaging data of patients participating in APOLLO, pulling imaging data on request from its 21 regional service network facilities, or VISNs. The Gyno. Center of Excellence (GYN-COE) developed its imaging SOP in the context of supporting the APOLLO 2 project; as such the emphasis was on collecting the data from multiple sites, configuring and testing the data de-identification and submission systems, and collaborating with TCIA on the quality control steps, staging the images for feature extraction and returning quantitative imaging measures back to the APOLLO data systems. This is ongoing with the feature extractions, the imaging data are all in TCIA
  • #25: Murtha Cancer Center Biobank (MCCB) Sites: WRNMMC, Ft. Bragg, Portsmouth, Keesler, San Diego, Madigan, Fort Belvoir, San Antonio, William Beaumont El Paso, Anne Arundel Medical Center estimated to contribute 200-300 cancers of all types / year VA Palo Alto initially estimated to contribute 50 lung and GI cancer cases / year Clinical Breast Care Project (CBCP) estimated to provide 100-200 breast cancers / year Center for Prostate Disease Research (CPDR) estimated to contribute 50-100 prostate cancers / year University of Virginia (UVA) estimated to provide 25-50 lung cancer cases / year Gynecologic Cancer Center of Excellence (GYN-COE) Tissue and Data Acquisition Network (TDAN) Sites: Inova, Duke, Roswell Park with negotiations underway for the Universities of Hawaii and Oklahoma estimated to contribute 300-400 GYN cancers /year Priorities: 1. Active Duty 2. Minorities 3. High priority aggressive cancers 4. High priority events (e.g. metastasis, recurrence, resistance)