SlideShare a Scribd company logo
Biobanking: a user’s perspective
and an overview
Jonathan Pevsner, Ph.D.
Professor, Dept. of Neurology
Kennedy Krieger Institute and Johns Hopkins Medicine
Chief Scientific Officer, Sturge-Weber Foundation
pevsner@kennedykrieger.org
Data Science Forum: NIH Data Science SIG
Global Perspective on Biobanking and Access to Samples
June 23, 2017
Conflicts of interest
I have no conflicts of interest.
Outline
From genotype to phenotype: a framework
Three biobanking examples
Postmortem brains from the NIH NeuroBioBank
Establishing a rare disease biobank
From a large genomics dataset to biobank samples
Issues, lessons and principles
1. Usefulness
2. Existing biobanks
3. GUIDs: the importance of labels
4. Data science is integral to biobanking
5. Standards
6. Informed consent
7. Needs and opportunities
The relationship between genotype and phenotype
represents one of the most fundamental and challenging
problems in biomedical science.
Fundamental framework: genotype to phenotype
Genotype Phenotype
Fundamental framework: genotype to phenotype
Genotype Phenotype
DNA individual populationorgancellprotein
We can provide a framework for this problem.
Fundamental framework: genotype to phenotype
Genotype Phenotype
DNA individual populationorgancellprotein
We can provide a framework for this problem.
RNA pathways circuits
Fundamental framework: genotype to phenotype
Genotype Phenotype
DNA individual populationorgancellprotein
gene1
gene20,000
gene2
…
Fundamental framework: genotype to phenotype
Genotype Phenotype
DNA individual populationorgancellprotein
gene1
gene20,000
gene2
ABCD1
severe childhood disease (ALD)
mild adult onset disease (AMN)
apparently normal
One gene mutation can have different phenotypic
consequences: the same ABCD1 mutation may result in severe
childhood-onset adrenoleukodystrophy (ALD), milder adult-
onset adrenomyeloneuropathy (AMN), or no symptoms.
Fundamental framework: genotype to phenotype
Genotype Phenotype
DNA individual populationorgancellprotein
gene1
gene20,000
gene2
GNAQ
melanocytes: uveal melanoma
endothelial cells: Sturge-Weber
blood: apparently normal
One gene mutation can have different consequences:
when and where mutations occur is crucial.
Fundamental framework: genotype to phenotype
Genotype Phenotype
DNA individual populationorgancellprotein
gene1
gene20,000
gene2
…
one disease phenotype,
multiple genetic contributors
For almost all diseases (including common diseases such
as autism or bipolar disorder) we search for multiple
genetic variants that confer risk for a phenotype
Outline
From genotype to phenotype: a framework
Three biobanking examples
Postmortem brains from the NIH NeuroBioBank
Establishing a rare disease biobank
From a large genomics dataset to biobank samples
Issues, lessons and principles
1. Usefulness
2. Existing biobanks
3. GUIDs: the importance of labels
4. Data science is integral to biobanking
5. Standards
6. Informed consent
7. Needs and opportunities
A port-wine birthmark affects about 1:300 people.
It varies in size and location.
Sturge-Weber syndrome affects < 1:20,000 people.
It affects a subset of individuals with a facial PW birthmark.
A user’s perspective on biobanking: three examples.
(1) Sturge-Weber syndrome and a brain bank
A user’s perspective on biobanking: three examples.
(1) Sturge-Weber syndrome and a brain bank
DNA from
blood
(presumed
unaffected)
DNA from port-
wine birthmark
(presumed
affected)
A user’s perspective on biobanking: three examples.
(1) Sturge-Weber syndrome and a brain bank
DNA from
blood
(presumed
unaffected)
DNA from port-
wine birthmark
(presumed
affected)
sequence the
genome
sequence the
genome
compare
We identified a mosaic mutation in GNAQ as causing Sturge-
Weber syndrome and port-wine birthmarks (NEJM, PMID
23656586).We analyzed samples from 3 individual patients.
A user’s perspective on biobanking: three examples.
(1) Sturge-Weber syndrome and a brain bank
Genotype Phenotype
DNA individual populationorgancellprotein
gene1
gene20,000
gene2
GNAQ
After finding the GNAQ mutation we turned to the NIH
NeuroBioBank at the University of Maryland.We obtained
97 samples to validate our findings.The availability of these
samples from a biobank was crucial!
A user’s perspective on biobanking: three examples.
(2) Establishing a Sturge-Weber syndrome biobank
I am Chief Scientific Officer of the Sturge-Weber
Foundation.We need to create (or join) a biobank.
• Patients and families tell me “I want to donate my brain
and body to science. Can you help?”What’s the plan;
and are there informed consent issues?
• Scientists have discovered that the GNAQ mutation
occurs primarily in endothelial cells, and cell lines
have been established from brain biopsies. How can
researchers share and access these cell lines?
• Are there standards that we should follow in describing
the genotype and phenotype of Sturge-Weber
syndrome samples and patients?
• Have these problems been addressed by those studying
related diseases?
A user’s perspective on biobanking: three examples.
(2) Establishing a Sturge-Weber syndrome biobank
Genotype Phenotype
DNA individual populationorgancellprotein
gene1
gene20,000
gene2
GNAQ
It’s important to link clinical data (e.g. from a patient
registry) with data generated from biospecimens!
A user’s perspective on biobanking: three examples.
(2) Establishing a Sturge-Weber syndrome biobank
Genotype Phenotype
DNA individual populationorgancellprotein
gene1
gene20,000
gene2
GNAQ
What information do we
need to capture about cell
lines, brain, and skin samples?
A user’s perspective on biobanking: three examples.
(2) Establishing a Sturge-Weber syndrome biobank
Genotype Phenotype
DNA individual populationorgancellprotein
gene1
gene20,000
gene2
GNAQ
What information do we
need to capture about cell
lines, brain, and skin samples?
How do we relate genomic DNA
sequence findings, RNA-seq,
proteomics to the samples?
A user’s perspective on biobanking: three examples.
(2) Establishing a Sturge-Weber syndrome biobank
Genotype Phenotype
DNA individual populationorgancellprotein
gene1
gene20,000
gene2
GNAQ
What information do we
need to capture about cell
lines, brain, and skin samples?
What information do we need
to capture about the
phenotypes as we collect
samples at diverse sites?
How do we relate genomic DNA
sequence findings, RNA-seq,
proteomics to the samples?
A user’s perspective on biobanking: three examples.
(3) Discovering mosaic mutations in 9000 autism samples
We asked whether mosaic mutations occur in autism. By
applying to NIH we obtained previously generated whole
exome sequence data on 9000 individuals via the Simons
Simplex Collection (SSC).We discovered mosaic variation is
enriched in children with autism spectrum disorder.
To validate our findings, we applied to the Simons
Foundation and received approval to obtain DNA from a
Rutgers repository (http://www.rucdr.org/).We purchased
300 DNA samples and successfully validated our findings.
See PMID 27632392:
A user’s perspective on biobanking: three examples.
(3) Discovering mosaic mutations in 9000 autism samples
Genotype Phenotype
DNA individual populationorgancellprotein
gene1
gene20,000
gene2
…
A user starts with
genomics data…
A user’s perspective on biobanking: three examples.
(3) Discovering mosaic mutations in 9000 autism samples
Genotype Phenotype
DNA individual populationorgancellprotein
gene1
gene20,000
gene2
…
…then purchases cell lines
or DNA or brain chunks
for further studies…
A user starts with
genomics data…
A user’s perspective on biobanking: three examples.
(3) Discovering mosaic mutations in 9000 autism samples
Genotype Phenotype
DNA individual populationorgancellprotein
gene1
gene20,000
gene2
…
…then purchases cell lines
or DNA or brain chunks
for further studies…
Obtaining clinical phenotypes
from the biobank is essential.
A user starts with
genomics data…
Outline
From genotype to phenotype: a framework
Three biobanking examples
Postmortem brains from the NIH NeuroBioBank
Establishing a rare disease biobank
From a large genomics dataset to biobank samples
Issues, lessons and principles
1. Usefulness
2. Existing biobanks
3. GUIDs: the importance of labels
4. Data science is integral to biobanking
5. Standards
6. Informed consent
7. Needs and opportunities
(1) Usefulness
• Diseases are considered rare when affecting 200,000
or fewer people (U.S. definition) or fewer than
1:2,000 people (European definition).
• There are ~6,800 rare diseases.
• Biobanks offer crucial resources to help solve the
causes of rare diseases—and to study diagnosis,
prevention, and treatment.
• Biobanks offer a range of cell, solid tissue types (e.g.
brain, heart, fibroblasts, lymphoblastoid cell
lines) and bodily fluids.
• Biobanks offer biospecimens from individuals,
pedigrees, and/or populations.
• Samples from biobanks are complemented by
phenotypic and genotypic data.
List of panelists
 Jonathan Pevsner, Professor, at the Dept. of Neurology, Kennedy Krieger
Institute. Presentation title: Biobanking user’s perspective and an overview
Dept. of Psychiatry and Behavioral Sciences, Johns Hopkins Medicine
 David van Enckevort, Project Manager BBMRI & RD-Connect,Department of
Genetics, University Medical Center Groningen (UMCG). Presentation title: “FAIR
(Findable, Accessible, Interoperable and Reusable) data and sample access “
 Manuel Posada de la Paz. Director, Research Institute for Rare Diseases
(Instituto de Investigación en Enfermedades Raras), a member of the EuroBioBank.
Presentation title: Rare diseases biological samples: small collections and research.
 Kerry Wiles, Program Director- VUMC Tissue Repository, CHTN (Cooperative
Human Tissue Network) Western Coordinator. Presentation title: An academic
prospective procurement repository: From Donor to Bench
 Jim Vaught Editor-in-Chief, Biopreservation Journal, past President of the
International Society for Biological and Environmental Repositories (ISBER), on the
board of directors for ISBER and NDRI (National Disease Research Interchange),
Presentation title: "NIH and ISBER perspectives on specimen locators"
 Daniel Catchpoole Director of Kids Research Institute, The Children's Hospital at
Westmead (Australia). Presentation title: The Australian experience, issues and
solution
(2) Examples of existing biobanks and biobank initiatives
Coriell Biorepository
The NIGMS collection has >11,000 cell lines and
~6,000 DNA samples.
https://catalog.coriell.org/
NIH NeuroBioBank
6 sites.The University of Maryland Brain &Tissue Bank
has distributed 35,000 tissue samples to >900
researchers.
https://neurobiobank.nih.gov/
Cooperative Human Tissue Network (CHTN)
Supported by the National Cancer Institute
https://www.chtn.org/
EuroBioBank
130,000 samples available; 13,000 collected and >7,000
samples distributed per year.
http://www.eurobiobank.org/
RD-Connect
"An integrated platform connecting databases,
registries, biobanks and clinical bioinformatics for rare
disease research.”
http://rd-connect.eu/
Research Institute for Rare Diseases
(Instituto de Investigación en Enfermedades Raras), a
member of the EuroBioBank.
http://www.eurobiobank.org/en/partners/description/
isciii.htm
(2) Examples of existing biobanks and biobank initiatives
BBMRI-ERIC
Biobanking and biomolecular resources research
infrastructure-European Research Infrastructure
Consortium.
http://www.bbmri-eric.eu/BBMRI-ERIC/common-service-it/
Kids Research Institute,The Children's Hospital at
Westmead (Australia)
http://www.kidsresearch.org.au/our-facilities/bio-banks
National Disease Research Interchange (NDRI)
The mission of NDRI is to provide human biospecimens to
advance biomedical/bioscience research and development
worldwide.”
http://ndriresource.org/
(2) Examples of existing biobanks and biobank initiatives
All of Us
“The All of Us Research Program seeks to extend
precision medicine to all diseases by building a national
research cohort of one million or more U.S. participants.”
It includes a biobank.
https://allofus.nih.gov/about/program-components
NIMH Repository and Genomics Resource (NIMH-RGR)
“…plays a key role in facilitating psychiatric genetic
research by providing a collection of over 150,000 well
characterized, high quality patient and control samples
from a wide-range of mental disorders.”
https://www.nimhgenetics.org/
(2) Examples of existing biobanks and biobank initiatives
(3) GUIDs: the importance of labels
“Accession numbers” are alpha-numeric characters that
provide links to various kinds of data or records. For
example, NP_620258.1 is an accession number
corresponding to a protein sequence.
A GUID is a Global Unique Identifier that corresponds to
a study participant.The GUID facilitates tracking patient’s
data across studies and location and over time in a
deidentified manner.
Example 1: a participant was recruited twice (years apart)
to a single study.
Example 2: a trio was recruited into two separate autism
genome sequencing studies (one study excluded severe
phenotypes, one excluded mild phenotypes).The proband’s
phenotype had become severe over time.
(4) Data science is integral to biobanking
Biobanking requires a series of tasks such as obtaining
biospecimens and associated metadata (e.g. phenotypic
data, cause of death, postmortem interval, cell culture
conditions, imaging data, genomics data).
Goals include effective communication, standardization
(e.g. of protocols), and an electronic portal to a
repository.
All this requires data science.
Biobanks must integrate diverse data types
Genotype Phenotype
DNA individual populationorgancellprotein
gene1
gene20,000
gene2
…
Sequence data:
Genomic DNA
(dbGaP), RNAseq
Proteomics data,
metabolomics
imaging data
phenotypic test
data (e.g.
neuropsychology)
cell culture,
biochemical,
iPSC data
epidemiology
(5) Standards
Biobanks implement Data Dictionaries to manage data
elements (and data structures) in a uniform manner.The
use of Common Data Elements is crucial.
NIH Common Data Elements (CDE) Repository
“designed to provide access to structured human and
machine-readable definitions of data elements.”
https://cde.nlm.nih.gov/home
Standards: Common Data Elements (CDE) Repository
(6) Informed consent
Research studies (in contrast to clinical tests) are under
Institutional Review Board (IRB) jurisdiction.A
researcher must have a research protocol approved,
and one or more consent forms.
Biobanks provide biospecimens that are sometimes in
the realm of human subjects research.Appropriate
consent forms must be administered for biospecimens
to be deposited in a biobank.
An emerging issue is obtaining appropriate consent for
DNA to be sequenced from biospecimens. Because of
the nature of contemporary sequencing samples are no
longer inherently deidentifiable.
(7) Needs and opportunities
We need resources and efforts such as the following:
• coordination of biobanking efforts across diverse
initiatives.
• awareness and adoption of community standards for
biobanking.
• flexibility to adapt to changing technologies (e.g.
sequencing technologies).
• Integrated platforms and bioinformatics tools

More Related Content

PPTX
Jonathan Pevsner - Biobanking: a user's perspective and overview
PPTX
The Genomics Revolution: The Good, The Bad, and The Ugly (UEOP16 Keynote)
PDF
AAVinthetreatmentofhumandiseases
PPTX
ENT Presentation
PPTX
Hospital Presentation
PPTX
Emerging challenges in data-intensive genomics
PDF
Romain Banchereau - Resume
PPT
Introduction to Bioinformatics Slides
Jonathan Pevsner - Biobanking: a user's perspective and overview
The Genomics Revolution: The Good, The Bad, and The Ugly (UEOP16 Keynote)
AAVinthetreatmentofhumandiseases
ENT Presentation
Hospital Presentation
Emerging challenges in data-intensive genomics
Romain Banchereau - Resume
Introduction to Bioinformatics Slides

What's hot (20)

PPTX
PPTX
CTF 2017 Cutaneous Neurofibroma Resource Sage Bionetworks
PPT
Digitally Revealing the Dynamics of Your Superorganism Body
PPTX
Bioinformatics
PDF
CV AMW 20150709
PPT
Dubina Michael biomedical technologies at Skolkovo
PPTX
Right to life and Rights to Privacy
PPTX
Jan 15 2013 Hospital Microbiome Meeting
PPT
Mci5004 biomarkers infectious diseases
PPTX
Monarch Initiative Poster - Rare Disease Symposium 2015
PPTX
Basics in bioinformatics
PDF
Robert Pesich_PAVA_Stanford Resume v. 8_22_16
PPTX
PDF
Introduction to Bioinformatics
PDF
PEGS the essential protein and antibody engineering summit
PDF
Science after the sequence
PDF
Stephen Friend Genetic Alliance 25th Anniversary 2011-06-24
PPTX
Research Ethics Forum: Ethical Challenges in Trials of Human Genome Editing a...
PPTX
Bioinformatics: What, Why and Where?
PDF
Karyotype: The Molecules that define You
CTF 2017 Cutaneous Neurofibroma Resource Sage Bionetworks
Digitally Revealing the Dynamics of Your Superorganism Body
Bioinformatics
CV AMW 20150709
Dubina Michael biomedical technologies at Skolkovo
Right to life and Rights to Privacy
Jan 15 2013 Hospital Microbiome Meeting
Mci5004 biomarkers infectious diseases
Monarch Initiative Poster - Rare Disease Symposium 2015
Basics in bioinformatics
Robert Pesich_PAVA_Stanford Resume v. 8_22_16
Introduction to Bioinformatics
PEGS the essential protein and antibody engineering summit
Science after the sequence
Stephen Friend Genetic Alliance 25th Anniversary 2011-06-24
Research Ethics Forum: Ethical Challenges in Trials of Human Genome Editing a...
Bioinformatics: What, Why and Where?
Karyotype: The Molecules that define You
Ad

Similar to NIH Data Science Special Interest Group (20)

PPTX
UK Biobank: A Prospective Cohort Epidemiology Study
PDF
JALANov2000
PPTX
Discovering the 100 Trillion Bacteria Living Within Each of Us
PPTX
Discovering the 100 Trillion Bacteria Living Within Each of Us
PPTX
Reg Sci Lecture Dec 2016
DOCX
Bio banking synopsis
PPTX
Bioinformatics .pptx
PPT
Bioinformatics in present and its future
PDF
Building an efficient infrastructure, standards and data flow for metabolomics
PDF
TLSC Biotech 101 Noc 2010 (Moore)
PPTX
Data analytics challenges in genomics
PPT
Human Genome Sequencing and health Biotechnology.ppt
PPTX
Gene Wiki and Mark2Cure update for BD2K
PPTX
Using Supercomputers and Gene Sequencers to Discover Your Inner Microbiome
PPTX
Finding the Patterns in the Big Data From Human Microbiome Ecology
PPT
Big Data and the Promise and Pitfalls when Applied to Disease Prevention and ...
PPTX
Emerging collaboration models for academic medical centers _ our place in the...
PPT
Genome data management
PDF
Friend harvard 2013-01-30
PDF
Bioinformatics in the Clinical Pipeline: Contribution in Genomic Medicine
UK Biobank: A Prospective Cohort Epidemiology Study
JALANov2000
Discovering the 100 Trillion Bacteria Living Within Each of Us
Discovering the 100 Trillion Bacteria Living Within Each of Us
Reg Sci Lecture Dec 2016
Bio banking synopsis
Bioinformatics .pptx
Bioinformatics in present and its future
Building an efficient infrastructure, standards and data flow for metabolomics
TLSC Biotech 101 Noc 2010 (Moore)
Data analytics challenges in genomics
Human Genome Sequencing and health Biotechnology.ppt
Gene Wiki and Mark2Cure update for BD2K
Using Supercomputers and Gene Sequencers to Discover Your Inner Microbiome
Finding the Patterns in the Big Data From Human Microbiome Ecology
Big Data and the Promise and Pitfalls when Applied to Disease Prevention and ...
Emerging collaboration models for academic medical centers _ our place in the...
Genome data management
Friend harvard 2013-01-30
Bioinformatics in the Clinical Pipeline: Contribution in Genomic Medicine
Ad

Recently uploaded (20)

PPTX
Pathophysiology And Clinical Features Of Peripheral Nervous System .pptx
PPTX
LUNG ABSCESS - respiratory medicine - ppt
PPT
MENTAL HEALTH - NOTES.ppt for nursing students
PPTX
anal canal anatomy with illustrations...
PPTX
CME 2 Acute Chest Pain preentation for education
PPTX
Respiratory drugs, drugs acting on the respi system
PDF
Medical Evidence in the Criminal Justice Delivery System in.pdf
PPTX
surgery guide for USMLE step 2-part 1.pptx
PPTX
Important Obstetric Emergency that must be recognised
PDF
شيت_عطا_0000000000000000000000000000.pdf
PPTX
POLYCYSTIC OVARIAN SYNDROME.pptx by Dr( med) Charles Amoateng
PDF
Oral Aspect of Metabolic Disease_20250717_192438_0000.pdf
PPT
Breast Cancer management for medicsl student.ppt
PDF
Human Health And Disease hggyutgghg .pdf
DOCX
RUHS II MBBS Microbiology Paper-II with Answer Key | 6th August 2025 (New Sch...
PPT
genitourinary-cancers_1.ppt Nursing care of clients with GU cancer
PPTX
neonatal infection(7392992y282939y5.pptx
PPTX
Chapter-1-The-Human-Body-Orientation-Edited-55-slides.pptx
PPTX
CEREBROVASCULAR DISORDER.POWERPOINT PRESENTATIONx
PPT
ASRH Presentation for students and teachers 2770633.ppt
Pathophysiology And Clinical Features Of Peripheral Nervous System .pptx
LUNG ABSCESS - respiratory medicine - ppt
MENTAL HEALTH - NOTES.ppt for nursing students
anal canal anatomy with illustrations...
CME 2 Acute Chest Pain preentation for education
Respiratory drugs, drugs acting on the respi system
Medical Evidence in the Criminal Justice Delivery System in.pdf
surgery guide for USMLE step 2-part 1.pptx
Important Obstetric Emergency that must be recognised
شيت_عطا_0000000000000000000000000000.pdf
POLYCYSTIC OVARIAN SYNDROME.pptx by Dr( med) Charles Amoateng
Oral Aspect of Metabolic Disease_20250717_192438_0000.pdf
Breast Cancer management for medicsl student.ppt
Human Health And Disease hggyutgghg .pdf
RUHS II MBBS Microbiology Paper-II with Answer Key | 6th August 2025 (New Sch...
genitourinary-cancers_1.ppt Nursing care of clients with GU cancer
neonatal infection(7392992y282939y5.pptx
Chapter-1-The-Human-Body-Orientation-Edited-55-slides.pptx
CEREBROVASCULAR DISORDER.POWERPOINT PRESENTATIONx
ASRH Presentation for students and teachers 2770633.ppt

NIH Data Science Special Interest Group

  • 1. Biobanking: a user’s perspective and an overview Jonathan Pevsner, Ph.D. Professor, Dept. of Neurology Kennedy Krieger Institute and Johns Hopkins Medicine Chief Scientific Officer, Sturge-Weber Foundation pevsner@kennedykrieger.org Data Science Forum: NIH Data Science SIG Global Perspective on Biobanking and Access to Samples June 23, 2017
  • 2. Conflicts of interest I have no conflicts of interest.
  • 3. Outline From genotype to phenotype: a framework Three biobanking examples Postmortem brains from the NIH NeuroBioBank Establishing a rare disease biobank From a large genomics dataset to biobank samples Issues, lessons and principles 1. Usefulness 2. Existing biobanks 3. GUIDs: the importance of labels 4. Data science is integral to biobanking 5. Standards 6. Informed consent 7. Needs and opportunities
  • 4. The relationship between genotype and phenotype represents one of the most fundamental and challenging problems in biomedical science. Fundamental framework: genotype to phenotype Genotype Phenotype
  • 5. Fundamental framework: genotype to phenotype Genotype Phenotype DNA individual populationorgancellprotein We can provide a framework for this problem.
  • 6. Fundamental framework: genotype to phenotype Genotype Phenotype DNA individual populationorgancellprotein We can provide a framework for this problem. RNA pathways circuits
  • 7. Fundamental framework: genotype to phenotype Genotype Phenotype DNA individual populationorgancellprotein gene1 gene20,000 gene2 …
  • 8. Fundamental framework: genotype to phenotype Genotype Phenotype DNA individual populationorgancellprotein gene1 gene20,000 gene2 ABCD1 severe childhood disease (ALD) mild adult onset disease (AMN) apparently normal One gene mutation can have different phenotypic consequences: the same ABCD1 mutation may result in severe childhood-onset adrenoleukodystrophy (ALD), milder adult- onset adrenomyeloneuropathy (AMN), or no symptoms.
  • 9. Fundamental framework: genotype to phenotype Genotype Phenotype DNA individual populationorgancellprotein gene1 gene20,000 gene2 GNAQ melanocytes: uveal melanoma endothelial cells: Sturge-Weber blood: apparently normal One gene mutation can have different consequences: when and where mutations occur is crucial.
  • 10. Fundamental framework: genotype to phenotype Genotype Phenotype DNA individual populationorgancellprotein gene1 gene20,000 gene2 … one disease phenotype, multiple genetic contributors For almost all diseases (including common diseases such as autism or bipolar disorder) we search for multiple genetic variants that confer risk for a phenotype
  • 11. Outline From genotype to phenotype: a framework Three biobanking examples Postmortem brains from the NIH NeuroBioBank Establishing a rare disease biobank From a large genomics dataset to biobank samples Issues, lessons and principles 1. Usefulness 2. Existing biobanks 3. GUIDs: the importance of labels 4. Data science is integral to biobanking 5. Standards 6. Informed consent 7. Needs and opportunities
  • 12. A port-wine birthmark affects about 1:300 people. It varies in size and location. Sturge-Weber syndrome affects < 1:20,000 people. It affects a subset of individuals with a facial PW birthmark. A user’s perspective on biobanking: three examples. (1) Sturge-Weber syndrome and a brain bank
  • 13. A user’s perspective on biobanking: three examples. (1) Sturge-Weber syndrome and a brain bank DNA from blood (presumed unaffected) DNA from port- wine birthmark (presumed affected)
  • 14. A user’s perspective on biobanking: three examples. (1) Sturge-Weber syndrome and a brain bank DNA from blood (presumed unaffected) DNA from port- wine birthmark (presumed affected) sequence the genome sequence the genome compare We identified a mosaic mutation in GNAQ as causing Sturge- Weber syndrome and port-wine birthmarks (NEJM, PMID 23656586).We analyzed samples from 3 individual patients.
  • 15. A user’s perspective on biobanking: three examples. (1) Sturge-Weber syndrome and a brain bank Genotype Phenotype DNA individual populationorgancellprotein gene1 gene20,000 gene2 GNAQ After finding the GNAQ mutation we turned to the NIH NeuroBioBank at the University of Maryland.We obtained 97 samples to validate our findings.The availability of these samples from a biobank was crucial!
  • 16. A user’s perspective on biobanking: three examples. (2) Establishing a Sturge-Weber syndrome biobank I am Chief Scientific Officer of the Sturge-Weber Foundation.We need to create (or join) a biobank. • Patients and families tell me “I want to donate my brain and body to science. Can you help?”What’s the plan; and are there informed consent issues? • Scientists have discovered that the GNAQ mutation occurs primarily in endothelial cells, and cell lines have been established from brain biopsies. How can researchers share and access these cell lines? • Are there standards that we should follow in describing the genotype and phenotype of Sturge-Weber syndrome samples and patients? • Have these problems been addressed by those studying related diseases?
  • 17. A user’s perspective on biobanking: three examples. (2) Establishing a Sturge-Weber syndrome biobank Genotype Phenotype DNA individual populationorgancellprotein gene1 gene20,000 gene2 GNAQ It’s important to link clinical data (e.g. from a patient registry) with data generated from biospecimens!
  • 18. A user’s perspective on biobanking: three examples. (2) Establishing a Sturge-Weber syndrome biobank Genotype Phenotype DNA individual populationorgancellprotein gene1 gene20,000 gene2 GNAQ What information do we need to capture about cell lines, brain, and skin samples?
  • 19. A user’s perspective on biobanking: three examples. (2) Establishing a Sturge-Weber syndrome biobank Genotype Phenotype DNA individual populationorgancellprotein gene1 gene20,000 gene2 GNAQ What information do we need to capture about cell lines, brain, and skin samples? How do we relate genomic DNA sequence findings, RNA-seq, proteomics to the samples?
  • 20. A user’s perspective on biobanking: three examples. (2) Establishing a Sturge-Weber syndrome biobank Genotype Phenotype DNA individual populationorgancellprotein gene1 gene20,000 gene2 GNAQ What information do we need to capture about cell lines, brain, and skin samples? What information do we need to capture about the phenotypes as we collect samples at diverse sites? How do we relate genomic DNA sequence findings, RNA-seq, proteomics to the samples?
  • 21. A user’s perspective on biobanking: three examples. (3) Discovering mosaic mutations in 9000 autism samples We asked whether mosaic mutations occur in autism. By applying to NIH we obtained previously generated whole exome sequence data on 9000 individuals via the Simons Simplex Collection (SSC).We discovered mosaic variation is enriched in children with autism spectrum disorder. To validate our findings, we applied to the Simons Foundation and received approval to obtain DNA from a Rutgers repository (http://www.rucdr.org/).We purchased 300 DNA samples and successfully validated our findings. See PMID 27632392:
  • 22. A user’s perspective on biobanking: three examples. (3) Discovering mosaic mutations in 9000 autism samples Genotype Phenotype DNA individual populationorgancellprotein gene1 gene20,000 gene2 … A user starts with genomics data…
  • 23. A user’s perspective on biobanking: three examples. (3) Discovering mosaic mutations in 9000 autism samples Genotype Phenotype DNA individual populationorgancellprotein gene1 gene20,000 gene2 … …then purchases cell lines or DNA or brain chunks for further studies… A user starts with genomics data…
  • 24. A user’s perspective on biobanking: three examples. (3) Discovering mosaic mutations in 9000 autism samples Genotype Phenotype DNA individual populationorgancellprotein gene1 gene20,000 gene2 … …then purchases cell lines or DNA or brain chunks for further studies… Obtaining clinical phenotypes from the biobank is essential. A user starts with genomics data…
  • 25. Outline From genotype to phenotype: a framework Three biobanking examples Postmortem brains from the NIH NeuroBioBank Establishing a rare disease biobank From a large genomics dataset to biobank samples Issues, lessons and principles 1. Usefulness 2. Existing biobanks 3. GUIDs: the importance of labels 4. Data science is integral to biobanking 5. Standards 6. Informed consent 7. Needs and opportunities
  • 26. (1) Usefulness • Diseases are considered rare when affecting 200,000 or fewer people (U.S. definition) or fewer than 1:2,000 people (European definition). • There are ~6,800 rare diseases. • Biobanks offer crucial resources to help solve the causes of rare diseases—and to study diagnosis, prevention, and treatment. • Biobanks offer a range of cell, solid tissue types (e.g. brain, heart, fibroblasts, lymphoblastoid cell lines) and bodily fluids. • Biobanks offer biospecimens from individuals, pedigrees, and/or populations. • Samples from biobanks are complemented by phenotypic and genotypic data.
  • 27. List of panelists  Jonathan Pevsner, Professor, at the Dept. of Neurology, Kennedy Krieger Institute. Presentation title: Biobanking user’s perspective and an overview Dept. of Psychiatry and Behavioral Sciences, Johns Hopkins Medicine  David van Enckevort, Project Manager BBMRI & RD-Connect,Department of Genetics, University Medical Center Groningen (UMCG). Presentation title: “FAIR (Findable, Accessible, Interoperable and Reusable) data and sample access “  Manuel Posada de la Paz. Director, Research Institute for Rare Diseases (Instituto de Investigación en Enfermedades Raras), a member of the EuroBioBank. Presentation title: Rare diseases biological samples: small collections and research.  Kerry Wiles, Program Director- VUMC Tissue Repository, CHTN (Cooperative Human Tissue Network) Western Coordinator. Presentation title: An academic prospective procurement repository: From Donor to Bench  Jim Vaught Editor-in-Chief, Biopreservation Journal, past President of the International Society for Biological and Environmental Repositories (ISBER), on the board of directors for ISBER and NDRI (National Disease Research Interchange), Presentation title: "NIH and ISBER perspectives on specimen locators"  Daniel Catchpoole Director of Kids Research Institute, The Children's Hospital at Westmead (Australia). Presentation title: The Australian experience, issues and solution
  • 28. (2) Examples of existing biobanks and biobank initiatives Coriell Biorepository The NIGMS collection has >11,000 cell lines and ~6,000 DNA samples. https://catalog.coriell.org/ NIH NeuroBioBank 6 sites.The University of Maryland Brain &Tissue Bank has distributed 35,000 tissue samples to >900 researchers. https://neurobiobank.nih.gov/ Cooperative Human Tissue Network (CHTN) Supported by the National Cancer Institute https://www.chtn.org/
  • 29. EuroBioBank 130,000 samples available; 13,000 collected and >7,000 samples distributed per year. http://www.eurobiobank.org/ RD-Connect "An integrated platform connecting databases, registries, biobanks and clinical bioinformatics for rare disease research.” http://rd-connect.eu/ Research Institute for Rare Diseases (Instituto de Investigación en Enfermedades Raras), a member of the EuroBioBank. http://www.eurobiobank.org/en/partners/description/ isciii.htm (2) Examples of existing biobanks and biobank initiatives
  • 30. BBMRI-ERIC Biobanking and biomolecular resources research infrastructure-European Research Infrastructure Consortium. http://www.bbmri-eric.eu/BBMRI-ERIC/common-service-it/ Kids Research Institute,The Children's Hospital at Westmead (Australia) http://www.kidsresearch.org.au/our-facilities/bio-banks National Disease Research Interchange (NDRI) The mission of NDRI is to provide human biospecimens to advance biomedical/bioscience research and development worldwide.” http://ndriresource.org/ (2) Examples of existing biobanks and biobank initiatives
  • 31. All of Us “The All of Us Research Program seeks to extend precision medicine to all diseases by building a national research cohort of one million or more U.S. participants.” It includes a biobank. https://allofus.nih.gov/about/program-components NIMH Repository and Genomics Resource (NIMH-RGR) “…plays a key role in facilitating psychiatric genetic research by providing a collection of over 150,000 well characterized, high quality patient and control samples from a wide-range of mental disorders.” https://www.nimhgenetics.org/ (2) Examples of existing biobanks and biobank initiatives
  • 32. (3) GUIDs: the importance of labels “Accession numbers” are alpha-numeric characters that provide links to various kinds of data or records. For example, NP_620258.1 is an accession number corresponding to a protein sequence. A GUID is a Global Unique Identifier that corresponds to a study participant.The GUID facilitates tracking patient’s data across studies and location and over time in a deidentified manner. Example 1: a participant was recruited twice (years apart) to a single study. Example 2: a trio was recruited into two separate autism genome sequencing studies (one study excluded severe phenotypes, one excluded mild phenotypes).The proband’s phenotype had become severe over time.
  • 33. (4) Data science is integral to biobanking Biobanking requires a series of tasks such as obtaining biospecimens and associated metadata (e.g. phenotypic data, cause of death, postmortem interval, cell culture conditions, imaging data, genomics data). Goals include effective communication, standardization (e.g. of protocols), and an electronic portal to a repository. All this requires data science.
  • 34. Biobanks must integrate diverse data types Genotype Phenotype DNA individual populationorgancellprotein gene1 gene20,000 gene2 … Sequence data: Genomic DNA (dbGaP), RNAseq Proteomics data, metabolomics imaging data phenotypic test data (e.g. neuropsychology) cell culture, biochemical, iPSC data epidemiology
  • 35. (5) Standards Biobanks implement Data Dictionaries to manage data elements (and data structures) in a uniform manner.The use of Common Data Elements is crucial. NIH Common Data Elements (CDE) Repository “designed to provide access to structured human and machine-readable definitions of data elements.” https://cde.nlm.nih.gov/home
  • 36. Standards: Common Data Elements (CDE) Repository
  • 37. (6) Informed consent Research studies (in contrast to clinical tests) are under Institutional Review Board (IRB) jurisdiction.A researcher must have a research protocol approved, and one or more consent forms. Biobanks provide biospecimens that are sometimes in the realm of human subjects research.Appropriate consent forms must be administered for biospecimens to be deposited in a biobank. An emerging issue is obtaining appropriate consent for DNA to be sequenced from biospecimens. Because of the nature of contemporary sequencing samples are no longer inherently deidentifiable.
  • 38. (7) Needs and opportunities We need resources and efforts such as the following: • coordination of biobanking efforts across diverse initiatives. • awareness and adoption of community standards for biobanking. • flexibility to adapt to changing technologies (e.g. sequencing technologies). • Integrated platforms and bioinformatics tools