SlideShare a Scribd company logo
“Quantifying The Dynamics of Your Superorganism Body 
Using Big Data Supercomputing” 
2014-15 Distinguished Lecturer Series 
Computer Science and Engineering Department 
University of Washington 
Seattle, WA 
October 9, 2014 
Dr. Larry Smarr 
Director, California Institute for Telecommunications and Information Technology 
Harry E. Gruber Professor, 
Dept. of Computer Science and Engineering 
Jacobs School of Engineering, UCSD 
http://lsmarr.calit2.net 
1
Abstract 
As a member of Lee Hood's 100 Person Wellness Project, headquartered in Seattle's Institute 
for System Biology, I am engaged in experiments to read out the time varying state of a 
complex dynamical system - my human body. However, the human body is host to 100 trillion 
microorganisms, ten times the number of cells in the human body, and these microbes contain 
100 times the number of DNA genes that our human DNA does. The microbial component of 
this "superorganism" is comprised of hundreds of species spread over many taxonomic phyla. 
The human immune system is tightly coupled with this microbial ecology and in cases of 
autoimmune disease, both the immune system and the microbial ecology can have dynamic 
excursions far from normal. To provide a deeper context for the microbiome results from the 
100 Person Wellness Project, I have been exploring the variation in the microbiome ecology 
across healthy and chronically ill populations. Our research starts with trillions of DNA bases, 
produced by Illumina Next Generation sequencers, of the human gut microbial DNA taken from 
my own body over time, as well as from hundreds of people sequenced under the NIH Human 
Microbiome Project. To decode the details of the microbial ecology we feed this data into 
parallel supercomputers, running sophisticated bioinformatics software pipelines. We then use 
Calit2/SDSC designed Big Data PCs to manage the data and drive innovative scalable 
visualization systems to examine the complexities of the changing human gut microbial 
ecology in health and disease. I will show how advanced data analytics tools find patterns in 
the resulting microbial distribution data that suggest new hypotheses for clinical application.
Calit2 Has Had a Vision of 
“the Digital Transformation of Health” for a Decade 
• Next Step—Putting You On-Line! 
www.bodymedia.com 
– Wireless Internet Transmission 
– Key Metabolic and Physical Variables 
– Model -- Dozens of Processors and 60 Sensors / 
Actuators Inside of our Cars 
• Post-Genomic Individualized Medicine 
– Combine 
– Genetic Code 
–Body Data Flow 
– Use Powerful AI Data Mining Techniques 
The Content of This Slide from 2001 Larry Smarr 
Calit2 Talk on Digitally Enabled Genomic Medicine
My Decade Long Journey to Being a Quantified Self: 
By Measuring the State of My Body and “Tuning” It 
Using Nutrition and Exercise, I Became Healthier 
I Arrived in La Jolla in 2000 After 20 Years in the Midwest 
2000 
Age 
41 
2010 
Age 
61 
1999 
1989 
Age 
51 
1999 
I Reversed My Body’s Decline By 
Quantifying and Altering 
Nutrition, Exercise, Sleep, and Stress 
http://lsmarr.calit2.net/repository/LS_reading_recommendations_FiRe_2011.pdf
From One to a Billion Data Points Defining Me: 
The Exponential Rise in Body Data in Just One Decade 
Billion: My Full DNA, 
MRI/CT Images 
Million: My DNA SNPs, 
Zeo, FitBit 
One: Hundred: My Blood Variables 
WeigMhyt Weight 
Blood 
Variables 
SNPs 
Microbial Genome 
Improving Body 
Discovering Disease
Early Adopting MDs Are Creating Partnerships 
with Their Quantified Patients 
• “The 100 participants will be guided on this 9-month 
journey by a coach and when necessary, 
be referred to their own health care practitioners.” 
• The data sets that will be evaluated include: 
– Self-Tracking Devices 
– Medical History, Traits, Lifestyle 
– Blood, Urine, Saliva 
– Gut Microbiome 
– Whole Genome Sequencing 
Will Grow to 1000, then 10,000 
There are 8760 Hours in a Year 
One of These Hours You Are With a Doctor… 
The Other 8759 Hours Are Up to You! 
https://pioneer100.systemsbiology.net/
Visualizing Time Series of 
150 LS Blood and Stool Variables, Each Over 5-10 Years 
Calit2 64 megapixel VROOM
Only One of My Blood Measurements 
Was Far Out of Range--Indicating Chronic Inflammation 
Episodic Peaks in Inflammation 
Followed by Spontaneous Drops 
Normal Range 
<1 mg/L 
27x Upper Limit 
Normal 
Complex Reactive Protein (CRP) is a Blood Biomarker 
for Detecting Presence of Inflammation
Adding Stool Tests Revealed 
Oscillatory Behavior in an Immune Variable 
Typical 
Lactoferrin 
Value for 
Active 
Inflammatory 
Bowel Disease 
(IBD) 
Normal Range 
<7.3 μg/mL 
124x Upper Limit 
Hypothesis: Lactoferrin Oscillations 
Coupled to Relative Abundance 
of Microbes that Require Iron 
Antibiotics 
Antibiotics 
Lactoferrin is a Protein Shed from Neutrophils - 
An Antibacterial that Sequesters Iron
Confirming the IBD (Crohn’s) Hypothesis: 
Finding the “Smoking Gun” with MRI Imaging 
I Obtained the MRI Slices 
From UCSD Medical Services 
and Converted to Interactive 3D 
Descending Colon 
Sigmoid Colon 
Threading Iliac Arteries 
Major Kink 
Working With 
Calit2 Staff & DeskVOX Software 
Transverse Colon 
Liver 
Small Intestine 
Diseased Sigmoid Colon 
MRI Jan 2012 
Cross Section
Why Did I Have an Autoimmune Disease like IBD? 
Despite decades of research, 
the etiology of Crohn's disease 
remains unknown. 
Its pathogenesis may involve 
a complex interplay between 
host genetics, 
immune dysfunction, 
and microbial or environmental factors. 
--The Role of Microbes in Crohn's Disease 
So I Set Out to Quantify All Three! 
Paul B. Eckburg & David A. Relman 
Clin Infect Dis. 44:256-262 (2007)
The Cost of Sequencing a Human Genome 
Has Fallen Over 10,000x in the Last Ten Years 
This Has Enabled Sequencing of 
Both Human and Microbial Genomes
Inclusion of the Microbiome 
Will Radically Change Medicine and Wellness 
Your Body Has 10 Times 
As Many Microbe Cells As Human Cells 
99% of Your 
DNA Genes 
Are in Microbe Cells 
Not Human Cells 
I Will Focus on the Human Gut Microbiome, 
Which Contains Hundreds of Microbial Species
When We Think About Biological Diversity 
We Typically Think of the Wide Range of Animals 
But All These Animals Are in One SubPhylum Vertebrata 
of the Chordata Phylum 
All images from Wikimedia Commons. 
Photos are public domain or by Trisha Shears & Richard Bartz
Think of These Phyla of Animals When 
You Consider the Biodiversity of Microbes Inside You 
Phylum 
Annelida 
All images from WikiMedia Commons. 
Phylum 
Echinodermata 
Photos are public domain or by Dan Hershman, Michael Linnenbach, Manuae, B_cool 
Phylum 
Cnidaria 
Phylum 
Mollusca 
Phylum 
Arthropoda 
Phylum 
Chordata
However, The Evolutionary Distance Between Your Gut Microbes 
Is Much Greater Than Between All Animals 
Green Circles Are 
Human Gut Microbes 
Source: Carl Woese, et al 
Last Slide 
Evolutionary Distance Derived from 
Comparative Sequencing of 16S or 18S Ribosomal RNA
A Year of Sequencing a Healthy Gut Microbiome Daily - 
Remarkable Stability with Abrupt Changes 
Days 
Genome Biology (2014) 
David, et al.
To Map Out the Dynamics of My Microbiome Ecology 
I Partnered with the J. Craig Venter Institute 
• JCVI Did Metagenomic 
Sequencing on Seven of My 
Stool Samples Over 1.5 Years 
• Sequencing on 
Illumina HiSeq 2000 
– Generates 100bp Reads 
– Run Takes ~14 Days 
– My 7 Samples Produced 
– >200Gbp of Data 
• JCVI Lab Manager, 
Genomic Medicine 
– Manolito Torralba 
• IRB PI Karen Nelson 
– President JCVI 
Illumina HiSeq 2000 at JCVI 
Manolito Torralba, JCVI Karen Nelson, JCVI
We Expanded Our Healthy Cohort to All Gut Microbiomes 
from NIH HMP For Comparative Analysis 
Each Sample Has 100-200 Million Illumina Short Reads (100 bases) 
IBD Patients 
2 Ulcerative Colitis Patients, 
6 Points in Time 
5 Ileal Crohn’s Patients, 
3 Points in Time 
“Healthy” Individuals 
Total of 27 Billion Reads 
Or 2.7 Trillion Bases 
Source: Jerry Sheehan, Calit2 
Weizhong Li, Sitao Wu, CRBS, UCSD 
250 Subjects 
1 Point in Time 
Larry Smarr 
7 Points in Time
We Created a Reference Database 
Of Known Gut Genomes 
• NCBI April 2013 
– 2471 Complete + 5543 Draft Bacteria & Archaea Genomes 
– 2399 Complete Virus Genomes 
– 26 Complete Fungi Genomes 
– 309 HMP Eukaryote Reference Genomes 
• Total 10,741 genomes, ~30 GB of sequences 
Now to Align Our 27 Billion Reads 
Against the Reference Database 
Source: Weizhong Li, Sitao Wu, CRBS, UCSD
Computational NextGen Sequencing Pipeline: 
From “Big Equations” to “Big Data” Computing 
PI: (Weizhong Li, CRBS, UCSD): 
NIH R01HG005978 (2010-2013, $1.1M)
We Used SDSC’s Gordon Data-Intensive Supercomputer 
to Analyze a Wide Range of Gut Microbiomes 
Enabled by 
a Grant of Time 
on Gordon from SDSC 
Director Mike Norman 
Source: Weizhong Li, Sitao Wu, CRBS, UCSD 
Our Team Used 25 CPU-Years 
To Compute 
the Comparative Gut Microbiome 
of My Time Samples 
and Our Healthy and IBD Controls 
Starting With 
the 5 Billion Illumina Reads 
Received from JCVI
We Used Dell’s HPC Cloud to Analyze 
All of Our Human Gut Microbiomes 
• Dell’s Sanger Cluster 
– 32 Nodes, 512 Cores 
– 48GB RAM per Node 
• We Processed the Taxonomic Relative Abundance 
– Used ~35,000 Core-Hours on Dell’s Sanger 
• Produced Relative Abundance of 
~10,000 Bacteria, Archaea, Viruses in ~300 People 
– ~3Million Spreadsheet Cells 
• New System: R Bio-Gen System 
– 48 Nodes, 768 Cores 
– 128 GB RAM per Node 
Source: Weizhong Li, UCSD
Using Scalable Visualization Allows Comparison of 
the Relative Abundance of 200 Microbe Species 
Comparing 3 LS Time Snapshots (Left) 
with Healthy, Crohn’s, UC (Right Top to Bottom) 
Calit2 VROOM-FuturePatient Expedition
Using Microbiome Profiles to Survey 155 Subjects 
for Unhealthy Candidates
Bacteroidetes and Firmicutes Phyla Dominate 
“Healthy” Subjects in the Pioneer 100 Gut Microbiomes 
A Few With High % 
Proteobacteria 
or Verrucomicrobia
Lessons from Ecological Dynamics: 
Gut Microbiome Has Multiple Relatively Stable Equilibria 
“The Application of Ecological Theory Toward an Understanding of the Human Microbiome,” 
Elizabeth Costello, Keaton Stagaman, Les Dethlefsen, Brendan Bohannan, David Relman 
Science 336, 1255-62 (2012)
We Found Major State Shifts in Microbial Ecology Phyla 
Between Healthy and Two Forms of IBD 
Most 
Common 
Microbial 
Phyla 
Average HE 
Average Ulcerative Colitis Average LS Average Crohn’s Disease 
Collapse of Bacteroidetes 
Explosion of Actinobacteria 
Explosion of 
Proteobacteria 
Hybrid of UC and CD 
High Level of Archaea
Is the Gut Microbial Ecology Different 
in Crohn’s Disease Subtypes? 
Ben Willing, GASTROENTEROLOGY 2010;139:1844 –1854 
Colonic 
Crohn’s 
Disease 
(CCD) 
Ileal Crohn’s Disease (ICD)
PCA Analysis 
on Species Abundance Across People 
PCA2 
Green-Healthy 
Red-CD 
Purple-UC 
Blue-LS 
PCA1 
Analysis by Mehrdad Yazdani, Calit2 
ICD 
CCD Healthy 
Subset?
KEGG: a Database Resource for Understanding High-Level 
Functions and Utilities of the Biological System 
http://www.genome.jp/kegg/
Using Ayasdi To Discover Patterns 
in KEGG Cellular Pathway Dataset 
topological data analysis 
Source: Pek Lum, Chief Data Scientist, Ayasdi 
Dataset from Larry Smarr Team 
With 60 Subjects (HE, CD, UC, LS) 
Each with 10,000 KEGGs - 
600,000 Cells
Disease Arises from Perturbed Cellular Networks: 
Dynamics of a Prion Perturbed Network in Mice 
Source: Lee Hood, ISB 
33 
Our Next Goal is to Create 
Such Perturbed Networks in Humans
Next Step: 
Compute Genes and Function 
Full Processing to Function 
(COGs, KEGGs) 
Would Require 
~1-2 Million 
Core-Hours 
Plus Dedicated Network to Move Data 
From R Systems / Dell to Calit2@UC San Diego
“A Whole-Cell Computational Model 
Predicts Phenotype from Genotype” 
A model of 
Mycoplasma genitalium, 
• 525 genes 
• Using 1,900 
experimental 
observations 
• From 900 studies, 
• They created the 
software model, 
• Which requires 128 
computers to run
Early Attempts at Modeling the Systems Biology of 
the Gut Microbiome and the Human Immune System
Next Step: Time Series of Metagenomic Gut Microbiomes 
and Immune Variables in an N=100 Clinic Trial 
Goal: Understand 
The Coupled Human Immune-Microbiome Dynamics 
In the Presence of Human Genetic Predispositions 
Drs. William J. Sandborn, John Chang, & Brigid Boland 
UCSD School of Medicine, Division of Gastroenterology
From Quantified Self to 
National-Scale Biomedical Research Projects 
My Anonymized Human Genome 
is Available for Download 
www.personalgenomes.org 
The Quantified Human Initiative 
is an effort to combine 
our natural curiosity about self 
with new research paradigms. 
Rich datasets of two individuals, 
Drs. Smarr and Snyder, 
serve as 21st century 
personal data prototypes. 
www.delsaglobal.org
Thanks to Our Great Team! 
UCSD Metagenomics Team 
Weizhong Li 
Sitao Wu 
Calit2@UCSD 
Future Patient Team 
Jerry Sheehan 
Tom DeFanti 
Kevin Patrick 
Jurgen Schulze 
Andrew Prudhomme 
Philip Weber 
Fred Raab 
Joe Keefe 
Ernesto Ramirez 
JCVI Team 
Karen Nelson 
Shibu Yooseph 
Manolito Torralba 
SDSC Team 
Michael Norman 
Mahidhar Tatineni 
Robert Sinkovits 
UCSD Health Sciences Team 
William J. Sandborn 
Elisabeth Evans 
John Chang 
Brigid Boland 
David Brenner

More Related Content

PPTX
Decoding the Software Inside of You
PPTX
Quantifying your Human Body & Its Trillions of Microbes
PPTX
Machine Learning Opportunities in the Explosion of Personalized Precision Med...
PPTX
Using Supercomputers to Discover the 100 Trillion Bacteria Living Within Each...
PPTX
Analyzing the Human Gut Microbiome Dynamics in Health and Disease Using Super...
PPTX
Using Supercomputers and Data Science to Reveal Your Inner Microbiome
PPTX
Quantifying Your Dynamic Human Body (Including Its Microbiome), Will Move Us ...
PPTX
Discovering the 100 Trillion Bacteria Living Within Each of Us
Decoding the Software Inside of You
Quantifying your Human Body & Its Trillions of Microbes
Machine Learning Opportunities in the Explosion of Personalized Precision Med...
Using Supercomputers to Discover the 100 Trillion Bacteria Living Within Each...
Analyzing the Human Gut Microbiome Dynamics in Health and Disease Using Super...
Using Supercomputers and Data Science to Reveal Your Inner Microbiome
Quantifying Your Dynamic Human Body (Including Its Microbiome), Will Move Us ...
Discovering the 100 Trillion Bacteria Living Within Each of Us

What's hot (19)

PPTX
Linking Phenotype Changes to Internal/External Longitudinal Time Series in a ...
PPTX
Analyzing the Human Gut Microbiome Dynamics in Health and Disease Using Super...
PPTX
Fifty Years of Supercomputing: From Colliding Black Holes to Dynamic Microbio...
PPTX
Using Supercomputers and Gene Sequencers to Discover Your Inner Microbiome
PPTX
Using Supercomputers and Data Analytics to Discover the Differences in Health...
PPTX
Discovering the Other 90% of our Human Superorganism
PPT
The Human Microbiome and the Revolution in Digital Health
PPTX
Stability in Health vs. Abrupt Changes in Disease in the Human Gut Microbiome...
PPTX
Supercomputing Your Inner Microbiome
PPT
Quantifying Your Superorganism Body Using Big Data Supercomputing
PPTX
Exploring the Dynamics of The Microbiome in Health and Disease
PPT
Exploring Our Inner Universe Using Supercomputers and Gene Sequencers
PPT
From N=1 to N=100: What I Have Learned from Quantifying My Superorganism Body
PPT
Quantifying Your Superorganism Body Using Big Data Supercomputing
PPTX
Finding the Patterns in the Big Data From Human Microbiome Ecology
PPTX
Assay Lab Within Your Body: Biometrics and Biomes
PPT
Individual, Consumer-Driven Care of the Future: Taking Wellness One Step Further
PPT
Large Memory High Performance Computing Enables Comparison Across Human Gut M...
PPTX
Quantfying Your Gut: A Personal Journey
Linking Phenotype Changes to Internal/External Longitudinal Time Series in a ...
Analyzing the Human Gut Microbiome Dynamics in Health and Disease Using Super...
Fifty Years of Supercomputing: From Colliding Black Holes to Dynamic Microbio...
Using Supercomputers and Gene Sequencers to Discover Your Inner Microbiome
Using Supercomputers and Data Analytics to Discover the Differences in Health...
Discovering the Other 90% of our Human Superorganism
The Human Microbiome and the Revolution in Digital Health
Stability in Health vs. Abrupt Changes in Disease in the Human Gut Microbiome...
Supercomputing Your Inner Microbiome
Quantifying Your Superorganism Body Using Big Data Supercomputing
Exploring the Dynamics of The Microbiome in Health and Disease
Exploring Our Inner Universe Using Supercomputers and Gene Sequencers
From N=1 to N=100: What I Have Learned from Quantifying My Superorganism Body
Quantifying Your Superorganism Body Using Big Data Supercomputing
Finding the Patterns in the Big Data From Human Microbiome Ecology
Assay Lab Within Your Body: Biometrics and Biomes
Individual, Consumer-Driven Care of the Future: Taking Wellness One Step Further
Large Memory High Performance Computing Enables Comparison Across Human Gut M...
Quantfying Your Gut: A Personal Journey
Ad

Similar to Quantifying The Dynamics of Your Superorganism Body Using Big Data Supercomputing (19)

PPTX
Assay Lab Within Your Body: Biometrics and Biomes
PPTX
Know Thyself: Quantifying Your Human Body and Its One Hundred Trillion Microbes
PPTX
Capturing the Interactive Dynamics of the Human Host/Microbiome System
PPT
Tracking Large Variations in My Immune Biomarkers and My Gut Microbiome: Infl...
PPT
Quantified Health and Disease
PDF
Using Dell’s HPC Cloud & Advanced Analytic Software to Discover Radical Chang...
PPTX
Mapping the Human Gut Microbiome in Health and Disease Using Sequencing, Supe...
PPTX
Supercomputing Your Inner Microbiome
PPT
Digitally Revealing the Dynamics of Your Superorganism Body
PPT
Living in a Microbial World
PPT
Big Data and Superorganism Genomics: Microbial Metagenomics Meets Human Genomics
PPTX
Discovering Human Gut Microbiome Dynamics
PPTX
The Systems Biology Dynamics of the Human Immune System and Gut Microbiome
PPTX
The Deeply Quantified Self: A Case Study
PPT
Using Genetic Sequencing to Unravel the Dynamics of Your Superorganism Body
PPTX
Inspired by Carl: Exploring the Microbial Dynamics Within
PPTX
Measuring the Human Brain-Gut Microbiome-Immune System Dynamics: a Big Data C...
PPT
Determining the Human Gut Microbiome Using Genome Sequencing and Dell's Cloud...
PPT
Observing the Dynamics of the Human Immune System Coupled to the Microbiome i...
Assay Lab Within Your Body: Biometrics and Biomes
Know Thyself: Quantifying Your Human Body and Its One Hundred Trillion Microbes
Capturing the Interactive Dynamics of the Human Host/Microbiome System
Tracking Large Variations in My Immune Biomarkers and My Gut Microbiome: Infl...
Quantified Health and Disease
Using Dell’s HPC Cloud & Advanced Analytic Software to Discover Radical Chang...
Mapping the Human Gut Microbiome in Health and Disease Using Sequencing, Supe...
Supercomputing Your Inner Microbiome
Digitally Revealing the Dynamics of Your Superorganism Body
Living in a Microbial World
Big Data and Superorganism Genomics: Microbial Metagenomics Meets Human Genomics
Discovering Human Gut Microbiome Dynamics
The Systems Biology Dynamics of the Human Immune System and Gut Microbiome
The Deeply Quantified Self: A Case Study
Using Genetic Sequencing to Unravel the Dynamics of Your Superorganism Body
Inspired by Carl: Exploring the Microbial Dynamics Within
Measuring the Human Brain-Gut Microbiome-Immune System Dynamics: a Big Data C...
Determining the Human Gut Microbiome Using Genome Sequencing and Dell's Cloud...
Observing the Dynamics of the Human Immune System Coupled to the Microbiome i...
Ad

More from Larry Smarr (20)

PPTX
Smart Patients, Big Data, NextGen Primary Care
PPTX
Internet2 and QUILT Initiatives with Regional Networks -6NRP Larry Smarr and ...
PPTX
Internet2 and QUILT Initiatives with Regional Networks -6NRP Larry Smarr and ...
PPTX
National Research Platform: Application Drivers
PPT
From Supercomputing to the Grid - Larry Smarr
PPTX
The CENIC-AI Resource - Los Angeles Community College District (LACCD)
PPT
Redefining Collaboration through Groupware - From Groupware to Societyware
PPT
The Coming of the Grid - September 8-10,1997
PPT
Supercomputers: Directions in Technology, Architecture, and Applications
PPT
High Performance Geographic Information Systems
PPT
Data Intensive Applications at UCSD: Driving a Campus Research Cyberinfrastru...
PPT
Enhanced Telepresence and Green IT — The Next Evolution in the Internet
PPTX
The CENIC AI Resource CENIC AIR - CENIC Retreat 2024
PPTX
The CENIC-AI Resource: The Right Connection
PPTX
The Pacific Research Platform: The First Six Years
PPTX
The NSF Grants Leading Up to CHASE-CI ENS
PPTX
Integrated Optical Fiber/Wireless Systems for Environmental Monitoring
PPTX
Toward a National Research Platform to Enable Data-Intensive Open-Source Sci...
PPTX
Toward a National Research Platform to Enable Data-Intensive Computing
PPTX
Digital Twins of Physical Reality - Future in Review
Smart Patients, Big Data, NextGen Primary Care
Internet2 and QUILT Initiatives with Regional Networks -6NRP Larry Smarr and ...
Internet2 and QUILT Initiatives with Regional Networks -6NRP Larry Smarr and ...
National Research Platform: Application Drivers
From Supercomputing to the Grid - Larry Smarr
The CENIC-AI Resource - Los Angeles Community College District (LACCD)
Redefining Collaboration through Groupware - From Groupware to Societyware
The Coming of the Grid - September 8-10,1997
Supercomputers: Directions in Technology, Architecture, and Applications
High Performance Geographic Information Systems
Data Intensive Applications at UCSD: Driving a Campus Research Cyberinfrastru...
Enhanced Telepresence and Green IT — The Next Evolution in the Internet
The CENIC AI Resource CENIC AIR - CENIC Retreat 2024
The CENIC-AI Resource: The Right Connection
The Pacific Research Platform: The First Six Years
The NSF Grants Leading Up to CHASE-CI ENS
Integrated Optical Fiber/Wireless Systems for Environmental Monitoring
Toward a National Research Platform to Enable Data-Intensive Open-Source Sci...
Toward a National Research Platform to Enable Data-Intensive Computing
Digital Twins of Physical Reality - Future in Review

Recently uploaded (20)

PDF
Dr. Jasvant Modi - Passionate About Philanthropy
PPT
Recent advances in Diagnosis of Autoimmune Disorders
PPTX
Pulmonary Circulation PPT final for easy
PPTX
General Pharmacology by Nandini Ratne, Nagpur College of Pharmacy, Hingna Roa...
PPTX
Rheumatic heart diseases with Type 2 Diabetes Mellitus
PPTX
ABG advance Arterial Blood Gases Analysis
PPTX
CBT FOR OCD TREATMENT WITHOUT MEDICATION
PPTX
BLS, BCLS Module-A life saving procedure
PDF
Dermatology diseases Index August 2025.pdf
PDF
CHAPTER 9 MEETING SAFETY NEEDS FOR OLDER ADULTS.pdf
PPTX
Infection prevention and control for medical students
PDF
Khaled Sary- Trailblazers of Transformation Middle East's 5 Most Inspiring Le...
PPTX
Importance of Immediate Response (1).pptx
PPTX
1. Drug Distribution System.pptt b pharmacy
PPT
Microscope is an instrument that makes an enlarged image of a small object, t...
PPTX
Basics of pharmacology (Pharmacology I).pptx
DOCX
Copies if quanti.docxsegdfhfkhjhlkjlj,klkj
PPTX
First Aid and Basic Life Support Training.pptx
PDF
Priorities Critical Care Nursing 7th Edition by Urden Stacy Lough Test Bank.pdf
PPT
Adrenergic drugs (sympathomimetics ).ppt
Dr. Jasvant Modi - Passionate About Philanthropy
Recent advances in Diagnosis of Autoimmune Disorders
Pulmonary Circulation PPT final for easy
General Pharmacology by Nandini Ratne, Nagpur College of Pharmacy, Hingna Roa...
Rheumatic heart diseases with Type 2 Diabetes Mellitus
ABG advance Arterial Blood Gases Analysis
CBT FOR OCD TREATMENT WITHOUT MEDICATION
BLS, BCLS Module-A life saving procedure
Dermatology diseases Index August 2025.pdf
CHAPTER 9 MEETING SAFETY NEEDS FOR OLDER ADULTS.pdf
Infection prevention and control for medical students
Khaled Sary- Trailblazers of Transformation Middle East's 5 Most Inspiring Le...
Importance of Immediate Response (1).pptx
1. Drug Distribution System.pptt b pharmacy
Microscope is an instrument that makes an enlarged image of a small object, t...
Basics of pharmacology (Pharmacology I).pptx
Copies if quanti.docxsegdfhfkhjhlkjlj,klkj
First Aid and Basic Life Support Training.pptx
Priorities Critical Care Nursing 7th Edition by Urden Stacy Lough Test Bank.pdf
Adrenergic drugs (sympathomimetics ).ppt

Quantifying The Dynamics of Your Superorganism Body Using Big Data Supercomputing

  • 1. “Quantifying The Dynamics of Your Superorganism Body Using Big Data Supercomputing” 2014-15 Distinguished Lecturer Series Computer Science and Engineering Department University of Washington Seattle, WA October 9, 2014 Dr. Larry Smarr Director, California Institute for Telecommunications and Information Technology Harry E. Gruber Professor, Dept. of Computer Science and Engineering Jacobs School of Engineering, UCSD http://lsmarr.calit2.net 1
  • 2. Abstract As a member of Lee Hood's 100 Person Wellness Project, headquartered in Seattle's Institute for System Biology, I am engaged in experiments to read out the time varying state of a complex dynamical system - my human body. However, the human body is host to 100 trillion microorganisms, ten times the number of cells in the human body, and these microbes contain 100 times the number of DNA genes that our human DNA does. The microbial component of this "superorganism" is comprised of hundreds of species spread over many taxonomic phyla. The human immune system is tightly coupled with this microbial ecology and in cases of autoimmune disease, both the immune system and the microbial ecology can have dynamic excursions far from normal. To provide a deeper context for the microbiome results from the 100 Person Wellness Project, I have been exploring the variation in the microbiome ecology across healthy and chronically ill populations. Our research starts with trillions of DNA bases, produced by Illumina Next Generation sequencers, of the human gut microbial DNA taken from my own body over time, as well as from hundreds of people sequenced under the NIH Human Microbiome Project. To decode the details of the microbial ecology we feed this data into parallel supercomputers, running sophisticated bioinformatics software pipelines. We then use Calit2/SDSC designed Big Data PCs to manage the data and drive innovative scalable visualization systems to examine the complexities of the changing human gut microbial ecology in health and disease. I will show how advanced data analytics tools find patterns in the resulting microbial distribution data that suggest new hypotheses for clinical application.
  • 3. Calit2 Has Had a Vision of “the Digital Transformation of Health” for a Decade • Next Step—Putting You On-Line! www.bodymedia.com – Wireless Internet Transmission – Key Metabolic and Physical Variables – Model -- Dozens of Processors and 60 Sensors / Actuators Inside of our Cars • Post-Genomic Individualized Medicine – Combine – Genetic Code –Body Data Flow – Use Powerful AI Data Mining Techniques The Content of This Slide from 2001 Larry Smarr Calit2 Talk on Digitally Enabled Genomic Medicine
  • 4. My Decade Long Journey to Being a Quantified Self: By Measuring the State of My Body and “Tuning” It Using Nutrition and Exercise, I Became Healthier I Arrived in La Jolla in 2000 After 20 Years in the Midwest 2000 Age 41 2010 Age 61 1999 1989 Age 51 1999 I Reversed My Body’s Decline By Quantifying and Altering Nutrition, Exercise, Sleep, and Stress http://lsmarr.calit2.net/repository/LS_reading_recommendations_FiRe_2011.pdf
  • 5. From One to a Billion Data Points Defining Me: The Exponential Rise in Body Data in Just One Decade Billion: My Full DNA, MRI/CT Images Million: My DNA SNPs, Zeo, FitBit One: Hundred: My Blood Variables WeigMhyt Weight Blood Variables SNPs Microbial Genome Improving Body Discovering Disease
  • 6. Early Adopting MDs Are Creating Partnerships with Their Quantified Patients • “The 100 participants will be guided on this 9-month journey by a coach and when necessary, be referred to their own health care practitioners.” • The data sets that will be evaluated include: – Self-Tracking Devices – Medical History, Traits, Lifestyle – Blood, Urine, Saliva – Gut Microbiome – Whole Genome Sequencing Will Grow to 1000, then 10,000 There are 8760 Hours in a Year One of These Hours You Are With a Doctor… The Other 8759 Hours Are Up to You! https://pioneer100.systemsbiology.net/
  • 7. Visualizing Time Series of 150 LS Blood and Stool Variables, Each Over 5-10 Years Calit2 64 megapixel VROOM
  • 8. Only One of My Blood Measurements Was Far Out of Range--Indicating Chronic Inflammation Episodic Peaks in Inflammation Followed by Spontaneous Drops Normal Range <1 mg/L 27x Upper Limit Normal Complex Reactive Protein (CRP) is a Blood Biomarker for Detecting Presence of Inflammation
  • 9. Adding Stool Tests Revealed Oscillatory Behavior in an Immune Variable Typical Lactoferrin Value for Active Inflammatory Bowel Disease (IBD) Normal Range <7.3 μg/mL 124x Upper Limit Hypothesis: Lactoferrin Oscillations Coupled to Relative Abundance of Microbes that Require Iron Antibiotics Antibiotics Lactoferrin is a Protein Shed from Neutrophils - An Antibacterial that Sequesters Iron
  • 10. Confirming the IBD (Crohn’s) Hypothesis: Finding the “Smoking Gun” with MRI Imaging I Obtained the MRI Slices From UCSD Medical Services and Converted to Interactive 3D Descending Colon Sigmoid Colon Threading Iliac Arteries Major Kink Working With Calit2 Staff & DeskVOX Software Transverse Colon Liver Small Intestine Diseased Sigmoid Colon MRI Jan 2012 Cross Section
  • 11. Why Did I Have an Autoimmune Disease like IBD? Despite decades of research, the etiology of Crohn's disease remains unknown. Its pathogenesis may involve a complex interplay between host genetics, immune dysfunction, and microbial or environmental factors. --The Role of Microbes in Crohn's Disease So I Set Out to Quantify All Three! Paul B. Eckburg & David A. Relman Clin Infect Dis. 44:256-262 (2007)
  • 12. The Cost of Sequencing a Human Genome Has Fallen Over 10,000x in the Last Ten Years This Has Enabled Sequencing of Both Human and Microbial Genomes
  • 13. Inclusion of the Microbiome Will Radically Change Medicine and Wellness Your Body Has 10 Times As Many Microbe Cells As Human Cells 99% of Your DNA Genes Are in Microbe Cells Not Human Cells I Will Focus on the Human Gut Microbiome, Which Contains Hundreds of Microbial Species
  • 14. When We Think About Biological Diversity We Typically Think of the Wide Range of Animals But All These Animals Are in One SubPhylum Vertebrata of the Chordata Phylum All images from Wikimedia Commons. Photos are public domain or by Trisha Shears & Richard Bartz
  • 15. Think of These Phyla of Animals When You Consider the Biodiversity of Microbes Inside You Phylum Annelida All images from WikiMedia Commons. Phylum Echinodermata Photos are public domain or by Dan Hershman, Michael Linnenbach, Manuae, B_cool Phylum Cnidaria Phylum Mollusca Phylum Arthropoda Phylum Chordata
  • 16. However, The Evolutionary Distance Between Your Gut Microbes Is Much Greater Than Between All Animals Green Circles Are Human Gut Microbes Source: Carl Woese, et al Last Slide Evolutionary Distance Derived from Comparative Sequencing of 16S or 18S Ribosomal RNA
  • 17. A Year of Sequencing a Healthy Gut Microbiome Daily - Remarkable Stability with Abrupt Changes Days Genome Biology (2014) David, et al.
  • 18. To Map Out the Dynamics of My Microbiome Ecology I Partnered with the J. Craig Venter Institute • JCVI Did Metagenomic Sequencing on Seven of My Stool Samples Over 1.5 Years • Sequencing on Illumina HiSeq 2000 – Generates 100bp Reads – Run Takes ~14 Days – My 7 Samples Produced – >200Gbp of Data • JCVI Lab Manager, Genomic Medicine – Manolito Torralba • IRB PI Karen Nelson – President JCVI Illumina HiSeq 2000 at JCVI Manolito Torralba, JCVI Karen Nelson, JCVI
  • 19. We Expanded Our Healthy Cohort to All Gut Microbiomes from NIH HMP For Comparative Analysis Each Sample Has 100-200 Million Illumina Short Reads (100 bases) IBD Patients 2 Ulcerative Colitis Patients, 6 Points in Time 5 Ileal Crohn’s Patients, 3 Points in Time “Healthy” Individuals Total of 27 Billion Reads Or 2.7 Trillion Bases Source: Jerry Sheehan, Calit2 Weizhong Li, Sitao Wu, CRBS, UCSD 250 Subjects 1 Point in Time Larry Smarr 7 Points in Time
  • 20. We Created a Reference Database Of Known Gut Genomes • NCBI April 2013 – 2471 Complete + 5543 Draft Bacteria & Archaea Genomes – 2399 Complete Virus Genomes – 26 Complete Fungi Genomes – 309 HMP Eukaryote Reference Genomes • Total 10,741 genomes, ~30 GB of sequences Now to Align Our 27 Billion Reads Against the Reference Database Source: Weizhong Li, Sitao Wu, CRBS, UCSD
  • 21. Computational NextGen Sequencing Pipeline: From “Big Equations” to “Big Data” Computing PI: (Weizhong Li, CRBS, UCSD): NIH R01HG005978 (2010-2013, $1.1M)
  • 22. We Used SDSC’s Gordon Data-Intensive Supercomputer to Analyze a Wide Range of Gut Microbiomes Enabled by a Grant of Time on Gordon from SDSC Director Mike Norman Source: Weizhong Li, Sitao Wu, CRBS, UCSD Our Team Used 25 CPU-Years To Compute the Comparative Gut Microbiome of My Time Samples and Our Healthy and IBD Controls Starting With the 5 Billion Illumina Reads Received from JCVI
  • 23. We Used Dell’s HPC Cloud to Analyze All of Our Human Gut Microbiomes • Dell’s Sanger Cluster – 32 Nodes, 512 Cores – 48GB RAM per Node • We Processed the Taxonomic Relative Abundance – Used ~35,000 Core-Hours on Dell’s Sanger • Produced Relative Abundance of ~10,000 Bacteria, Archaea, Viruses in ~300 People – ~3Million Spreadsheet Cells • New System: R Bio-Gen System – 48 Nodes, 768 Cores – 128 GB RAM per Node Source: Weizhong Li, UCSD
  • 24. Using Scalable Visualization Allows Comparison of the Relative Abundance of 200 Microbe Species Comparing 3 LS Time Snapshots (Left) with Healthy, Crohn’s, UC (Right Top to Bottom) Calit2 VROOM-FuturePatient Expedition
  • 25. Using Microbiome Profiles to Survey 155 Subjects for Unhealthy Candidates
  • 26. Bacteroidetes and Firmicutes Phyla Dominate “Healthy” Subjects in the Pioneer 100 Gut Microbiomes A Few With High % Proteobacteria or Verrucomicrobia
  • 27. Lessons from Ecological Dynamics: Gut Microbiome Has Multiple Relatively Stable Equilibria “The Application of Ecological Theory Toward an Understanding of the Human Microbiome,” Elizabeth Costello, Keaton Stagaman, Les Dethlefsen, Brendan Bohannan, David Relman Science 336, 1255-62 (2012)
  • 28. We Found Major State Shifts in Microbial Ecology Phyla Between Healthy and Two Forms of IBD Most Common Microbial Phyla Average HE Average Ulcerative Colitis Average LS Average Crohn’s Disease Collapse of Bacteroidetes Explosion of Actinobacteria Explosion of Proteobacteria Hybrid of UC and CD High Level of Archaea
  • 29. Is the Gut Microbial Ecology Different in Crohn’s Disease Subtypes? Ben Willing, GASTROENTEROLOGY 2010;139:1844 –1854 Colonic Crohn’s Disease (CCD) Ileal Crohn’s Disease (ICD)
  • 30. PCA Analysis on Species Abundance Across People PCA2 Green-Healthy Red-CD Purple-UC Blue-LS PCA1 Analysis by Mehrdad Yazdani, Calit2 ICD CCD Healthy Subset?
  • 31. KEGG: a Database Resource for Understanding High-Level Functions and Utilities of the Biological System http://www.genome.jp/kegg/
  • 32. Using Ayasdi To Discover Patterns in KEGG Cellular Pathway Dataset topological data analysis Source: Pek Lum, Chief Data Scientist, Ayasdi Dataset from Larry Smarr Team With 60 Subjects (HE, CD, UC, LS) Each with 10,000 KEGGs - 600,000 Cells
  • 33. Disease Arises from Perturbed Cellular Networks: Dynamics of a Prion Perturbed Network in Mice Source: Lee Hood, ISB 33 Our Next Goal is to Create Such Perturbed Networks in Humans
  • 34. Next Step: Compute Genes and Function Full Processing to Function (COGs, KEGGs) Would Require ~1-2 Million Core-Hours Plus Dedicated Network to Move Data From R Systems / Dell to Calit2@UC San Diego
  • 35. “A Whole-Cell Computational Model Predicts Phenotype from Genotype” A model of Mycoplasma genitalium, • 525 genes • Using 1,900 experimental observations • From 900 studies, • They created the software model, • Which requires 128 computers to run
  • 36. Early Attempts at Modeling the Systems Biology of the Gut Microbiome and the Human Immune System
  • 37. Next Step: Time Series of Metagenomic Gut Microbiomes and Immune Variables in an N=100 Clinic Trial Goal: Understand The Coupled Human Immune-Microbiome Dynamics In the Presence of Human Genetic Predispositions Drs. William J. Sandborn, John Chang, & Brigid Boland UCSD School of Medicine, Division of Gastroenterology
  • 38. From Quantified Self to National-Scale Biomedical Research Projects My Anonymized Human Genome is Available for Download www.personalgenomes.org The Quantified Human Initiative is an effort to combine our natural curiosity about self with new research paradigms. Rich datasets of two individuals, Drs. Smarr and Snyder, serve as 21st century personal data prototypes. www.delsaglobal.org
  • 39. Thanks to Our Great Team! UCSD Metagenomics Team Weizhong Li Sitao Wu Calit2@UCSD Future Patient Team Jerry Sheehan Tom DeFanti Kevin Patrick Jurgen Schulze Andrew Prudhomme Philip Weber Fred Raab Joe Keefe Ernesto Ramirez JCVI Team Karen Nelson Shibu Yooseph Manolito Torralba SDSC Team Michael Norman Mahidhar Tatineni Robert Sinkovits UCSD Health Sciences Team William J. Sandborn Elisabeth Evans John Chang Brigid Boland David Brenner