SlideShare a Scribd company logo
Gene Expression Data Analysis
(Microarray, NGS & qRT-PCR)
Theme: Transcriptional Program in the Response of Human
Fibroblasts to Serum.
Lab #2
Etienne Z. Gnimpieba
BRIN WS 2013
Mount Marty College – June 24th 2013
Etienne.gnimpieba@usd.edu
Resolution Process
Context
Specification & Aims
Lab #2
Statement of problem / Case study:
The temporal program of gene expression during a model physiological response of human cells, the response of fibroblasts to serum, was explored with a
complementary DNA microarray representing about 8600 different human genes. Genes could be clustered into groups on the basis of their temporal patterns of expression in
this program. Many features of the transcriptional program appeared to be related to the physiology of wound repair, suggesting that fibroblasts play a larger and richer role in
this complex multicellular response than had previously been appreciated.
Gene Expression Data Analysis
16 Vishwanath R. Iyer, Scince, 1999
Aim:
The purpose of this lab is to initiate on gene expression data analysis process.
We simulated the application on “Transcriptional Program in the Response of
Human Fibroblasts to Serum” . Now we can understand how a researcher can
come to identify a significant expressed gene from microarray dataset.
T1. Excel used in Genomics
Objective: Use of basic excel functionalities to solve some gene
expression data analysis needs
Acquired skills
- Gene expression data overview
- Excel Used for genomics
- Microarray data analysis using GEPAS
- ArrayTrack for WorkBench driven gene expression analysis
T1.1. Import your dataset in Excel
T1.2. Pre-treat your dataset obtained a centered (mean=0) and
scaled (stdv=1) dataset
2
Slide Scanning:
:
T3. Gene Expression and Analysis for APT13A2
Objective: Use of Gene Expression and Analysis Tools to discover more
information about the gene APT13A2.
T3.1. Gene Expression Profile Quering using Gene Atlas
T3.2. Experimental Condition Driven Dataset Extraction in Array
Express
T3.3. Gene Expression data quick analysis using GEO2R
T2. Workbench driven gene expression analysis
(ArrayTrack)
Objective: Workbench driven gene expression analysis using ArrayTrack
T2.1. ArrayTrack overview
T2.2. Descriptive statistic: Data Exploring in ArrayTrack
T2.3. Accessing gene expression profiles using BarChart
T2.4. Using SAM through ArrayTrack
T1. Excel used in Genomics
o Open Excel and go to the “Data” tab. Then click on “From Text” and select the text document with that contains the data you
are studying. In this example select “fibroblasts_ori – Excel1”.
o Click “Next”, “Next”, “Finish”, “ok”
T1.1. Import your Data in Excel
T1.2. Pre-treat your dataset to obtained a centered (mean=0) and scaling (stdv =1) dataset
Centering data:
o Go to A520 and type in “Average”
o Select B520 cell, then go to the “Formulas” tab and click on “More Formulas”>”Statistical”>”Average”. Then enter the cells you
want to average. (B2:B518)
o Then go to cell B520 and hover over the lower right corner until a black plus sign appears. Then click and drag across so that
the averages for the rest of the column will appear (This is a shortcut) Check to make sure the shortcut worked*
o Be careful, for missing values (empty cells), replace empty contents by the NULL or NA string, in order to avoid introducing a
zero value in Excel calculation in this cell.
o Verify the value you get for the averages are zero. If it is not zero:
o Select cell U2 and enter the equation: (=) The first cell (B2) (-) the average cell (B520) with a $ in between so it will not change
when apply equation to the other cells (B$520) so the equation should look like (=B2-B$520)
o Then apply to other cells in that column. (check to make sure it worked)
o Then get the average of the new columns made and verify that they equal zero (or really close such as 4.29057E-16)
Etienne Z. Gnimpieba
BRIN WS 2013
Mount Marty College – June 24th 2013
Scaling data:
o For each column (corresponding to one DNA microarray experiment), calculate the standard deviation value
o Divide each experiment value by the corresponding standard deviation value
o Once the calculation is over, verify that the standard deviation value of the column is equal to one,
Objective: Use of basic excel functionalities to solve some gene expression data analysis needs
Gene Expression Data Analysis
• Frouin, V. & Gidrol, X. (2005)
• CBB group (Berlin)
Using ArrayTrackT2. ArrayTrack
Etienne Z. Gnimpieba
BRIN WS 2013
Mount Marty College – June 24th 2013
Objective: Workbench driven gene expression analysis using ArrayTrack
T2.1. ArrayTrack overview
o Microarray Database
o Biological libraries (gene, pathway, protein, …)
o Microarray and Systems Biology Tools (quality, normalisation, analysis, visualization)
T2.2. Descriptive statistic: gene expression data Exploring in ArrayTrack
o Principal Component Analysis (PCA)
o Hierarchical Cluster Analysis (HCA)
o Correlation Matrix
o Scatter plot
o K-means
T2.3. Accessing gene expression profiles using BarChart and Venn Diagram
o Select 4 gene List
o Right click to view the venn diagram
o Pathway level browsing of venn diagram sub-dataset
o Gene ontology level browsing of venn diagram sub-dataset
o Open BarChart from gene List.
o Query and sort data table
o Grouping data bars
o Standard deviation view
T2.4. Using SAM through ArrayTrack
o Two class paired
o Two class unpaired
o One class
o Multi class
o Survival
o One class timecourse
o Two class unpaired timecourse
o Two class paired timecourse
Gene Expression Data Analysis
Gene Expression Data Analysis
Gene Expression and Analysis for APT13A2T3. Gene Expression and Analaysis
On Gene Atlas website: http://www.ebi.ac.uk/gxa/
o Filter search by selecting “homo sapiens” for “organism”, and “H2O2” and “starvation” for “condition”
o Then click on “Search Atlas”
o Here you can click on the bottom right corner of the result that came up for more details.
T3.1. Gene Expression Profile Quering using Gene Atlas
Etienne Z. Gnimpieba
BRIN WS 2013
T3.2. Experimental Condition Driven Dataset Extraction in Array Express
On Array Express website: http://www.ebi.ac.uk/arrayexpress/
o Search “H2O2”
o Filter by selecting “human” for “organism”, and “Array assay” instead of “all technologies”. Then click “Filter
o Find and click on the experiment with the accession number E-GEOD-26143
T3.3. Gene Expression data quick analysis using GEO2R
o Scroll down the webpage to the title “Link” and click on the link “GEO-GSE26143”. On this page you can
see the details of the experiment.
o Next scroll down the webpage and click on the link “Analyze with GEOR2”
o Then group the different parts of the experiment. In this example we will group it by going to the end of
the Titles and click on the “Treatment Group (Ch2)” so that is will go in order of that title.
o Then we will separate in into 3 groups; Click and drag for group 1 (first 3), click on “define group” above
the table, and name “Ble”, hit enter and click on “Ble” ; do the same for 6 and name “Gamma”, and the
next 3 and name “Control”.
o Click on the “Visualize distribution” tab at the bottom of the page and click “View” to see the result of the
experiment in a box plot.
o Click on the “Profile Graph” tab and search for the gene “5783”
o Click on “GEOR2” Tab and click on “Top 250” genes that showed the most differentially expressed genes.
o Here you can search for the gene “5783” again, as well as save the results.
Objective: Use of Gene Expression and Analysis Tools to discover more information about the gene
APT13A2.

More Related Content

PPTX
Dma unit 2
PDF
[IJET-V2I3P21] Authors: Amit Kumar Dewangan, Akhilesh Kumar Shrivas, Prem Kumar
PPTX
Classification
PPTX
Dsa unit 1
PDF
An experimental study on hypothyroid using rotation forest
PPT
Dma unit 1
PDF
Using computable phenotypes in point of care clinical trial recruitment
DOCX
Eastman_MI530_FinalProjectReport
Dma unit 2
[IJET-V2I3P21] Authors: Amit Kumar Dewangan, Akhilesh Kumar Shrivas, Prem Kumar
Classification
Dsa unit 1
An experimental study on hypothyroid using rotation forest
Dma unit 1
Using computable phenotypes in point of care clinical trial recruitment
Eastman_MI530_FinalProjectReport

What's hot (20)

PPTX
Decision tree induction
PDF
Research Methodology - Target Discovery
PDF
An integrated mechanism for feature selection
PDF
research paper
PDF
A Survey Ondecision Tree Learning Algorithms for Knowledge Discovery
PDF
Classification of Paddy Types using Naïve Bayesian Classifiers
PDF
Framework for efficient transformation for complex medical data for improving...
PDF
An Efficient Approach for Asymmetric Data Classification
PDF
Privacy preservation techniques in data mining
PDF
A WEB REPOSITORY SYSTEM FOR DATA MINING IN DRUG DISCOVERY
PPTX
Session ii g3 lab behavior science mmc
PPTX
XL-MINER: Data Utilities
PDF
PDF
Hybrid prediction model with missing value imputation for medical data 2015-g...
PDF
IRJET- Disease Prediction System
PDF
CLUSTERING DICHOTOMOUS DATA FOR HEALTH CARE
PDF
IRJET- Medical Data Mining
PDF
Effect of Feature Selection on Gene Expression Datasets Classification Accura...
Decision tree induction
Research Methodology - Target Discovery
An integrated mechanism for feature selection
research paper
A Survey Ondecision Tree Learning Algorithms for Knowledge Discovery
Classification of Paddy Types using Naïve Bayesian Classifiers
Framework for efficient transformation for complex medical data for improving...
An Efficient Approach for Asymmetric Data Classification
Privacy preservation techniques in data mining
A WEB REPOSITORY SYSTEM FOR DATA MINING IN DRUG DISCOVERY
Session ii g3 lab behavior science mmc
XL-MINER: Data Utilities
Hybrid prediction model with missing value imputation for medical data 2015-g...
IRJET- Disease Prediction System
CLUSTERING DICHOTOMOUS DATA FOR HEALTH CARE
IRJET- Medical Data Mining
Effect of Feature Selection on Gene Expression Datasets Classification Accura...
Ad

Viewers also liked (6)

PPTX
Session ii g2 overview metabolic network modeling mcc
PPTX
Session i lab bioinfo dm and app mmc
PDF
Huber brin pb1_f2_poster_2012
PPTX
Lab Gene Expression Data Analysis
PPTX
Session ii g1 overview genomics and gene expression mmc-good
PPTX
Visualization Tools
Session ii g2 overview metabolic network modeling mcc
Session i lab bioinfo dm and app mmc
Huber brin pb1_f2_poster_2012
Lab Gene Expression Data Analysis
Session ii g1 overview genomics and gene expression mmc-good
Visualization Tools
Ad

Similar to Session ii g1 lab genomics and gene expression mmc-corr (20)

PPTX
Session ii g3 overview behavior science mmc
PPTX
Gene Expression Lab Summary
PDF
Gene expression introduction
PDF
Golden Rules of Bioinformatics
PPT
20100509 bioinformatics kapushesky_lecture05_0
PDF
Bioinfornatics Practical Lab Manual For Biotech
PPTX
O.M.GSEA - An in-depth introduction to gene-set enrichment analysis
PDF
Pathway analysis 2012
PDF
call for papers, research paper publishing, where to publish research paper, ...
PPT
Microarray Data Analysis
PPTX
Data analysis patterns, tools and data types in genomics
PPTX
Measuring Gene Expression
PPTX
Cool Informatics Tools and Services for Biomedical Research
PPT
Slides_SB3.ppt
PPT
Slides_SB3.ppt
PPTX
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
PPTX
slides on Gene Expression Analysis with R:
PPT
Setac 2008 Genespring Ppt.
PPTX
High throughput Data Analysis
PDF
2018. gwas data cleaning
Session ii g3 overview behavior science mmc
Gene Expression Lab Summary
Gene expression introduction
Golden Rules of Bioinformatics
20100509 bioinformatics kapushesky_lecture05_0
Bioinfornatics Practical Lab Manual For Biotech
O.M.GSEA - An in-depth introduction to gene-set enrichment analysis
Pathway analysis 2012
call for papers, research paper publishing, where to publish research paper, ...
Microarray Data Analysis
Data analysis patterns, tools and data types in genomics
Measuring Gene Expression
Cool Informatics Tools and Services for Biomedical Research
Slides_SB3.ppt
Slides_SB3.ppt
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
slides on Gene Expression Analysis with R:
Setac 2008 Genespring Ppt.
High throughput Data Analysis
2018. gwas data cleaning

More from USD Bioinformatics (20)

PPTX
Clinical Application of RNA Sequencing - Bladder Cancer
PPTX
Clinical Application 1.0
PPTX
Clinical Application 2.0
PPTX
Bridge Amplification Part 2
PPTX
Bridge Amplification Part 1
PPTX
Basic Steps of the NGS Method
PPTX
True Single Molecule Sequencing
PPTX
Small Molecule Real Time Sequencing
PPTX
Sanger Dideoxy Method
PPTX
Pyrosequencing 454
PPTX
Ion Torrent Sequencing
PPTX
Next Generation Sequencing - the basics
PPTX
Illumina Sequencing
PPTX
Session ii g3 overview epidemiology modeling mmc
PPTX
Session ii g2 overview protein modeling mmc
PPTX
Session ii g2 overview chemical modeling mmc
PPTX
Session ii g2 lab modeling mmc
PPTX
Session i overview bioinfo dm and app mmc
PDF
Swiss model evaluation
PDF
Amino acid sequence
Clinical Application of RNA Sequencing - Bladder Cancer
Clinical Application 1.0
Clinical Application 2.0
Bridge Amplification Part 2
Bridge Amplification Part 1
Basic Steps of the NGS Method
True Single Molecule Sequencing
Small Molecule Real Time Sequencing
Sanger Dideoxy Method
Pyrosequencing 454
Ion Torrent Sequencing
Next Generation Sequencing - the basics
Illumina Sequencing
Session ii g3 overview epidemiology modeling mmc
Session ii g2 overview protein modeling mmc
Session ii g2 overview chemical modeling mmc
Session ii g2 lab modeling mmc
Session i overview bioinfo dm and app mmc
Swiss model evaluation
Amino acid sequence

Recently uploaded (20)

PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Empathic Computing: Creating Shared Understanding
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Machine learning based COVID-19 study performance prediction
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Approach and Philosophy of On baking technology
PDF
Electronic commerce courselecture one. Pdf
PDF
Network Security Unit 5.pdf for BCA BBA.
PPT
Teaching material agriculture food technology
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
cuic standard and advanced reporting.pdf
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Encapsulation theory and applications.pdf
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Diabetes mellitus diagnosis method based random forest with bat algorithm
Empathic Computing: Creating Shared Understanding
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Machine learning based COVID-19 study performance prediction
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Approach and Philosophy of On baking technology
Electronic commerce courselecture one. Pdf
Network Security Unit 5.pdf for BCA BBA.
Teaching material agriculture food technology
Reach Out and Touch Someone: Haptics and Empathic Computing
sap open course for s4hana steps from ECC to s4
Per capita expenditure prediction using model stacking based on satellite ima...
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
cuic standard and advanced reporting.pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Chapter 3 Spatial Domain Image Processing.pdf
Encapsulation theory and applications.pdf
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy

Session ii g1 lab genomics and gene expression mmc-corr

  • 1. Gene Expression Data Analysis (Microarray, NGS & qRT-PCR) Theme: Transcriptional Program in the Response of Human Fibroblasts to Serum. Lab #2 Etienne Z. Gnimpieba BRIN WS 2013 Mount Marty College – June 24th 2013 Etienne.gnimpieba@usd.edu
  • 2. Resolution Process Context Specification & Aims Lab #2 Statement of problem / Case study: The temporal program of gene expression during a model physiological response of human cells, the response of fibroblasts to serum, was explored with a complementary DNA microarray representing about 8600 different human genes. Genes could be clustered into groups on the basis of their temporal patterns of expression in this program. Many features of the transcriptional program appeared to be related to the physiology of wound repair, suggesting that fibroblasts play a larger and richer role in this complex multicellular response than had previously been appreciated. Gene Expression Data Analysis 16 Vishwanath R. Iyer, Scince, 1999 Aim: The purpose of this lab is to initiate on gene expression data analysis process. We simulated the application on “Transcriptional Program in the Response of Human Fibroblasts to Serum” . Now we can understand how a researcher can come to identify a significant expressed gene from microarray dataset. T1. Excel used in Genomics Objective: Use of basic excel functionalities to solve some gene expression data analysis needs Acquired skills - Gene expression data overview - Excel Used for genomics - Microarray data analysis using GEPAS - ArrayTrack for WorkBench driven gene expression analysis T1.1. Import your dataset in Excel T1.2. Pre-treat your dataset obtained a centered (mean=0) and scaled (stdv=1) dataset 2 Slide Scanning: : T3. Gene Expression and Analysis for APT13A2 Objective: Use of Gene Expression and Analysis Tools to discover more information about the gene APT13A2. T3.1. Gene Expression Profile Quering using Gene Atlas T3.2. Experimental Condition Driven Dataset Extraction in Array Express T3.3. Gene Expression data quick analysis using GEO2R T2. Workbench driven gene expression analysis (ArrayTrack) Objective: Workbench driven gene expression analysis using ArrayTrack T2.1. ArrayTrack overview T2.2. Descriptive statistic: Data Exploring in ArrayTrack T2.3. Accessing gene expression profiles using BarChart T2.4. Using SAM through ArrayTrack
  • 3. T1. Excel used in Genomics o Open Excel and go to the “Data” tab. Then click on “From Text” and select the text document with that contains the data you are studying. In this example select “fibroblasts_ori – Excel1”. o Click “Next”, “Next”, “Finish”, “ok” T1.1. Import your Data in Excel T1.2. Pre-treat your dataset to obtained a centered (mean=0) and scaling (stdv =1) dataset Centering data: o Go to A520 and type in “Average” o Select B520 cell, then go to the “Formulas” tab and click on “More Formulas”>”Statistical”>”Average”. Then enter the cells you want to average. (B2:B518) o Then go to cell B520 and hover over the lower right corner until a black plus sign appears. Then click and drag across so that the averages for the rest of the column will appear (This is a shortcut) Check to make sure the shortcut worked* o Be careful, for missing values (empty cells), replace empty contents by the NULL or NA string, in order to avoid introducing a zero value in Excel calculation in this cell. o Verify the value you get for the averages are zero. If it is not zero: o Select cell U2 and enter the equation: (=) The first cell (B2) (-) the average cell (B520) with a $ in between so it will not change when apply equation to the other cells (B$520) so the equation should look like (=B2-B$520) o Then apply to other cells in that column. (check to make sure it worked) o Then get the average of the new columns made and verify that they equal zero (or really close such as 4.29057E-16) Etienne Z. Gnimpieba BRIN WS 2013 Mount Marty College – June 24th 2013 Scaling data: o For each column (corresponding to one DNA microarray experiment), calculate the standard deviation value o Divide each experiment value by the corresponding standard deviation value o Once the calculation is over, verify that the standard deviation value of the column is equal to one, Objective: Use of basic excel functionalities to solve some gene expression data analysis needs Gene Expression Data Analysis
  • 4. • Frouin, V. & Gidrol, X. (2005) • CBB group (Berlin) Using ArrayTrackT2. ArrayTrack Etienne Z. Gnimpieba BRIN WS 2013 Mount Marty College – June 24th 2013 Objective: Workbench driven gene expression analysis using ArrayTrack T2.1. ArrayTrack overview o Microarray Database o Biological libraries (gene, pathway, protein, …) o Microarray and Systems Biology Tools (quality, normalisation, analysis, visualization) T2.2. Descriptive statistic: gene expression data Exploring in ArrayTrack o Principal Component Analysis (PCA) o Hierarchical Cluster Analysis (HCA) o Correlation Matrix o Scatter plot o K-means T2.3. Accessing gene expression profiles using BarChart and Venn Diagram o Select 4 gene List o Right click to view the venn diagram o Pathway level browsing of venn diagram sub-dataset o Gene ontology level browsing of venn diagram sub-dataset o Open BarChart from gene List. o Query and sort data table o Grouping data bars o Standard deviation view T2.4. Using SAM through ArrayTrack o Two class paired o Two class unpaired o One class o Multi class o Survival o One class timecourse o Two class unpaired timecourse o Two class paired timecourse Gene Expression Data Analysis
  • 5. Gene Expression Data Analysis Gene Expression and Analysis for APT13A2T3. Gene Expression and Analaysis On Gene Atlas website: http://www.ebi.ac.uk/gxa/ o Filter search by selecting “homo sapiens” for “organism”, and “H2O2” and “starvation” for “condition” o Then click on “Search Atlas” o Here you can click on the bottom right corner of the result that came up for more details. T3.1. Gene Expression Profile Quering using Gene Atlas Etienne Z. Gnimpieba BRIN WS 2013 T3.2. Experimental Condition Driven Dataset Extraction in Array Express On Array Express website: http://www.ebi.ac.uk/arrayexpress/ o Search “H2O2” o Filter by selecting “human” for “organism”, and “Array assay” instead of “all technologies”. Then click “Filter o Find and click on the experiment with the accession number E-GEOD-26143 T3.3. Gene Expression data quick analysis using GEO2R o Scroll down the webpage to the title “Link” and click on the link “GEO-GSE26143”. On this page you can see the details of the experiment. o Next scroll down the webpage and click on the link “Analyze with GEOR2” o Then group the different parts of the experiment. In this example we will group it by going to the end of the Titles and click on the “Treatment Group (Ch2)” so that is will go in order of that title. o Then we will separate in into 3 groups; Click and drag for group 1 (first 3), click on “define group” above the table, and name “Ble”, hit enter and click on “Ble” ; do the same for 6 and name “Gamma”, and the next 3 and name “Control”. o Click on the “Visualize distribution” tab at the bottom of the page and click “View” to see the result of the experiment in a box plot. o Click on the “Profile Graph” tab and search for the gene “5783” o Click on “GEOR2” Tab and click on “Top 250” genes that showed the most differentially expressed genes. o Here you can search for the gene “5783” again, as well as save the results. Objective: Use of Gene Expression and Analysis Tools to discover more information about the gene APT13A2.