Applied Statistics
Part 1
By
M. H. Farjoo MD, PhD
Shahid Beheshti University of Medical Sciences
Instagram: @bio_animation
Applied Statistics
part 1
 Introduction
 Normal (Gaussian) Distribution
 Standard Deviation
 Standard Error of the Mean
 Confidence Interval of the Mean
Which one do you hate most?
Cockroaches or statistics?
Introduction
 There are three kinds of lies: Lies, Damn Lies, and Statistics!
(Mark Twain)
 I can prove anything by statistics - except the truth. (George
Canning)
 It is true we can tell a lie easily with statistics, yet is easier to
lie without it!
 It is for deciding about a population with as accuracy as
possible.
 … and no one claims it is error free.
Applied statistics part 1
Applied statistics part 1
Applied statistics part 1
Applied statistics part 1
Introduction (Cont,d)
 By statistics we try to extrapolate variables from
“sample” to “population”.
 Kinds of variables:
 Categorical, Nominal, qualitative
 Measurement, Numeric, Quantitative
 Continuous
 Discrete
 Ordinal (differences between values are not the same)
 Interval (differences between values are the same)
 Ratio (similar to interval, but also has a clear 0.0)
Normal (Gaussian) Distribution
 The experimenters think it can be proved by
mathematics; and the mathematicians, believe it has
been established by observation. (W. Lippmann)
 Named after Carl Friedrich Gauss, 19th century
German mathematician.
 It underlies the assumption of many statistical tests.
 The distribution emerges when many independent
random factors act in an additive manner to create
variability.
Normal (Gaussian) Distribution (Cont,d)
μ (mu) is the mean
of the population
σ (sigma) is the
standard deviation
of the population
Normal (Gaussian) Distribution (Cont,d)
Normal (Gaussian) Distribution (Cont,d)
10 ml pipetting, 1 time, repeated 1000 time
Normal (Gaussian) Distribution (Cont,d)
10 ml pipetting, 2 time, repeated 1000 time
Normal (Gaussian) Distribution (Cont,d)
10 ml pipetting, 10 times, repeated 1000 time
Normal (Gaussian) Distribution (Cont,d)
10 ml pipetting, 10 time, repeated 15,000 times
Standard Deviation (SD)
Standard Deviation (SD)
Standard Deviation (SD)
Standard Deviation (SD)
Standard Deviation (SD)
 Standard deviation (SD) is the variability or scatter of
the numbers compared to their mean.
 SD is a number that tells you how far numbers are
from their mean.
 Obviously, the higher is SD, the more observations
needed to make results reasonable.
 SD of a sample (s) is always greater than the SD of
its related population (σ). why?
Standard Deviation (SD)
 The data may have the same mean yet their pattern is
different.
 The data may have the same mean and SD, yet their
pattern is different!
 The unit of SD is similar to the unit of the variable in
question.
 This makes its interpretation easier, compared to the
variance.
 What is variance and Is it useful? Variance is SD
squared and it is not useful!
Standard Deviation (SD)
Standard Deviation (SD)
Standard Deviation (SD)
Hands-on practice
 To calculate Mean and SD in Excel:
 For SD of a sample: =STDEV.S(number1,[number2],...)
 For Mean of a sample: =AVERAGE(number1,[number2],...)
 To calculate Mean and SD in SPSS:
 Analyze => Descriptive Statistics => Frequencies => Statistics =>
Mean & Standard Deviation check boxes
 Analyze => Descriptive Statistics => Descriptives => Options =>
Mean & Standard Deviation check boxes
 Analyze => Descriptive Statistics => Explore => Statistics =>
Descriptive check box
 To calculate Mean and SD in Prism:
 Analyze => Column statistics => Mean, SD, SEM check box
Standard Error of The Mean (SEM)
Individual observations (X's) and means (red dots) for random
samples from a population with a parametric mean of 5 (blue line)
Standard Error of The Mean (SEM)
Standard Error of The Mean (SEM)
Means ±1 standard deviation of 100 random samples (N=3) from a
population with a parametric mean of 5 (blue line).
Note that there are 100 means, NOT 100 observations. The “X”s represent
the mean of 3 observations.
The calculated standard
deviation of 100 sample
means is 0.63
Standard Error of The Mean (SEM)
Means ±1 standard error of 100 random samples (N=3) from a population
with a parametric mean of 5 (blue line).
Note that there are 100 means, NOT 100 observations. The “X”s represent
the mean of 3 observations.
Standard Error of The Mean (SEM)
Means ±1 standard error of 100 random samples (N=20) from a population
with a parametric mean of 5 (blue line).
Note that there are 100 means, NOT 100 observations. The “X”s represent
the mean of 20 observations.
Standard Error of The Mean (SEM)
 The SEM quantifies how precisely you know the true
mean of the population.
 It is a measure of how far your sample mean is likely to
be from the true population mean.
 SEM =
𝑆𝐷
𝑁
 The higher is SD, the less precise is your estimation of
the population mean.
 The SEM is always smaller than the SD. Why?
Standard Error of The Mean (SEM)
Hands-on practice
 To calculate SE in Excel:
 For SE of a sample: =STDEV/SQRT(COUNT(sampling
range))
 To calculate SE in SPSS:
 Analyze => Descriptive Statistics => Frequencies =>
Statistics => S.E. mean check box
 Analyze => Descriptive Statistics => Descriptives => Options
=> S.E. mean check box
 To calculate SE in Prism:
 Analyze => Column statistics => Mean, SD, SEM check box
Confidence Interval (CI)
Confidence Interval (CI)
Confidence Interval (CI)
 Statistically and mathematically CI and SEM are different,
but conceptually and practically they serve the same
purpose.
 How sure are you? confidence interval (CI) is the way to
answer this.
 The CI of a mean tells you how precisely you have
determined the mean.
 SEM is the probability about the difference between the
mean of the population and the mean of the sample.
 CI is directly related to SEM, so if CI includes ZERO, it
means: NO difference!
Confidence Interval (CI)
The X represent the mean of the sample, and the bars represent the SEM of
the sample. The red dot and red bars are samples which do not include the
mean of the population. the blue line is the mean of the population (which
we do NOT know).
Confidence Interval (CI)
The X represent the mean of the sample, and the bars represent the SEM of
the sample. The red dot and red bars are samples which do not include the
mean of the population. the blue line is the mean of the population (which
we do NOT know).
Confidence Interval (CI)
 It is globally accepted to calculate 95% CI.
 A 95% CI is a range that you can be 95% certain
contains the true mean of the population.
 Don't misinterpret CI as the range that contains 95%
of the values!
 Is it possible that the CI of a mean does not include
the true mean?
The graph shows three samples (of different size) all sampled
from the same population
Confidence Interval (CI) 30
ten sets of data (N=5), from a Gaussian distribution
with a mean of 100 and a standard deviation of 35
Confidence Interval (CI)
95% CI of the mean for each sample.
Confidence Interval (CI)
Confidence Interval (CI)
 A common rule-of-thumb is that the 95% CI is
computed from the mean ± 2 SEMs.
 So you may roughly double the size of the SEM error
bars, to represent them as CI error bars. Why?
 With large samples, the rule is accurate, with small
ones, the CI is much wider than anticipated by this
rule.
Confidence Interval (CI)
Confidence Interval (CI)
Because for calculating CI, the constant of SE is 1.96 (almost 2)
Teacher!
I do NOT like
formulas
Confidence Interval (CI)
 We can express the precision of any computed value
as a 95% CI (eg: CI of a slope for the best-fit value,
CI of SD).
 There is a myth that when two means have
overlapping CIs, the means are not significantly
different.
 Another version is: if each mean is outside the CI of
another mean, the means are significantly different.
 Neither of these is true!
Confidence Interval (CI)
 It is easy for two sets of data to have overlapping CIs,
yet still be significantly different.
 Conversely, each mean can be outside the confidence
interval of the other, yet they're still not significantly
different.
 Do not compare two means by visually comparing
their confidence intervals, just use the correct
statistical test.
Confidence Interval (CI)
 The error bars may be asymmetrical.
 This is especially true with nominal variables eg: the
number of cigarettes smoked, or the number of color
blind men.
 In these cases a zero or negative number makes no sense.
 We know this because there are some occurrences of the
variable in the population.
 The calculation method of CI is different if this is the
case.
Confidence Interval (CI)
Hands-on practice
 To calculate CI in Excel:
 For normal distribution:
=CONFIDENCE.NORM(alpha,standard_dev,size)
 To calculate CI in SPSS:
 Analyze => Descriptive Statistics => Explore => Statistics =>
Descriptive check box
 Analyze => Compare means => One sample T Test =>
Options
 To calculate CI in Prism:
 Analyze => Column statistics => CI of the Mean check box
Applied statistics part 1
Thank you
Any question?
1.6: Introduction to Plots
• A plot(graphs) is a graphical technique for
representing a data set.
• Graphs are a visual representation of the
variables and relationship between
variables.
• Plots are very useful for humans who can
quickly derive an understanding which
would not come from lists of values.
Bar Chart
Clustered Bar Chart
Pie Chart
Histogram
Area Chart
Box Plot
Applied statistics part 1
Charts
Pie Chart
6/23/2009 Arsia Jamali-Students' Scientific
Research Center
60
Disrtribution of Stage of the
Pancreatic Cancer In Patients
IV
III
II
I
Bar Chart
Disrtribution of Stage of the Pancreatic
Cancer In Patients
0
20
40
60
80
100
120
IV III II I
IV
III
II
I
6/23/2009 Arsia Jamali-Students' Scientific
Research Center
61
Charts
Histogram
6/23/2009
Arsia Jamali-Students' Scientific
Research Center
61
Area
6/23/2009 Arsia Jamali-Students' Scientific
Research Center
62
6/23/2009
Arsia Jamali-Students' Scientific
Research Center
62
Charts
Box Plot
6/23/2009
Arsia Jamali-Students' Scientific
Research Center
62
Error Bar
6/23/2009 Arsia Jamali-Students' Scientific
Research Center
63
Charts
Clustered Bar
6/23/2009
Arsia Jamali-Students' Scientific
Research Center
63
Scatter Plot
0
500
1000
1500
2000
2500
3000
3500
4000
4500
0 10 20 30 40 50
Birth
Weight
Gestational Week
Disrtribution of Stage of the Pancreatic Cancer According
To Age In Patients
0
10
20
30
40
50
60
70
IV III II I
105 80 20 10
Stage
Number
of
The
Patients
Male
Female
Significance of Cluster Bar
0%
5%
10%
15%
20%
25%
30%
Stage I Stage II Stage
III
Stage
IV
Distributaion of Stage
in Pancreatic Cancer
Pateints
0%
10%
20%
30%
40%
50%
60%
Male Female
Distribuation of Gender
in Pancreatic Cancer
Pateints
Arsia Jamali-Students' Scientific
Research Center
64
6/23/2009
Significance of Cluster Bar
0.0%
10.0%
20.0%
30.0%
40.0%
50.0%
60.0%
70.0%
80.0%
90.0%
Stage I Stage II Stage III Stage IV
Male
Female
Arsia Jamali-Students' Scientific
Research Center
3/5/2022 65

More Related Content

PPT
Basics of statistics
PDF
Foundations of Statistics for Ecology and Evolution. 2. Hypothesis Testing
PPTX
Hypothesis testing Part1
PPT
Introduction to t-tests (statistics)
PPT
Critical Value and The P Value
PPT
Hypothesis Testing
Basics of statistics
Foundations of Statistics for Ecology and Evolution. 2. Hypothesis Testing
Hypothesis testing Part1
Introduction to t-tests (statistics)
Critical Value and The P Value
Hypothesis Testing

What's hot (20)

PPT
Lecture2 hypothesis testing
PPT
Hypothesis Testing in Six Sigma
PDF
P value part 1
PDF
Testing hypothesis
PPT
Research methodology - Estimation Theory & Hypothesis Testing, Techniques of ...
PDF
Multiplicity, how to deal with the testing of more than one hypothesis.
PPTX
Statistical Analysis with R- III
PPTX
Hypothesis testing
PDF
Population and sample mean
PDF
Chapter 6 part2-Introduction to Inference-Tests of Significance, Stating Hyp...
PPT
Hypothesis testing and p-value, www.eyenirvaan.com
PPTX
Machine learning session2
PDF
Hypothesis testing
PPT
Basis of statistical inference
PPTX
What is a Single Sample Z Test?
PDF
Statistical inference: Statistical Power, ANOVA, and Post Hoc tests
PPS
Chap#9 hypothesis testing (3)
PPTX
Hypothesis testing and p values 06
PDF
Hypothesis Testing
PPTX
Presentation on Hypothesis Test by Ashik Amin Prem
Lecture2 hypothesis testing
Hypothesis Testing in Six Sigma
P value part 1
Testing hypothesis
Research methodology - Estimation Theory & Hypothesis Testing, Techniques of ...
Multiplicity, how to deal with the testing of more than one hypothesis.
Statistical Analysis with R- III
Hypothesis testing
Population and sample mean
Chapter 6 part2-Introduction to Inference-Tests of Significance, Stating Hyp...
Hypothesis testing and p-value, www.eyenirvaan.com
Machine learning session2
Hypothesis testing
Basis of statistical inference
What is a Single Sample Z Test?
Statistical inference: Statistical Power, ANOVA, and Post Hoc tests
Chap#9 hypothesis testing (3)
Hypothesis testing and p values 06
Hypothesis Testing
Presentation on Hypothesis Test by Ashik Amin Prem
Ad

Similar to Applied statistics part 1 (20)

PPT
Mpu 1033 Kuliah 9
PPT
Sampling Distributions
PPT
Unit 3: Sampling Distributions, Parameter and Parameter Estimates
PPTX
RSS Hypothessis testing
PDF
Basics in Epidemiology & Biostatistics 2 RSS6 2014
PPT
Sampling Size
PPT
Statistice Chapter 02[1]
PPTX
Standard Error (SE) is a fundamental concept in statistics u.pptx
DOCX
Confidence Interval ModuleOne of the key concepts of statist.docx
DOCX
Confidence Intervals in the Life Sciences PresentationNamesS.docx
PPTX
Lecture 7 Sample Size and CI.pptxtc5c5kyso6xr6x
PDF
Normal and standard normal distribution
PPTX
Sampling Distribution
PPTX
Confidence interval & probability statements
PPT
Chapter 11
PPT
Mca admission in india
DOCX
35881 DiscussionNumber of Pages 1 (Double Spaced)Number o.docx
PPT
Statistics and Public Health. Curso de Inglés Técnico para profesionales de S...
PDF
Lec 5 statistical intervals
PPT
Review of Chapters 1-5.ppt
Mpu 1033 Kuliah 9
Sampling Distributions
Unit 3: Sampling Distributions, Parameter and Parameter Estimates
RSS Hypothessis testing
Basics in Epidemiology & Biostatistics 2 RSS6 2014
Sampling Size
Statistice Chapter 02[1]
Standard Error (SE) is a fundamental concept in statistics u.pptx
Confidence Interval ModuleOne of the key concepts of statist.docx
Confidence Intervals in the Life Sciences PresentationNamesS.docx
Lecture 7 Sample Size and CI.pptxtc5c5kyso6xr6x
Normal and standard normal distribution
Sampling Distribution
Confidence interval & probability statements
Chapter 11
Mca admission in india
35881 DiscussionNumber of Pages 1 (Double Spaced)Number o.docx
Statistics and Public Health. Curso de Inglés Técnico para profesionales de S...
Lec 5 statistical intervals
Review of Chapters 1-5.ppt
Ad

More from Mohammad Hadi Farjoo MD, PhD, Shahid behehsti University of Medical Sciences (20)

PPTX
Agents used in anemias hematopoietic growth factors
Agents used in anemias hematopoietic growth factors

Recently uploaded (20)

PPT
DU, AIS, Big Data and Data Analytics.ppt
PPTX
SET 1 Compulsory MNH machine learning intro
PPT
statistic analysis for study - data collection
PPTX
CHAPTER-2-THE-ACCOUNTING-PROCESS-2-4.pptx
PDF
Best Data Science Professional Certificates in the USA | IABAC
PDF
Session 11 - Data Visualization Storytelling (2).pdf
PPT
lectureusjsjdhdsjjshdshshddhdhddhhd1.ppt
PDF
Tetra Pak Index 2023 - The future of health and nutrition - Full report.pdf
PDF
Jean-Georges Perrin - Spark in Action, Second Edition (2020, Manning Publicat...
PDF
Votre score augmente si vous choisissez une catégorie et que vous rédigez une...
PDF
ahaaaa shbzjs yaiw jsvssv bdjsjss shsusus s
PPTX
The Data Security Envisioning Workshop provides a summary of an organization...
PDF
Global Data and Analytics Market Outlook Report
PPTX
eGramSWARAJ-PPT Training Module for beginners
PPTX
MBA JAPAN: 2025 the University of Waseda
PPTX
FMIS 108 and AISlaudon_mis17_ppt_ch11.pptx
PPT
statistics analysis - topic 3 - describing data visually
PPTX
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
PPTX
Tapan_20220802057_Researchinternship_final_stage.pptx
PDF
©️ 02_SKU Automatic SW Robotics for Microsoft PC.pdf
DU, AIS, Big Data and Data Analytics.ppt
SET 1 Compulsory MNH machine learning intro
statistic analysis for study - data collection
CHAPTER-2-THE-ACCOUNTING-PROCESS-2-4.pptx
Best Data Science Professional Certificates in the USA | IABAC
Session 11 - Data Visualization Storytelling (2).pdf
lectureusjsjdhdsjjshdshshddhdhddhhd1.ppt
Tetra Pak Index 2023 - The future of health and nutrition - Full report.pdf
Jean-Georges Perrin - Spark in Action, Second Edition (2020, Manning Publicat...
Votre score augmente si vous choisissez une catégorie et que vous rédigez une...
ahaaaa shbzjs yaiw jsvssv bdjsjss shsusus s
The Data Security Envisioning Workshop provides a summary of an organization...
Global Data and Analytics Market Outlook Report
eGramSWARAJ-PPT Training Module for beginners
MBA JAPAN: 2025 the University of Waseda
FMIS 108 and AISlaudon_mis17_ppt_ch11.pptx
statistics analysis - topic 3 - describing data visually
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
Tapan_20220802057_Researchinternship_final_stage.pptx
©️ 02_SKU Automatic SW Robotics for Microsoft PC.pdf

Applied statistics part 1

  • 1. Applied Statistics Part 1 By M. H. Farjoo MD, PhD Shahid Beheshti University of Medical Sciences Instagram: @bio_animation
  • 2. Applied Statistics part 1  Introduction  Normal (Gaussian) Distribution  Standard Deviation  Standard Error of the Mean  Confidence Interval of the Mean
  • 3. Which one do you hate most? Cockroaches or statistics?
  • 4. Introduction  There are three kinds of lies: Lies, Damn Lies, and Statistics! (Mark Twain)  I can prove anything by statistics - except the truth. (George Canning)  It is true we can tell a lie easily with statistics, yet is easier to lie without it!  It is for deciding about a population with as accuracy as possible.  … and no one claims it is error free.
  • 9. Introduction (Cont,d)  By statistics we try to extrapolate variables from “sample” to “population”.  Kinds of variables:  Categorical, Nominal, qualitative  Measurement, Numeric, Quantitative  Continuous  Discrete  Ordinal (differences between values are not the same)  Interval (differences between values are the same)  Ratio (similar to interval, but also has a clear 0.0)
  • 10. Normal (Gaussian) Distribution  The experimenters think it can be proved by mathematics; and the mathematicians, believe it has been established by observation. (W. Lippmann)  Named after Carl Friedrich Gauss, 19th century German mathematician.  It underlies the assumption of many statistical tests.  The distribution emerges when many independent random factors act in an additive manner to create variability.
  • 11. Normal (Gaussian) Distribution (Cont,d) μ (mu) is the mean of the population σ (sigma) is the standard deviation of the population
  • 13. Normal (Gaussian) Distribution (Cont,d) 10 ml pipetting, 1 time, repeated 1000 time
  • 14. Normal (Gaussian) Distribution (Cont,d) 10 ml pipetting, 2 time, repeated 1000 time
  • 15. Normal (Gaussian) Distribution (Cont,d) 10 ml pipetting, 10 times, repeated 1000 time
  • 16. Normal (Gaussian) Distribution (Cont,d) 10 ml pipetting, 10 time, repeated 15,000 times
  • 21. Standard Deviation (SD)  Standard deviation (SD) is the variability or scatter of the numbers compared to their mean.  SD is a number that tells you how far numbers are from their mean.  Obviously, the higher is SD, the more observations needed to make results reasonable.  SD of a sample (s) is always greater than the SD of its related population (σ). why?
  • 22. Standard Deviation (SD)  The data may have the same mean yet their pattern is different.  The data may have the same mean and SD, yet their pattern is different!  The unit of SD is similar to the unit of the variable in question.  This makes its interpretation easier, compared to the variance.  What is variance and Is it useful? Variance is SD squared and it is not useful!
  • 25. Standard Deviation (SD) Hands-on practice  To calculate Mean and SD in Excel:  For SD of a sample: =STDEV.S(number1,[number2],...)  For Mean of a sample: =AVERAGE(number1,[number2],...)  To calculate Mean and SD in SPSS:  Analyze => Descriptive Statistics => Frequencies => Statistics => Mean & Standard Deviation check boxes  Analyze => Descriptive Statistics => Descriptives => Options => Mean & Standard Deviation check boxes  Analyze => Descriptive Statistics => Explore => Statistics => Descriptive check box  To calculate Mean and SD in Prism:  Analyze => Column statistics => Mean, SD, SEM check box
  • 26. Standard Error of The Mean (SEM)
  • 27. Individual observations (X's) and means (red dots) for random samples from a population with a parametric mean of 5 (blue line) Standard Error of The Mean (SEM)
  • 28. Standard Error of The Mean (SEM) Means ±1 standard deviation of 100 random samples (N=3) from a population with a parametric mean of 5 (blue line). Note that there are 100 means, NOT 100 observations. The “X”s represent the mean of 3 observations. The calculated standard deviation of 100 sample means is 0.63
  • 29. Standard Error of The Mean (SEM) Means ±1 standard error of 100 random samples (N=3) from a population with a parametric mean of 5 (blue line). Note that there are 100 means, NOT 100 observations. The “X”s represent the mean of 3 observations.
  • 30. Standard Error of The Mean (SEM) Means ±1 standard error of 100 random samples (N=20) from a population with a parametric mean of 5 (blue line). Note that there are 100 means, NOT 100 observations. The “X”s represent the mean of 20 observations.
  • 31. Standard Error of The Mean (SEM)  The SEM quantifies how precisely you know the true mean of the population.  It is a measure of how far your sample mean is likely to be from the true population mean.  SEM = 𝑆𝐷 𝑁  The higher is SD, the less precise is your estimation of the population mean.  The SEM is always smaller than the SD. Why?
  • 32. Standard Error of The Mean (SEM) Hands-on practice  To calculate SE in Excel:  For SE of a sample: =STDEV/SQRT(COUNT(sampling range))  To calculate SE in SPSS:  Analyze => Descriptive Statistics => Frequencies => Statistics => S.E. mean check box  Analyze => Descriptive Statistics => Descriptives => Options => S.E. mean check box  To calculate SE in Prism:  Analyze => Column statistics => Mean, SD, SEM check box
  • 35. Confidence Interval (CI)  Statistically and mathematically CI and SEM are different, but conceptually and practically they serve the same purpose.  How sure are you? confidence interval (CI) is the way to answer this.  The CI of a mean tells you how precisely you have determined the mean.  SEM is the probability about the difference between the mean of the population and the mean of the sample.  CI is directly related to SEM, so if CI includes ZERO, it means: NO difference!
  • 36. Confidence Interval (CI) The X represent the mean of the sample, and the bars represent the SEM of the sample. The red dot and red bars are samples which do not include the mean of the population. the blue line is the mean of the population (which we do NOT know).
  • 37. Confidence Interval (CI) The X represent the mean of the sample, and the bars represent the SEM of the sample. The red dot and red bars are samples which do not include the mean of the population. the blue line is the mean of the population (which we do NOT know).
  • 38. Confidence Interval (CI)  It is globally accepted to calculate 95% CI.  A 95% CI is a range that you can be 95% certain contains the true mean of the population.  Don't misinterpret CI as the range that contains 95% of the values!  Is it possible that the CI of a mean does not include the true mean?
  • 39. The graph shows three samples (of different size) all sampled from the same population Confidence Interval (CI) 30
  • 40. ten sets of data (N=5), from a Gaussian distribution with a mean of 100 and a standard deviation of 35 Confidence Interval (CI)
  • 41. 95% CI of the mean for each sample. Confidence Interval (CI)
  • 42. Confidence Interval (CI)  A common rule-of-thumb is that the 95% CI is computed from the mean ± 2 SEMs.  So you may roughly double the size of the SEM error bars, to represent them as CI error bars. Why?  With large samples, the rule is accurate, with small ones, the CI is much wider than anticipated by this rule.
  • 44. Confidence Interval (CI) Because for calculating CI, the constant of SE is 1.96 (almost 2) Teacher! I do NOT like formulas
  • 45. Confidence Interval (CI)  We can express the precision of any computed value as a 95% CI (eg: CI of a slope for the best-fit value, CI of SD).  There is a myth that when two means have overlapping CIs, the means are not significantly different.  Another version is: if each mean is outside the CI of another mean, the means are significantly different.  Neither of these is true!
  • 46. Confidence Interval (CI)  It is easy for two sets of data to have overlapping CIs, yet still be significantly different.  Conversely, each mean can be outside the confidence interval of the other, yet they're still not significantly different.  Do not compare two means by visually comparing their confidence intervals, just use the correct statistical test.
  • 47. Confidence Interval (CI)  The error bars may be asymmetrical.  This is especially true with nominal variables eg: the number of cigarettes smoked, or the number of color blind men.  In these cases a zero or negative number makes no sense.  We know this because there are some occurrences of the variable in the population.  The calculation method of CI is different if this is the case.
  • 48. Confidence Interval (CI) Hands-on practice  To calculate CI in Excel:  For normal distribution: =CONFIDENCE.NORM(alpha,standard_dev,size)  To calculate CI in SPSS:  Analyze => Descriptive Statistics => Explore => Statistics => Descriptive check box  Analyze => Compare means => One sample T Test => Options  To calculate CI in Prism:  Analyze => Column statistics => CI of the Mean check box
  • 51. 1.6: Introduction to Plots • A plot(graphs) is a graphical technique for representing a data set. • Graphs are a visual representation of the variables and relationship between variables. • Plots are very useful for humans who can quickly derive an understanding which would not come from lists of values.
  • 59. Charts Pie Chart 6/23/2009 Arsia Jamali-Students' Scientific Research Center 60 Disrtribution of Stage of the Pancreatic Cancer In Patients IV III II I Bar Chart Disrtribution of Stage of the Pancreatic Cancer In Patients 0 20 40 60 80 100 120 IV III II I IV III II I
  • 60. 6/23/2009 Arsia Jamali-Students' Scientific Research Center 61 Charts Histogram 6/23/2009 Arsia Jamali-Students' Scientific Research Center 61 Area
  • 61. 6/23/2009 Arsia Jamali-Students' Scientific Research Center 62 6/23/2009 Arsia Jamali-Students' Scientific Research Center 62 Charts Box Plot 6/23/2009 Arsia Jamali-Students' Scientific Research Center 62 Error Bar
  • 62. 6/23/2009 Arsia Jamali-Students' Scientific Research Center 63 Charts Clustered Bar 6/23/2009 Arsia Jamali-Students' Scientific Research Center 63 Scatter Plot 0 500 1000 1500 2000 2500 3000 3500 4000 4500 0 10 20 30 40 50 Birth Weight Gestational Week Disrtribution of Stage of the Pancreatic Cancer According To Age In Patients 0 10 20 30 40 50 60 70 IV III II I 105 80 20 10 Stage Number of The Patients Male Female
  • 63. Significance of Cluster Bar 0% 5% 10% 15% 20% 25% 30% Stage I Stage II Stage III Stage IV Distributaion of Stage in Pancreatic Cancer Pateints 0% 10% 20% 30% 40% 50% 60% Male Female Distribuation of Gender in Pancreatic Cancer Pateints Arsia Jamali-Students' Scientific Research Center 64 6/23/2009
  • 64. Significance of Cluster Bar 0.0% 10.0% 20.0% 30.0% 40.0% 50.0% 60.0% 70.0% 80.0% 90.0% Stage I Stage II Stage III Stage IV Male Female Arsia Jamali-Students' Scientific Research Center 3/5/2022 65