SlideShare a Scribd company logo
Standard Error Of Proportion,
Difference Of Mean And
Difference Of Proportion, t test,
chi square test, anova
Dr. Syed Razi Haider Zaidi
Associate Professor Community Medicine,
SIMS, Lahore.
Standard error of proportion
• The standard error of a proportion is a statistic indicating how greatly
a particular sample proportion is likely to differ from the proportion in
the population proportion, p. Let p^ represent a proportion observed
in a sample.
• sep = sqrt (p^q^/n)
• q = 1 - p, and n represents the sample size.
• For example, if 47 of the 300 residents in the sample supported the
use of covid vaccine, the sample proportion, p would be calculated as
47 / 300 = 0.157.
• This means our best estimate for the proportion of residents in the
population who supported the law would be 0.157.
• However, there’s no guarantee that this estimate will exactly match
the true population proportion so we typically calculate the standard
error of the proportion as well.
This is calculated as:
Standard Error of the Proportion Formula:
Standard Error = √ (1- ) / n
p
̂ p
̂
For example, if = 0.157 and n = 300, then we would calculate
p
̂
the standard error of the proportion as:
Standard error of the proportion = √.157(1-.157) / 300 = 0.021
We then typically use this standard error to calculate a confidence
interval for the true proportion of residents who
support the covid vaccine.
This is calculated as:
Confidence Interval for a Population Proportion Formula:
Confidence Interval = p
̂ +/- z*√ (1- ) / n
p
̂ p
̂
Looking at this formula, it’s easy to see that
the larger the standard error of the proportion,
the wider the confidence interval.
Note that the z in the formula is the z-value that corresponds to
popular confidence level choices:
For example, here’s how to calculate a 95% confidence interval
for the true proportion of residents in the city who support the
new law
Confidence Level z-value
0.90 1.645
0.95 1.96
0.99 2.58
:
• For example, here’s how to calculate a 95% confidence interval for the
true proportion of residents in the city who support the new vaccines:
• 95% C.I. = +/- z*√ (1- ) / n
p̂ p̂ p̂
• 95% C.I. = .157 +/- 1.96*√.157(1-.157) / 300
• 95% C.I. = .157 +/- 1.96*(.021)
• 95% C.I. = [ .10884 , .19816]
• 2) The proportion of blood group A among Indians is 30%. In a batch of 100 individuals if it is
observed as 25%, what is your conclusion about the group?
• Ans- Given values n= 100, p= proportion of blood group A in sample =25%, q=100-p= 75% P=
proportion of blood group A in Indian population =30%
• H0: The sample is drawn from Indian population with population proportion of blood group A, P =
30%
• H1: The sample is not drawn from Indian population with population proportion of blood group A, P
≠ 30%
• Z-test for proportion:- Z= − / ( ) ,
𝑝 𝑃 𝑆𝐸 𝑃
• SE(P)=sqrt / = 30 70 /100 = sqrt21 =4.58
𝑃∗𝑄 𝑛 ∗
• Z= 25−30 / 4.58 = 1.09
• 1.09 < 1.96
• Here Cal Z < 1.96 hence accept the null hypothesis, The sample is drawn from Indian population. 6
Hypothesis
• In Statistics, a hypothesis is defined as a formal statement, which gives the
explanation about the relationship between the two or more variables of the
specified population. It helps the researcher to translate the given problem to a
clear explanation for the outcome of the study. It clearly explains and predicts
the expected outcome.
• Null Hypothesis
• In the null hypothesis, there is no significant difference between the populations
specified in the experiments. The null hypothesis is denoted by H0.
• Alternative Hypothesis
• In an alternative hypothesis, there is a difference between populations specified.
It is denoted by the Ha or H1.
Hypothesis testing
• Hypothesis testing is used to assess the plausibility of a hypothesis by
using sample data.
• The test provides evidence concerning the plausibility of the
hypothesis, given the data.
• Statistical analysts test a hypothesis by measuring and examining a
random sample of the population being analyzed.
• The four steps of hypothesis testing include stating the hypotheses,
formulating an analysis plan, analyzing the sample data, and analyzing
the result.
• All analysts use a random population sample to test two different
hypotheses: the null hypothesis and the alternative hypothesis.
• The null hypothesis is usually a hypothesis of equality between
population parameters; e.g., a null hypothesis may state that the
population mean return is equal to zero. The alternative hypothesis is
effectively the opposite of a null hypothesis. Thus, they are
mutually exclusive, and only one can be true. However, one of the
two hypotheses will always be true.
• The null hypothesis is a statement about a population parameter,
such as the population mean, that is assumed to be true
Type I and Type II Errors
• When a statistical hypothesis is tested, there are 4 possible results:
(1)The hypothesis is true but our test accepts it.
• (2)The hypothesis is false but our test rejects it.
• (3)The hypothesis is true but our test rejects it.
• (4)The hypothesis is false but our test accepts it.
• Obviously, the last 2 possibilities lead to errors. Rejecting a null
hypothesis when it is true is called a Type I error. Accepting a null
hypothesis when it is false is called Type II error.
Test of significance
• We need to run a test of significance to reach value of p
• Z test
• Chi square
• t test
• Anova
• Etc etc
What is Statistical Significance?
• In Statistics, “significance” means “not by chance” or “probably true”. We can say
that if a statistician declares that some result is “highly significant”, then he
indicates by stating that it might be very probably true. It does not mean that
the result is highly significant, but it suggests that it is highly probable.
• Level of Significance Definition
• The level of significance is defined as the fixed probability of wrong elimination
of null hypothesis when in fact, it is true. The level of significance is stated to be
the probability of type I error (rejecting null when it is true)and is preset by the
researcher with the outcomes of error. The level of significance is the
measurement of the statistical significance. It defines whether the null
hypothesis is assumed to be accepted or rejected. It is expected to identify if the
result is statistically significant for the null hypothesis to be false or rejected.
• Level of Significance Symbol
• The level of significance is denoted by the Greek symbol α (alpha). Therefore, the level of
significance is defined as follows:
• Significance Level = p (type I error) = α
• The values or the observations are less likely when they are farther than the mean. The results
are written as “significant at x%”.
• Example: The value significant at 5% refers to p-value is less than 0.05 or p < 0.05. Similarly,
significant at the 1% means that the p-value is less than 0.01.
• The level of significance is taken at 0.05 or 5%. When the p-value is low, it means that the
recognised values are significantly different from the population value that was hypothesised in
the beginning. The p-value is said to be more significant if it is as low as possible. Also, the result
would be highly significant if the p-value is very less. But, most generally, p-values smaller than
0.05 are known as significant, since getting a p-value less than 0.05 is quite a less practice.
• How to Find the Level of Significance?
• To measure the level of statistical significance of the result, the investigator first
needs to calculate the p-value. It defines the probability of identifying an effect
which provides that the null hypothesis is true. When the p-value is less than the
level of significance (α), the null hypothesis is rejected. If the p-value so
observed is not less than the significance level α, then theoretically null
hypothesis is accepted. Level of significance is kept generally at 0.05.
• If p > 0.01 and p ≤ 0.05, then there must be a strong assumption about the null
hypothesis.
• If p ≤ 0.01, then a very strong assumption about the null hypothesis is indicated.
standar error of proporton hypthesis testing standard error of diff t test.pptx
Hypothesis testing pearl of wisdom
• There are 5 main steps in hypothesis testing:
• State your research hypothesis as a null hypothesis and alternate
hypothesis (Ho) and (Ha or H1).
• Collect data in a way designed to test the hypothesis.
• Perform an appropriate statistical test.(e.g z test, t test, chi square,
anova etc)
• Decide whether to reject or fail to reject your null hypothesis.
• Present the findings in your results and discussion section.
Test of significance
• We need to run a test of significance to reach value of p
• Z test
• Chi square
• t test
• Anova
• Etc etc
• Based on the outcome of your statistical test, you will have to decide
whether to reject or fail to reject your null hypothesis.
• In most cases you will use the p-value generated by your statistical
test to guide your decision. And in most cases, your predetermined
level of significance for rejecting the null hypothesis will be 0.05 –
that is, when there is a less than 5% chance that you would see these
results if the null hypothesis were true.
Standard error difference between two proportions (Z test)
Ho:- There is no significant difference between two population
proportion P1= P2
• H1:- There is significant difference between two population
proportion P1≠ P2
• Z= observed difference between proportion /SEp1-p2
• Z= 1− 2 / ( 1− 2)
𝐼 𝑝 𝑝 𝐼 𝑆𝐸 𝑝 𝑝
• SE(p1-p2)= sqrt{ 1 1 1 + 2 2 / 2}
𝑝 𝑞 𝑛 𝑝 𝑞 𝑛
• If Z < 1.96 then accept Ho otherwise reject Ho.
• A survey of 400 children in age group 0-5 years showed prevalence rate of protein calorie malnutrition
to be 15%. Another study showed prevalence of 5% in a sample of 300 of same age group. Can we say
that there is statistical significance in difference between the two prevalence rates?
• Ans- Given values n1= 400, p1= 15%, q1= 100-15 =85%
• n2 =300, p2=5%, q2= 100-5=95%
• Z-test for difference between two proportions
• Ho:- There is no significant difference between two population (prevalence) proportion P1= P2
• H1:- There is significant difference between two population proportion
• P1≠ P2
• Z= 1− 2 / ( 1− 2)
𝐼 𝑝 𝑝 𝐼 𝑆𝐸 𝑝 𝑝
• SE(p1-p2)=sqrt{ 1 1/ 1 + 2 2 / 2} = sqrt{15 85 /400 + 5 95/100} = 2.18
𝑝 𝑞 𝑛 𝑝 𝑞 𝑛 ∗ ∗
• Z= 1− 2 / ( 1− 2) = 15−5 /2.18 =4.59 >1.96 Here cal Z > 1.96 hence reject Ho, there is
𝐼 𝑝 𝑝 𝐼 𝑆𝐸 𝑝 𝑝 𝐼 𝐼
significant difference in prevalence of protein calorie malnutrition
Standard error of difference of means
• Difference between two means is significant or not? Whether this
difference is present in actual populations(significant)meaning
samples represent two different universes or not
Drug trial to see effect on kidney weight
number (n) mean SD
• Control group 12 318 10.2
• Experimental 12 370 24.1
• S.E(d) between mean= sqrt ( σ12
/ n1 + σ22
/ n2)
• Sqrt(10.2*10.2/12 + 24.1*24.1/12}
• Sqrt{8.67+48.4}
• 7.5
• The SED between two means is 7.5.the actual difference between is
(370-319)=52 which is more than twice the S.E.(d) between means
and therefore significant.we conclude that treatment effects the
kidney weight.
t test
• A t-test (also known as Student's t-test) is a tool for evaluating the
means of one or two populations using hypothesis testing. A t-test
may be used to evaluate whether a single group differs from a known
value (a one-sample t-test), whether two groups differ from each
other (an independent two-sample t-test), or whether there is a
significant difference in paired measurements (a paired, or dependent
samples t-test).
• 1908 by William Sealy Gosset. • Gosset published his mathematical
work under the pseudonym “Student”.
Assumptions of t-Test •
• . Dependent variables are interval or ratio.
• • The population from which samples are drawn is normally
distributed.
• Samples are randomly selected.
• The groups have equal variance (Homogeneity of variance).
Applications of t test
• • To test whether a sample mean is different from a hypothesized
value.
• • To compare mean of two samples.
• • To compare two sample means by group.
• • The calculation of a confidence interval for a sample mean.
Types of “t” test
• . • Single sample t test – we have only 1 group; want to test against a
hypothetical mean.
• • Independent samples t test – we have 2 means, 2 groups; no
relation between groups, Eg: When we want to compare the mean of
T/m group with Placebo group.
• • Paired t test – It consists of samples of matched pairs of similar units
or one group of units tested twice. Eg: Difference of mean pre & post
drug intervention.
. One Sample t-test •
• It is used in measuring whether a sample value significantly differs
from a hypothesized value.
• For example, a research scholar might hypothesize that on an average
it takes 3 minutes for people to drink a standard cup of coffee. • He
conducts an experiment and measures how long it takes his subjects to
drink a standard cup of coffee.
• The one sample t-test measures whether the mean amount of time it
took the experimental group to complete the task varies significantly
from the hypothesized 3 minutes value.
standar error of proporton hypthesis testing standard error of diff t test.pptx
standar error of proporton hypthesis testing standard error of diff t test.pptx
standar error of proporton hypthesis testing standard error of diff t test.pptx
• The independent sample t-test consists of tests that compare mean value(s) of
continuous-level (interval or ratio data), in a normally distributed data.
• • The independent sample t-test compares two means.
• • The independent samples t-test is also called unpaired t-test/ two sample t test.
• • It is the t-test to be used when two separate independent and identically
distributed variables are measured.
• • Eg: 1. Comparision of quality of life improved for patients who took drug
Valporate as opposed to patients who took drug Levetiracetam in myoclonic
seizures.
• • 2.Comparasion of mean cholesterol levels in treatment group with placebo
group after administration of test drug.
Assumptions
• A random sample of each population is used.
• The random samples are each made up of independent observation.
• Each sample is independent of one another.
• The population distribution of each population must be nearly
normal, or the size of the sample is large.
• To test the null hypothesis that the two population means, μ1 and μ2, are equal:
• 1. Calculate the difference between the two sample means, x 1 − x 2.
̄ ̄
• • 2. Calculate the pooled standard deviation: sp
• • 3. Calculate the standard error of the difference between the means:
• • 4. Calculate the T-statistic, which is given by T = x 1 − x 2/S E (x 1 − x 2 )
̄ ̄ ̄ ̄
• This statistic follows a t-distribution with n1 + n2 − 2 degrees of freedom.
• • 5. Use tables of the t-distribution to compare your value for T to the t n1+n2−2
distribution. This will give the p-value for the unpaired t-test.
standar error of proporton hypthesis testing standard error of diff t test.pptx
standar error of proporton hypthesis testing standard error of diff t test.pptx
standar error of proporton hypthesis testing standard error of diff t test.pptx
standar error of proporton hypthesis testing standard error of diff t test.pptx
standar error of proporton hypthesis testing standard error of diff t test.pptx
• Two independent samples t-test and z-test are both statistical tests used to compare the means of two
independent samples. However, the choice between the two tests depends on the characteristics of the data
and the assumptions that we can make about the population.
• In general, a two independent samples z-test is appropriate when we know the population standard deviation
and the sample sizes are large. This is because, when sample sizes are large, the sample means are typically
normally distributed, and the z-test assumes normality in the population.
• On the other hand, a two samples t-test is more appropriate when we do not know the population standard
deviation and the sample sizes are small. This is because, when the sample size is small, the sample means
may not be normally distributed, and the t-test can provide a more accurate estimate of the population
mean.
• Here is the summary of which tests out of z-test or t-test to use in which scenarios:
• Two independent samples z-test:
• Large sample size (typically > 30)
• Known population standard deviation
• Normally distributed population
Paired t test
• • A paired t-test is used to compare two population means where you have two
samples in which observations in one sample can be paired with observations in
the other sample.
• • A comparison of two different methods of measurement or two different
treatments where the measurements/treatments are applied to the same subjects.
• • Eg: 1.pre-test/post-test samples in which a factor is measured before and after an
intervention,
• • 2.Cross-over trials in which individuals are randomized to two treatments and
then the same individuals are crossed-over to the alternative treatment,
• • 3.Matched samples, in which individuals are matched on personal characteristics
such as age and sex,
• . Paired t test • Suppose a sample of “n” subjects were given an
antihypertensive drug we want to check blood pressure before and
after treatment . We want to find out the effectiveness of the
treatment by comparing mean pre & post t/t.
• • To test the null hypothesis that the true mean difference is zero, the
procedure is as follows:
• 1.Calculate the difference (di = yi − xi) between the two observations
on each pair.
standar error of proporton hypthesis testing standard error of diff t test.pptx
• Calculate the mean difference, d.
• 4. Calculate the t-statistic, which is given by T = d/S.E, Under the null
hypothesis, this statistic follows a t-distribution with n − 1 degrees of
freedom.
• 5. Use tables of the t-distribution to compare your value for T to the t
n−1 distribution. This will give the p-value for the paired t-test
Chi square test
standar error of proporton hypthesis testing standard error of diff t test.pptx

More Related Content

PPTX
Session V-Hypothesis Te..aisar @ hussain
PPTX
7. hypothesis_tot (1)................pptx
PPTX
Basics of Hypothesis testing for Pharmacy
PPTX
inferential statistics Part - 1 i.e Parametric tests
PPTX
Hypothesis Testing.pptx
PPTX
Tests of significance
PPTX
Confidence intervals, hypothesis testing and statistical tests of significanc...
PPTX
How to do the maths
Session V-Hypothesis Te..aisar @ hussain
7. hypothesis_tot (1)................pptx
Basics of Hypothesis testing for Pharmacy
inferential statistics Part - 1 i.e Parametric tests
Hypothesis Testing.pptx
Tests of significance
Confidence intervals, hypothesis testing and statistical tests of significanc...
How to do the maths

Similar to standar error of proporton hypthesis testing standard error of diff t test.pptx (20)

PPTX
Probalities, Estimations and Hypothesis Testing.pptx
PPTX
Elements of inferential statistics
PPTX
Intro to tests of significance qualitative
PPSX
Research Sample size by Dr Allah Yar Malik
PPT
Review Z Test Ci 1
PPTX
Levelof significance t test biostatisctics
PPTX
Hypothesis
PDF
20200519073328de6dca404c.pdfkshhjejhehdhd
PPT
HypothesisT I think I will be in there for the
PPT
Test signal for the patient and the rest of the week after Christmas
PPTX
Hypothesis testing1
PPTX
Hyp test_Ps and errors.pptx ppt relatedhypo
PPTX
Test of-significance : Z test , Chi square test
PDF
Biostatistics and epidemiology 01stats20
PPTX
Class 5 Hypothesis & Normal Disdribution.pptx
PPT
Statistics basics for oncologist kiran
PPTX
Chapter 18 Hypothesis testing (1).pptx
PDF
Test of hypotheses part i
PPTX
Significance test
PPTX
Hypothesis
Probalities, Estimations and Hypothesis Testing.pptx
Elements of inferential statistics
Intro to tests of significance qualitative
Research Sample size by Dr Allah Yar Malik
Review Z Test Ci 1
Levelof significance t test biostatisctics
Hypothesis
20200519073328de6dca404c.pdfkshhjejhehdhd
HypothesisT I think I will be in there for the
Test signal for the patient and the rest of the week after Christmas
Hypothesis testing1
Hyp test_Ps and errors.pptx ppt relatedhypo
Test of-significance : Z test , Chi square test
Biostatistics and epidemiology 01stats20
Class 5 Hypothesis & Normal Disdribution.pptx
Statistics basics for oncologist kiran
Chapter 18 Hypothesis testing (1).pptx
Test of hypotheses part i
Significance test
Hypothesis
Ad

Recently uploaded (20)

PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
Insiders guide to clinical Medicine.pdf
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PPTX
Cell Types and Its function , kingdom of life
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PDF
Pre independence Education in Inndia.pdf
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PPTX
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
PDF
VCE English Exam - Section C Student Revision Booklet
PDF
RMMM.pdf make it easy to upload and study
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PPTX
PPH.pptx obstetrics and gynecology in nursing
PPTX
Pharma ospi slides which help in ospi learning
PDF
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
PPTX
Institutional Correction lecture only . . .
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
Insiders guide to clinical Medicine.pdf
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
2.FourierTransform-ShortQuestionswithAnswers.pdf
Cell Types and Its function , kingdom of life
Supply Chain Operations Speaking Notes -ICLT Program
Pre independence Education in Inndia.pdf
Renaissance Architecture: A Journey from Faith to Humanism
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
102 student loan defaulters named and shamed – Is someone you know on the list?
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
VCE English Exam - Section C Student Revision Booklet
RMMM.pdf make it easy to upload and study
O5-L3 Freight Transport Ops (International) V1.pdf
PPH.pptx obstetrics and gynecology in nursing
Pharma ospi slides which help in ospi learning
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
Institutional Correction lecture only . . .
Microbial diseases, their pathogenesis and prophylaxis
human mycosis Human fungal infections are called human mycosis..pptx
Ad

standar error of proporton hypthesis testing standard error of diff t test.pptx

  • 1. Standard Error Of Proportion, Difference Of Mean And Difference Of Proportion, t test, chi square test, anova Dr. Syed Razi Haider Zaidi Associate Professor Community Medicine, SIMS, Lahore.
  • 2. Standard error of proportion • The standard error of a proportion is a statistic indicating how greatly a particular sample proportion is likely to differ from the proportion in the population proportion, p. Let p^ represent a proportion observed in a sample. • sep = sqrt (p^q^/n) • q = 1 - p, and n represents the sample size.
  • 3. • For example, if 47 of the 300 residents in the sample supported the use of covid vaccine, the sample proportion, p would be calculated as 47 / 300 = 0.157. • This means our best estimate for the proportion of residents in the population who supported the law would be 0.157. • However, there’s no guarantee that this estimate will exactly match the true population proportion so we typically calculate the standard error of the proportion as well.
  • 4. This is calculated as: Standard Error of the Proportion Formula: Standard Error = √ (1- ) / n p ̂ p ̂ For example, if = 0.157 and n = 300, then we would calculate p ̂ the standard error of the proportion as: Standard error of the proportion = √.157(1-.157) / 300 = 0.021 We then typically use this standard error to calculate a confidence interval for the true proportion of residents who support the covid vaccine.
  • 5. This is calculated as: Confidence Interval for a Population Proportion Formula: Confidence Interval = p ̂ +/- z*√ (1- ) / n p ̂ p ̂ Looking at this formula, it’s easy to see that the larger the standard error of the proportion, the wider the confidence interval.
  • 6. Note that the z in the formula is the z-value that corresponds to popular confidence level choices: For example, here’s how to calculate a 95% confidence interval for the true proportion of residents in the city who support the new law
  • 7. Confidence Level z-value 0.90 1.645 0.95 1.96 0.99 2.58 :
  • 8. • For example, here’s how to calculate a 95% confidence interval for the true proportion of residents in the city who support the new vaccines: • 95% C.I. = +/- z*√ (1- ) / n p̂ p̂ p̂ • 95% C.I. = .157 +/- 1.96*√.157(1-.157) / 300 • 95% C.I. = .157 +/- 1.96*(.021) • 95% C.I. = [ .10884 , .19816]
  • 9. • 2) The proportion of blood group A among Indians is 30%. In a batch of 100 individuals if it is observed as 25%, what is your conclusion about the group? • Ans- Given values n= 100, p= proportion of blood group A in sample =25%, q=100-p= 75% P= proportion of blood group A in Indian population =30% • H0: The sample is drawn from Indian population with population proportion of blood group A, P = 30% • H1: The sample is not drawn from Indian population with population proportion of blood group A, P ≠ 30% • Z-test for proportion:- Z= − / ( ) , 𝑝 𝑃 𝑆𝐸 𝑃 • SE(P)=sqrt / = 30 70 /100 = sqrt21 =4.58 𝑃∗𝑄 𝑛 ∗ • Z= 25−30 / 4.58 = 1.09 • 1.09 < 1.96 • Here Cal Z < 1.96 hence accept the null hypothesis, The sample is drawn from Indian population. 6
  • 10. Hypothesis • In Statistics, a hypothesis is defined as a formal statement, which gives the explanation about the relationship between the two or more variables of the specified population. It helps the researcher to translate the given problem to a clear explanation for the outcome of the study. It clearly explains and predicts the expected outcome. • Null Hypothesis • In the null hypothesis, there is no significant difference between the populations specified in the experiments. The null hypothesis is denoted by H0. • Alternative Hypothesis • In an alternative hypothesis, there is a difference between populations specified. It is denoted by the Ha or H1.
  • 11. Hypothesis testing • Hypothesis testing is used to assess the plausibility of a hypothesis by using sample data. • The test provides evidence concerning the plausibility of the hypothesis, given the data. • Statistical analysts test a hypothesis by measuring and examining a random sample of the population being analyzed. • The four steps of hypothesis testing include stating the hypotheses, formulating an analysis plan, analyzing the sample data, and analyzing the result.
  • 12. • All analysts use a random population sample to test two different hypotheses: the null hypothesis and the alternative hypothesis. • The null hypothesis is usually a hypothesis of equality between population parameters; e.g., a null hypothesis may state that the population mean return is equal to zero. The alternative hypothesis is effectively the opposite of a null hypothesis. Thus, they are mutually exclusive, and only one can be true. However, one of the two hypotheses will always be true. • The null hypothesis is a statement about a population parameter, such as the population mean, that is assumed to be true
  • 13. Type I and Type II Errors • When a statistical hypothesis is tested, there are 4 possible results: (1)The hypothesis is true but our test accepts it. • (2)The hypothesis is false but our test rejects it. • (3)The hypothesis is true but our test rejects it. • (4)The hypothesis is false but our test accepts it. • Obviously, the last 2 possibilities lead to errors. Rejecting a null hypothesis when it is true is called a Type I error. Accepting a null hypothesis when it is false is called Type II error.
  • 14. Test of significance • We need to run a test of significance to reach value of p • Z test • Chi square • t test • Anova • Etc etc
  • 15. What is Statistical Significance? • In Statistics, “significance” means “not by chance” or “probably true”. We can say that if a statistician declares that some result is “highly significant”, then he indicates by stating that it might be very probably true. It does not mean that the result is highly significant, but it suggests that it is highly probable. • Level of Significance Definition • The level of significance is defined as the fixed probability of wrong elimination of null hypothesis when in fact, it is true. The level of significance is stated to be the probability of type I error (rejecting null when it is true)and is preset by the researcher with the outcomes of error. The level of significance is the measurement of the statistical significance. It defines whether the null hypothesis is assumed to be accepted or rejected. It is expected to identify if the result is statistically significant for the null hypothesis to be false or rejected.
  • 16. • Level of Significance Symbol • The level of significance is denoted by the Greek symbol α (alpha). Therefore, the level of significance is defined as follows: • Significance Level = p (type I error) = α • The values or the observations are less likely when they are farther than the mean. The results are written as “significant at x%”. • Example: The value significant at 5% refers to p-value is less than 0.05 or p < 0.05. Similarly, significant at the 1% means that the p-value is less than 0.01. • The level of significance is taken at 0.05 or 5%. When the p-value is low, it means that the recognised values are significantly different from the population value that was hypothesised in the beginning. The p-value is said to be more significant if it is as low as possible. Also, the result would be highly significant if the p-value is very less. But, most generally, p-values smaller than 0.05 are known as significant, since getting a p-value less than 0.05 is quite a less practice.
  • 17. • How to Find the Level of Significance? • To measure the level of statistical significance of the result, the investigator first needs to calculate the p-value. It defines the probability of identifying an effect which provides that the null hypothesis is true. When the p-value is less than the level of significance (α), the null hypothesis is rejected. If the p-value so observed is not less than the significance level α, then theoretically null hypothesis is accepted. Level of significance is kept generally at 0.05. • If p > 0.01 and p ≤ 0.05, then there must be a strong assumption about the null hypothesis. • If p ≤ 0.01, then a very strong assumption about the null hypothesis is indicated.
  • 19. Hypothesis testing pearl of wisdom • There are 5 main steps in hypothesis testing: • State your research hypothesis as a null hypothesis and alternate hypothesis (Ho) and (Ha or H1). • Collect data in a way designed to test the hypothesis. • Perform an appropriate statistical test.(e.g z test, t test, chi square, anova etc) • Decide whether to reject or fail to reject your null hypothesis. • Present the findings in your results and discussion section.
  • 20. Test of significance • We need to run a test of significance to reach value of p • Z test • Chi square • t test • Anova • Etc etc
  • 21. • Based on the outcome of your statistical test, you will have to decide whether to reject or fail to reject your null hypothesis. • In most cases you will use the p-value generated by your statistical test to guide your decision. And in most cases, your predetermined level of significance for rejecting the null hypothesis will be 0.05 – that is, when there is a less than 5% chance that you would see these results if the null hypothesis were true.
  • 22. Standard error difference between two proportions (Z test) Ho:- There is no significant difference between two population proportion P1= P2 • H1:- There is significant difference between two population proportion P1≠ P2 • Z= observed difference between proportion /SEp1-p2 • Z= 1− 2 / ( 1− 2) 𝐼 𝑝 𝑝 𝐼 𝑆𝐸 𝑝 𝑝 • SE(p1-p2)= sqrt{ 1 1 1 + 2 2 / 2} 𝑝 𝑞 𝑛 𝑝 𝑞 𝑛 • If Z < 1.96 then accept Ho otherwise reject Ho.
  • 23. • A survey of 400 children in age group 0-5 years showed prevalence rate of protein calorie malnutrition to be 15%. Another study showed prevalence of 5% in a sample of 300 of same age group. Can we say that there is statistical significance in difference between the two prevalence rates? • Ans- Given values n1= 400, p1= 15%, q1= 100-15 =85% • n2 =300, p2=5%, q2= 100-5=95% • Z-test for difference between two proportions • Ho:- There is no significant difference between two population (prevalence) proportion P1= P2 • H1:- There is significant difference between two population proportion • P1≠ P2 • Z= 1− 2 / ( 1− 2) 𝐼 𝑝 𝑝 𝐼 𝑆𝐸 𝑝 𝑝 • SE(p1-p2)=sqrt{ 1 1/ 1 + 2 2 / 2} = sqrt{15 85 /400 + 5 95/100} = 2.18 𝑝 𝑞 𝑛 𝑝 𝑞 𝑛 ∗ ∗ • Z= 1− 2 / ( 1− 2) = 15−5 /2.18 =4.59 >1.96 Here cal Z > 1.96 hence reject Ho, there is 𝐼 𝑝 𝑝 𝐼 𝑆𝐸 𝑝 𝑝 𝐼 𝐼 significant difference in prevalence of protein calorie malnutrition
  • 24. Standard error of difference of means • Difference between two means is significant or not? Whether this difference is present in actual populations(significant)meaning samples represent two different universes or not
  • 25. Drug trial to see effect on kidney weight number (n) mean SD • Control group 12 318 10.2 • Experimental 12 370 24.1
  • 26. • S.E(d) between mean= sqrt ( σ12 / n1 + σ22 / n2) • Sqrt(10.2*10.2/12 + 24.1*24.1/12} • Sqrt{8.67+48.4} • 7.5 • The SED between two means is 7.5.the actual difference between is (370-319)=52 which is more than twice the S.E.(d) between means and therefore significant.we conclude that treatment effects the kidney weight.
  • 27. t test • A t-test (also known as Student's t-test) is a tool for evaluating the means of one or two populations using hypothesis testing. A t-test may be used to evaluate whether a single group differs from a known value (a one-sample t-test), whether two groups differ from each other (an independent two-sample t-test), or whether there is a significant difference in paired measurements (a paired, or dependent samples t-test). • 1908 by William Sealy Gosset. • Gosset published his mathematical work under the pseudonym “Student”.
  • 28. Assumptions of t-Test • • . Dependent variables are interval or ratio. • • The population from which samples are drawn is normally distributed. • Samples are randomly selected. • The groups have equal variance (Homogeneity of variance).
  • 29. Applications of t test • • To test whether a sample mean is different from a hypothesized value. • • To compare mean of two samples. • • To compare two sample means by group. • • The calculation of a confidence interval for a sample mean.
  • 30. Types of “t” test • . • Single sample t test – we have only 1 group; want to test against a hypothetical mean. • • Independent samples t test – we have 2 means, 2 groups; no relation between groups, Eg: When we want to compare the mean of T/m group with Placebo group. • • Paired t test – It consists of samples of matched pairs of similar units or one group of units tested twice. Eg: Difference of mean pre & post drug intervention.
  • 31. . One Sample t-test • • It is used in measuring whether a sample value significantly differs from a hypothesized value. • For example, a research scholar might hypothesize that on an average it takes 3 minutes for people to drink a standard cup of coffee. • He conducts an experiment and measures how long it takes his subjects to drink a standard cup of coffee. • The one sample t-test measures whether the mean amount of time it took the experimental group to complete the task varies significantly from the hypothesized 3 minutes value.
  • 35. • The independent sample t-test consists of tests that compare mean value(s) of continuous-level (interval or ratio data), in a normally distributed data. • • The independent sample t-test compares two means. • • The independent samples t-test is also called unpaired t-test/ two sample t test. • • It is the t-test to be used when two separate independent and identically distributed variables are measured. • • Eg: 1. Comparision of quality of life improved for patients who took drug Valporate as opposed to patients who took drug Levetiracetam in myoclonic seizures. • • 2.Comparasion of mean cholesterol levels in treatment group with placebo group after administration of test drug.
  • 36. Assumptions • A random sample of each population is used. • The random samples are each made up of independent observation. • Each sample is independent of one another. • The population distribution of each population must be nearly normal, or the size of the sample is large.
  • 37. • To test the null hypothesis that the two population means, μ1 and μ2, are equal: • 1. Calculate the difference between the two sample means, x 1 − x 2. ̄ ̄ • • 2. Calculate the pooled standard deviation: sp • • 3. Calculate the standard error of the difference between the means: • • 4. Calculate the T-statistic, which is given by T = x 1 − x 2/S E (x 1 − x 2 ) ̄ ̄ ̄ ̄ • This statistic follows a t-distribution with n1 + n2 − 2 degrees of freedom. • • 5. Use tables of the t-distribution to compare your value for T to the t n1+n2−2 distribution. This will give the p-value for the unpaired t-test.
  • 43. • Two independent samples t-test and z-test are both statistical tests used to compare the means of two independent samples. However, the choice between the two tests depends on the characteristics of the data and the assumptions that we can make about the population. • In general, a two independent samples z-test is appropriate when we know the population standard deviation and the sample sizes are large. This is because, when sample sizes are large, the sample means are typically normally distributed, and the z-test assumes normality in the population. • On the other hand, a two samples t-test is more appropriate when we do not know the population standard deviation and the sample sizes are small. This is because, when the sample size is small, the sample means may not be normally distributed, and the t-test can provide a more accurate estimate of the population mean. • Here is the summary of which tests out of z-test or t-test to use in which scenarios: • Two independent samples z-test: • Large sample size (typically > 30) • Known population standard deviation • Normally distributed population
  • 44. Paired t test • • A paired t-test is used to compare two population means where you have two samples in which observations in one sample can be paired with observations in the other sample. • • A comparison of two different methods of measurement or two different treatments where the measurements/treatments are applied to the same subjects. • • Eg: 1.pre-test/post-test samples in which a factor is measured before and after an intervention, • • 2.Cross-over trials in which individuals are randomized to two treatments and then the same individuals are crossed-over to the alternative treatment, • • 3.Matched samples, in which individuals are matched on personal characteristics such as age and sex,
  • 45. • . Paired t test • Suppose a sample of “n” subjects were given an antihypertensive drug we want to check blood pressure before and after treatment . We want to find out the effectiveness of the treatment by comparing mean pre & post t/t. • • To test the null hypothesis that the true mean difference is zero, the procedure is as follows: • 1.Calculate the difference (di = yi − xi) between the two observations on each pair.
  • 47. • Calculate the mean difference, d. • 4. Calculate the t-statistic, which is given by T = d/S.E, Under the null hypothesis, this statistic follows a t-distribution with n − 1 degrees of freedom. • 5. Use tables of the t-distribution to compare your value for T to the t n−1 distribution. This will give the p-value for the paired t-test