SlideShare a Scribd company logo
Hypothesis Testing
• Goal: Make statement(s) regarding unknown population
parameter values based on sample data
• Elements of a hypothesis test:
– Null hypothesis - Statement regarding the value(s) of unknown
parameter(s). Typically will imply no association between
explanatory and response variables in our applications (will
always contain an equality)
– Alternative hypothesis - Statement contradictory to the null
hypothesis (will always contain an inequality)
– Test statistic - Quantity based on sample data and null
hypothesis used to test between null and alternative hypotheses
– Rejection region - Values of the test statistic for which we
reject the null in favor of the alternative hypothesis
Hypothesis Testing
Test Result –
True State
H0 True H0 False
H0 True Correct
Decision
Type I Error
H0 False Type II Error Correct
Decision
)
(
)
( Error
II
Type
P
Error
I
Type
P 
 

• Goal: Keep , reasonably small
Example - Efficacy Test for New drug
• Drug company has new drug, wishes to compare it
with current standard treatment
• Federal regulators tell company that they must
demonstrate that new drug is better than current
treatment to receive approval
• Firm runs clinical trial where some patients receive
new drug, and others receive standard treatment
• Numeric response of therapeutic effect is obtained
(higher scores are better).
• Parameter of interest: New - Std
Example - Efficacy Test for New drug
• Null hypothesis - New drug is no better than standard trt
 
0
0
:
0 


 Std
New
Std
New
H 



• Alternative hypothesis - New drug is better than standard trt
0
: 
 Std
New
A
H 

• Experimental (Sample) data:
Std
New
Std
New
Std
New
n
n
s
s
y
y
Sampling Distribution of Difference in Means
• In large samples, the difference in two sample means is
approximately normally distributed:











2
2
2
1
2
1
2
1
2
1 ,
~
n
n
N
Y
Y




• Under the null hypothesis, 1-2=0 and:
)
1
,
0
(
~
2
2
2
1
2
1
2
1
N
n
n
Y
Y
Z





• 1
2
and 2
2
are unknown and estimated by s1
2
and s2
2
Example - Efficacy Test for New drug
• Type I error - Concluding that the new drug is better than the
standard (HA) when in fact it is no better (H0). Ineffective drug is
deemed better.
– Traditionally = P(Type I error) = 0.05
• Type II error - Failing to conclude that the new drug is better
(HA) when in fact it is. Effective drug is deemed to be no better.
– Traditionally a clinically important difference ( is assigned
and sample sizes chosen so that:
 = P(Type II error | 1-2 = )  .20
Elements of a Hypothesis Test
• Test Statistic - Difference between the Sample means,
scaled to number of standard deviations (standard errors)
from the null difference of 0 for the Population means:
2
2
2
1
2
1
2
1
:
.
.
n
s
n
s
y
y
z
S
T obs



• Rejection Region - Set of values of the test statistic that are
consistent with HA, such that the probability it falls in this
region when H0 is true is  (we will always set =0.05)
645
.
1
05
.
0
:
.
. 


 
  z
z
z
R
R obs
P-value (aka Observed Significance Level)
• P-value - Measure of the strength of evidence the sample
data provides against the null hypothesis:
P(Evidence This strong or stronger against H0 | H0 is true)
)
(
: obs
z
Z
P
p
val
P 


Large-Sample Test H0:1-2=0 vs H0:1-2>0
• H0: 1-2 = 0 (No difference in population means
• HA: 1-2 > 0 (Population Mean 1 > Pop Mean 2)
)
(
:
:
.
.
:
.
.
2
2
2
1
2
1
2
1
obs
obs
obs
z
Z
P
value
P
z
z
R
R
n
s
n
s
y
y
z
S
T










• Conclusion - Reject H0 if test statistic falls in rejection region,
or equivalently the P-value is  
Example - Botox for Cervical Dystonia
• Patients - Individuals suffering from cervical dystonia
• Response - Tsui score of severity of cervical dystonia
(higher scores are more severe) at week 8 of Tx
• Research (alternative) hypothesis - Botox A
decreases mean Tsui score more than placebo
• Groups - Placebo (Group 1) and Botox A (Group 2)
• Experimental (Sample) Results:
35
4
.
3
7
.
7
33
6
.
3
1
.
10
2
2
2
1
1
1






n
s
y
n
s
y
Source: Wissel, et al (2001)
Example - Botox for Cervical Dystonia
0024
.
)
82
.
2
(
:
645
.
1
:
.
.
82
.
2
85
.
0
4
.
2
35
)
4
.
3
(
33
)
6
.
3
(
7
.
7
1
.
10
:
.
.
0
:
0
:
05
.
2
2
2
1
2
1
0




















Z
P
val
P
z
z
z
R
R
z
S
T
H
H
obs
obs
A





Test whether Botox A produces lower mean Tsui
scores than placebo ( = 0.05)
Conclusion: Botox A produces lower mean Tsui scores than
placebo (since 2.82 > 1.645 and P-value < 0.05)
2-Sided Tests
• Many studies don’t assume a direction wrt the
difference 1-2
• H0: 1-2 = 0 HA: 1-2  0
• Test statistic is the same as before
• Decision Rule:
– Conclude 1-2 > 0 if zobs  z=0.05  z2=1.96)
– Conclude 1-2 < 0 if zobs  -z=0.05  -z2= -1.96)
– Do not reject 1-2 = 0 if -zzobs z
• P-value: 2P(Z |zobs|)
Power of a Test
• Power - Probability a test rejects H0 (depends on 1- 2)
– H0 True: Power = P(Type I error) = 
– H0 False: Power = 1-P(Type II error) = 1-
· Example:
· H0: 1- 2 = 0 HA: 1- 2 > 0
 

= 

n1 = n2 = 25
· Decision Rule: Reject H0 (at =0.05 significance level) if:
326
.
2
645
.
1
2
2
1
2
1
2
2
2
1
2
1
2
1








 y
y
y
y
n
n
y
y
zobs


Power of a Test
• Now suppose in reality that 1-2 = 3.0 (HA is true)
• Power now refers to the probability we (correctly)
reject the null hypothesis. Note that the sampling
distribution of the difference in sample means is
approximately normal, with mean 3.0 and standard
deviation (standard error) 1.414.
• Decision Rule (from last slide): Conclude population
means differ if the sample mean for group 1 is at least
2.326 higher than the sample mean for group 2
• Power for this case can be computed as:
)
414
.
1
0
.
2
,
3
(
~
)
326
.
2
( 2
1
2
1 


 N
Y
Y
Y
Y
P
Power of a Test
6844
.
)
48
.
0
41
.
1
3
326
.
2
(
)
326
.
2
( 2
1 







 Z
P
Y
Y
P
Power
• All else being equal:
• As sample sizes increase, power increases
• As population variances decrease, power increases
• As the true mean difference increases, power increases
Power of a Test
Distribution (H0) Distribution (HA)
Power of a Test
Power Curves for group sample sizes of 25,50,75,100 and
varying true values 1-2 with 1=2=5.
• For given 1-2 , power increases with sample size
• For given sample size, power increases with 1-2
Sample Size Calculations for Fixed Power
• Goal - Choose sample sizes to have a favorable chance of
detecting a clinically meaning difference
• Step 1 - Define an important difference in means:
– Case 1:  approximated from prior experience or pilot study - dfference
can be stated in units of the data
– Case 2:  unknown - difference must be stated in units of standard
deviations of the data



 2
1 

• Step 2 - Choose the desired power to detect the the clinically
meaningful difference (1-, typically at least .80). For 2-sided test:
 
2
2
2
/
2
1
2


 z
z
n
n



Example - Rosiglitazone for HIV-1
Lipoatrophy
• Trts - Rosiglitazone vs Placebo
• Response - Change in Limb fat mass
• Clinically Meaningful Difference - 0.5 (std dev’s)
• Desired Power - 1- = 0.80
• Significance Level -  = 0.05
  63
)
5
.
0
(
84
.
0
96
.
1
2
84
.
96
.
1
2
2
2
1
20
.
2
/







n
n
z
z
z 

Source: Carr, et al (2004)
Confidence Intervals
• Normally Distributed data - approximately 95% of
individual measurements lie within 2 standard
deviations of the mean
• Difference between 2 sample means is
approximately normally distributed in large
samples (regardless of shape of distribution of
individual measurements):











2
2
2
1
2
1
2
1
2
1 ,
~
n
n
N
Y
Y




• Thus, we can expect (with 95% confidence) that our sample
mean difference lies within 2 standard errors of the true difference
(1-)100% Confidence Interval for 1-2
 
2
2
2
1
2
1
2
/
2
1
n
s
n
s
z
y
y 

 
• Large sample Confidence Interval for 1-2:
• Standard level of confidence is 95% (z.025 = 1.96  2)
• (1-)100% CI’s and 2-sided tests reach the same
conclusions regarding whether 1-2= 0
Example - Viagra for ED
• Comparison of Viagra (Group 1) and Placebo (Group 2)
for ED
• Data pooled from 6 double-blind trials
• Subjects - White males
• Response - Percent of succesful intercourse attempts in
past 4 weeks (Each subject reports his own percentage)
240
3
.
42
5
.
23
264
3
.
41
2
.
63
2
2
2
2
1
1






n
s
y
n
s
y
95% CI for 1- 2:
)
0
.
47
,
4
.
32
(
3
.
7
7
.
39
240
)
3
.
42
(
264
)
3
.
41
(
96
.
1
)
5
.
23
2
.
63
(
2
2






Source: Carson, et al (2002)

More Related Content

PPT
Testing the hypothesis
PPT
comparison of two population means - chapter 8
PPT
comparison of two population means - chapter 8
PPTX
PPTX
Parametric test jdixhbdicndjxnxixhbdjxjxhxn
PPTX
Lecture 10 t –test for Two Independent Samples.pptx
PPTX
Test of significance
PPTX
Chisquare
Testing the hypothesis
comparison of two population means - chapter 8
comparison of two population means - chapter 8
Parametric test jdixhbdicndjxnxixhbdjxjxhxn
Lecture 10 t –test for Two Independent Samples.pptx
Test of significance
Chisquare

Similar to HypothesisTestForMachineLearningInCSE.ppt (20)

PPTX
Intro to tests of significance qualitative
PPTX
Hypothesis testing lectures
DOCX
Chapter 7Hypothesis Testing ProceduresLearning.docx
PPTX
slides Testing of hypothesis.pptx
PPTX
Lecture 7 Hypothesis testing.pptx
PPTX
Lecture_9_Sample_size_calculation_Summer_2016.pptx
PPTX
hypothesis testing-tests of proportions and variances in six sigma
PDF
inferentialstatistics-210411214248.pdf
PPTX
Inferential statistics
PPT
Lesson06_static11
PPT
Introduction to t test and types in Nursing.ppt
PPTX
Test of-significance : Z test , Chi square test
PPTX
Sample determinants and size
PPTX
Hypothesis Test _Two-sample t-test, Z-test, Proportion Z-test
PPTX
Test of significance in Statistics
PPT
Chapter 28 clincal trials
PPTX
Introduction to Business Analytics Course Part 9
PPTX
Statr session 17 and 18
PPTX
Statr session 17 and 18 (ASTR)
PPTX
Lecture 7 Hypothesis Testing in Biostate.pptx
Intro to tests of significance qualitative
Hypothesis testing lectures
Chapter 7Hypothesis Testing ProceduresLearning.docx
slides Testing of hypothesis.pptx
Lecture 7 Hypothesis testing.pptx
Lecture_9_Sample_size_calculation_Summer_2016.pptx
hypothesis testing-tests of proportions and variances in six sigma
inferentialstatistics-210411214248.pdf
Inferential statistics
Lesson06_static11
Introduction to t test and types in Nursing.ppt
Test of-significance : Z test , Chi square test
Sample determinants and size
Hypothesis Test _Two-sample t-test, Z-test, Proportion Z-test
Test of significance in Statistics
Chapter 28 clincal trials
Introduction to Business Analytics Course Part 9
Statr session 17 and 18
Statr session 17 and 18 (ASTR)
Lecture 7 Hypothesis Testing in Biostate.pptx
Ad

More from fizarcse (20)

PPTX
Big Data for computer science student.pptx
PDF
Blockchain for Computer science Students.pdf
PPTX
Introduction of deep learning in cse.pptx
PPTX
Tensor Flow for Deep Learning in CSE.pptx
PDF
Networking Basic in Computer Science.pdf
PDF
Machine Learning Basic in Computer Science.pdf
PPTX
L-15.4 Intellectual Property, ICT ACT and Digital security act.pptx
PPTX
L-1 Introduction about the fundamental of computer.pptx
PPT
ClassificationOfMachineLearninginCSE.ppt
PPT
DataTestsComputerScienceAndEngineering.ppt
PDF
Presentation- Introduction to Cybersecurity.pdf
PPTX
Computer Arithmetic and Binary Math.pptx
PDF
Boolean Algebra for Computer Science Student.pdf
PPTX
IoT, Data Analytics and Big Data Security.pptx
PPTX
BIG DATA ANALYTICS an Machine Learning.pptx
PPTX
Secondary Storage for a microcontroller system
PDF
Computer Systems Quiz Presentation in Purple Bold Style (4).pdf
PPTX
CSE112 Presentation (Binary Brains).pptx
PPTX
Presentaion About Processor and Memory.pptx
PPTX
STM32 Networking for Embedded Technology.pptx
Big Data for computer science student.pptx
Blockchain for Computer science Students.pdf
Introduction of deep learning in cse.pptx
Tensor Flow for Deep Learning in CSE.pptx
Networking Basic in Computer Science.pdf
Machine Learning Basic in Computer Science.pdf
L-15.4 Intellectual Property, ICT ACT and Digital security act.pptx
L-1 Introduction about the fundamental of computer.pptx
ClassificationOfMachineLearninginCSE.ppt
DataTestsComputerScienceAndEngineering.ppt
Presentation- Introduction to Cybersecurity.pdf
Computer Arithmetic and Binary Math.pptx
Boolean Algebra for Computer Science Student.pdf
IoT, Data Analytics and Big Data Security.pptx
BIG DATA ANALYTICS an Machine Learning.pptx
Secondary Storage for a microcontroller system
Computer Systems Quiz Presentation in Purple Bold Style (4).pdf
CSE112 Presentation (Binary Brains).pptx
Presentaion About Processor and Memory.pptx
STM32 Networking for Embedded Technology.pptx
Ad

Recently uploaded (20)

PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PPTX
Nature of X-rays, X- Ray Equipment, Fluoroscopy
PPT
INTRODUCTION -Data Warehousing and Mining-M.Tech- VTU.ppt
PPT
introduction to datamining and warehousing
PPTX
Information Storage and Retrieval Techniques Unit III
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PPTX
UNIT 4 Total Quality Management .pptx
PDF
III.4.1.2_The_Space_Environment.p pdffdf
PPTX
Current and future trends in Computer Vision.pptx
PDF
A SYSTEMATIC REVIEW OF APPLICATIONS IN FRAUD DETECTION
PPTX
Safety Seminar civil to be ensured for safe working.
PDF
BIO-INSPIRED ARCHITECTURE FOR PARSIMONIOUS CONVERSATIONAL INTELLIGENCE : THE ...
PDF
SMART SIGNAL TIMING FOR URBAN INTERSECTIONS USING REAL-TIME VEHICLE DETECTI...
PDF
Automation-in-Manufacturing-Chapter-Introduction.pdf
PPT
Occupational Health and Safety Management System
PPTX
introduction to high performance computing
PDF
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
PPTX
CURRICULAM DESIGN engineering FOR CSE 2025.pptx
PDF
Abrasive, erosive and cavitation wear.pdf
PPT
Introduction, IoT Design Methodology, Case Study on IoT System for Weather Mo...
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
Nature of X-rays, X- Ray Equipment, Fluoroscopy
INTRODUCTION -Data Warehousing and Mining-M.Tech- VTU.ppt
introduction to datamining and warehousing
Information Storage and Retrieval Techniques Unit III
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
UNIT 4 Total Quality Management .pptx
III.4.1.2_The_Space_Environment.p pdffdf
Current and future trends in Computer Vision.pptx
A SYSTEMATIC REVIEW OF APPLICATIONS IN FRAUD DETECTION
Safety Seminar civil to be ensured for safe working.
BIO-INSPIRED ARCHITECTURE FOR PARSIMONIOUS CONVERSATIONAL INTELLIGENCE : THE ...
SMART SIGNAL TIMING FOR URBAN INTERSECTIONS USING REAL-TIME VEHICLE DETECTI...
Automation-in-Manufacturing-Chapter-Introduction.pdf
Occupational Health and Safety Management System
introduction to high performance computing
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
CURRICULAM DESIGN engineering FOR CSE 2025.pptx
Abrasive, erosive and cavitation wear.pdf
Introduction, IoT Design Methodology, Case Study on IoT System for Weather Mo...

HypothesisTestForMachineLearningInCSE.ppt

  • 1. Hypothesis Testing • Goal: Make statement(s) regarding unknown population parameter values based on sample data • Elements of a hypothesis test: – Null hypothesis - Statement regarding the value(s) of unknown parameter(s). Typically will imply no association between explanatory and response variables in our applications (will always contain an equality) – Alternative hypothesis - Statement contradictory to the null hypothesis (will always contain an inequality) – Test statistic - Quantity based on sample data and null hypothesis used to test between null and alternative hypotheses – Rejection region - Values of the test statistic for which we reject the null in favor of the alternative hypothesis
  • 2. Hypothesis Testing Test Result – True State H0 True H0 False H0 True Correct Decision Type I Error H0 False Type II Error Correct Decision ) ( ) ( Error II Type P Error I Type P     • Goal: Keep , reasonably small
  • 3. Example - Efficacy Test for New drug • Drug company has new drug, wishes to compare it with current standard treatment • Federal regulators tell company that they must demonstrate that new drug is better than current treatment to receive approval • Firm runs clinical trial where some patients receive new drug, and others receive standard treatment • Numeric response of therapeutic effect is obtained (higher scores are better). • Parameter of interest: New - Std
  • 4. Example - Efficacy Test for New drug • Null hypothesis - New drug is no better than standard trt   0 0 : 0     Std New Std New H     • Alternative hypothesis - New drug is better than standard trt 0 :   Std New A H   • Experimental (Sample) data: Std New Std New Std New n n s s y y
  • 5. Sampling Distribution of Difference in Means • In large samples, the difference in two sample means is approximately normally distributed:            2 2 2 1 2 1 2 1 2 1 , ~ n n N Y Y     • Under the null hypothesis, 1-2=0 and: ) 1 , 0 ( ~ 2 2 2 1 2 1 2 1 N n n Y Y Z      • 1 2 and 2 2 are unknown and estimated by s1 2 and s2 2
  • 6. Example - Efficacy Test for New drug • Type I error - Concluding that the new drug is better than the standard (HA) when in fact it is no better (H0). Ineffective drug is deemed better. – Traditionally = P(Type I error) = 0.05 • Type II error - Failing to conclude that the new drug is better (HA) when in fact it is. Effective drug is deemed to be no better. – Traditionally a clinically important difference ( is assigned and sample sizes chosen so that:  = P(Type II error | 1-2 = )  .20
  • 7. Elements of a Hypothesis Test • Test Statistic - Difference between the Sample means, scaled to number of standard deviations (standard errors) from the null difference of 0 for the Population means: 2 2 2 1 2 1 2 1 : . . n s n s y y z S T obs    • Rejection Region - Set of values of the test statistic that are consistent with HA, such that the probability it falls in this region when H0 is true is  (we will always set =0.05) 645 . 1 05 . 0 : . .        z z z R R obs
  • 8. P-value (aka Observed Significance Level) • P-value - Measure of the strength of evidence the sample data provides against the null hypothesis: P(Evidence This strong or stronger against H0 | H0 is true) ) ( : obs z Z P p val P   
  • 9. Large-Sample Test H0:1-2=0 vs H0:1-2>0 • H0: 1-2 = 0 (No difference in population means • HA: 1-2 > 0 (Population Mean 1 > Pop Mean 2) ) ( : : . . : . . 2 2 2 1 2 1 2 1 obs obs obs z Z P value P z z R R n s n s y y z S T           • Conclusion - Reject H0 if test statistic falls in rejection region, or equivalently the P-value is  
  • 10. Example - Botox for Cervical Dystonia • Patients - Individuals suffering from cervical dystonia • Response - Tsui score of severity of cervical dystonia (higher scores are more severe) at week 8 of Tx • Research (alternative) hypothesis - Botox A decreases mean Tsui score more than placebo • Groups - Placebo (Group 1) and Botox A (Group 2) • Experimental (Sample) Results: 35 4 . 3 7 . 7 33 6 . 3 1 . 10 2 2 2 1 1 1       n s y n s y Source: Wissel, et al (2001)
  • 11. Example - Botox for Cervical Dystonia 0024 . ) 82 . 2 ( : 645 . 1 : . . 82 . 2 85 . 0 4 . 2 35 ) 4 . 3 ( 33 ) 6 . 3 ( 7 . 7 1 . 10 : . . 0 : 0 : 05 . 2 2 2 1 2 1 0                     Z P val P z z z R R z S T H H obs obs A      Test whether Botox A produces lower mean Tsui scores than placebo ( = 0.05) Conclusion: Botox A produces lower mean Tsui scores than placebo (since 2.82 > 1.645 and P-value < 0.05)
  • 12. 2-Sided Tests • Many studies don’t assume a direction wrt the difference 1-2 • H0: 1-2 = 0 HA: 1-2  0 • Test statistic is the same as before • Decision Rule: – Conclude 1-2 > 0 if zobs  z=0.05  z2=1.96) – Conclude 1-2 < 0 if zobs  -z=0.05  -z2= -1.96) – Do not reject 1-2 = 0 if -zzobs z • P-value: 2P(Z |zobs|)
  • 13. Power of a Test • Power - Probability a test rejects H0 (depends on 1- 2) – H0 True: Power = P(Type I error) =  – H0 False: Power = 1-P(Type II error) = 1- · Example: · H0: 1- 2 = 0 HA: 1- 2 > 0    =   n1 = n2 = 25 · Decision Rule: Reject H0 (at =0.05 significance level) if: 326 . 2 645 . 1 2 2 1 2 1 2 2 2 1 2 1 2 1          y y y y n n y y zobs  
  • 14. Power of a Test • Now suppose in reality that 1-2 = 3.0 (HA is true) • Power now refers to the probability we (correctly) reject the null hypothesis. Note that the sampling distribution of the difference in sample means is approximately normal, with mean 3.0 and standard deviation (standard error) 1.414. • Decision Rule (from last slide): Conclude population means differ if the sample mean for group 1 is at least 2.326 higher than the sample mean for group 2 • Power for this case can be computed as: ) 414 . 1 0 . 2 , 3 ( ~ ) 326 . 2 ( 2 1 2 1     N Y Y Y Y P
  • 15. Power of a Test 6844 . ) 48 . 0 41 . 1 3 326 . 2 ( ) 326 . 2 ( 2 1          Z P Y Y P Power • All else being equal: • As sample sizes increase, power increases • As population variances decrease, power increases • As the true mean difference increases, power increases
  • 16. Power of a Test Distribution (H0) Distribution (HA)
  • 17. Power of a Test Power Curves for group sample sizes of 25,50,75,100 and varying true values 1-2 with 1=2=5. • For given 1-2 , power increases with sample size • For given sample size, power increases with 1-2
  • 18. Sample Size Calculations for Fixed Power • Goal - Choose sample sizes to have a favorable chance of detecting a clinically meaning difference • Step 1 - Define an important difference in means: – Case 1:  approximated from prior experience or pilot study - dfference can be stated in units of the data – Case 2:  unknown - difference must be stated in units of standard deviations of the data     2 1   • Step 2 - Choose the desired power to detect the the clinically meaningful difference (1-, typically at least .80). For 2-sided test:   2 2 2 / 2 1 2    z z n n   
  • 19. Example - Rosiglitazone for HIV-1 Lipoatrophy • Trts - Rosiglitazone vs Placebo • Response - Change in Limb fat mass • Clinically Meaningful Difference - 0.5 (std dev’s) • Desired Power - 1- = 0.80 • Significance Level -  = 0.05   63 ) 5 . 0 ( 84 . 0 96 . 1 2 84 . 96 . 1 2 2 2 1 20 . 2 /        n n z z z   Source: Carr, et al (2004)
  • 20. Confidence Intervals • Normally Distributed data - approximately 95% of individual measurements lie within 2 standard deviations of the mean • Difference between 2 sample means is approximately normally distributed in large samples (regardless of shape of distribution of individual measurements):            2 2 2 1 2 1 2 1 2 1 , ~ n n N Y Y     • Thus, we can expect (with 95% confidence) that our sample mean difference lies within 2 standard errors of the true difference
  • 21. (1-)100% Confidence Interval for 1-2   2 2 2 1 2 1 2 / 2 1 n s n s z y y     • Large sample Confidence Interval for 1-2: • Standard level of confidence is 95% (z.025 = 1.96  2) • (1-)100% CI’s and 2-sided tests reach the same conclusions regarding whether 1-2= 0
  • 22. Example - Viagra for ED • Comparison of Viagra (Group 1) and Placebo (Group 2) for ED • Data pooled from 6 double-blind trials • Subjects - White males • Response - Percent of succesful intercourse attempts in past 4 weeks (Each subject reports his own percentage) 240 3 . 42 5 . 23 264 3 . 41 2 . 63 2 2 2 2 1 1       n s y n s y 95% CI for 1- 2: ) 0 . 47 , 4 . 32 ( 3 . 7 7 . 39 240 ) 3 . 42 ( 264 ) 3 . 41 ( 96 . 1 ) 5 . 23 2 . 63 ( 2 2       Source: Carson, et al (2002)