SlideShare a Scribd company logo
WEEK 5 HOMEWORK 5
THIS WEEK INVOLVES READING NEW TABLES, THE t-
TABLES. BUT, UNLIKE THE Z-TABLES WHERE YOU JUST
NEEDED TO CALCULATE A Z-VALUE TO FIND THE
PROBABILITY IN THE TABLE, WITH THE t-TABLE YOU
NEED THE t-VALUE AND THE DEGREES OF FREEDOM,
WHICH IS SIMPLY N – 1. THE t-VALUES DEPEND ON THE
SIZE OF THE SAMPLE. THE t-DISTRIBUTION SHAPE CAN
BE A “SKINNY” BELL CURVE, SYMETRICAL JUST LIKE
THE NORMAL DISTRIBTION; BUT, AS YOU CAN SEE,
THE LARGER THE SAMPLE SIZE THE MORE THE t-
DISTRIBUTION SHAPE LOOKS LIKE THE NORMAL
DISTRIBUTION SHAPE.
THE t-TABLES ARE IN OUR COURSE CONTENT > COURSE
RESOURCES (NEAR THE BOTTOM OF CONTENT) >
STATISTICAL RESOURCES. I SUGGEST YOU PRINT IT
OUT AND REVIEW IT AS YOU READ THESE
INSTRUCTIONS:
LET’S SAY YOU WANT TO BE 99% CONFIDENT (LEVEL
OF CONFIDENCE) IN YOUR CONFIDENCE INTERVAL.
THIS LEAVES A 1% CHANCE YOUR SAMPLE STATISTIC
LIKE THE MEAN IS NOT IN THAT INTERVAL. KEEP IN
MIND THAT THE CI HAS AN UPPER LIMIT AND A LOWER
LIMIT SO THIS 1% COVERS BOTH ENDS. THIS MEANS
THAT THE PROBABILITY THAT OUR RESULT IS IN ONE
OR THE OTHER TAIL IS 1% / 2 = 0.5% (OR 0.005). THINK
ABOUT IT. NOTE THAT IN THE TABLE THE 99% CI
(BOTTOM LINE) EQUATES TO 0.005 ON THE TOP LINE.
ALL WE NEED NOW IS THE DEGREES OF FREEDOM,
WHICH IS N – 1, TO GET OUR CRITICAL t-VALUE FROM
THE BODY OF THE TABLE. IN THIS EXAMPLE THE df IS
14, SO THE t-VALUE IS 2.977. THIS t-VALUE IS A
CRITICAL VALUE LIKE A Z-VALUE.
THE t-TABLES ARE DIFFERENT FROM THE z-TABLES IN
THAT WITH THE “t” WE SPECIFY THE PROBABILITY WE
WANT AND THEN FIND THE CRITICAL t-VALUE. WITH
THE NORMAL DISTRIBUTION, WE CALCULTE THE z-
VALUE FIRST AND THEN COMPARE IT TO THE CRITICAL
z-VALUE OF OUR DESIRED LEVEL OF SIGNIFICANCE
(90%, 95%, OR 99%) TO SEE IF OUR CALCULATED z-
VALUE IS FURTHER OUT IN EITHER TAIL, HENCE
“UNUSUAL”.
THE NORMAL DISTRIBUTION CURVE IS ONE CURVE AND
DOES NOT CHANGE WITH SAMPLE SIZE, HENCE WE CAN
HAVE THE SAME CRITICAL z-VALUES FOR EVERY
PROBLEM (99%, 95%, AND 90%). THE t-DISTRIBTUTION
CURVE STARTS OUT A SKINNY BELL CURVE AND
SLOWLY WIDENS AS THE SAMPLE SIZE (THE df)
INCREASES. YOU CAN SEE FROM THE t-TABLE THAT
WHEN THE df APPROACHES 1000, WE ARE VERY CLOSE
TO THE NORMAL DISTRIBUTION’S BELL SHAPE. BUT,
SINCE THE t-SHAPE CHANGES, THERE ARE DIFFERENT
CRITICAL t-VALUES DEPENDING ON THE df (SAMPLE
SIZES) INVOLVED.
HERE IS WHAT THE CONFIDENCE INTERVAL MEANS
GRAPHICALLY: IN THIS SAMPLE PROBLEM WE WANTED
A 95% CONFIDENCE LEVEL CONFIDENCE INTERVAL
BASED ON A SAMPLE WITH A MEAN OF 3.26, A STD DEV
OF 1.019 AND A SAMPLE SIZE OF 39 (df = 39-1 =38). THE
CRITICAL t-VALUE FROM THE TABLE IS 2.023 (WE
DON’T HAVE 39 df IN THE TABLE SO WE APPROXIMATED
USING THE df FOR 39 AND 40). USE THE TABLE AND
NOT SOFTWARE FOR THESE PROBLEMS. NOTE THAT
THE 95% LEVEL MEANS 2.5% OF THE AREA UNDER THE
t-CURVE IS IN EACH TAIL AS THE LIMITS.
NOW WE HAVE OUR t-VALUE = 2.023. REMEMBER WHEN
WE STANDARDIZED OUR RAW DATA POINTS (THE X-
VALUES) USING: Z = (X –MEAN)/STD DEV. IF WE WERE
GIVEN THE z-VALUE WE COULD BACK-CALCULATE THE
X-VALUE OR ACTUAL DATA POINT WITH THAT
STANDARD DEVIATION (THE z-VALUE): X = Z * STD
DEV + MEAN.
WITH THE t-VALUE IT’S A LITTLE DIFFERENT SINCE
SAMPLE SIZE MATTERS. HERE WE MUST FIRST
CALCULATE AN “ERROR BOUND” WHICH EQUALS: t-
VALUE * STD DEV / SQ RT OF THE SAMPLE SIZE N.
PLUGGING IN WE HAVE: EBM = 2.023 * 1.019 / √39 =
0.3302. SIMPLY ADD AND SUBTRACT 0.3302 FROM THE
SAMPLE MEAN TO GET THE CONFIDENCE INTERVAL
LIMITS IN ACTUAL X-VALUES.
STATING WHAT THE CONFIDENCE NTERVAL MEANS IS
DIFFERENT THAN YOU MIGHT FIRST THINK.
“WE CAN BE CONFIDENT THAT 95% OF OUR SAMPLE
MEANS WILL FALL IN THIS CONFIDENCE INTERVAL.”
AND NOT SIMPLY: “95% OF OUR SAMPLE MEANS WILL
FALL IN THIS INTERVAL”. BE CAREFUL.
HERE ARE THE WEEK 5 HOMEWORK PROBLEMS.
LANE-CHAPTER 10 TYPE PROBLEMS.
#1. BRIEFLY EXPLAIN THE RELATIONSHIPS AMONG
SAMPLES, POPULATIONS, STATISTICS, AND
PARAMETERS.
#2. HOW ARE BIAS AND ACCURACY RELATED ?
#3. WHY DOES A CONFIDENCE INTERVAL GET WIDER
THE MORE CONFIDENT YOU WANT TO BE ?
#4. THE STUDENT t DISTRIBUTION WITH ITS t- TABLES
IS INTRODUCED THIS WEEK. WE HAVE PREVIOUSLY
BEEN USING THE Z-TABLES FOR THE NORMAL
DISTRIBUTION. HOW DO WE KNOW WHICH TO USE ?
#5. IN THE CONCEPTS AND RELATED PROBLEMS THIS
WEEK WE INTRODUCE CONFIDENCE INTERVALS AND
THE “STANDARD ERROR”. WE USE THIS “SE” TO
DETERMINE THE LIMITS OF OUR CONFIDENCE
INTERVAL. OF COURSE THE WIDTH OF THE “CI” ALSO
DEPENDS ON JUST HOW “CONFIDENT” WE WANT TO BE:
90%, 95% OR 99%. AND, YOU EXPLAINED ABOVE WHY
THE CI GETS WIDER AS OUR REQUIRED CONFIDENCE
LEVEL INCREASES.
SO, WHAT IS THE STANDARD ERROR (NOT THE
FORMULA) AND WHAT DOES IT MEASURE? ARE WE
STILL JUST MEASURING DISTANCES FROM SOME LINE
AND SQUARING THEM TO GET RID OF NEGATIVES? IS
THERE A SQUARE ROOT INVOLVED? WHAT DO YOU
THINK?
#6. YOU ALL HAVE ENJOYED DOING THE COMPLEX
CALCULATIONS RELATED TO BINOMIAL PROBLEMS ;-)
THE BINOMIAL PROBLEMS DEALT WITH DISCRETE
DATA LIKE “HEADS/TAILS” OR “WIN/LOSE” AND YOU
HAD TO USE THE COMPLEX EQUATION TO FIND THE
PROBABILITY OF EACH SITUATION LIKE TOSSING 4 OR
5 HEADS OUT OF 5 TOSSES. YOU HAVE ALSO USED
FORMULAS THAT ALLOW US TO TREAT DISCRETE DATA
AS “CONTINUOUS” DATA (NORMAL APPROXIMATION TO
THE BINOMIAL).
A ‘WRINKLE” FOR CONFIDENCE INTERVALS IS THAT
THERE IS A “CORRECTION FACTOR” THAT MUST BE
APPLIED (CHECK LANE AROUND PAGE 360) UNDER THE
SECTION ENTITLED “PROPORTIONS” FOR THAT
CORRECTION FORMULA). PROPORTIONS MEANS
PERCENTAGES LIKE 35 OUT OF 100 = 35%.
THIS IS A PROBLEM THAT REQUIRES USING THAT
CORRECTION FACTOR TO SOLVE A DISCRETE DATA
PROBLEM AS A CONTINUOUS DATA PROBLEM (SOLVE
IT):
A person claims to be able to predict the outcome of flipping a
coin. This person is correct 18/24 times. Compute the 95%
confidence interval on the proportion of times this person can
predict coin flips correctly. What conclusion can you draw
about this test of his ability to predict the future (at least
regarding coin toss results)?
#7. THIS PROBLEM HAS TWO PARTS: THE FIRST USES
THE Z-TABLES FOR THE NORMAL DISTRIBUTION AND
THE SECOND REQUIRES THE t-TABLES. DO YOU
UNDERSTAND WHY (THIS IS VERY IMORTANT)
You take a sample of 30 from a large population of test scores,
and the mean of your sample is 70.
(a) You know that the standard deviation of the population is
12. What is the 99% confidence interval on the population
mean.
(b) Now assume that you do NOT know the population standard
deviation, but the standard deviation of your sample is 12. What
is the 99% confidence interval on the mean now?
#8. WE ARE DIFINITELY INTO “INFERENTIAL”
STATISTICS NOW AND ENTERING INTO THE MOST
IMPORTANT TOPIC WE COVER: HYPOTHESIS TESTING.
IN THIS PROBLEM WE USE THE CONFIDENCE INTERVAL
TO ACTUALLY TEST A HYPOTHESIS OR “GUT FEELING”
THAT WE HAVE.
You read about a survey in a newspaper and find that 75% of
the 275 people sampled prefer Candidate A. You are surprised
by this survey because you thought that only 50% of the
population would prefer this candidate. Based on this sample, is
50% a possible population proportion? Compute the 95%
confidence interval to be sure and decide for yourself whether
based on this sample, 50% is a possible population proportion
(proportion simply refers to percentage).
AN HYPOTHESIS IS KIND OF AN EDUCATED GUESS AND
IN HYPOTHESIS TESTING THERE ARE ALWAYS TWO
HYPOTHESES: THE NULL HYPOTHESIS (Ho) WHICH
MUST ALWAYS HAVE THE EQUALS IN IT AND THE
ALTERNATE HYPOTHESIS (Ha) WHICH MUST ACCOUNT
FOR THE REMAINING POSSIBILITIES. HERE OUR Ho IS
THAT THE PROPORTION EQUALS (OR IS GREATER THAN)
75% AND THE ALTERNATE HYPOTHESIS Ha IS THAT THE
PROPORTION IS LESS THAN 75% (LIKE 50% WOULD BE).
AS YOU WILL LEARN IN THE UPCOMING WEEKS, WE
COVER THREE WAYS TO PERFORM AN HYPOTHESIS
TEST: CONFIDENCE INTERVALS, TEST STATISTICS,
AND PROBABILITY VALUES (P-VALUES). THE LATTER
TWO ALWAYS LEAD TO THE SAME CONCLUSION, BUT
CONFIDENCE INTERVALS MAY NOT AGREE.
AFTER WE DO OUR ANALYSIS WE EITHER “ACCEPT” OR
“REJECT” Ho. (YOU SHOULD USE THOSE TERMS TO BE
CLEAR). BE CAREFUL THOUGH, REJECTING Ho DOES
NOT AUTOMATICALLY MEAN WE ACCEPT Ha. THERE
MAY BE OTHER WRINKLES.
ILLOWSKY CHAPTER 8
#9. BELOW IS A TABLE SHOWING THE NUMBER OF ZIKA
MOSQUITOES FOUND IN 5 NEIGHBORHOODS IN A SMALL
CITY. WE WANT TO CALCULATE THE CONFIDENCE
INTERVAL AROUND THE TRUE MEAN AT A CONFIDENCE
LEVEL OF 99%.
# SAMPLES
MOSQ / SAMPLE
TOTAL MOSQ
2
1
2
3
2
6
4
3
12
5
4
20
1
5
5
15
45
THE TRICKY PART HERE IS TO GET THE VARIANCE (AND
STD DEV) NOTE THAT THERE ARE 15 SAMPLES AND
YOU ARE GIVEN THE NUMBER OF MOSQUITOES IN EACH
OF THEM. TRY THIS TABLE:
SAMPLE #
# MOSQ IN SAMPLE
# MOSQ - MEAN
(# MOSQ - MEAN)^2
1
1
-2.00
4.00
2
1
3
2
4
2
5
2
6
3
7
3
8
3
9
3
10
4
11
4
12
4
13
4
14
4
15
5
TOTAL MOSQ
TOTAL
MEAN = _______________, VARIANCE =
___________________, STANDARD DEVIATION =
_______________
CRITICAL t- VALUE IS ____________, THE EBM IS
_________ AND THE 99% CI IS
__________________________
#10. AD AGENCIES are interested in knowing the population
percent (PROPORTION) of women who make the majority of
CAR BUYING decisions. If you were designing this study to
determine this population proportion, what is the minimum
number (N) you would need to survey to be 95% confident that
the population proportion is estimated to within 5% (0.05)?
THIS IS A z-VALUE PROPORTION (PERCENTAGE)
PROBLEM (NOT t-VALUE)
b. WHAT IS THE FORMULA WE USE TO FIND “n” ?
c. WHAT ARE THE VALUES FOR p’ AND q’ (HINT: THIS
IS A BINOMIAL SITUATION: MALE/FEMALE SO ASSUME
50:50 OR 0.5 AND 0.5)
d. WHAT ARE THE VALUES OF EBP AND EBP2 ?
e. WHAT ARE THE VALUES OF α AND α / 2 ?
f. USING THE z-TABLES (NOT THE t-TABLES) WHAT IS
THE CRITICAL z-VALUE FOR THIS α / 2 ?
g. PLUG THESE VALUES INTO THE FORMULA FOR “n”
AND YOU GET n = _________
(YOU CAN REFER TO ILLOWSKY Section 8.3 and SAMPLE
PROBLEM 8.14 AROUND PAGE 463 FOR OTHER
GUIDANCE)
WEEK 5 HOMEWORK 5
THIS WEEK INVOLVES READING NEW TABLES, THE t-
TABLES. BUT, UNLIKE THE Z-TABLES WHERE YOU JUST
NEEDED TO CALCULATE A Z-VALUE TO FIND THE
PROBABILITY IN THE TABLE, WITH THE t-TABLE YOU
NEED THE t-VALUE AND THE DEGREES OF FREEDOM,
WHICH IS SIMPLY N – 1. THE t-VALUES DEPEND ON THE
SIZE OF THE SAMPLE. THE t-DISTRIBUTION SHAPE CAN
BE A “SKINNY” BELL CURVE, SYMETRICAL JUST LIKE
THE NORMAL DISTRIBTION; BUT, AS YOU CAN SEE,
THE LARGER THE SAMPLE SIZE THE MORE THE t-
DISTRIBUTION SHAPE LOOKS LIKE THE NORMAL
DISTRIBUTION SHAPE.
THE t-TABLES ARE IN OUR COURSE CONTENT > COURSE
RESOURCES (NEAR THE BOTTOM OF CONTENT) >
STATISTICAL RESOURCES. I SUGGEST YOU PRINT IT
OUT AND REVIEW IT AS YOU READ THESE
INSTRUCTIONS:
LET’S SAY YOU WANT TO BE 99% CONFIDENT (LEVEL
OF CONFIDENCE) IN YOUR CONFIDENCE INTERVAL.
THIS LEAVES A 1% CHANCE YOUR SAMPLE STATISTIC
LIKE THE MEAN IS NOT IN THAT INTERVAL. KEEP IN
MIND THAT THE CI HAS AN UPPER LIMIT AND A LOWER
LIMIT SO THIS 1% COVERS BOTH ENDS. THIS MEANS
THAT THE PROBABILITY THAT OUR RESULT IS IN ONE
OR THE OTHER TAIL IS 1% / 2 = 0.5% (OR 0.005). THINK
ABOUT IT. NOTE THAT IN THE TABLE THE 99% CI
(BOTTOM LINE) EQUATES TO 0.005 ON THE TOP LINE.
ALL WE NEED NOW IS THE DEGREES OF FREEDOM,
WHICH IS N – 1, TO GET OUR CRITICAL t-VALUE FROM
THE BODY OF THE TABLE. IN THIS EXAMPLE THE df IS
14, SO THE t-VALUE IS 2.977. THIS t-VALUE IS A
CRITICAL VALUE LIKE A Z-VALUE.
THE t-TABLES ARE DIFFERENT FROM THE z-TABLES IN
THAT WITH THE “t” WE SPECIFY THE PROBABILITY WE
WANT AND THEN FIND THE CRITICAL t-VALUE. WITH
THE NORMAL DISTRIBUTION, WE CALCULTE THE z-
VALUE FIRST AND THEN COMPARE IT TO THE CRITICAL
z-VALUE OF OUR DESIRED LEVEL OF SIGNIFICANCE
(90%, 95%, OR 99%) TO SEE IF OUR CALCULATED z-
VALUE IS FURTHER OUT IN EITHER TAIL, HENCE
“UNUSUAL”.
THE NORMAL DISTRIBUTION CURVE IS ONE CURVE AND
DOES NOT CHANGE WITH SAMPLE SIZE, HENCE WE CAN
HAVE THE SAME CRITICAL z-VALUES FOR EVERY
PROBLEM (99%, 95%, AND 90%). THE t-DISTRIBTUTION
CURVE STARTS OUT A SKINNY BELL CURVE AND
SLOWLY WIDENS AS THE SAMPLE SIZE (THE df)
INCREASES. YOU CAN SEE FROM THE t-TABLE THAT
WHEN THE df APPROACHES 1000, WE ARE VERY CLOSE
TO THE NORMAL DISTRIBUTION’S BELL SHAPE. BUT,
SINCE THE t-SHAPE CHANGES, THERE ARE DIFFERENT
CRITICAL t-VALUES DEPENDING ON THE df (SAMPLE
SIZES) INVOLVED.
HERE IS WHAT THE CONFIDENCE INTERVAL MEANS
GRAPHICALLY: IN THIS SAMPLE PROBLEM WE WANTED
A 95% CONFIDENCE LEVEL CONFIDENCE INTERVAL
BASED ON A SAMPLE WITH A MEAN OF 3.26, A STD DEV
OF 1.019 AND A SAMPLE SIZE OF 39 (df = 39-1 =38). THE
CRITICAL t-VALUE FROM THE TABLE IS 2.023 (WE
DON’T HAVE 39 df IN THE TABLE SO WE APPROXIMATED
USING THE df FOR 39 AND 40). USE THE TABLE AND
NOT SOFTWARE FOR THESE PROBLEMS. NOTE THAT
THE 95% LEVEL MEANS 2.5% OF THE AREA UNDER THE
t-CURVE IS IN EACH TAIL AS THE LIMITS.
NOW WE HAVE OUR t-VALUE = 2.023. REMEMBER WHEN
WE STANDARDIZED OUR RAW DATA POINTS (THE X-
VALUES) USING: Z = (X –MEAN)/STD DEV. IF WE WERE
GIVEN THE z-VALUE WE COULD BACK-CALCULATE THE
X-VALUE OR ACTUAL DATA POINT WITH THAT
STANDARD DEVIATION (THE z-VALUE): X = Z * STD
DEV + MEAN.
WITH THE t-VALUE IT’S A LITTLE DIFFERENT SINCE
SAMPLE SIZE MATTERS. HERE WE MUST FIRST
CALCULATE AN “ERROR BOUND” WHICH EQUALS: t-
VALUE * STD DEV / SQ RT OF THE SAMPLE SIZE N.
PLUGGING IN WE HAVE: EBM = 2.023 * 1.019 / √39 =
0.3302. SIMPLY ADD AND SUBTRACT 0.3302 FROM THE
SAMPLE MEAN TO GET THE CONFIDENCE INTERVAL
LIMITS IN ACTUAL X-VALUES.
STATING WHAT THE CONFIDENCE NTERVAL MEANS IS
DIFFERENT THAN YOU MIGHT FIRST THINK.
“WE CAN BE CONFIDENT THAT 95% OF OUR SAMPLE
MEANS WILL FALL IN THIS CONFIDENCE INTERVAL.”
AND NOT SIMPLY: “95% OF OUR SAMPLE MEANS WILL
FALL IN THIS INTERVAL”. BE CAREFUL.
HERE ARE THE WEEK 5 HOMEWORK PROBLEMS.
LANE-CHAPTER 10 TYPE PROBLEMS.
#1. BRIEFLY EXPLAIN THE RELATIONSHIPS AMONG
SAMPLES, POPULATIONS, STATISTICS, AND
PARAMETERS.
#2. HOW ARE BIAS AND ACCURACY RELATED ?
#3. WHY DOES A CONFIDENCE INTERVAL GET WIDER
THE MORE CONFIDENT YOU WANT TO BE ?
#4. THE STUDENT t DISTRIBUTION WITH ITS t- TABLES
IS INTRODUCED THIS WEEK. WE HAVE PREVIOUSLY
BEEN USING THE Z-TABLES FOR THE NORMAL
DISTRIBUTION. HOW DO WE KNOW WHICH TO USE ?
#5. IN THE CONCEPTS AND RELATED PROBLEMS THIS
WEEK WE INTRODUCE CONFIDENCE INTERVALS AND
THE “STANDARD ERROR”. WE USE THIS “SE” TO
DETERMINE THE LIMITS OF OUR CONFIDENCE
INTERVAL. OF COURSE THE WIDTH OF THE “CI” ALSO
DEPENDS ON JUST HOW “CONFIDENT” WE WANT TO BE:
90%, 95% OR 99%. AND, YOU EXPLAINED ABOVE WHY
THE CI GETS WIDER AS OUR REQUIRED CONFIDENCE
LEVEL INCREASES.
SO, WHAT IS THE STANDARD ERROR (NOT THE
FORMULA) AND WHAT DOES IT MEASURE? ARE WE
STILL JUST MEASURING DISTANCES FROM SOME LINE
AND SQUARING THEM TO GET RID OF NEGATIVES? IS
THERE A SQUARE ROOT INVOLVED? WHAT DO YOU
THINK?
#6. YOU ALL HAVE ENJOYED DOING THE COMPLEX
CALCULATIONS RELATED TO BINOMIAL PROBLEMS ;-)
THE BINOMIAL PROBLEMS DEALT WITH DISCRETE
DATA LIKE “HEADS/TAILS” OR “WIN/LOSE” AND YOU
HAD TO USE THE COMPLEX EQUATION TO FIND THE
PROBABILITY OF EACH SITUATION LIKE TOSSING 4 OR
5 HEADS OUT OF 5 TOSSES. YOU HAVE ALSO USED
FORMULAS THAT ALLOW US TO TREAT DISCRETE DATA
AS “CONTINUOUS” DATA (NORMAL APPROXIMATION TO
THE BINOMIAL).
A ‘WRINKLE” FOR CONFIDENCE INTERVALS IS THAT
THERE IS A “CORRECTION FACTOR” THAT MUST BE
APPLIED (CHECK LANE AROUND PAGE 360) UNDER THE
SECTION ENTITLED “PROPORTIONS” FOR THAT
CORRECTION FORMULA). PROPORTIONS MEANS
PERCENTAGES LIKE 35 OUT OF 100 = 35%.
THIS IS A PROBLEM THAT REQUIRES USING THAT
CORRECTION FACTOR TO SOLVE A DISCRETE DATA
PROBLEM AS A CONTINUOUS DATA PROBLEM (SOLVE
IT):
A person claims to be able to predict the outcome of flipping a
coin. This person is correct 18/24 times. Compute the 95%
confidence interval on the proportion of times this person can
predict coin flips correctly. What conclusion can you draw
about this test of his ability to predict the future (at least
regarding coin toss results)?
#7. THIS PROBLEM HAS TWO PARTS: THE FIRST USES
THE Z-TABLES FOR THE NORMAL DISTRIBUTION AND
THE SECOND REQUIRES THE t-TABLES. DO YOU
UNDERSTAND WHY (THIS IS VERY IMORTANT)
You take a sample of 30 from a large population of test scores,
and the mean of your sample is 70.
(a) You know that the standard deviation of the population is
12. What is the 99% confidence interval on the population
mean.
(b) Now assume that you do NOT know the population standard
deviation, but the standard deviation of your sample is 12. What
is the 99% confidence interval on the mean now?
#8. WE ARE DIFINITELY INTO “INFERENTIAL”
STATISTICS NOW AND ENTERING INTO THE MOST
IMPORTANT TOPIC WE COVER: HYPOTHESIS TESTING.
IN THIS PROBLEM WE USE THE CONFIDENCE INTERVAL
TO ACTUALLY TEST A HYPOTHESIS OR “GUT FEELING”
THAT WE HAVE.
You read about a survey in a newspaper and find that 75% of
the 275 people sampled prefer Candidate A. You are surprised
by this survey because you thought that only 50% of the
population would prefer this candidate. Based on this sample, is
50% a possible population proportion? Compute the 95%
confidence interval to be sure and decide for yourself whether
based on this sample, 50% is a possible population proportion
(proportion simply refers to percentage).
AN HYPOTHESIS IS KIND OF AN EDUCATED GUESS AND
IN HYPOTHESIS TESTING THERE ARE ALWAYS TWO
HYPOTHESES: THE NULL HYPOTHESIS (Ho) WHICH
MUST ALWAYS HAVE THE EQUALS IN IT AND THE
ALTERNATE HYPOTHESIS (Ha) WHICH MUST ACCOUNT
FOR THE REMAINING POSSIBILITIES. HERE OUR Ho IS
THAT THE PROPORTION EQUALS (OR IS GREATER THAN)
75% AND THE ALTERNATE HYPOTHESIS Ha IS THAT THE
PROPORTION IS LESS THAN 75% (LIKE 50% WOULD BE).
AS YOU WILL LEARN IN THE UPCOMING WEEKS, WE
COVER THREE WAYS TO PERFORM AN HYPOTHESIS
TEST: CONFIDENCE INTERVALS, TEST STATISTICS,
AND PROBABILITY VALUES (P-VALUES). THE LATTER
TWO ALWAYS LEAD TO THE SAME CONCLUSION, BUT
CONFIDENCE INTERVALS MAY NOT AGREE.
AFTER WE DO OUR ANALYSIS WE EITHER “ACCEPT” OR
“REJECT” Ho. (YOU SHOULD USE THOSE TERMS TO BE
CLEAR). BE CAREFUL THOUGH, REJECTING Ho DOES
NOT AUTOMATICALLY MEAN WE ACCEPT Ha. THERE
MAY BE OTHER WRINKLES.
ILLOWSKY CHAPTER 8
#9. BELOW IS A TABLE SHOWING THE NUMBER OF ZIKA
MOSQUITOES FOUND IN 5 NEIGHBORHOODS IN A SMALL
CITY. WE WANT TO CALCULATE THE CONFIDENCE
INTERVAL AROUND THE TRUE MEAN AT A CONFIDENCE
LEVEL OF 99%.
# SAMPLES
MOSQ / SAMPLE
TOTAL MOSQ
2
1
2
3
2
6
4
3
12
5
4
20
1
5
5
15
45
THE TRICKY PART HERE IS TO GET THE VARIANCE (AND
STD DEV) NOTE THAT THERE ARE 15 SAMPLES AND
YOU ARE GIVEN THE NUMBER OF MOSQUITOES IN EACH
OF THEM. TRY THIS TABLE:
SAMPLE #
# MOSQ IN SAMPLE
# MOSQ - MEAN
(# MOSQ - MEAN)^2
1
1
-2.00
4.00
2
1
3
2
4
2
5
2
6
3
7
3
8
3
9
3
10
4
11
4
12
4
13
4
14
4
15
5
TOTAL MOSQ
TOTAL
MEAN = _______________, VARIANCE =
___________________, STANDARD DEVIATION =
_______________
CRITICAL t- VALUE IS ____________, THE EBM IS
_________ AND THE 99% CI IS
__________________________
#10. AD AGENCIES are interested in knowing the population
percent (PROPORTION) of women who make the majority of
CAR BUYING decisions. If you were designing this study to
determine this population proportion, what is the minimum
number (N) you would need to survey to be 95% confident that
the population proportion is estimated to within 5% (0.05)?
THIS IS A z-VALUE PROPORTION (PERCENTAGE)
PROBLEM (NOT t-VALUE)
b. WHAT IS THE FORMULA WE USE TO FIND “n” ?
c. WHAT ARE THE VALUES FOR p’ AND q’ (HINT: THIS
IS A BINOMIAL SITUATION: MALE/FEMALE SO ASSUME
50:50 OR 0.5 AND 0.5)
d. WHAT ARE THE VALUES OF EBP AND EBP2 ?
e. WHAT ARE THE VALUES OF α AND α / 2 ?
f. USING THE z-TABLES (NOT THE t-TABLES) WHAT IS
THE CRITICAL z-VALUE FOR THIS α / 2 ?
g. PLUG THESE VALUES INTO THE FORMULA FOR “n”
AND YOU GET n = _________
(YOU CAN REFER TO ILLOWSKY Section 8.3 and SAMPLE
PROBLEM 8.14 AROUND PAGE 463 FOR OTHER
GUIDANCE)

More Related Content

DOCX
Confidence Interval ModuleOne of the key concepts of statist.docx
PPTX
How to compute for sample size.pptx
PPT
Chapter 11
DOCX
- Aow-Aowf--,d--Tto o4prnbAuSDUJ_ pya.docx
PPTX
Standard Error & Confidence Intervals.pptx
PDF
Normal and standard normal distribution
PPT
Mca admission in india
PPT
Statistics Review
Confidence Interval ModuleOne of the key concepts of statist.docx
How to compute for sample size.pptx
Chapter 11
- Aow-Aowf--,d--Tto o4prnbAuSDUJ_ pya.docx
Standard Error & Confidence Intervals.pptx
Normal and standard normal distribution
Mca admission in india
Statistics Review

Similar to WEEK 5 HOMEWORK 5THIS WEEK INVOLVES READING NEW TABLES, THE t-TA.docx (20)

PPT
Bca admission in india
PPTX
STAT 206 - Chapter 8 (Confidence Interval Estimation).pptx
PDF
What Averages Dont Tell You
PPTX
A.6 confidence intervals
PPT
Introduction to Statistics - Part 2
DOCX
Answer the questions in one paragraph 4-5 sentences. · Why did t.docx
DOCX
Findings, Conclusions, & RecommendationsReport Writing
PPT
Sampling methods theory and practice
PDF
Chris Stuccio - Data science - Conversion Hotel 2015
PPT
101_sampling__population_Sept_2020.ppt
PDF
Lecture_Wk08.pdf
DOCX
Module 7 Interval estimatorsMaster for Business Statistics.docx
DOCX
WEEK 6 – HOMEWORK 6 LANE CHAPTERS, 11, 12, AND 13; ILLOWSKY CHAP.docx
DOCX
Module-2_Notes-with-Example for data science
DOCX
Section 7 Analyzing our Marketing Test, Survey Results .docx
PPTX
Confidence interval statistics two .pptx
DOCX
Quantitative MethodsChoosing a Sample.pptxChoosing a Samp.docx
PDF
5 simple questions to determin sample size
DOCX
Section 8 Ensure Valid Test and Survey Results Trough .docx
PPTX
Pengenalan Ekonometrika
Bca admission in india
STAT 206 - Chapter 8 (Confidence Interval Estimation).pptx
What Averages Dont Tell You
A.6 confidence intervals
Introduction to Statistics - Part 2
Answer the questions in one paragraph 4-5 sentences. · Why did t.docx
Findings, Conclusions, & RecommendationsReport Writing
Sampling methods theory and practice
Chris Stuccio - Data science - Conversion Hotel 2015
101_sampling__population_Sept_2020.ppt
Lecture_Wk08.pdf
Module 7 Interval estimatorsMaster for Business Statistics.docx
WEEK 6 – HOMEWORK 6 LANE CHAPTERS, 11, 12, AND 13; ILLOWSKY CHAP.docx
Module-2_Notes-with-Example for data science
Section 7 Analyzing our Marketing Test, Survey Results .docx
Confidence interval statistics two .pptx
Quantitative MethodsChoosing a Sample.pptxChoosing a Samp.docx
5 simple questions to determin sample size
Section 8 Ensure Valid Test and Survey Results Trough .docx
Pengenalan Ekonometrika
Ad

More from cockekeshia (20)

DOCX
at least 2 references in each peer responses! I noticed .docx
DOCX
At least 2 pages longMarilyn Lysohir, an internationally celebra.docx
DOCX
At least 2 citations. APA 7TH EditionResponse 1. TITop.docx
DOCX
At each decision point, you should evaluate all options before selec.docx
DOCX
At an elevation of nearly four thousand metres above sea.docx
DOCX
At a minimum, your outline should include the followingIntroducti.docx
DOCX
At least 500 wordsPay attention to the required length of these.docx
DOCX
At a generic level, innovation is a core business process concerned .docx
DOCX
Asymmetric Cryptography•Description of each algorithm•Types•Encrypt.docx
DOCX
Astronomy HWIn 250-300 words,What was Aristarchus idea of the.docx
DOCX
Astronomy ASTA01The Sun and PlanetsDepartment of Physic.docx
DOCX
Astronomers have been reflecting laser beams off the Moon since refl.docx
DOCX
Astrategicplantoinformemergingfashionretailers.docx
DOCX
Asthma, Sleep, and Sun-SafetyPercentage of High School S.docx
DOCX
Asthma DataSchoolNumStudentIDGenderZipDOBAsthmaRADBronchitisWheezi.docx
DOCX
Assumption-Busting1. What assumption do you have that is in s.docx
DOCX
Assuming you have the results of the Business Impact Analysis and ri.docx
DOCX
Assuming you are hired by a corporation to assess the market potenti.docx
DOCX
Assuming that you are in your chosen criminal justice professi.docx
DOCX
assuming that Nietzsche is correct that conventional morality is aga.docx
at least 2 references in each peer responses! I noticed .docx
At least 2 pages longMarilyn Lysohir, an internationally celebra.docx
At least 2 citations. APA 7TH EditionResponse 1. TITop.docx
At each decision point, you should evaluate all options before selec.docx
At an elevation of nearly four thousand metres above sea.docx
At a minimum, your outline should include the followingIntroducti.docx
At least 500 wordsPay attention to the required length of these.docx
At a generic level, innovation is a core business process concerned .docx
Asymmetric Cryptography•Description of each algorithm•Types•Encrypt.docx
Astronomy HWIn 250-300 words,What was Aristarchus idea of the.docx
Astronomy ASTA01The Sun and PlanetsDepartment of Physic.docx
Astronomers have been reflecting laser beams off the Moon since refl.docx
Astrategicplantoinformemergingfashionretailers.docx
Asthma, Sleep, and Sun-SafetyPercentage of High School S.docx
Asthma DataSchoolNumStudentIDGenderZipDOBAsthmaRADBronchitisWheezi.docx
Assumption-Busting1. What assumption do you have that is in s.docx
Assuming you have the results of the Business Impact Analysis and ri.docx
Assuming you are hired by a corporation to assess the market potenti.docx
Assuming that you are in your chosen criminal justice professi.docx
assuming that Nietzsche is correct that conventional morality is aga.docx
Ad

Recently uploaded (20)

PPTX
master seminar digital applications in india
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PDF
01-Introduction-to-Information-Management.pdf
PDF
Pre independence Education in Inndia.pdf
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PDF
Classroom Observation Tools for Teachers
PDF
Basic Mud Logging Guide for educational purpose
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
Sports Quiz easy sports quiz sports quiz
PDF
Computing-Curriculum for Schools in Ghana
PDF
Complications of Minimal Access Surgery at WLH
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
Insiders guide to clinical Medicine.pdf
PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PPTX
Institutional Correction lecture only . . .
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
master seminar digital applications in india
Renaissance Architecture: A Journey from Faith to Humanism
102 student loan defaulters named and shamed – Is someone you know on the list?
Supply Chain Operations Speaking Notes -ICLT Program
01-Introduction-to-Information-Management.pdf
Pre independence Education in Inndia.pdf
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Classroom Observation Tools for Teachers
Basic Mud Logging Guide for educational purpose
O5-L3 Freight Transport Ops (International) V1.pdf
Sports Quiz easy sports quiz sports quiz
Computing-Curriculum for Schools in Ghana
Complications of Minimal Access Surgery at WLH
human mycosis Human fungal infections are called human mycosis..pptx
Insiders guide to clinical Medicine.pdf
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
Institutional Correction lecture only . . .
Module 4: Burden of Disease Tutorial Slides S2 2025
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx

WEEK 5 HOMEWORK 5THIS WEEK INVOLVES READING NEW TABLES, THE t-TA.docx

  • 1. WEEK 5 HOMEWORK 5 THIS WEEK INVOLVES READING NEW TABLES, THE t- TABLES. BUT, UNLIKE THE Z-TABLES WHERE YOU JUST NEEDED TO CALCULATE A Z-VALUE TO FIND THE PROBABILITY IN THE TABLE, WITH THE t-TABLE YOU NEED THE t-VALUE AND THE DEGREES OF FREEDOM, WHICH IS SIMPLY N – 1. THE t-VALUES DEPEND ON THE SIZE OF THE SAMPLE. THE t-DISTRIBUTION SHAPE CAN BE A “SKINNY” BELL CURVE, SYMETRICAL JUST LIKE THE NORMAL DISTRIBTION; BUT, AS YOU CAN SEE, THE LARGER THE SAMPLE SIZE THE MORE THE t- DISTRIBUTION SHAPE LOOKS LIKE THE NORMAL DISTRIBUTION SHAPE. THE t-TABLES ARE IN OUR COURSE CONTENT > COURSE RESOURCES (NEAR THE BOTTOM OF CONTENT) > STATISTICAL RESOURCES. I SUGGEST YOU PRINT IT OUT AND REVIEW IT AS YOU READ THESE INSTRUCTIONS: LET’S SAY YOU WANT TO BE 99% CONFIDENT (LEVEL OF CONFIDENCE) IN YOUR CONFIDENCE INTERVAL. THIS LEAVES A 1% CHANCE YOUR SAMPLE STATISTIC LIKE THE MEAN IS NOT IN THAT INTERVAL. KEEP IN MIND THAT THE CI HAS AN UPPER LIMIT AND A LOWER LIMIT SO THIS 1% COVERS BOTH ENDS. THIS MEANS THAT THE PROBABILITY THAT OUR RESULT IS IN ONE OR THE OTHER TAIL IS 1% / 2 = 0.5% (OR 0.005). THINK ABOUT IT. NOTE THAT IN THE TABLE THE 99% CI (BOTTOM LINE) EQUATES TO 0.005 ON THE TOP LINE. ALL WE NEED NOW IS THE DEGREES OF FREEDOM, WHICH IS N – 1, TO GET OUR CRITICAL t-VALUE FROM THE BODY OF THE TABLE. IN THIS EXAMPLE THE df IS 14, SO THE t-VALUE IS 2.977. THIS t-VALUE IS A CRITICAL VALUE LIKE A Z-VALUE.
  • 2. THE t-TABLES ARE DIFFERENT FROM THE z-TABLES IN THAT WITH THE “t” WE SPECIFY THE PROBABILITY WE WANT AND THEN FIND THE CRITICAL t-VALUE. WITH THE NORMAL DISTRIBUTION, WE CALCULTE THE z- VALUE FIRST AND THEN COMPARE IT TO THE CRITICAL z-VALUE OF OUR DESIRED LEVEL OF SIGNIFICANCE (90%, 95%, OR 99%) TO SEE IF OUR CALCULATED z- VALUE IS FURTHER OUT IN EITHER TAIL, HENCE “UNUSUAL”. THE NORMAL DISTRIBUTION CURVE IS ONE CURVE AND DOES NOT CHANGE WITH SAMPLE SIZE, HENCE WE CAN HAVE THE SAME CRITICAL z-VALUES FOR EVERY PROBLEM (99%, 95%, AND 90%). THE t-DISTRIBTUTION CURVE STARTS OUT A SKINNY BELL CURVE AND SLOWLY WIDENS AS THE SAMPLE SIZE (THE df) INCREASES. YOU CAN SEE FROM THE t-TABLE THAT WHEN THE df APPROACHES 1000, WE ARE VERY CLOSE TO THE NORMAL DISTRIBUTION’S BELL SHAPE. BUT, SINCE THE t-SHAPE CHANGES, THERE ARE DIFFERENT CRITICAL t-VALUES DEPENDING ON THE df (SAMPLE SIZES) INVOLVED. HERE IS WHAT THE CONFIDENCE INTERVAL MEANS GRAPHICALLY: IN THIS SAMPLE PROBLEM WE WANTED A 95% CONFIDENCE LEVEL CONFIDENCE INTERVAL BASED ON A SAMPLE WITH A MEAN OF 3.26, A STD DEV OF 1.019 AND A SAMPLE SIZE OF 39 (df = 39-1 =38). THE CRITICAL t-VALUE FROM THE TABLE IS 2.023 (WE DON’T HAVE 39 df IN THE TABLE SO WE APPROXIMATED USING THE df FOR 39 AND 40). USE THE TABLE AND NOT SOFTWARE FOR THESE PROBLEMS. NOTE THAT THE 95% LEVEL MEANS 2.5% OF THE AREA UNDER THE t-CURVE IS IN EACH TAIL AS THE LIMITS. NOW WE HAVE OUR t-VALUE = 2.023. REMEMBER WHEN WE STANDARDIZED OUR RAW DATA POINTS (THE X- VALUES) USING: Z = (X –MEAN)/STD DEV. IF WE WERE GIVEN THE z-VALUE WE COULD BACK-CALCULATE THE
  • 3. X-VALUE OR ACTUAL DATA POINT WITH THAT STANDARD DEVIATION (THE z-VALUE): X = Z * STD DEV + MEAN. WITH THE t-VALUE IT’S A LITTLE DIFFERENT SINCE SAMPLE SIZE MATTERS. HERE WE MUST FIRST CALCULATE AN “ERROR BOUND” WHICH EQUALS: t- VALUE * STD DEV / SQ RT OF THE SAMPLE SIZE N. PLUGGING IN WE HAVE: EBM = 2.023 * 1.019 / √39 = 0.3302. SIMPLY ADD AND SUBTRACT 0.3302 FROM THE SAMPLE MEAN TO GET THE CONFIDENCE INTERVAL LIMITS IN ACTUAL X-VALUES. STATING WHAT THE CONFIDENCE NTERVAL MEANS IS DIFFERENT THAN YOU MIGHT FIRST THINK. “WE CAN BE CONFIDENT THAT 95% OF OUR SAMPLE MEANS WILL FALL IN THIS CONFIDENCE INTERVAL.” AND NOT SIMPLY: “95% OF OUR SAMPLE MEANS WILL FALL IN THIS INTERVAL”. BE CAREFUL. HERE ARE THE WEEK 5 HOMEWORK PROBLEMS. LANE-CHAPTER 10 TYPE PROBLEMS. #1. BRIEFLY EXPLAIN THE RELATIONSHIPS AMONG SAMPLES, POPULATIONS, STATISTICS, AND PARAMETERS. #2. HOW ARE BIAS AND ACCURACY RELATED ? #3. WHY DOES A CONFIDENCE INTERVAL GET WIDER THE MORE CONFIDENT YOU WANT TO BE ? #4. THE STUDENT t DISTRIBUTION WITH ITS t- TABLES IS INTRODUCED THIS WEEK. WE HAVE PREVIOUSLY BEEN USING THE Z-TABLES FOR THE NORMAL DISTRIBUTION. HOW DO WE KNOW WHICH TO USE ? #5. IN THE CONCEPTS AND RELATED PROBLEMS THIS WEEK WE INTRODUCE CONFIDENCE INTERVALS AND THE “STANDARD ERROR”. WE USE THIS “SE” TO
  • 4. DETERMINE THE LIMITS OF OUR CONFIDENCE INTERVAL. OF COURSE THE WIDTH OF THE “CI” ALSO DEPENDS ON JUST HOW “CONFIDENT” WE WANT TO BE: 90%, 95% OR 99%. AND, YOU EXPLAINED ABOVE WHY THE CI GETS WIDER AS OUR REQUIRED CONFIDENCE LEVEL INCREASES. SO, WHAT IS THE STANDARD ERROR (NOT THE FORMULA) AND WHAT DOES IT MEASURE? ARE WE STILL JUST MEASURING DISTANCES FROM SOME LINE AND SQUARING THEM TO GET RID OF NEGATIVES? IS THERE A SQUARE ROOT INVOLVED? WHAT DO YOU THINK? #6. YOU ALL HAVE ENJOYED DOING THE COMPLEX CALCULATIONS RELATED TO BINOMIAL PROBLEMS ;-) THE BINOMIAL PROBLEMS DEALT WITH DISCRETE DATA LIKE “HEADS/TAILS” OR “WIN/LOSE” AND YOU HAD TO USE THE COMPLEX EQUATION TO FIND THE PROBABILITY OF EACH SITUATION LIKE TOSSING 4 OR 5 HEADS OUT OF 5 TOSSES. YOU HAVE ALSO USED FORMULAS THAT ALLOW US TO TREAT DISCRETE DATA AS “CONTINUOUS” DATA (NORMAL APPROXIMATION TO THE BINOMIAL). A ‘WRINKLE” FOR CONFIDENCE INTERVALS IS THAT THERE IS A “CORRECTION FACTOR” THAT MUST BE APPLIED (CHECK LANE AROUND PAGE 360) UNDER THE SECTION ENTITLED “PROPORTIONS” FOR THAT CORRECTION FORMULA). PROPORTIONS MEANS PERCENTAGES LIKE 35 OUT OF 100 = 35%. THIS IS A PROBLEM THAT REQUIRES USING THAT CORRECTION FACTOR TO SOLVE A DISCRETE DATA PROBLEM AS A CONTINUOUS DATA PROBLEM (SOLVE IT): A person claims to be able to predict the outcome of flipping a coin. This person is correct 18/24 times. Compute the 95% confidence interval on the proportion of times this person can predict coin flips correctly. What conclusion can you draw
  • 5. about this test of his ability to predict the future (at least regarding coin toss results)? #7. THIS PROBLEM HAS TWO PARTS: THE FIRST USES THE Z-TABLES FOR THE NORMAL DISTRIBUTION AND THE SECOND REQUIRES THE t-TABLES. DO YOU UNDERSTAND WHY (THIS IS VERY IMORTANT) You take a sample of 30 from a large population of test scores, and the mean of your sample is 70. (a) You know that the standard deviation of the population is 12. What is the 99% confidence interval on the population mean. (b) Now assume that you do NOT know the population standard deviation, but the standard deviation of your sample is 12. What is the 99% confidence interval on the mean now? #8. WE ARE DIFINITELY INTO “INFERENTIAL” STATISTICS NOW AND ENTERING INTO THE MOST IMPORTANT TOPIC WE COVER: HYPOTHESIS TESTING. IN THIS PROBLEM WE USE THE CONFIDENCE INTERVAL TO ACTUALLY TEST A HYPOTHESIS OR “GUT FEELING” THAT WE HAVE. You read about a survey in a newspaper and find that 75% of the 275 people sampled prefer Candidate A. You are surprised by this survey because you thought that only 50% of the population would prefer this candidate. Based on this sample, is 50% a possible population proportion? Compute the 95% confidence interval to be sure and decide for yourself whether based on this sample, 50% is a possible population proportion (proportion simply refers to percentage). AN HYPOTHESIS IS KIND OF AN EDUCATED GUESS AND IN HYPOTHESIS TESTING THERE ARE ALWAYS TWO HYPOTHESES: THE NULL HYPOTHESIS (Ho) WHICH MUST ALWAYS HAVE THE EQUALS IN IT AND THE
  • 6. ALTERNATE HYPOTHESIS (Ha) WHICH MUST ACCOUNT FOR THE REMAINING POSSIBILITIES. HERE OUR Ho IS THAT THE PROPORTION EQUALS (OR IS GREATER THAN) 75% AND THE ALTERNATE HYPOTHESIS Ha IS THAT THE PROPORTION IS LESS THAN 75% (LIKE 50% WOULD BE). AS YOU WILL LEARN IN THE UPCOMING WEEKS, WE COVER THREE WAYS TO PERFORM AN HYPOTHESIS TEST: CONFIDENCE INTERVALS, TEST STATISTICS, AND PROBABILITY VALUES (P-VALUES). THE LATTER TWO ALWAYS LEAD TO THE SAME CONCLUSION, BUT CONFIDENCE INTERVALS MAY NOT AGREE. AFTER WE DO OUR ANALYSIS WE EITHER “ACCEPT” OR “REJECT” Ho. (YOU SHOULD USE THOSE TERMS TO BE CLEAR). BE CAREFUL THOUGH, REJECTING Ho DOES NOT AUTOMATICALLY MEAN WE ACCEPT Ha. THERE MAY BE OTHER WRINKLES. ILLOWSKY CHAPTER 8 #9. BELOW IS A TABLE SHOWING THE NUMBER OF ZIKA MOSQUITOES FOUND IN 5 NEIGHBORHOODS IN A SMALL CITY. WE WANT TO CALCULATE THE CONFIDENCE INTERVAL AROUND THE TRUE MEAN AT A CONFIDENCE LEVEL OF 99%. # SAMPLES MOSQ / SAMPLE TOTAL MOSQ 2 1 2 3 2
  • 7. 6 4 3 12 5 4 20 1 5 5 15 45 THE TRICKY PART HERE IS TO GET THE VARIANCE (AND STD DEV) NOTE THAT THERE ARE 15 SAMPLES AND YOU ARE GIVEN THE NUMBER OF MOSQUITOES IN EACH OF THEM. TRY THIS TABLE: SAMPLE # # MOSQ IN SAMPLE # MOSQ - MEAN (# MOSQ - MEAN)^2 1 1 -2.00 4.00 2 1 3 2
  • 9. 13 4 14 4 15 5 TOTAL MOSQ TOTAL MEAN = _______________, VARIANCE = ___________________, STANDARD DEVIATION = _______________ CRITICAL t- VALUE IS ____________, THE EBM IS _________ AND THE 99% CI IS __________________________ #10. AD AGENCIES are interested in knowing the population percent (PROPORTION) of women who make the majority of CAR BUYING decisions. If you were designing this study to determine this population proportion, what is the minimum number (N) you would need to survey to be 95% confident that the population proportion is estimated to within 5% (0.05)? THIS IS A z-VALUE PROPORTION (PERCENTAGE) PROBLEM (NOT t-VALUE) b. WHAT IS THE FORMULA WE USE TO FIND “n” ?
  • 10. c. WHAT ARE THE VALUES FOR p’ AND q’ (HINT: THIS IS A BINOMIAL SITUATION: MALE/FEMALE SO ASSUME 50:50 OR 0.5 AND 0.5) d. WHAT ARE THE VALUES OF EBP AND EBP2 ? e. WHAT ARE THE VALUES OF α AND α / 2 ? f. USING THE z-TABLES (NOT THE t-TABLES) WHAT IS THE CRITICAL z-VALUE FOR THIS α / 2 ? g. PLUG THESE VALUES INTO THE FORMULA FOR “n” AND YOU GET n = _________ (YOU CAN REFER TO ILLOWSKY Section 8.3 and SAMPLE PROBLEM 8.14 AROUND PAGE 463 FOR OTHER GUIDANCE) WEEK 5 HOMEWORK 5 THIS WEEK INVOLVES READING NEW TABLES, THE t- TABLES. BUT, UNLIKE THE Z-TABLES WHERE YOU JUST NEEDED TO CALCULATE A Z-VALUE TO FIND THE PROBABILITY IN THE TABLE, WITH THE t-TABLE YOU NEED THE t-VALUE AND THE DEGREES OF FREEDOM, WHICH IS SIMPLY N – 1. THE t-VALUES DEPEND ON THE SIZE OF THE SAMPLE. THE t-DISTRIBUTION SHAPE CAN BE A “SKINNY” BELL CURVE, SYMETRICAL JUST LIKE THE NORMAL DISTRIBTION; BUT, AS YOU CAN SEE, THE LARGER THE SAMPLE SIZE THE MORE THE t- DISTRIBUTION SHAPE LOOKS LIKE THE NORMAL DISTRIBUTION SHAPE. THE t-TABLES ARE IN OUR COURSE CONTENT > COURSE RESOURCES (NEAR THE BOTTOM OF CONTENT) > STATISTICAL RESOURCES. I SUGGEST YOU PRINT IT OUT AND REVIEW IT AS YOU READ THESE INSTRUCTIONS: LET’S SAY YOU WANT TO BE 99% CONFIDENT (LEVEL OF CONFIDENCE) IN YOUR CONFIDENCE INTERVAL. THIS LEAVES A 1% CHANCE YOUR SAMPLE STATISTIC
  • 11. LIKE THE MEAN IS NOT IN THAT INTERVAL. KEEP IN MIND THAT THE CI HAS AN UPPER LIMIT AND A LOWER LIMIT SO THIS 1% COVERS BOTH ENDS. THIS MEANS THAT THE PROBABILITY THAT OUR RESULT IS IN ONE OR THE OTHER TAIL IS 1% / 2 = 0.5% (OR 0.005). THINK ABOUT IT. NOTE THAT IN THE TABLE THE 99% CI (BOTTOM LINE) EQUATES TO 0.005 ON THE TOP LINE. ALL WE NEED NOW IS THE DEGREES OF FREEDOM, WHICH IS N – 1, TO GET OUR CRITICAL t-VALUE FROM THE BODY OF THE TABLE. IN THIS EXAMPLE THE df IS 14, SO THE t-VALUE IS 2.977. THIS t-VALUE IS A CRITICAL VALUE LIKE A Z-VALUE. THE t-TABLES ARE DIFFERENT FROM THE z-TABLES IN THAT WITH THE “t” WE SPECIFY THE PROBABILITY WE WANT AND THEN FIND THE CRITICAL t-VALUE. WITH THE NORMAL DISTRIBUTION, WE CALCULTE THE z- VALUE FIRST AND THEN COMPARE IT TO THE CRITICAL z-VALUE OF OUR DESIRED LEVEL OF SIGNIFICANCE (90%, 95%, OR 99%) TO SEE IF OUR CALCULATED z- VALUE IS FURTHER OUT IN EITHER TAIL, HENCE “UNUSUAL”. THE NORMAL DISTRIBUTION CURVE IS ONE CURVE AND DOES NOT CHANGE WITH SAMPLE SIZE, HENCE WE CAN HAVE THE SAME CRITICAL z-VALUES FOR EVERY PROBLEM (99%, 95%, AND 90%). THE t-DISTRIBTUTION CURVE STARTS OUT A SKINNY BELL CURVE AND SLOWLY WIDENS AS THE SAMPLE SIZE (THE df) INCREASES. YOU CAN SEE FROM THE t-TABLE THAT WHEN THE df APPROACHES 1000, WE ARE VERY CLOSE TO THE NORMAL DISTRIBUTION’S BELL SHAPE. BUT, SINCE THE t-SHAPE CHANGES, THERE ARE DIFFERENT CRITICAL t-VALUES DEPENDING ON THE df (SAMPLE SIZES) INVOLVED. HERE IS WHAT THE CONFIDENCE INTERVAL MEANS GRAPHICALLY: IN THIS SAMPLE PROBLEM WE WANTED A 95% CONFIDENCE LEVEL CONFIDENCE INTERVAL
  • 12. BASED ON A SAMPLE WITH A MEAN OF 3.26, A STD DEV OF 1.019 AND A SAMPLE SIZE OF 39 (df = 39-1 =38). THE CRITICAL t-VALUE FROM THE TABLE IS 2.023 (WE DON’T HAVE 39 df IN THE TABLE SO WE APPROXIMATED USING THE df FOR 39 AND 40). USE THE TABLE AND NOT SOFTWARE FOR THESE PROBLEMS. NOTE THAT THE 95% LEVEL MEANS 2.5% OF THE AREA UNDER THE t-CURVE IS IN EACH TAIL AS THE LIMITS. NOW WE HAVE OUR t-VALUE = 2.023. REMEMBER WHEN WE STANDARDIZED OUR RAW DATA POINTS (THE X- VALUES) USING: Z = (X –MEAN)/STD DEV. IF WE WERE GIVEN THE z-VALUE WE COULD BACK-CALCULATE THE X-VALUE OR ACTUAL DATA POINT WITH THAT STANDARD DEVIATION (THE z-VALUE): X = Z * STD DEV + MEAN. WITH THE t-VALUE IT’S A LITTLE DIFFERENT SINCE SAMPLE SIZE MATTERS. HERE WE MUST FIRST CALCULATE AN “ERROR BOUND” WHICH EQUALS: t- VALUE * STD DEV / SQ RT OF THE SAMPLE SIZE N. PLUGGING IN WE HAVE: EBM = 2.023 * 1.019 / √39 = 0.3302. SIMPLY ADD AND SUBTRACT 0.3302 FROM THE SAMPLE MEAN TO GET THE CONFIDENCE INTERVAL LIMITS IN ACTUAL X-VALUES. STATING WHAT THE CONFIDENCE NTERVAL MEANS IS DIFFERENT THAN YOU MIGHT FIRST THINK. “WE CAN BE CONFIDENT THAT 95% OF OUR SAMPLE MEANS WILL FALL IN THIS CONFIDENCE INTERVAL.” AND NOT SIMPLY: “95% OF OUR SAMPLE MEANS WILL FALL IN THIS INTERVAL”. BE CAREFUL. HERE ARE THE WEEK 5 HOMEWORK PROBLEMS. LANE-CHAPTER 10 TYPE PROBLEMS. #1. BRIEFLY EXPLAIN THE RELATIONSHIPS AMONG
  • 13. SAMPLES, POPULATIONS, STATISTICS, AND PARAMETERS. #2. HOW ARE BIAS AND ACCURACY RELATED ? #3. WHY DOES A CONFIDENCE INTERVAL GET WIDER THE MORE CONFIDENT YOU WANT TO BE ? #4. THE STUDENT t DISTRIBUTION WITH ITS t- TABLES IS INTRODUCED THIS WEEK. WE HAVE PREVIOUSLY BEEN USING THE Z-TABLES FOR THE NORMAL DISTRIBUTION. HOW DO WE KNOW WHICH TO USE ? #5. IN THE CONCEPTS AND RELATED PROBLEMS THIS WEEK WE INTRODUCE CONFIDENCE INTERVALS AND THE “STANDARD ERROR”. WE USE THIS “SE” TO DETERMINE THE LIMITS OF OUR CONFIDENCE INTERVAL. OF COURSE THE WIDTH OF THE “CI” ALSO DEPENDS ON JUST HOW “CONFIDENT” WE WANT TO BE: 90%, 95% OR 99%. AND, YOU EXPLAINED ABOVE WHY THE CI GETS WIDER AS OUR REQUIRED CONFIDENCE LEVEL INCREASES. SO, WHAT IS THE STANDARD ERROR (NOT THE FORMULA) AND WHAT DOES IT MEASURE? ARE WE STILL JUST MEASURING DISTANCES FROM SOME LINE AND SQUARING THEM TO GET RID OF NEGATIVES? IS THERE A SQUARE ROOT INVOLVED? WHAT DO YOU THINK? #6. YOU ALL HAVE ENJOYED DOING THE COMPLEX CALCULATIONS RELATED TO BINOMIAL PROBLEMS ;-) THE BINOMIAL PROBLEMS DEALT WITH DISCRETE DATA LIKE “HEADS/TAILS” OR “WIN/LOSE” AND YOU HAD TO USE THE COMPLEX EQUATION TO FIND THE PROBABILITY OF EACH SITUATION LIKE TOSSING 4 OR 5 HEADS OUT OF 5 TOSSES. YOU HAVE ALSO USED FORMULAS THAT ALLOW US TO TREAT DISCRETE DATA AS “CONTINUOUS” DATA (NORMAL APPROXIMATION TO THE BINOMIAL). A ‘WRINKLE” FOR CONFIDENCE INTERVALS IS THAT THERE IS A “CORRECTION FACTOR” THAT MUST BE
  • 14. APPLIED (CHECK LANE AROUND PAGE 360) UNDER THE SECTION ENTITLED “PROPORTIONS” FOR THAT CORRECTION FORMULA). PROPORTIONS MEANS PERCENTAGES LIKE 35 OUT OF 100 = 35%. THIS IS A PROBLEM THAT REQUIRES USING THAT CORRECTION FACTOR TO SOLVE A DISCRETE DATA PROBLEM AS A CONTINUOUS DATA PROBLEM (SOLVE IT): A person claims to be able to predict the outcome of flipping a coin. This person is correct 18/24 times. Compute the 95% confidence interval on the proportion of times this person can predict coin flips correctly. What conclusion can you draw about this test of his ability to predict the future (at least regarding coin toss results)? #7. THIS PROBLEM HAS TWO PARTS: THE FIRST USES THE Z-TABLES FOR THE NORMAL DISTRIBUTION AND THE SECOND REQUIRES THE t-TABLES. DO YOU UNDERSTAND WHY (THIS IS VERY IMORTANT) You take a sample of 30 from a large population of test scores, and the mean of your sample is 70. (a) You know that the standard deviation of the population is 12. What is the 99% confidence interval on the population mean. (b) Now assume that you do NOT know the population standard deviation, but the standard deviation of your sample is 12. What is the 99% confidence interval on the mean now? #8. WE ARE DIFINITELY INTO “INFERENTIAL” STATISTICS NOW AND ENTERING INTO THE MOST IMPORTANT TOPIC WE COVER: HYPOTHESIS TESTING. IN THIS PROBLEM WE USE THE CONFIDENCE INTERVAL TO ACTUALLY TEST A HYPOTHESIS OR “GUT FEELING” THAT WE HAVE. You read about a survey in a newspaper and find that 75% of
  • 15. the 275 people sampled prefer Candidate A. You are surprised by this survey because you thought that only 50% of the population would prefer this candidate. Based on this sample, is 50% a possible population proportion? Compute the 95% confidence interval to be sure and decide for yourself whether based on this sample, 50% is a possible population proportion (proportion simply refers to percentage). AN HYPOTHESIS IS KIND OF AN EDUCATED GUESS AND IN HYPOTHESIS TESTING THERE ARE ALWAYS TWO HYPOTHESES: THE NULL HYPOTHESIS (Ho) WHICH MUST ALWAYS HAVE THE EQUALS IN IT AND THE ALTERNATE HYPOTHESIS (Ha) WHICH MUST ACCOUNT FOR THE REMAINING POSSIBILITIES. HERE OUR Ho IS THAT THE PROPORTION EQUALS (OR IS GREATER THAN) 75% AND THE ALTERNATE HYPOTHESIS Ha IS THAT THE PROPORTION IS LESS THAN 75% (LIKE 50% WOULD BE). AS YOU WILL LEARN IN THE UPCOMING WEEKS, WE COVER THREE WAYS TO PERFORM AN HYPOTHESIS TEST: CONFIDENCE INTERVALS, TEST STATISTICS, AND PROBABILITY VALUES (P-VALUES). THE LATTER TWO ALWAYS LEAD TO THE SAME CONCLUSION, BUT CONFIDENCE INTERVALS MAY NOT AGREE. AFTER WE DO OUR ANALYSIS WE EITHER “ACCEPT” OR “REJECT” Ho. (YOU SHOULD USE THOSE TERMS TO BE CLEAR). BE CAREFUL THOUGH, REJECTING Ho DOES NOT AUTOMATICALLY MEAN WE ACCEPT Ha. THERE MAY BE OTHER WRINKLES. ILLOWSKY CHAPTER 8 #9. BELOW IS A TABLE SHOWING THE NUMBER OF ZIKA MOSQUITOES FOUND IN 5 NEIGHBORHOODS IN A SMALL CITY. WE WANT TO CALCULATE THE CONFIDENCE
  • 16. INTERVAL AROUND THE TRUE MEAN AT A CONFIDENCE LEVEL OF 99%. # SAMPLES MOSQ / SAMPLE TOTAL MOSQ 2 1 2 3 2 6 4 3 12 5 4 20 1 5 5 15 45 THE TRICKY PART HERE IS TO GET THE VARIANCE (AND STD DEV) NOTE THAT THERE ARE 15 SAMPLES AND YOU ARE GIVEN THE NUMBER OF MOSQUITOES IN EACH OF THEM. TRY THIS TABLE: SAMPLE # # MOSQ IN SAMPLE # MOSQ - MEAN
  • 17. (# MOSQ - MEAN)^2 1 1 -2.00 4.00 2 1 3 2 4 2 5 2 6 3 7 3 8 3 9 3
  • 18. 10 4 11 4 12 4 13 4 14 4 15 5 TOTAL MOSQ TOTAL MEAN = _______________, VARIANCE = ___________________, STANDARD DEVIATION = _______________ CRITICAL t- VALUE IS ____________, THE EBM IS _________ AND THE 99% CI IS
  • 19. __________________________ #10. AD AGENCIES are interested in knowing the population percent (PROPORTION) of women who make the majority of CAR BUYING decisions. If you were designing this study to determine this population proportion, what is the minimum number (N) you would need to survey to be 95% confident that the population proportion is estimated to within 5% (0.05)? THIS IS A z-VALUE PROPORTION (PERCENTAGE) PROBLEM (NOT t-VALUE) b. WHAT IS THE FORMULA WE USE TO FIND “n” ? c. WHAT ARE THE VALUES FOR p’ AND q’ (HINT: THIS IS A BINOMIAL SITUATION: MALE/FEMALE SO ASSUME 50:50 OR 0.5 AND 0.5) d. WHAT ARE THE VALUES OF EBP AND EBP2 ? e. WHAT ARE THE VALUES OF α AND α / 2 ? f. USING THE z-TABLES (NOT THE t-TABLES) WHAT IS THE CRITICAL z-VALUE FOR THIS α / 2 ? g. PLUG THESE VALUES INTO THE FORMULA FOR “n” AND YOU GET n = _________ (YOU CAN REFER TO ILLOWSKY Section 8.3 and SAMPLE PROBLEM 8.14 AROUND PAGE 463 FOR OTHER GUIDANCE)