Sample Size Determination  Deliverable 10A
Analyze Module Roadmap Define 1D – Define VOC, VOB, and CTQ’s 2D – Define Project Boundaries 3D – Quantify Project Value 4D – Develop Project Mgmt. Plan Measure 5M – Document Process 6M – Prioritize List of X’s 7M – Create Data Collection Plan 8M – Validate Measurement System 9M – Establish Baseline Process Cap. Analyze  10A – Determine Critical X’s Improve 12I – Prioritized List of Solutions 13I – Pilot Best Solution Control 14C – Create Control System 15C – Finalize Project Documentation Green 11G – Identify Root Cause Relationships Queue 1 Queue 2
Objectives – Sample Size Upon completion of this module, the student should be able to: List and define the variables which contribute to determining the correct sample size. Calculate the appropriate sample size for a defined set of variables
Key Variables in Sample Size An optimal sample size is determined by four key factors: Alpha risk (  ): The maximum risk the business is willing to take of rejecting the null hypothesis when it is true Beta risk (  ): The risk level of failing to reject the null hypothesis when it is false Delta or difference (  ): The minimum difference we want to detect between populations Proportion (p) or Standard Deviation (s):  Proportion - Your best estimate of the defect rate with discrete data Standard Deviation - Your best estimate from available continuous data
Alpha Risk  (  ) Alpha risk is decided by the Black Belt Our choice of    will determine when to reject the null hypothesis Typical    values for general business applications are between 0.05 and 0.10. As the cost of incorrect conclusions go up, you may choose to lower   . e.g. Pharmaceutical companies have tremendous risk to consumer health issues and often use an    of 0.01 The    value should depend on practical considerations such as financial or safety risk, or risk to the customer “ Significance” is defined as 1- 
Beta Risk (  )  Beta risk can be selected by the Black Belt, but we don‘t control it the same way we do    risk. The best we can do is adjust sample size so that    is no greater than a specified value. When a beta error (  ) occurs, we have missed detecting a difference (good or bad).  Power is defined as (1 -   ). It represents the probablitily that we can detect an important effect in the process Typical values of power in experiments are between 0.80 to 0.90 We will use 0.90 for most work at JEA
Delta (  )  Delta (  ) is the minimum change that needs to be detected during analysis Example: if the average cycle time to perform a laboratory test was 120 minutes, you as the supervisor may not be concerned if the average time shifted to 121 minutes, but you would want to know if it increased to 130 minutes. In this case, 10 minutes is the smallest increment of concern (   = 10 minutes). It is the acceptable window of uncertainty around the estimate As delta decreases (more precision), the sample size increases  As delta increases (less precision), the sample size decreases
Signal to Noise Ratio If you consider    and   , the ratio of the two is much like a signal-to-noise ratio If the “signal” is large relative to the noise, we can “hear” the signal Sample size will increase dramatically as the    ratio drops Low   High   
Minitab Versus Excel Minitab uses an “infinite population” approach  Minitab calculators assume the population is relatively infinite Relatively infinite means the population is at least ten times larger than the sample used Predicts a “safe” sample size (larger than a finite population approach) Excel calculators are able to use a “finite population” approach They have a “finite population correction factor” Adjusts the sample size to account for when we are sampling a significant portion of the population
Calculating Sample Size in Minitab Stat>Power and Sample Size>{Select as needed} Enter multiple values with a space between values for any/all of these (Minitab will calculate the value for the third parameter)
Wastewater Sample Size Example You are going to perform a statistical test to determine if there is a difference in the average suspended solids level for two processing lines at a wastewater treatment plant. A suspended solids difference of 10 units or less is unimportant to you for the purpose of this test, but you would like to detect a difference > 10. The historical process standard deviation is 5.
Wastewater Sample Size Example Stat > power and sample size > 2-Sample t
Minitab Output Power and Sample Size  2-Sample t Test Testing mean 1 = mean 2 (versus not =) Calculating power for mean 1 = mean 2 + difference Alpha = 0.05 Assumed standard deviation = 5 Sample Target Difference Size Power Actual Power 10 7 0.9 0.929070 The sample size is for each group.
Wastewater Sample Size – Pt. 2 “ Wow! Seven samples are not that many. I was prepared to gather 25 samples. How small of a difference can I detect if I collect 10,15, 20 or the entire 25 samples”? Power and Sample Size  2-Sample t Test Testing mean 1 = mean 2 (versus not =) Calculating power for mean 1 = mean 2 + difference Alpha = 0.05 Assumed standard deviation = 5 Sample Size Power Difference 10 0.9 7.66846 15 0.9 6.13222 20 0.9 5.25996 25 0.9 4.67878 The sample size is for each group. Notice how sample size increases dramatically as the difference to detect becomes smaller and smaller.
Class Exercise Recalculate the sample size for the previous problem using a 1%    and a 0.80 power. 10 min
Homework - Back to Pat’s Invoice Problem Our old friend Pat is starting to wonder about the validity of a great number of past decisions. In this case, Pat now realizes that the past practice of guessing at the number of invoices to inspect (as was done in previous modules) wasn’t the most reliable. How many data points will Pat need to inspect to rule in/out that the process does not have a 10% defect rate if the samples inspected had a 12%, 15%, 20%, or 25% defect rate?
Selecting Data for the Stat Test Now that we know how many data points to include in the statistical test, we need to identify which samples should be placed in the test. Assume you have several hundred data points collected over time, but the sample size calculation showed you need only 35 for the statistical test. How do we pick the appropriate 35? The 35 “best” or “worst” will certainly skew our conclusions 35 from the center of the data will not show the appropriate variability Let’s have Minitab do it for us!
Generating Random Data Use Minitab to generate 300 randomly distributed data points having a mean of 100 and a standard deviation of 10. Calc>Random Data>Normal
Selecting Data at Random Use the following to select 35 random data points Calc>Random Data>Sample from Columns
Randomly Selected Data This procedure works equally well with text or numerical values (a wonderful way to select the sequence for Black Belts to present their projects in class).
Learning Check – Sample Size Upon completion of this module, the student should be able to: List and define the variables which contribute to determining the correct sample size. Calculate the appropriate sample size for a defined set of variables

More Related Content

PPT
A05 Continuous One Variable Stat Tests
PPTX
Process Capability: Steps 1 to 3
PPTX
Hypothesis Testing: Relationships (Compare 2+ Factors)
PPTX
Analyze Phase Roadmap (Level 3)
PPTX
Descriptive Statistics
PPTX
Process Capability: Step 6 (Binomial)
PPT
Sample size and power
PPTX
Identify Root Causes – Building the DCP
A05 Continuous One Variable Stat Tests
Process Capability: Steps 1 to 3
Hypothesis Testing: Relationships (Compare 2+ Factors)
Analyze Phase Roadmap (Level 3)
Descriptive Statistics
Process Capability: Step 6 (Binomial)
Sample size and power
Identify Root Causes – Building the DCP

What's hot (20)

PPTX
Testing for Special Cause Variation
PPTX
Defining Performance Objectives
PPTX
Calculating a Sample Size
PPTX
Rational Sub-Grouping
PPTX
Hypothesis Testing: Formal and Informal Sub-Processes
PPTX
Process Capability: Step 4 (Normal Distributions)
PPTX
Hypothesis Testing: Relationships (Compare 1:1)
PPTX
Bayesian Assurance: Formalizing Sensitivity Analysis For Sample Size
PPTX
Hypothesis Testing: Spread (Compare 1:1)
PPTX
Process Capability: Step 5 (Non-Normal Distributions)
PPTX
Process Capability: Overview
PPTX
Hypothesis Testing: Central Tendency – Non-Normal (Compare 2+ Factors)
PPTX
Hypothesis Testing: Proportions (Compare 1:1)
PPTX
Understanding Statistical Power for Non-Statisticians
PPTX
MSA – Attribute ARR Test
PPTX
Hypothesis Testing: Proportions (Compare 2+ Factors)
PPTX
Hypothesis Testing: Spread (Compare 2+ Factors)
PPTX
Hypothesis Testing: Central Tendency – Non-Normal (Compare 1:Standard)
PPTX
Hypothesis Testing: Spread (Compare 1:Standard)
PPT
Perform audit testing in excel: Monetary Unit Sampling Method
Testing for Special Cause Variation
Defining Performance Objectives
Calculating a Sample Size
Rational Sub-Grouping
Hypothesis Testing: Formal and Informal Sub-Processes
Process Capability: Step 4 (Normal Distributions)
Hypothesis Testing: Relationships (Compare 1:1)
Bayesian Assurance: Formalizing Sensitivity Analysis For Sample Size
Hypothesis Testing: Spread (Compare 1:1)
Process Capability: Step 5 (Non-Normal Distributions)
Process Capability: Overview
Hypothesis Testing: Central Tendency – Non-Normal (Compare 2+ Factors)
Hypothesis Testing: Proportions (Compare 1:1)
Understanding Statistical Power for Non-Statisticians
MSA – Attribute ARR Test
Hypothesis Testing: Proportions (Compare 2+ Factors)
Hypothesis Testing: Spread (Compare 2+ Factors)
Hypothesis Testing: Central Tendency – Non-Normal (Compare 1:Standard)
Hypothesis Testing: Spread (Compare 1:Standard)
Perform audit testing in excel: Monetary Unit Sampling Method
Ad

Viewers also liked (20)

PPT
LEAN template
PPT
Blank Logo LEAN template
PPT
I07 Simulation
PPT
G06 Green Review
PPT
D02 Certification requirements
PPT
D01 Define Spacer
PPT
ANG_AFSO21_Awareness_Training_(DULUTH)
PPT
D12 Airplane DMAGIC Project
PPT
15 Deliv template
PPT
Hypothesis Test Selection Guide
PPT
D03 15 Deliverable Roadmap
PPT
D04 Why6Sigma
PDF
NG BB 39 IMPROVE Roadmap
PPT
D11 Define Review
PPT
D09 Recognize and Overcome Resistance
PDF
NG BB 38 ANALYZE Tollgate
PDF
NG BB 22 Process Measurement
PDF
NG BB 55 CONTROL Tollgate
PDF
NG BB 18 Theory of Constraints
PPT
D10 Project Management
LEAN template
Blank Logo LEAN template
I07 Simulation
G06 Green Review
D02 Certification requirements
D01 Define Spacer
ANG_AFSO21_Awareness_Training_(DULUTH)
D12 Airplane DMAGIC Project
15 Deliv template
Hypothesis Test Selection Guide
D03 15 Deliverable Roadmap
D04 Why6Sigma
NG BB 39 IMPROVE Roadmap
D11 Define Review
D09 Recognize and Overcome Resistance
NG BB 38 ANALYZE Tollgate
NG BB 22 Process Measurement
NG BB 55 CONTROL Tollgate
NG BB 18 Theory of Constraints
D10 Project Management
Ad

Similar to A04 Sample Size (20)

PDF
Faster and cheaper, smart ab experiments - public ver.
PPT
A05 Continuous One Variable Stat Tests
PPT
Chap 9 A Process Capability & Spc Hk
PDF
Advanced sampling part 2 presentation notes
PDF
Accurate Campaign Targeting Using Classification Algorithms
PPT
Analyzing Performance Test Data
PDF
Determining the optimal sample size for study/ research question.
PPTX
Webinar slides how to reduce sample size ethically and responsibly
PPTX
Machine learning session6(decision trees random forrest)
PPT
Introducing SigmaXL Version 7
DOCX
Statistics in real life engineering
PPT
Statistical Power And Sample Size Calculations .ppt
PDF
Andrii Belas: A/B testing overview: use-cases, theory and tools
PPT
MLlectureMethod.ppt
PPT
MLlectureMethod.ppt
PPTX
Stats chapter 12
PPTX
H2O World - Top 10 Data Science Pitfalls - Mark Landry
PDF
report
PPT
Lecture 4 Applied Econometrics and Economic Modeling
PDF
Statistical-Process-Control-Analysis-Unraveled_updated210
Faster and cheaper, smart ab experiments - public ver.
A05 Continuous One Variable Stat Tests
Chap 9 A Process Capability & Spc Hk
Advanced sampling part 2 presentation notes
Accurate Campaign Targeting Using Classification Algorithms
Analyzing Performance Test Data
Determining the optimal sample size for study/ research question.
Webinar slides how to reduce sample size ethically and responsibly
Machine learning session6(decision trees random forrest)
Introducing SigmaXL Version 7
Statistics in real life engineering
Statistical Power And Sample Size Calculations .ppt
Andrii Belas: A/B testing overview: use-cases, theory and tools
MLlectureMethod.ppt
MLlectureMethod.ppt
Stats chapter 12
H2O World - Top 10 Data Science Pitfalls - Mark Landry
report
Lecture 4 Applied Econometrics and Economic Modeling
Statistical-Process-Control-Analysis-Unraveled_updated210

More from Leanleaders.org (20)

PPTX
Variation and mistake proofing
PPT
D11 Define Review
DOC
Blankgage.MTW
DOC
Chi-sq GOF Calculator.xls
PPT
D04 Why6Sigma
PPT
D10 Project Management
DOC
Attrib R&R.xls
PPT
Blank Logo LEAN template
PPT
D07 Project Charter
PDF
NG BB 36 Simple Linear Regression
PPT
ANG_AFSO21_Awareness_Training_(DULUTH)
DOC
Cause and Effect Tree.vst
PPT
LEAN template
PPT
I07 Simulation
PDF
NG BB 39 IMPROVE Roadmap
PPT
D01 Define Spacer
PDF
NG BB 45 Quick Change Over
DOC
Attribute Process Capability Calculator.xls
DOC
XY Matrix.xls
PDF
NG BB 11 Power Steering
Variation and mistake proofing
D11 Define Review
Blankgage.MTW
Chi-sq GOF Calculator.xls
D04 Why6Sigma
D10 Project Management
Attrib R&R.xls
Blank Logo LEAN template
D07 Project Charter
NG BB 36 Simple Linear Regression
ANG_AFSO21_Awareness_Training_(DULUTH)
Cause and Effect Tree.vst
LEAN template
I07 Simulation
NG BB 39 IMPROVE Roadmap
D01 Define Spacer
NG BB 45 Quick Change Over
Attribute Process Capability Calculator.xls
XY Matrix.xls
NG BB 11 Power Steering

Recently uploaded (20)

PDF
LIFE & LIVING TRILOGY - PART (3) REALITY & MYSTERY.pdf
PPTX
Module on health assessment of CHN. pptx
PDF
Literature_Review_methods_ BRACU_MKT426 course material
PDF
CRP102_SAGALASSOS_Final_Projects_2025.pdf
PDF
IP : I ; Unit I : Preformulation Studies
PDF
FORM 1 BIOLOGY MIND MAPS and their schemes
PPTX
Share_Module_2_Power_conflict_and_negotiation.pptx
PDF
Journal of Dental Science - UDMY (2021).pdf
PDF
Hazard Identification & Risk Assessment .pdf
PDF
BP 505 T. PHARMACEUTICAL JURISPRUDENCE (UNIT 1).pdf
PDF
FOISHS ANNUAL IMPLEMENTATION PLAN 2025.pdf
PDF
Complications of Minimal Access-Surgery.pdf
PDF
Myanmar Dental Journal, The Journal of the Myanmar Dental Association (2013).pdf
PDF
Journal of Dental Science - UDMY (2020).pdf
PDF
Environmental Education MCQ BD2EE - Share Source.pdf
PDF
LIFE & LIVING TRILOGY- PART (1) WHO ARE WE.pdf
PDF
LEARNERS WITH ADDITIONAL NEEDS ProfEd Topic
PDF
HVAC Specification 2024 according to central public works department
PPTX
Education and Perspectives of Education.pptx
PDF
English Textual Question & Ans (12th Class).pdf
LIFE & LIVING TRILOGY - PART (3) REALITY & MYSTERY.pdf
Module on health assessment of CHN. pptx
Literature_Review_methods_ BRACU_MKT426 course material
CRP102_SAGALASSOS_Final_Projects_2025.pdf
IP : I ; Unit I : Preformulation Studies
FORM 1 BIOLOGY MIND MAPS and their schemes
Share_Module_2_Power_conflict_and_negotiation.pptx
Journal of Dental Science - UDMY (2021).pdf
Hazard Identification & Risk Assessment .pdf
BP 505 T. PHARMACEUTICAL JURISPRUDENCE (UNIT 1).pdf
FOISHS ANNUAL IMPLEMENTATION PLAN 2025.pdf
Complications of Minimal Access-Surgery.pdf
Myanmar Dental Journal, The Journal of the Myanmar Dental Association (2013).pdf
Journal of Dental Science - UDMY (2020).pdf
Environmental Education MCQ BD2EE - Share Source.pdf
LIFE & LIVING TRILOGY- PART (1) WHO ARE WE.pdf
LEARNERS WITH ADDITIONAL NEEDS ProfEd Topic
HVAC Specification 2024 according to central public works department
Education and Perspectives of Education.pptx
English Textual Question & Ans (12th Class).pdf

A04 Sample Size

  • 1. Sample Size Determination Deliverable 10A
  • 2. Analyze Module Roadmap Define 1D – Define VOC, VOB, and CTQ’s 2D – Define Project Boundaries 3D – Quantify Project Value 4D – Develop Project Mgmt. Plan Measure 5M – Document Process 6M – Prioritize List of X’s 7M – Create Data Collection Plan 8M – Validate Measurement System 9M – Establish Baseline Process Cap. Analyze 10A – Determine Critical X’s Improve 12I – Prioritized List of Solutions 13I – Pilot Best Solution Control 14C – Create Control System 15C – Finalize Project Documentation Green 11G – Identify Root Cause Relationships Queue 1 Queue 2
  • 3. Objectives – Sample Size Upon completion of this module, the student should be able to: List and define the variables which contribute to determining the correct sample size. Calculate the appropriate sample size for a defined set of variables
  • 4. Key Variables in Sample Size An optimal sample size is determined by four key factors: Alpha risk (  ): The maximum risk the business is willing to take of rejecting the null hypothesis when it is true Beta risk (  ): The risk level of failing to reject the null hypothesis when it is false Delta or difference (  ): The minimum difference we want to detect between populations Proportion (p) or Standard Deviation (s): Proportion - Your best estimate of the defect rate with discrete data Standard Deviation - Your best estimate from available continuous data
  • 5. Alpha Risk (  ) Alpha risk is decided by the Black Belt Our choice of  will determine when to reject the null hypothesis Typical  values for general business applications are between 0.05 and 0.10. As the cost of incorrect conclusions go up, you may choose to lower  . e.g. Pharmaceutical companies have tremendous risk to consumer health issues and often use an  of 0.01 The  value should depend on practical considerations such as financial or safety risk, or risk to the customer “ Significance” is defined as 1- 
  • 6. Beta Risk (  ) Beta risk can be selected by the Black Belt, but we don‘t control it the same way we do  risk. The best we can do is adjust sample size so that  is no greater than a specified value. When a beta error (  ) occurs, we have missed detecting a difference (good or bad). Power is defined as (1 -  ). It represents the probablitily that we can detect an important effect in the process Typical values of power in experiments are between 0.80 to 0.90 We will use 0.90 for most work at JEA
  • 7. Delta (  ) Delta (  ) is the minimum change that needs to be detected during analysis Example: if the average cycle time to perform a laboratory test was 120 minutes, you as the supervisor may not be concerned if the average time shifted to 121 minutes, but you would want to know if it increased to 130 minutes. In this case, 10 minutes is the smallest increment of concern (  = 10 minutes). It is the acceptable window of uncertainty around the estimate As delta decreases (more precision), the sample size increases As delta increases (less precision), the sample size decreases
  • 8. Signal to Noise Ratio If you consider  and  , the ratio of the two is much like a signal-to-noise ratio If the “signal” is large relative to the noise, we can “hear” the signal Sample size will increase dramatically as the  ratio drops Low  High  
  • 9. Minitab Versus Excel Minitab uses an “infinite population” approach Minitab calculators assume the population is relatively infinite Relatively infinite means the population is at least ten times larger than the sample used Predicts a “safe” sample size (larger than a finite population approach) Excel calculators are able to use a “finite population” approach They have a “finite population correction factor” Adjusts the sample size to account for when we are sampling a significant portion of the population
  • 10. Calculating Sample Size in Minitab Stat>Power and Sample Size>{Select as needed} Enter multiple values with a space between values for any/all of these (Minitab will calculate the value for the third parameter)
  • 11. Wastewater Sample Size Example You are going to perform a statistical test to determine if there is a difference in the average suspended solids level for two processing lines at a wastewater treatment plant. A suspended solids difference of 10 units or less is unimportant to you for the purpose of this test, but you would like to detect a difference > 10. The historical process standard deviation is 5.
  • 12. Wastewater Sample Size Example Stat > power and sample size > 2-Sample t
  • 13. Minitab Output Power and Sample Size 2-Sample t Test Testing mean 1 = mean 2 (versus not =) Calculating power for mean 1 = mean 2 + difference Alpha = 0.05 Assumed standard deviation = 5 Sample Target Difference Size Power Actual Power 10 7 0.9 0.929070 The sample size is for each group.
  • 14. Wastewater Sample Size – Pt. 2 “ Wow! Seven samples are not that many. I was prepared to gather 25 samples. How small of a difference can I detect if I collect 10,15, 20 or the entire 25 samples”? Power and Sample Size 2-Sample t Test Testing mean 1 = mean 2 (versus not =) Calculating power for mean 1 = mean 2 + difference Alpha = 0.05 Assumed standard deviation = 5 Sample Size Power Difference 10 0.9 7.66846 15 0.9 6.13222 20 0.9 5.25996 25 0.9 4.67878 The sample size is for each group. Notice how sample size increases dramatically as the difference to detect becomes smaller and smaller.
  • 15. Class Exercise Recalculate the sample size for the previous problem using a 1%  and a 0.80 power. 10 min
  • 16. Homework - Back to Pat’s Invoice Problem Our old friend Pat is starting to wonder about the validity of a great number of past decisions. In this case, Pat now realizes that the past practice of guessing at the number of invoices to inspect (as was done in previous modules) wasn’t the most reliable. How many data points will Pat need to inspect to rule in/out that the process does not have a 10% defect rate if the samples inspected had a 12%, 15%, 20%, or 25% defect rate?
  • 17. Selecting Data for the Stat Test Now that we know how many data points to include in the statistical test, we need to identify which samples should be placed in the test. Assume you have several hundred data points collected over time, but the sample size calculation showed you need only 35 for the statistical test. How do we pick the appropriate 35? The 35 “best” or “worst” will certainly skew our conclusions 35 from the center of the data will not show the appropriate variability Let’s have Minitab do it for us!
  • 18. Generating Random Data Use Minitab to generate 300 randomly distributed data points having a mean of 100 and a standard deviation of 10. Calc>Random Data>Normal
  • 19. Selecting Data at Random Use the following to select 35 random data points Calc>Random Data>Sample from Columns
  • 20. Randomly Selected Data This procedure works equally well with text or numerical values (a wonderful way to select the sequence for Black Belts to present their projects in class).
  • 21. Learning Check – Sample Size Upon completion of this module, the student should be able to: List and define the variables which contribute to determining the correct sample size. Calculate the appropriate sample size for a defined set of variables

Editor's Notes

  • #17: Power and Sample Size Test for One Proportion Testing proportion = 0.1 (versus not = 0.1) Alpha = 0.05 Alternative Sample Target Proportion Size Power Actual Power 0.12 2523 0.9 0.900079 0.15 438 0.9 0.900409 0.20 122 0.9 0.901723 0.25 59 0.9 0.903729