SlideShare a Scribd company logo
Bessel's Correction: Effects of (n-1) as the
denominator in Standard deviation
Saradindu Sengupta
PyData Global 2022
Senior ML Engineer @Nunam
Where I work on building learning systems to forecast health and failure of Li-ion batteries.
Mean, Variance and Standard Deviation
Sample Variance Sample Standard Deviation
Population Variance
Population Standard Deviation
Mean
PyData Global 2022
Why (n-1)
Assume n number of samples are drawn from a population N with mean μ and
variance s^2.
The sample standard deviation and variance would contain a bias in all the
scenarios because sample mean(x) would be used to calculate sample variance
and standard deviation instead of population mean (μ)
Any value in the sample n is going to be closer to sample mean(x) population
mean(μ). The sum of squares in variance and standard deviation would be large
with population mean(μ) than sample mean(x)
PyData Global 2022
What to do
To get an unbiased estimator, multiplying both sides by
constant (N-1)/N would give
When To use
In cases where the sample size is small compared to the population, because the sum of square
deviation is going to be severe
PyData Global 2022
Demo
Link: https://pydata-global-2022-lightning-talk.streamlit.app/
Low Sample Size
High Sample Size
PyData Global 2022
Thank You
6
PyData Global 2022
/in/saradindusengupta
@iamsaradindu /saradindusengupta

More Related Content

PPTX
Statistical inference: Estimation
PPTX
statistical inference.pptx
PPT
PPT
tps5e_Ch10_2.ppt
PPT
PPTX
Hypothesis Test _Two-sample t-test, Z-test, Proportion Z-test
PPTX
3. Statistical inference_anesthesia.pptx
PDF
Neural Networks with Complex Sample Data
Statistical inference: Estimation
statistical inference.pptx
tps5e_Ch10_2.ppt
Hypothesis Test _Two-sample t-test, Z-test, Proportion Z-test
3. Statistical inference_anesthesia.pptx
Neural Networks with Complex Sample Data

Similar to PyData Global 2022 - Lightning Talk - Bessel's Correction (20)

PPT
CI for Difference of Proportions
PPTX
Monte carlo analysis
PPTX
M1-4 Estimasi Titik dan Intervaltttt.pptx
PPT
Public health and Epidemiology sample size estimation
PDF
Effects of missing observations on
PDF
DMAIC
PPTX
3.1 Inference about Two Population Mean_Summer 2025.pptx
PPT
Medical statistics2
PPT
Sample size by formula
PPTX
Estimating a Population Proportion
PPT
Introduction to t test and types in Nursing.ppt
PPT
POINT_INTERVAL_estimates.ppt
PPTX
3.2 measures of variation
PPT
Chapter09
PPTX
Montgomery
PDF
Download-manuals-ground water-manual-gw-volume2referencemanualsamplingprinci...
PDF
A Note on Confidence Bands for Linear Regression Means-07-24-2015
PPTX
5_lectureslides.pptx
PPTX
3.2 measures of variation
PPTX
Meta analysis presentation-sim vs. no sim
CI for Difference of Proportions
Monte carlo analysis
M1-4 Estimasi Titik dan Intervaltttt.pptx
Public health and Epidemiology sample size estimation
Effects of missing observations on
DMAIC
3.1 Inference about Two Population Mean_Summer 2025.pptx
Medical statistics2
Sample size by formula
Estimating a Population Proportion
Introduction to t test and types in Nursing.ppt
POINT_INTERVAL_estimates.ppt
3.2 measures of variation
Chapter09
Montgomery
Download-manuals-ground water-manual-gw-volume2referencemanualsamplingprinci...
A Note on Confidence Bands for Linear Regression Means-07-24-2015
5_lectureslides.pptx
3.2 measures of variation
Meta analysis presentation-sim vs. no sim
Ad

More from SARADINDU SENGUPTA (7)

PDF
Solar Energy Output Forecasting from SolarGIS Data for Connected Grid Station
PDF
An Analytical Comparison of Different Regularization Parameter Selection Meth...
PDF
Pydata Global 2023 - How can a learnt model unlearn something
PDF
AZConf 2023 - Considerations for LLMOps: Running LLMs in production
PDF
GDG Community Day 2023 - Interpretable ML in production
PDF
PyData Global 2022 - Things I learned while running neural networks on microc...
PDF
GDG Cloud Community Day 2022 - Managing data quality in Machine Learning
Solar Energy Output Forecasting from SolarGIS Data for Connected Grid Station
An Analytical Comparison of Different Regularization Parameter Selection Meth...
Pydata Global 2023 - How can a learnt model unlearn something
AZConf 2023 - Considerations for LLMOps: Running LLMs in production
GDG Community Day 2023 - Interpretable ML in production
PyData Global 2022 - Things I learned while running neural networks on microc...
GDG Cloud Community Day 2022 - Managing data quality in Machine Learning
Ad

Recently uploaded (20)

PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PDF
Clinical guidelines as a resource for EBP(1).pdf
PDF
Fluorescence-microscope_Botany_detailed content
PDF
Mega Projects Data Mega Projects Data
PPTX
SAP 2 completion done . PRESENTATION.pptx
PPTX
Computer network topology notes for revision
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PDF
annual-report-2024-2025 original latest.
PDF
Introduction to Data Science and Data Analysis
PPTX
1_Introduction to advance data techniques.pptx
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPTX
STERILIZATION AND DISINFECTION-1.ppthhhbx
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PDF
Introduction to the R Programming Language
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPT
Reliability_Chapter_ presentation 1221.5784
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
Clinical guidelines as a resource for EBP(1).pdf
Fluorescence-microscope_Botany_detailed content
Mega Projects Data Mega Projects Data
SAP 2 completion done . PRESENTATION.pptx
Computer network topology notes for revision
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
annual-report-2024-2025 original latest.
Introduction to Data Science and Data Analysis
1_Introduction to advance data techniques.pptx
Galatica Smart Energy Infrastructure Startup Pitch Deck
STUDY DESIGN details- Lt Col Maksud (21).pptx
oil_refinery_comprehensive_20250804084928 (1).pptx
STERILIZATION AND DISINFECTION-1.ppthhhbx
Qualitative Qantitative and Mixed Methods.pptx
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Introduction to the R Programming Language
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
Reliability_Chapter_ presentation 1221.5784

PyData Global 2022 - Lightning Talk - Bessel's Correction

  • 1. Bessel's Correction: Effects of (n-1) as the denominator in Standard deviation Saradindu Sengupta PyData Global 2022 Senior ML Engineer @Nunam Where I work on building learning systems to forecast health and failure of Li-ion batteries.
  • 2. Mean, Variance and Standard Deviation Sample Variance Sample Standard Deviation Population Variance Population Standard Deviation Mean PyData Global 2022
  • 3. Why (n-1) Assume n number of samples are drawn from a population N with mean μ and variance s^2. The sample standard deviation and variance would contain a bias in all the scenarios because sample mean(x) would be used to calculate sample variance and standard deviation instead of population mean (μ) Any value in the sample n is going to be closer to sample mean(x) population mean(μ). The sum of squares in variance and standard deviation would be large with population mean(μ) than sample mean(x) PyData Global 2022
  • 4. What to do To get an unbiased estimator, multiplying both sides by constant (N-1)/N would give When To use In cases where the sample size is small compared to the population, because the sum of square deviation is going to be severe PyData Global 2022
  • 6. Thank You 6 PyData Global 2022 /in/saradindusengupta @iamsaradindu /saradindusengupta