Mean and Variance of the HyperGeometric Distribution Page 1
Al Lehnen Madison Area Technical College 11/30/2011
In a drawing of n distinguishable objects without replacement from a set of N (n < N)
distinguishable objects, a of which have characteristic A, (a < N) the probability that
exactly x objects in the draw of n have the characteristic A is given by then number of
different ways the x objects can be chosen from the a available times the number of
different ways the n-x objects in the draw which don’t have A can be chosen from the
N-a available divided by the number of different ways n distinguishable objects can be
chosen from a set of N. The resulting probability distribution for the random variable x is
called the hypergeometric distribution. In symbols,
( )
a N a
x n x
f x
N
n
−⎛ ⎞⎛ ⎞
⎜ ⎟⎜ ⎟
−⎝ ⎠⎝ ⎠=
⎛ ⎞
⎜ ⎟
⎝ ⎠
.
The binomial coefficient
( )
!
! !
k k
j j k j
⎛ ⎞
=⎜ ⎟
−⎝ ⎠
is defined to be zero if either j or k-j is
negative, so that the probability of the null event of drawing more objects than those
available is zero. To prove that ( )
0 0
1
n n
x x
a N a
x n x
f x
N
n
= =
−⎛ ⎞⎛ ⎞
⎜ ⎟⎜ ⎟
−⎝ ⎠⎝ ⎠= =
⎛ ⎞
⎜ ⎟
⎝ ⎠
∑ ∑ , consider the factorization
( ) ( ) ( )N a N a
B C B C B C
−
+ = + + . From the binomial theorem,
( ) ( )
( )
0 0
0 0
a N a
a N a a j j N a l l
j l
a N a
N l j l j
j l
a N a
B C B C B C B C
j l
a N a
B C
j l
−
− − − −
= =
−
− + +
= =
−⎛ ⎞ ⎛ ⎞
+ + = ⎜ ⎟ ⎜ ⎟
⎝ ⎠ ⎝ ⎠
−⎛ ⎞⎛ ⎞
= ⎜ ⎟⎜ ⎟
⎝ ⎠⎝ ⎠
∑ ∑
∑ ∑
Using the diagonal rearrangement suggested by the figure below with l n j= − , with the
intercept n running from 0 to N and j running from 0 to a. This generates more than the
( )( )1 1a N a+ − + terms in the above sum. However, all of the new terms generated vanish
since they have l N a> − .
( ) ( )
0 0
N a
a N a N n n
n j
a N a
B C B C B C
j n j
− −
= =
−⎛ ⎞⎛ ⎞
+ + = ⎜ ⎟⎜ ⎟
−⎝ ⎠⎝ ⎠
∑ ∑
Now, for n a> extending the sum over j to n because of the
a
j
⎛ ⎞
⎜ ⎟
⎝ ⎠
factor would only add
terms which are zero. Similarly, if n a< , the terms in the sum over j from j = n + 1 to j = a
are all zero due to the
N a
n j
−⎛ ⎞
⎜ ⎟
−⎝ ⎠
factor. Thus,
( ) ( )
0 0 0 0
N a N n
a N a N n n N n n
n j n j
a N a a N a
B C B C B C B C
j n j j n j
− − −
= = = =
− −⎛ ⎞⎛ ⎞ ⎛ ⎞⎛ ⎞
+ + = =⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟
− −⎝ ⎠⎝ ⎠ ⎝ ⎠⎝ ⎠
∑ ∑ ∑ ∑ .
Mean and Variance of the HyperGeometric Distribution Page 2
Al Lehnen Madison Area Technical College 11/30/2011
But from a second use of the binomial theorem,
( ) ( ) ( )
0 0 0
N n N
a N a NN n n N n n
n j n
a N a N
B C B C B C B C B C
j n j n
− − −
= = =
−⎛ ⎞⎛ ⎞ ⎛ ⎞
+ + = = + =⎜ ⎟⎜ ⎟ ⎜ ⎟
−⎝ ⎠⎝ ⎠ ⎝ ⎠
∑ ∑ ∑ .
The only way the two sums can be equal for all values of B and C is for
0
n
j
a N a N
j n j n=
−⎛ ⎞⎛ ⎞ ⎛ ⎞
=⎜ ⎟⎜ ⎟ ⎜ ⎟
−⎝ ⎠⎝ ⎠ ⎝ ⎠
∑ . (1)
This in turn implies that the hypergeometric probabilities do indeed construct a valid
probability distribution, i.e. ( )
0 0
1
n n
x x
a N a
x n x
f x
N
n
= =
−⎛ ⎞⎛ ⎞
⎜ ⎟⎜ ⎟
−⎝ ⎠⎝ ⎠= =
⎛ ⎞
⎜ ⎟
⎝ ⎠
∑ ∑ .
The mean or expected value of the hypergeometric random variable is given by
( )
1
0 0
n n
x
x x
N a N a
x x f x x
n x n x
μ
−
= =
−⎛ ⎞ ⎛ ⎞⎛ ⎞
= = =⎜ ⎟ ⎜ ⎟⎜ ⎟
−⎝ ⎠ ⎝ ⎠⎝ ⎠
∑ ∑ .
Now, using Equation (1),
Mean and Variance of the HyperGeometric Distribution Page 3
Al Lehnen Madison Area Technical College 11/30/2011
( )
( )
( ) ( ) ( )
( ) ( )
( ) ( )
( )
( )
( ) ( )
( )
( ) ( )
( )
0 1 1
1 1
0 0
1 11 !!
1 1! ! 1 ! 1 1 !
1 1 1 11 ! 1
1 1! 1 !
1
1
n n n
x x x
n n
x x
N aa aa N a N axa
x
n xx n x n xx n x x n x
N a N aa a
a a
n x n xxx n x
N
a
n
= = =
− −
= =
⎛ ⎞− − −−− −⎛ ⎞⎛ ⎞ ⎛ ⎞
= = ⎜ ⎟⎜ ⎟⎜ ⎟ ⎜ ⎟ ⎜ ⎟− − −− −− ⎡ ⎤− − − −⎝ ⎠⎝ ⎠ ⎝ ⎠ ⎝ ⎠⎣ ⎦
⎛ ⎞ ⎛ ⎞− − − − − −− −⎛ ⎞
= =⎜ ⎟ ⎜ ⎟⎜ ⎟⎜ ⎟ ⎜ ⎟− − − −⎡ ⎤− − ⎝ ⎠⎝ ⎠ ⎝ ⎠⎣ ⎦
−⎛ ⎞
= ⎜ ⎟
−⎝ ⎠
∑ ∑ ∑
∑ ∑
This gives that ( )
( )
( ) ( )
( )1
0
1 ! ! !1
1 1 ! ! !
n
x
x
N n N nN N na
x x f x a a
n n n N n N N
μ
−
=
− −−⎛ ⎞ ⎛ ⎞
= = = = ⋅ =⎜ ⎟ ⎜ ⎟
− − −⎝ ⎠ ⎝ ⎠
∑ .
Using the notation of the binomial distribution that
a
p
N
= , we see that the expected value
of x is the same for both drawing without replacement (the hypergeometric distribution)
and with replacement (the binomial distribution).
x
na
x np
N
μ = = = (2)
The variance of the hypergeometric distribution can be computed from the generic
formula that
2 22 2
x x x x xσ ⎡ ⎤= − = −⎣ ⎦ . Again from Equation (1),
( )
( )
( )
( )( )
( ) ( ) ( )
( ) ( )
( ) ( )
( )
( )
( )
( ) ( )
( )
( )
( ) ( )
( )
0 2 2
2 2
0 0
2 21 ! 1 2 !
1
2 2! ! 2 ! 2 2 !
2 2 2 22 ! 2
1 1
2 2! 2 !
n n n
x x x
n n
x x
N ax x a a a aa N a N a
x x
n xx n x n xx n x x n x
N a N aa a
a a a a
n x n xxx n x
= = =
− −
= =
⎛ ⎞− − −− − −− −⎛ ⎞⎛ ⎞ ⎛ ⎞
− = = ⎜ ⎟⎜ ⎟⎜ ⎟ ⎜ ⎟ ⎜ ⎟− − −− −− ⎡ ⎤− − − −⎝ ⎠⎝ ⎠ ⎝ ⎠ ⎝ ⎠⎣ ⎦
⎛ ⎞ ⎛ ⎞− − − − − −− −⎛ ⎞
= − = −⎜ ⎟ ⎜ ⎟⎜ ⎟⎜ ⎟ ⎜ ⎟− − − −⎡ ⎤− − ⎝ ⎠⎝ ⎠ ⎝ ⎠⎣ ⎦
∑ ∑ ∑
∑
( )
2
1
2
N
a a
n
−⎛ ⎞
= − ⎜ ⎟
−⎝ ⎠
∑
So,
( ) ( ) ( )
( )
( )
( ) ( )
( ) ( ) ( )
( )
1 1
0
2
1 1 1
2
2 ! ! ! 1 1
1
2 ! ! ! 1
n
x
N a N a N N
x x x x a a
n x n x n n
N N n n a a n n
a a
n N n N N N
− −
=
− −⎛ ⎞ ⎛ ⎞⎛ ⎞ ⎛ ⎞ ⎛ ⎞
− = − = −⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟ ⎜ ⎟
− −⎝ ⎠ ⎝ ⎠⎝ ⎠ ⎝ ⎠ ⎝ ⎠
− − − −
= − ⋅ =
− − −
∑
and
( )
( ) ( )
( )
( )( )
( )
2 1 1 1 1
1 1
1 1
a a n n a nan an
x x x x
N N N N N
⎡ ⎤− − − −
= − + = + = +⎢ ⎥
− −⎢ ⎥⎣ ⎦
.
Mean and Variance of the HyperGeometric Distribution Page 4
Al Lehnen Madison Area Technical College 11/30/2011
Thus,
( )( ) ( )( )
( )
( )
( )
( )
( )
( ) ( )
( ) ( )
( )
( )( )
( )
22 2
2 2
1 1 1 1 1 1
1
1 1 1 1
1 1
1 1
x
a n N a n N N an Nan an an
x x
N N N N N N N N N N
an Nan Na Nn N N N Nan an an N Na Nn an
N N N N N N
N N a n N a N n N aan an an N a
N N N N N N N N
σ
⎡ ⎤⎡ ⎤− − − − − −
= − = + − = + −⎢ ⎥⎢ ⎥
− − − −⎢ ⎥⎣ ⎦ ⎣ ⎦
⎡ ⎤ ⎡ ⎤− − + + − − + − − +
= =⎢ ⎥ ⎢ ⎥
− −⎢ ⎥ ⎢ ⎥⎣ ⎦ ⎣ ⎦
⎡ ⎤ ⎡ ⎤− − − − − −⎛
= = =⎢ ⎥ ⎢ ⎥ ⎜− − ⎝⎢ ⎥ ⎢ ⎥⎣ ⎦ ⎣ ⎦
( )
1
1 1
1 1
N n
N
an a N n N n
np p
N N N N
−⎞⎛ ⎞
⎟⎜ ⎟−⎠⎝ ⎠
− −⎛ ⎞⎛ ⎞ ⎛ ⎞
= − = −⎜ ⎟⎜ ⎟ ⎜ ⎟
− −⎝ ⎠⎝ ⎠ ⎝ ⎠
The last factor
1
N n
N
−⎛ ⎞
⎜ ⎟−⎝ ⎠
is called the “finite population correction” and is the reason that
the variance of the binomial distribution ( )1np p− differs from the hypergeometric
distribution. For N large compared to the sample size n, the two distributions are
essentially identical.

More Related Content

PPTX
Hypergeometric Distribution
PDF
Normal lecture
PPT
PDF
Statistics lecture 6 (ch5)
PDF
Particle filter
PDF
Application of laplace(find k)
PDF
Binomial lecture
Hypergeometric Distribution
Normal lecture
Statistics lecture 6 (ch5)
Particle filter
Application of laplace(find k)
Binomial lecture

What's hot (18)

PDF
TABLA DE DERIVADAS
PPTX
Lección 1 - Propiedades de las operaciones en reales
PPTX
Newton’s forward interpolation
PPTX
Newton’s Divided Difference Formula
DOCX
Linear conformal mapping
PDF
X2 T08 01 inequalities and graphs (2010)
PDF
X2 t08 03 inequalities & graphs (2012)
PPT
NUMERICAL METHODS
PPTX
Newton's forward difference
PPT
Statistik 1 5 distribusi probabilitas diskrit
PPSX
Correlation of dts by er. sanyam s. saini me (reg) 2012-14
PDF
X2 T04 03 cuve sketching - addition, subtraction, multiplication and division
PDF
Lesson 24: The Definite Integral (Section 4 version)
PPT
Bai giang ham so kha vi va vi phan cua ham nhieu bien
PDF
Lesson 24: The Definite Integral (Section 10 version)
PDF
Formulas statistics
PDF
Lesson 6: Limits Involving Infinity (slides)
TABLA DE DERIVADAS
Lección 1 - Propiedades de las operaciones en reales
Newton’s forward interpolation
Newton’s Divided Difference Formula
Linear conformal mapping
X2 T08 01 inequalities and graphs (2010)
X2 t08 03 inequalities & graphs (2012)
NUMERICAL METHODS
Newton's forward difference
Statistik 1 5 distribusi probabilitas diskrit
Correlation of dts by er. sanyam s. saini me (reg) 2012-14
X2 T04 03 cuve sketching - addition, subtraction, multiplication and division
Lesson 24: The Definite Integral (Section 4 version)
Bai giang ham so kha vi va vi phan cua ham nhieu bien
Lesson 24: The Definite Integral (Section 10 version)
Formulas statistics
Lesson 6: Limits Involving Infinity (slides)
Ad

Viewers also liked (9)

PPTX
Hypergeometric Distribution
PPT
Hypergeometric distribution
PPTX
Probability Distributions for Discrete Variables
PDF
The Poisson Distribution
PDF
Poisson lecture
PPTX
Poission distribution
PPTX
Poisson distribution
PPTX
Poisson distribution
PPT
Poisson Distribution
Hypergeometric Distribution
Hypergeometric distribution
Probability Distributions for Discrete Variables
The Poisson Distribution
Poisson lecture
Poission distribution
Poisson distribution
Poisson distribution
Poisson Distribution
Ad

Similar to Hypergdistribution (20)

PDF
Formulario calculo
PDF
Formulario cálculo
PDF
Formulario oficial-calculo
PDF
Formulario
PDF
Calculo
PDF
Tablas calculo
PDF
Formulario
PDF
Formulario calculo
PDF
Formulario derivadas e integrales
PDF
Formulas de calculo
PDF
Calculo
PDF
Lecture 6: Stochastic Hydrology (Estimation Problem-Kriging-, Conditional Sim...
PDF
X2 t08 03 inequalities & graphs (2013)
PDF
3.digital signal procseeingLTI System.pdf
PPTX
07-Convolution.pptx signal spectra and signal processing
PDF
Nonparametric approach to multiple regression
PDF
Simple Linear Regression
PDF
Solution manual for introduction to nonlinear finite element analysis nam-h...
PDF
微積分定理與公式
PDF
Introduction to Gaussian Processes
Formulario calculo
Formulario cálculo
Formulario oficial-calculo
Formulario
Calculo
Tablas calculo
Formulario
Formulario calculo
Formulario derivadas e integrales
Formulas de calculo
Calculo
Lecture 6: Stochastic Hydrology (Estimation Problem-Kriging-, Conditional Sim...
X2 t08 03 inequalities & graphs (2013)
3.digital signal procseeingLTI System.pdf
07-Convolution.pptx signal spectra and signal processing
Nonparametric approach to multiple regression
Simple Linear Regression
Solution manual for introduction to nonlinear finite element analysis nam-h...
微積分定理與公式
Introduction to Gaussian Processes

Recently uploaded (20)

PDF
BP 704 T. NOVEL DRUG DELIVERY SYSTEMS (UNIT 1)
PDF
Mucosal Drug Delivery system_NDDS_BPHARMACY__SEM VII_PCI.pdf
PPTX
ELIAS-SEZIURE AND EPilepsy semmioan session.pptx
PDF
1.3 FINAL REVISED K-10 PE and Health CG 2023 Grades 4-10 (1).pdf
DOCX
Cambridge-Practice-Tests-for-IELTS-12.docx
PDF
LIFE & LIVING TRILOGY- PART (1) WHO ARE WE.pdf
PDF
Uderstanding digital marketing and marketing stratergie for engaging the digi...
PDF
What if we spent less time fighting change, and more time building what’s rig...
PDF
BP 505 T. PHARMACEUTICAL JURISPRUDENCE (UNIT 1).pdf
PDF
My India Quiz Book_20210205121199924.pdf
PDF
Environmental Education MCQ BD2EE - Share Source.pdf
PDF
Hazard Identification & Risk Assessment .pdf
PPTX
Education and Perspectives of Education.pptx
PPTX
B.Sc. DS Unit 2 Software Engineering.pptx
PDF
FORM 1 BIOLOGY MIND MAPS and their schemes
PPTX
Computer Architecture Input Output Memory.pptx
PDF
LEARNERS WITH ADDITIONAL NEEDS ProfEd Topic
PDF
MICROENCAPSULATION_NDDS_BPHARMACY__SEM VII_PCI .pdf
PDF
medical_surgical_nursing_10th_edition_ignatavicius_TEST_BANK_pdf.pdf
PDF
ChatGPT for Dummies - Pam Baker Ccesa007.pdf
BP 704 T. NOVEL DRUG DELIVERY SYSTEMS (UNIT 1)
Mucosal Drug Delivery system_NDDS_BPHARMACY__SEM VII_PCI.pdf
ELIAS-SEZIURE AND EPilepsy semmioan session.pptx
1.3 FINAL REVISED K-10 PE and Health CG 2023 Grades 4-10 (1).pdf
Cambridge-Practice-Tests-for-IELTS-12.docx
LIFE & LIVING TRILOGY- PART (1) WHO ARE WE.pdf
Uderstanding digital marketing and marketing stratergie for engaging the digi...
What if we spent less time fighting change, and more time building what’s rig...
BP 505 T. PHARMACEUTICAL JURISPRUDENCE (UNIT 1).pdf
My India Quiz Book_20210205121199924.pdf
Environmental Education MCQ BD2EE - Share Source.pdf
Hazard Identification & Risk Assessment .pdf
Education and Perspectives of Education.pptx
B.Sc. DS Unit 2 Software Engineering.pptx
FORM 1 BIOLOGY MIND MAPS and their schemes
Computer Architecture Input Output Memory.pptx
LEARNERS WITH ADDITIONAL NEEDS ProfEd Topic
MICROENCAPSULATION_NDDS_BPHARMACY__SEM VII_PCI .pdf
medical_surgical_nursing_10th_edition_ignatavicius_TEST_BANK_pdf.pdf
ChatGPT for Dummies - Pam Baker Ccesa007.pdf

Hypergdistribution

  • 1. Mean and Variance of the HyperGeometric Distribution Page 1 Al Lehnen Madison Area Technical College 11/30/2011 In a drawing of n distinguishable objects without replacement from a set of N (n < N) distinguishable objects, a of which have characteristic A, (a < N) the probability that exactly x objects in the draw of n have the characteristic A is given by then number of different ways the x objects can be chosen from the a available times the number of different ways the n-x objects in the draw which don’t have A can be chosen from the N-a available divided by the number of different ways n distinguishable objects can be chosen from a set of N. The resulting probability distribution for the random variable x is called the hypergeometric distribution. In symbols, ( ) a N a x n x f x N n −⎛ ⎞⎛ ⎞ ⎜ ⎟⎜ ⎟ −⎝ ⎠⎝ ⎠= ⎛ ⎞ ⎜ ⎟ ⎝ ⎠ . The binomial coefficient ( ) ! ! ! k k j j k j ⎛ ⎞ =⎜ ⎟ −⎝ ⎠ is defined to be zero if either j or k-j is negative, so that the probability of the null event of drawing more objects than those available is zero. To prove that ( ) 0 0 1 n n x x a N a x n x f x N n = = −⎛ ⎞⎛ ⎞ ⎜ ⎟⎜ ⎟ −⎝ ⎠⎝ ⎠= = ⎛ ⎞ ⎜ ⎟ ⎝ ⎠ ∑ ∑ , consider the factorization ( ) ( ) ( )N a N a B C B C B C − + = + + . From the binomial theorem, ( ) ( ) ( ) 0 0 0 0 a N a a N a a j j N a l l j l a N a N l j l j j l a N a B C B C B C B C j l a N a B C j l − − − − − = = − − + + = = −⎛ ⎞ ⎛ ⎞ + + = ⎜ ⎟ ⎜ ⎟ ⎝ ⎠ ⎝ ⎠ −⎛ ⎞⎛ ⎞ = ⎜ ⎟⎜ ⎟ ⎝ ⎠⎝ ⎠ ∑ ∑ ∑ ∑ Using the diagonal rearrangement suggested by the figure below with l n j= − , with the intercept n running from 0 to N and j running from 0 to a. This generates more than the ( )( )1 1a N a+ − + terms in the above sum. However, all of the new terms generated vanish since they have l N a> − . ( ) ( ) 0 0 N a a N a N n n n j a N a B C B C B C j n j − − = = −⎛ ⎞⎛ ⎞ + + = ⎜ ⎟⎜ ⎟ −⎝ ⎠⎝ ⎠ ∑ ∑ Now, for n a> extending the sum over j to n because of the a j ⎛ ⎞ ⎜ ⎟ ⎝ ⎠ factor would only add terms which are zero. Similarly, if n a< , the terms in the sum over j from j = n + 1 to j = a are all zero due to the N a n j −⎛ ⎞ ⎜ ⎟ −⎝ ⎠ factor. Thus, ( ) ( ) 0 0 0 0 N a N n a N a N n n N n n n j n j a N a a N a B C B C B C B C j n j j n j − − − = = = = − −⎛ ⎞⎛ ⎞ ⎛ ⎞⎛ ⎞ + + = =⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟ − −⎝ ⎠⎝ ⎠ ⎝ ⎠⎝ ⎠ ∑ ∑ ∑ ∑ .
  • 2. Mean and Variance of the HyperGeometric Distribution Page 2 Al Lehnen Madison Area Technical College 11/30/2011 But from a second use of the binomial theorem, ( ) ( ) ( ) 0 0 0 N n N a N a NN n n N n n n j n a N a N B C B C B C B C B C j n j n − − − = = = −⎛ ⎞⎛ ⎞ ⎛ ⎞ + + = = + =⎜ ⎟⎜ ⎟ ⎜ ⎟ −⎝ ⎠⎝ ⎠ ⎝ ⎠ ∑ ∑ ∑ . The only way the two sums can be equal for all values of B and C is for 0 n j a N a N j n j n= −⎛ ⎞⎛ ⎞ ⎛ ⎞ =⎜ ⎟⎜ ⎟ ⎜ ⎟ −⎝ ⎠⎝ ⎠ ⎝ ⎠ ∑ . (1) This in turn implies that the hypergeometric probabilities do indeed construct a valid probability distribution, i.e. ( ) 0 0 1 n n x x a N a x n x f x N n = = −⎛ ⎞⎛ ⎞ ⎜ ⎟⎜ ⎟ −⎝ ⎠⎝ ⎠= = ⎛ ⎞ ⎜ ⎟ ⎝ ⎠ ∑ ∑ . The mean or expected value of the hypergeometric random variable is given by ( ) 1 0 0 n n x x x N a N a x x f x x n x n x μ − = = −⎛ ⎞ ⎛ ⎞⎛ ⎞ = = =⎜ ⎟ ⎜ ⎟⎜ ⎟ −⎝ ⎠ ⎝ ⎠⎝ ⎠ ∑ ∑ . Now, using Equation (1),
  • 3. Mean and Variance of the HyperGeometric Distribution Page 3 Al Lehnen Madison Area Technical College 11/30/2011 ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) 0 1 1 1 1 0 0 1 11 !! 1 1! ! 1 ! 1 1 ! 1 1 1 11 ! 1 1 1! 1 ! 1 1 n n n x x x n n x x N aa aa N a N axa x n xx n x n xx n x x n x N a N aa a a a n x n xxx n x N a n = = = − − = = ⎛ ⎞− − −−− −⎛ ⎞⎛ ⎞ ⎛ ⎞ = = ⎜ ⎟⎜ ⎟⎜ ⎟ ⎜ ⎟ ⎜ ⎟− − −− −− ⎡ ⎤− − − −⎝ ⎠⎝ ⎠ ⎝ ⎠ ⎝ ⎠⎣ ⎦ ⎛ ⎞ ⎛ ⎞− − − − − −− −⎛ ⎞ = =⎜ ⎟ ⎜ ⎟⎜ ⎟⎜ ⎟ ⎜ ⎟− − − −⎡ ⎤− − ⎝ ⎠⎝ ⎠ ⎝ ⎠⎣ ⎦ −⎛ ⎞ = ⎜ ⎟ −⎝ ⎠ ∑ ∑ ∑ ∑ ∑ This gives that ( ) ( ) ( ) ( ) ( )1 0 1 ! ! !1 1 1 ! ! ! n x x N n N nN N na x x f x a a n n n N n N N μ − = − −−⎛ ⎞ ⎛ ⎞ = = = = ⋅ =⎜ ⎟ ⎜ ⎟ − − −⎝ ⎠ ⎝ ⎠ ∑ . Using the notation of the binomial distribution that a p N = , we see that the expected value of x is the same for both drawing without replacement (the hypergeometric distribution) and with replacement (the binomial distribution). x na x np N μ = = = (2) The variance of the hypergeometric distribution can be computed from the generic formula that 2 22 2 x x x x xσ ⎡ ⎤= − = −⎣ ⎦ . Again from Equation (1), ( ) ( ) ( ) ( )( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) 0 2 2 2 2 0 0 2 21 ! 1 2 ! 1 2 2! ! 2 ! 2 2 ! 2 2 2 22 ! 2 1 1 2 2! 2 ! n n n x x x n n x x N ax x a a a aa N a N a x x n xx n x n xx n x x n x N a N aa a a a a a n x n xxx n x = = = − − = = ⎛ ⎞− − −− − −− −⎛ ⎞⎛ ⎞ ⎛ ⎞ − = = ⎜ ⎟⎜ ⎟⎜ ⎟ ⎜ ⎟ ⎜ ⎟− − −− −− ⎡ ⎤− − − −⎝ ⎠⎝ ⎠ ⎝ ⎠ ⎝ ⎠⎣ ⎦ ⎛ ⎞ ⎛ ⎞− − − − − −− −⎛ ⎞ = − = −⎜ ⎟ ⎜ ⎟⎜ ⎟⎜ ⎟ ⎜ ⎟− − − −⎡ ⎤− − ⎝ ⎠⎝ ⎠ ⎝ ⎠⎣ ⎦ ∑ ∑ ∑ ∑ ( ) 2 1 2 N a a n −⎛ ⎞ = − ⎜ ⎟ −⎝ ⎠ ∑ So, ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) 1 1 0 2 1 1 1 2 2 ! ! ! 1 1 1 2 ! ! ! 1 n x N a N a N N x x x x a a n x n x n n N N n n a a n n a a n N n N N N − − = − −⎛ ⎞ ⎛ ⎞⎛ ⎞ ⎛ ⎞ ⎛ ⎞ − = − = −⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟ ⎜ ⎟ − −⎝ ⎠ ⎝ ⎠⎝ ⎠ ⎝ ⎠ ⎝ ⎠ − − − − = − ⋅ = − − − ∑ and ( ) ( ) ( ) ( ) ( )( ) ( ) 2 1 1 1 1 1 1 1 1 a a n n a nan an x x x x N N N N N ⎡ ⎤− − − − = − + = + = +⎢ ⎥ − −⎢ ⎥⎣ ⎦ .
  • 4. Mean and Variance of the HyperGeometric Distribution Page 4 Al Lehnen Madison Area Technical College 11/30/2011 Thus, ( )( ) ( )( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )( ) ( ) 22 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 x a n N a n N N an Nan an an x x N N N N N N N N N N an Nan Na Nn N N N Nan an an N Na Nn an N N N N N N N N a n N a N n N aan an an N a N N N N N N N N σ ⎡ ⎤⎡ ⎤− − − − − − = − = + − = + −⎢ ⎥⎢ ⎥ − − − −⎢ ⎥⎣ ⎦ ⎣ ⎦ ⎡ ⎤ ⎡ ⎤− − + + − − + − − + = =⎢ ⎥ ⎢ ⎥ − −⎢ ⎥ ⎢ ⎥⎣ ⎦ ⎣ ⎦ ⎡ ⎤ ⎡ ⎤− − − − − −⎛ = = =⎢ ⎥ ⎢ ⎥ ⎜− − ⎝⎢ ⎥ ⎢ ⎥⎣ ⎦ ⎣ ⎦ ( ) 1 1 1 1 1 N n N an a N n N n np p N N N N −⎞⎛ ⎞ ⎟⎜ ⎟−⎠⎝ ⎠ − −⎛ ⎞⎛ ⎞ ⎛ ⎞ = − = −⎜ ⎟⎜ ⎟ ⎜ ⎟ − −⎝ ⎠⎝ ⎠ ⎝ ⎠ The last factor 1 N n N −⎛ ⎞ ⎜ ⎟−⎝ ⎠ is called the “finite population correction” and is the reason that the variance of the binomial distribution ( )1np p− differs from the hypergeometric distribution. For N large compared to the sample size n, the two distributions are essentially identical.