R4 m.s. radhakrishnan, probability & statistics, dlpd notes.

Study Material
for
Probability and Statistics
AAOC ZC111
Distance Learning Programmes Division
Birla Institute of Technology & Science
Pilani – 333031 (Rajasthan)
July 2003

Course Developed by
M.S.Radhakrishnan
Word Processing & Typesetting by
Narendra Saini
Ashok Jitawat

Contents
Page No.
INTRODUCTION, SAMPLE SPACES & EVENTS 1
Probability 1
Events 2
AXIOMS OF PROBABILITY 4
Some elementary consequences of the Axioms 4
Finite Sample Space (in which all outcomes are equally likely) 6
CONDITIONAL PROBABILITY 11
Independent events 11
Theorem on Total Probability 14
BAYE’S THEOREM 16
MATHEMATICAL EXPECTATION & DECISION MAKING 22
RANDOM VARIABLES 26
Discrete Random Variables 27
Binomial Distribution 28
Cumulative Binomial Probabilities 29
Binomial Distribution – Sampling with replacement 31
Mode of a Binomial distribution 31
Hyper Geometric Distribution (Sampling without replacement) 32
Binomial distribution as an approximation to the Hypergeometric
Distribution
34
THE MEAN AND VARIANCE OF PROBABILITY DISTRIBUTIONS 36
The mean of a Binomial Distribution 37
Digression 37
Chebychevs theorem 39
Law of large numbers 41
Poisson Distribution 42
Poisson approximation to binomial distribution 42

Cumulative Poisson distribution 43
Poisson Process 43
The Geometric Distribution 46
Multinomial Distribution 52
Simulation 54
CONTINUOUS RANDOM VARIABLES 56
Probability Density Function (pdf) 57
Normal Distribution 64
Normal Approximation to Binomial Distribution 69
Correction for Continuity 70
Other Probability Densities 71
The uniform Distribution 71
Gamma Function 73
Properties of Gamma Function 74
The Gamma Distribution 74
Exponential Distribution 74
Beta Distribution 78
The Log-Normal Distribution 79
JOINT DISTRIBUTIONS – TWO AND HIGHER DIMENSIONAL
RANDOM VARIABLES
83
Conditional Distribution 86
Independence 87
Two-Dimensional Continuous Random Variables 88
Marginal and Conditional Densities 90
Independence 91
The Cumulative Distribution Function 93
Properties of Expectation 100
Sample Mean 101
Sample Variance 102

SAMPLING DISTRIBUTION 115
Statistical Inference 115
Statistics 116
The Sampling Distribution of the Sample Mean X . 117
Inferences Concerning Means 128
Point Estimation 128
Estimation of n 130
Estimation of Sample proportion 143
Large Samples 143
Tests of Statistical Hypothesis 148
Notation 149
REGRESSION AND CORRELATION 164
Regression 164
Correlation 167
Sample Correlation Coefficient 167

1
INTRODUCTION, SAMPLE SPACES & EVENTS
Probability
Let E be a random experiment (where we ‘know’ all possible outcomes but can’t predict
what the particular outcome will be when the experiment is conducted). The set of all
possible outcomes is called a sample space for the random experiment E.
Example 1:
Let E be the random experiment:
Toss two coins and observe the sequence of heads and tails. A sample space for this
experiment could be { }TTHTTHHHS ,,,= . If however we only observe the number
of heads got, the sample space would be S = {0, 1, 2}.
Example 2:
Toss two fair dice and observe the two numbers on the top. A sample space would be
( ) ( ) ( ) ( )
( ) ( ) ( )
( )
( ) −−−−−−−−−−
−−−−−
−−−−−−
=
)6,6(,1,6
|
,1,3
,3,2,2,2,1,2
6,1,,3,1,2,1,1,1
S
If however, we are interested only in the sum of the two numbers on the top, the
sample space could be S = { 2, 3, …, 12}.
Example 3:
Count the number of machines produced by a factory until a defective machine is
produced. A sample space for this experiment could be { }−−−−−−= ,3,2,1S .

2
Example 4:
Count the life length of a bulb produced by a factory.
Here S will be { } ).,0[0| ∞=≥tt
Events
An event is a subset of the sample space.
Example 5:
Suppose a balanced die is rolled and we observe the number on the top. Let A be the
event: an even number occurs.
Thus in symbols,
{ } { }6,5,4,3,2,16,4,2 =⊂= SA
Two events are said to be mutually exclusive if they cannot occur together; that is there
is no element common between them.
In the above example if B is the event: an odd number occurs, i.e. B = { }5,3,1 , then A and
B are mutually exclusive.
Solved Examples
Example 1:
A manufacturer of small motors is concerned with three major types of defects. If A is
the event that the shaft size is too large, B is the event that the windings are improper and
C is the event that the electrical connections are unsatisfactory, express in words what
events are represented by the following regions of the Venn diagram given below:
(a) region 2 (b) regions 1 and 3 together (c) regions 3, 5, 6 and 8 together.

3
Solution:
(a) Since this region is contained in A and B but not in C, it represents the event that
the shaft is too large and the windings improper but the electrical connections are
satisfactory.
(b) Since this region is common to B and C, it represents the event that the windings
are improper and the electrical connections are unsatisfactory. (c) Since this is the
entire region outside A, it represents the event that the shaft size is not too large.
Example 2:
A carton of 12 rechargeable batteries contain one that is defective. In how many ways can
the inspector choose three of the batteries and
(a) get the one that is defective
(b) not get the one that is defective.
Solution:
(a) one defective can be chosen in one way and two good ones can be chosen in
55
2
11
= ways. Hence one defective and two good can be chosen in 1 x 55 = 55
ways.
(b) Three good ones can be chosen in 165
3
11
= ways
8
A
7
B
2 5
1
4 3
C 6

4
AXIOMS OF PROBABILITY
Let E be a random experiment. Suppose to each event A, we associate a real number
P(A) satisfying the following axioms:
(i) ( ) 10 ≤≤ AP
(ii) ( ) 1=SP
(iii) If A and B are any two mutually exclusive events, then
( ) ( ) ( )BPAPBAP +=∪
(iv) If {A1, A2 - - - - - -An , …} is a sequence of pair- wise mutually exclusive
events, then ...)(...)()(...)...( 2121 ++++=∪∪∪∪ nn APAPAPAAAP
We call P(A) the probability of the event A.
Axiom 1 says that the probability of an event is always a number between 0 and 1.
Axiom 2 says that the probability of the certain event S is 1. Axiom 3 says that the
probability is an additive set function.
Some elementary consequences of the Axioms
1. ( ) 0=φP
Proof: S= φ∪S .. Now S and φ are disjoint.
Hence .0)()()()( =+= φφ PPSPSP Q.E.D.
2. If nAAA ,...,, 21 are any n pair-wise mutually exclusive events, then
( ) ( )
=
=∪∪∪
n
i
in APAAAP
1
21 ... .
Proof: By induction on n.
Def.: If A is an event
A′
the complementary event = S-A (It is the shaded portion in the figure below)
A

5
3. )(1)( APAP −=′
Proof: AAS ′∪=
Now )()()( APAPSP ′+= as A and A′ are disjoint or 1 = )()( APAP ′+ .
Thus )(1)( APAP −=′ . Q.E.D.
4. Probability is a
subtractive set function; i.e.
If BA ⊂ , then
)()()( APBPABP −=− .
5. Probability is a monotone set function:
i.e. )()( BPAPBA ≤⊂
Proof: ( )ABAB −∪= where A, B-A are disjoint.
Thus ).()()()( APABPAPBP ≥−+=
BA ∩
6. If A, B are any two events,
( ) ( )BAPBPAPBAP ∩−+=∩ )()(
Proof:
( ) ( )BAABA ∩′∪=∪ )
where A and BA ∩′ are disjoint
Hence ( ) ( )BAPAPBAP ∩′+=∪ )(
But ( ) ( ),BABAB ∩′∪∩=
union of two disjoint sets
( ) ( )
( ) ( ) ( ).
)(
BAPBPBAPor
BAPBAPBP
∩−=∩′
∩′+∩=
( ) ( )BAPBPAPBAP ∩−+=∪∴ )()( . Q.E.D.
7. If A, B, C are any three events,
( ) )CBA(P)AC(P)CB(P)BA(P)C(P)B(P)A(PCBAP ∩∩+∩−∩−∩−++=∪∪ .
B
A
A B
BA ∩′

6
Proof:
( )( )
)CBA(P)CB(P)CA(P)BA(P)C(P)B(P)A(P
))CB()CA((P)BA(P)C(P)B(P)A(P
)C)BA((P)C(P)BA(P)B(P)A(P
CBAP)C(P)BA(P)CBA(P
∩∩+∩−∩−∩−++=
∩∪∩−∩−++=
∩∪−+∩−+=
∩∪−+∪=∪∪
More generally,
8. If nAAA ,...,, 21 are any n events.
)AAA(P)1(
...)AAA(P)AA(P)A(P
)A...AA(P
n21
1n
nkji1
kji
nj1i
ji
n
1i
I
n21
∩−−−−−−−∩∩−+
−∩∩+∩−=
∪∪∪
−
<≤<≤≤<≤=
Finite Sample Space (in which all outcomes are equally likely)
Let E be a random experiment having only a finite number of outcomes.
Let all the (finite no. of) outcomes be equally likely.
If { }naaaS ,...,, 21= ( naaa ,...,, 21 are equally likely outcomes), { } { } { }n21 a.......aaS ∪= .a
union of m.e. events.
Hence { } { }( )naPaPaPSP −−−+= 21})({)(
But P({a1})=P({a2})= …= P({an}) = p (say)
Hence 1 = p+ p+ . . . +p (n terms) or p = 1/n
Hence if A is a subset consisting of ‘k’ of these outcomes,
A ={a1, a2………ak}, then
n
k
AP =)( =
outcomesofno.Total
outcomesfavorableofNo.
.

7
Example 1:
If a card is drawn from a well-shuffled pack of 52 cards find the probability of drawing
(a) a red king Ans:
52
2
(b) a 3, 4, 5 or 6 Ans:
52
16
(c) a black card Ans:
2
1
(d) a red ace or a black queen Ans:
52
4
Example 2:
When a pair of balanced die is thrown, find probability of getting a sum equal to
(a) 7.
Ans:
6
1
36
6
= (Total number of equally likely outcomes is
36 & the favourable number of outcomes = 6, namely
(1,6), (2,5),, …(6,1).)
(b) 11 Ans:
36
2
(c) 7 or 11 Ans:
36
8
(d) 2, 3 or 12 Ans: =
36
4
36
1
36
2
36
1
=++ .
Example 3:
10 persons in a room are wearing badges marked 1 through 10. 3 persons are chosen at
random and asked to leave the room simultaneously and their badge nos are noted. Find
the probability that
(a) the smallest badge number is 5.
(b) the largest badge number is 5.

8
Solution:
(a) 3 persons can be chosen in 10
C3 equally likely ways. If the smallest badge
number is to be 5, the badge numbers should be 5 and any two of the 5
numbers 6, 7, 8, 9,10. Now 2 numbers out of 5 can be chosen in 5
C2 ways.
Hence the probability that the smallest badge number is 5 is 5
C2 /10
C3 .
(b) Ans. 4
C2 /10
C3 .
Example 4:
A lot consists of 10 good articles, 4 articles with minor defects and 2 with major defects.
Two articles are chosen at random. Find the probability that
(a) both are good Ans:
2
16
2
10
C
C
(b) both have major defects Ans:
2
16
2
2
C
C
(c) At least one is good Ans: 1 – P(none is good) =
2
1
16
6
1
c
c
−
(d) Exactly one is good Ans:
2
11
16
6.10
c
cc
(e) At most one is good Ans. P(none is good) + P(exactly one is good) =
2
11
2
2
16
6.10
16
6
c
cc
c
c
+
(f) Neither has major defects Ans:
2
2
16
14
c
c
(g) Neither is good Ans:
2
2
16
6
c
c

9
Example 5:
From 6 positive and 8 negative integers, 4 integers are chosen at random and multiplied.
Find the probability that their product is positive.
Solution:
The product is positive if all the 4 integers are positive or all of them are negative or two
of them are positive and the other two are negative. Hence the probability is
++
4
14
2
8
2
6
4
14
4
8
4
14
4
6
Example 6:
If, A, B are mutually exclusive events and if P(A) = 0.29, P(B) = 0.43, then
(a) 0.710.291)AP( =−=′
(b) P(A∪B) = 0.29 + 0.43 = 0.72
(c) P( ) m.e.)areBandAsince,BofsubsetaisA(as[0.29BA ′==′∩ )A(P
(d) 0.280.721B)P(A1)BAP( =−=∪−=′∩′
Example 7:
P(A) = 0.35, P(B) = 0.73, P 0.14B)(A =∩ . Find
(a) P (A ∪ B) = P(A) + P(B) - P( A ∩ B) = 0.94.
(b) 0.59B)P(AP(B)B)A(P =∩−=∩′
(c) 0.21B)P(AP(A))B(AP =∩−=′∩
(d) 0.860.141B)P(A1)BAP( =−=∩−=′∪′
Example 8:
A, B, C are 3 mutually exclusive events. Is this assignment of probabilities possible?
P(A) = 0.3, P(B) = 0.4, P(C) = 0.5

10
Ans. P(A ∪ B ∪ C) = P(A) + P(B) + P(C) >1 NOT POSSIBLE
Example 9:
Three newspapers are published in a city. A recent survey of readers indicated the
following:
20% read A 8% read A and B 2% read all
16% read B 5% read A and C
14% read C 4% read B and C
Find probability that an adult chosen at random reads
(a) none of the papers.
Ans. 0.65
100
2458141620
1C)BP(A1 =
+−−−+++
−=∪∪−
(b) reads exactly one paper.
P (Reading exactly one paper)
0.22
100
769
=
++
=
(c) reads at least A and B given he reads at least one of the papers.
P (At least reading A and B given he reads at least one of the papers)
=
35
8
C)BP(A
B)P(A
=
∪∪
∩
A B
C
9 6
3
6
22
7

11
CONDITIONAL PROBABILITY
Let, A, B be two events. Suppose P(B) ≠ 0. The conditional probability of A occurring
given that B has occurred is defined as
P(A | B) = probability of A given B = .
P(B)
B)P(A ∩
Similarly we define P(B | A) =
P(A)
B)P(A ∩
if P(A) ≠ 0.
Hence we get the multiplication theorem
0)P(A)(if)P(A).P(B/AB)P(A ≠=∩ )
0)P(B)(if)P(B).P(A/B ≠=
Example 10
A bag contains 4 red balls and 6 black balls. 2 balls are chosen at random one by one
without replacement. Find the probability that both are red.
Solution
Let A be the event that the first ball drawn is red, B the event the second ball drawn is
red. Hence the probability that both balls drawn are red =
15
2
9
3
10
4
A)|P(BP(A)B)P(A =×=×=∩
Independent events:
Definition: We say two events A, B are independent if P(A∩ B) = P(A). P(B)
Equivalently A and B are independent if P(B | A) = P(B) or P(A | B) = P(A)
Theorem If, A, B are independent, then
(a) A′ , B are independent
(b) A, B′ are independent
(c) B,A ′′ are independent

12
Proof B)A(B)(AB ∩′∪∩=
Mutually
exclusive
B)AP(B)P(AP(B) ∩′+∩=
B)P(A-P(B)B)AP( ∩=∩′
= P(B) – P(A) (P/B)
= P(B) [1-P(A)]
= P(B) P( )A′
∴A, B′ are also independent.
By the same reasoning, A′ and B are independent.
So again A′ and B′ are independent.
Example 11
Find the probability of getting 8 heads in a row in 8 tosses of a fair coin.
Solution
If Ai is the event of getting a head in the ith toss, A1, A2, …, A8 are independent and
P(Ai) =
2
1
for all i. Hence P(getting all heads) =
P(A1) P(A2)…P(An) =
8
2
1
Example 12
It is found that in manufacturing a certain article, defects of one type occur with
probability 0.1 and defects of other type occur with probability 0.05. Assume
independence between the two types of defects. Find the probability that an article chosen
at random has exactly one type of defect given that it is defective.
A B
A∩B
BA ∩′

13
Let A be the event that article has exactly one type of defect.
Let B be the event that the article is defective.
Required
P(B)
B)P(A
B)|P(A
∩
=
P(B) = P(D ∪ E) where D is the event it has type one defect
E is the event it has type two defect
= P(D) + P(E) – P(D ∩ E) = 0.1 + 0.05 - (0.1) (0.05) = 0.145
P(A∩ B) = P (article is having exactly one type of defect)
= P(D) + P(E) – 2 P(D ∩ E) = 0.1 + 0.05 - 2 (0.1) (0.05)
= 0.14
∴Probability =
145.0
14.0
[Note: If A and B are two events, probability that exactly only one of them occurs
is P(A) + P(B) – 2P(A∩ B)]
Example 13
An electronic system has 2 subsystems A and B. It is known that
P (A fails) = 0.2
P (B fails alone) = 0.15
P (A and B fail) = 0.15
Find (a) P (A fails | B has failed)
(b) P (A fails alone)

14
Solution
(a) P(A fails | B has failed)
2
1
0.30
0.15
failed)P(B
failed)BandP(A
===
(b) P (A fails alone) = P (A fails) – P (A and B fail) = 0.02-0.15 = 0.05
Example 14
A binary number is a number having digits 0 and 1. Suppose a binary number is made up
of ‘n’ digits. Suppose the probability of forming an incorrect binary digit is p. Assume
independence between errors. What is the probability of forming an incorrect binary
number?
Ans 1- P (forming a correct no.) = 1 – (1-p)n
.
Example 15
A question paper consists of 5 Multiple choice questions each of which has 4 choices (of
which only one is correct). If a student answers all the five questions randomly, find the
probability that he answers all questions correctly.
Ans
5
4
1
.
Theorem on Total Probability
Let B1, B2, …, Bn be n mutually exclusive events of which one must occur. If A is any
other event, then
( ) )BP(A......BAP)BP(AP(A) 21 n∩++∩+∩=
)B|P(A)P(B ii
1i=
=
n
(For a proof, see your text book.)
Example 16
There are 2 urns. The first one has 4 red balls and 6 black balls. The second has 5 red
balls and 4 black balls. A ball is chosen at random from the 1st
and put in the 2nd
. Now a
ball is drawn at random from the 2nd
urn. Find the probability it is red.

15
Solution:
Let B1 be the event that the first ball drawn is red and B2 be the event that the first ball
drawn is black. Let A be the event that the second ball drawn is red. By the theorem on
total probability,
P(A) = P(B1) P(A | B1) + P(B2) P(A | B2) =
100
54
10
5
10
6
10
6
10
4
=×+× =0.54.
Example 17:
A consulting firm rents cars from three agencies D, E, F. 20% of the cars are rented from
D, 20% from E and the remaining 60% from F. If 10% of cars rented from D, 12% of
cars rented from E, 4% of cars rented from F have bad tires, find the probability that a
car rented from the consulting firm will have bad tires.
Ans. (0.2) (0.1) + (0.2) (0.12) + (0.6) (0.04)
Example 18:
A bolt factory has three divisions B1, B2, B3 that manufacture bolts. 25% of output is
from B1, 35% from B2 and 40% from B3. 5% of the bolts manufactured by B1 are
defective, 4% of the bolts manufactured by B2 are defective and 2% of the bolts
manufactured by B3 are defective. Find the probability that a bolt chosen at random from
the factory is defective.
Ans.
100
2
100
40
100
4
100
35
100
5
100
25
×+×+×

16
BAYES’ THEOREM
Let B1, B2, ……….Bn be n mutually exclusive events of which one of them must occur.
If A is any event, then
)B|)P(AP(B
)B|)P(AP(B
P(A)
)BP(A
A)|P(B
ii
1i
kkk
k n
=
=
∩
=
Example 19
Miss ‘X’ is fond of seeing films. The probability that she sees a film on the day before
the test is 0.7. Miss X is any way good at studies. The probability that she maxes the test
is 0.3 if she sees the film on the day before the test and the corresponding probability is
0.8 if she does not see the film. If Miss ‘X’ maxed the test, find the probability that she
saw the film on the day before the test.
Solution
Let B1 be the event that Miss X saw the film before the test and let B2 be the
complementary event. Let A be the event that she maxed the test.
Required. P(B1 | A)
)B|P(AP(B))B|P(A)P(B
)B|)P(AP(B
211
11
×+×
=
Example 20
At an electronics firm, it is known from past experience that the probability a new worker
who attended the company’s training program meets the production quota is 0.86. The
corresponding probability for a new worker who did not attend the training program is
0.35. It is also known that 80% of all new workers attend the company’s training
8.03.03.07.0
3.07.0
×+×
×
=

17
program. Find probability that a new worker who met the production quota would have
attended the company’s training programme.
Solution
Let B1 be the event that a new worker attended the company’s training programme. Let
B2 be the complementary event, namely a new worker did not attend the training
programme. Let A be the event that a new worker met the production quota. Then we
want P(B1 | A) =
35.02.086.08.0
8.08.0
×+×
×
.
Example 21
A printing machine can print any one of n letters L1, L2,……….Ln. It is operated by
electrical impulses, each letter being produced by a different impulse. Assume that there
is a constant probability p that any impulse prints the letter it is meant to print. Also
assume independence. One of the impulses is chosen at random and fed into the machine
twice. Both times, the letter L1 was printed. Find the probability that the impulse chosen
was meant to print the letter L1.
Solution:
Let B1 be the event that the impulse chosen was meant to print the letter L1. Let B2 be the
complementary event. Let A be the event that both the times the letter L1 was printed.
P(B1) =
n
1
. P(A|B1) = p2
. Now the probability that an impulse prints a wrong letter is (1-
p). Since there are n-1 ways of printing a wrong letter, P(A|B2) =
1
1
−
−
n
p
. Hence P(B1|A)
=
)B|P(A)P(B)B|P(A)P(B
)B|P(A)P(B
2211
11
×+×
×
2
2
2
1
11
1
1
1
−
−
−+
=
n
p
n
p
n
p
n
. This is the required probability.

18
Miscellaneous problems
1 (a). Suppose the digits 1,2,3 are written in a random order. Find probability that at
least one digit occupies its proper place.
Solution
There are 3! = 6 ways of arranging 3 digits (See the figure), out of which in 4
arrangements , at least one digit occupies its proper place. Hence the probability is
3!
4
=
6
4
. 123 213 312
132 231 321
(Remark. An arrangement like 231, where no digit occupies its proper place is
called a derangement.)
(b) Same as (a) but with 4 digits 1,2,3,4 Ans.
24
15
(Try proving this.)
Solution
Let A1 be the Event 1st
digit occupies its proper place
A2 be the Event 2nd
A3 be the Event 3rd digit occupies its proper place
A4 be the Event 4th digit occupies its proper place
P(at least one digit occupies its proper place)
=P(A1∪A2 ∪A3 ∪A4)
=P(A1) + P(A2) + P(A3) + P(A4)
(There are 4C1 terms each with the same probability)
)AAP(...)AP(A)AP(A)AP(A 43413121 ∩−−∩−∩−∩−
)AAP(A...)AAP(A).AAP(A 432421321 ∩∩++∩∩+∩∩+
- )AAAAP( 4321 ∩∩∩
4!
0!
4c
4!
1!
4c
4!
2!
4c
4!
3!
4c 4321 −+−=

19
24
1
6
1
2
1
1 −+−=
24
15
24
141224
=
−+−
=
(c) Same as (a) but with n digits.
Solution
Let A1 be the Event 1st
A2 be the Event 2nd
……………………
An be the Event nth
P(at least one digit occupies its proper place)
= P(A1∪A2 ∪ … ∪An)
=
!
1
(-1)......-
n!
3)!(n
nc
n!
2)!(n
nc
n!
1)!(n
nc 1-n
321
n
+
−
+
−
−
−
!
1
1)(..........
4!
1
3!
1
2!
1
1 1n
n
−
−−+−= ≈ !
e1 −
− (for n large).
2. In a party there are ‘n’ married couples. If each male chooses at random a
female for dancing, find the probability that no man chooses his wife.
Ans 1-(
!
1
1)(..........
4!
1
3!
1
2!
1
1 1n
n
−
−−+− ).
3. A and B play the following game. They throw alternatively a pair of dice.
Whosoever gets sum of the two numbers on the top as seven wins the game
and the game stops. Suppose A starts the game. Find the probability (a) A
wins the game (b) B wins the game.

20
Solution
A wins the game if he gets seven in the 1st
throw or in the 3rd
throw or in the
5th
throw or …. Hence P(A wins) =
6
1
6
5
6
5
6
5
6
5
6
1
6
5
6
5
6
1
××××+××+ + …
= .
11
6
36
2536
6
1
6
5
1
6
1
2
=
−
=
−
P(B wins) = complementary probability =
11
5
.
4. Birthday Problem
There are n persons in a room. Assume that nobody is born on 29th
Feb.
Assume that any one birthday is as likely as any other birth day. Find the
probability that no two persons will have same birthday.
Solution
If n > 365, at least two will have the same birthday and hence the probability
that no two will have the same birthday is 0.
If n ≤ 365, the desired probability is
( )[ ]
n
(365)
1n365.........364365 −−××
= .
5. A die is rolled until all the faces have appeared on top.
(a) What is probability that exactly 6 throws are needed?
Ans. 6
6
!6
(b) What is probability that exactly ‘n’ throws are needed? ( )6n >

21
6. Polya’s urn problem
An urn contains g green balls and r red balls. A ball is chosen at random and
its color is noted. Then the ball is returned to the urn and c more balls of same
color are added. Now a ball is drawn. Its color is noted and the ball is
replaced. This process is repeated.
(a) Find probability that 1st
ball drawn is green.
Ans.
rg
g
+
(b) Find the probability that the 2nd
Ans.
rg
g
crg
g
rg
r
crg
cg
rg
g
+
=
+++
+
++
+
×
+
(c) Find the probability that the nth
The surprising answer is
rg
g
+
.
7. There are n urns and each urn contains a white and b red balls. A ball is
chosen from Urn 1 and put into Urn 2. Now a ball is chosen at random from
urn 2 and put into urn 3 and this is continued. Finally a ball drawn from Urn n.
Find the probability that it is white.
Solution
Let pr = Probability that the ball drawn from Urn r is white.
∴
1
)p(1
1
1
pp 1r1rr
++
×−+
++
+
×= −−
aa
a
ba
a
; r = 1, 2, …, n.
This is a recurrence relation for pr. Noting that p1 =
ba
a
+
, we can find pn.

22
MATHEMATICAL EXPECTATION & DECISION MAKING
Suppose we roll a die n times. What is the average of the n numbers that appear on the
top?
Suppose 1 occurs on the top n1 times
Total of the n numbers on the top = 621 n....6..........n1n1 ×+×+×
∴Average of the n numbers,
nnnn
621621 n
6...
n
2
n
1
n6..........n2n1
×++×+×=
××+×
=
Here clearly n1, n2, …, n6 are unknown. But by the relative frequency definition of
probability, we may approximate
n
n1
by P(getting 1 on the top) =
6
1
,
n
n2
by
P(getting 2 on the top) =
6
1
, and so on. So we can ‘expect’ the average of the n
numbers to be 5.3
2
7
= . We call this the Mathematical Expectation of the number
on the top.
Definition
Let E be a random experiment with n outcomes a1, a2 ……….an. Suppose P({a1})=p1,
P({a2})=p2, …, P({an})=pn. Then we define the mathematical expectation as
nn2211 pa.........papa ×+×+×

23
Problems
1. If a service club sells 4000 raffle tickets for a cash prize of $800, what is the
mathematical expectation of a person who buys one of these tickets?
Solution. 2.0
5
1
)(0
4000
1
800 ==×+×
2. A charitable organization raises funds by selling 2000 raffle tickets for a 1st
prize
worth $5000 and a second prize $100. What is mathematical expectation of a
person who buys one of the tickets?
Solution. )(0
2000
1
100
2000
1
5000 ×+×+×
3. A game between 2 players is called fair if each player has the same mathematical
expectation. If some one gives us $5 whenever we roll a 1 or a 2 with a balanced
die, what we must pay him when we roll a 3, 4, 5 or 6 to make the game fair?
Solution. If we pay $x when we roll a 3, 4, 5, or 6 for the game to be fair,
6
2
5
6
4
×=×x or x = 10. That is we must pay $10.
4. Gambler’s Ruin
A and B are betting on repeated flips of a balanced coin. At the beginning, A has
m dollars and B has n dollars. After each flip the loser pays the winner 1 dollar
and the game stops when one of them is ruined. Find probability that A will win
B’s n dollars before he loses his m dollars.
Solution.
Let p be the probability that A wins (so that 1-p is the probability that B wins).
Since the game is fair, A’s math exp = B’s math exp.
Thus ( ) 0.pp)m(1p10pn +−=−+× or
nm
m
p
+
=

24
5. An importer is offered a shipment of machines for $140,000. The probability that
he will sell them for $180,000, $170,000 (or) $150,000 are respectively 0.32,
0.55, and 0.13. What is his expected profit?
Solution. Expected profit
= 13.0000,1055.0000,3032.0000,40 ×+×+×
=$30,600
6. The manufacturer of a new battery additive has to decide whether to sell her
product for $80 a can and for $1.2 a can with a ‘double your money back if not
satisfied’ guarantee. How does she feel about the chances that a person will ask
for double his/her money back if
(a) she decides to sell the product for $0.80
(b) she decides to sell the product for $1.20
(c) she can not make up her mind?
Solution. In the 1st
case, she gets a fixed amount of $0.80 a can
In the 2nd
case, she expects to get for each can
(1.20) (1-p) + (-1.2) (p) = 1.20 – (2.4) p
Let p be the prob that a person will ask for double his money back.
(a) happens if 0.80 > 1.20 –2.40 p
p > 1/6
(b) happens if
p < 1/6
(c) happens if p = 1/6

25
7. A manufacturer buys an item for $1.20 and sells it for $4.50. The probabilities for
a demand of 0, 1, 2, 3, 4, “5 or more” items are 0.05, 0.15, 0.30, 0.25, 0.15, 0.10
respectively. How many items he must stock to maximize his expected profit?
No. of items stocked No. sold with prob. Exp. profit
0 0 1 0
1
0 0.05
1 0.95
175.2
1.295.0
5.405.00
=
−×
+×
2
0 0.05
1 0.15
2 0.80 675.3
2.480.09
15.05.405.00
=
−×+
×+×
3
0 0.05
1 0.15
2 0.30
3 0.50 ˆˁˋˈˆˁˋˈˆˁˋˈˆˁˋˈ=
−×
+×+
×+×
3.615.0
5.1330.09
15.05.405.00
4 2.85
5 0.525
6 0.45
Hence he must stock 3 items to maximize his expected profit.
8. A contractor has to choose between 2 jobs. The 1st
job promises a profit of
$240,000 with probability 0.75 and a loss of $60,000 with probability 0.25. The
2nd
job promises a profit of $360,000 with probability 0.5 and a loss of $90,000
with probability 0.5.
(a) Which job should the contractor choose to maximize his expected profit?
i. Exp. profit for job1 = 000,155
4
1
000,60
4
3
000,240 =×−×
ii. Exp. profit for job2 = 36,000 000,135
2
1
000,90
2
1
=×−×
Go in for job1.
(b) What job would the contractor probably choose if her business is in bad
shape and she goes broke unless, she makes a profit of $300,000 on her
next job.
Ans:- She takes the job2 as it gives her higher profit.

26
RANDOM VARIABLES
Let E be a random experiment. A random variable (r.v) X is a function that associates to
each outcome s, a unique real number X (s).
Example 1
Let E be the random experiment of tossing a fair coin 3 times. We see that there are
823
= outcomes TTT, HTT, THT, TTH, HHT, HTH, THH, HHH all of which are
equally likely. Let X be the random variable that ‘counts’ the number of heads obtained.
Thus X can take only 4 values 0,1,2,3. We note that
( ) ( ) ( ) ( ) .
8
1
3,
8
3
2,
8
3
1,
8
1
0 ======== XPXPXPXP This is called the
probability distribution of the rv X. Thus the probability distribution of a rv X is the
listing of the probabilities with which X takes all its values.
Example 2
Let E be the random experiment of rolling a pair of balanced die. There are 36 possible
equally likely outcomes, namely (1,1), (1,2)…… (6,6). Let X be the rv that gives the sum
of the two nos on the top. Hence X take 11 values namely 2,3……12. We note that the
probability distribution of X is
( ) ( ) ( ) ( )
36
2
11XP3XP,
36
1
12XP2XP ======== ,
( ) ( ) ,
36
3
10XP4XP ====
( ) ( )
36
4
9XP5XP ==== .
( ) ( ) ( ) .
6
1
36
6
7XP,
36
5
8XP6XP =======
Example 3
Let E be the random experiment of rolling a die till a 6 appears on the top. Let X be the
no of rolls needed to get the “first” six. Thus X can take values 1,2,3…… Here X takes
an infinite number of values. So it is not possible to list all the probabilities with which X
takes its values. But we can give a formula.

27
( ) ( ).....2,1
6
1
6
5
1
===
−
xxXP
x
(Justification: X = x means the first (x-1) rolls gave a number (other than 6) and
the xth roll gave the first 6. Hence ( ) )
6
1
6
5
6
1
6
5
...
6
5
6
5
1
1
−
−
=×××==
x
timesx
xXP
Discrete Random Variables
We say X is a discrete rv of it can take only a finite number of values (as in example 1,2
above) or a “countably” infinite values (as in example 3).
On the other hand, the annual rainfall in a city, the lifelength of an electronic device, the
diameter of washers produced by a factory are all continuous random variables in the
sense they can take (theoretically at least) all values in an ‘interval’ of the x-axis. We
shall discuss continuous rvs a little later.
Probability distribution of a Discrete RV
Let X be a discrete rv with values ......, 21 xx
Let ( ) ( )( ).....2,1ixXPxf ii ===
We say that ( ){ } ....2,1iixf = is the probability distribution of the rv X.
Properties of the probability distribution
(i) ( ) .....2,1iallfor0xf i =≥
(ii) ( ) 1xf i
i
=
The first condition follows from the fact that the probability is always .0≥ The second
condition follows from the fact that the probability of the certain event = 1.

28
Example 4
Determine whether the following can be the probability distribution of a rv which can
take only 4 values 1,2,3 and 4.
(a) ( ) ( ) ( ) ( ) 26.0426.0326.0226.01 ==== ffff .
No as the sum of all the “probabilities” > 1.
(b) ( ) ( ) ( ) ( ) 28.0429.03,28.0215.01 ==== ffff .
Yes as these are all 0≥ and add up to 1.
(c) ( ) 4,3,2,1
16
1
=
+
= x
x
xf .
No as the sum of all the probabilities < 1.
Binomial Distribution
Let E be a random experiment having only 2 outcomes, say ‘success’ and ‘failure’.
Suppose that P(success) = p and so P(failure) = q (=1-p). Consider n independent
repetitions of E (This means the outcome in any one repetition is not dependent upon the
outcome in any other repetition). We also make the important assumption that P(success)
= p remains the same for all such independent repetitions of E. Let X be the rv that
’counts’ the number of successes obtained in n such independent repetitions of E. Clearly
X is a discrete rv that can take n+1 values namely 0,1,2,….n. We note that there are
n
2 outcomes each of which is a ‘string’ of n letters each of which is an S or F (if n =3, it
will be FFF, SFF, FSF, FFS, SSF, SFS, FSS, SSS).
X = x means in any such outcome there are
x successes and (n-x) failures in some order. One such will be
xnx
FFFFSSSS
−
.... . Since all
the repetitions are independent prob of this outcome will be xnx
qp −
. Exactly the same
prob would be associated with any other outcome for which X = x. But x successes can
occur out of n repetitions in
x
n
mutually exclusive ways. Hence
( ) ( )....n1,0,xqp
x
n
xXP xnx
=== −

29
We say X has a Binomial distribution with parameters n (≡ the number of repetitions)
and p (Prob of success in any one repetition).
We denote ( ) ( )p,n;xbxXP by= to show its dependence on x, n and p. The letter ‘b’
stands for binomial.
Since all the above (n+1) probabilities are the (n+1) terms in the expansion of the
binomial ( )n
pq + , X is said to have a binomial distribution. We at once see that the sum
of all the binomial probabilities = ( ) .11 ==+ nn
pq
The independent repetitions are usually referred to as the “Bernoulli” trials. We note that
( ) ( )q,n;xnbp,n;xb −=
(LHS = Prob of getting x successes in n Bernoulli trials = prob of getting n-x failures in
n Bernoulli trials = R.H.S.)
Cumulative Binomial Probabilities
Let X have a binomial distribution with parameters n and p.
( ) ( ) P0XPxXP +==≤ ( ) ( )xXP......1X =+=
= ( )p,n;kb
x
0k=
is denoted by ( )pnxB ,; and is called the cumulative Binomial distribution function. This
is tabulated in Table 1 of your text book. We note that
( ) ( ) ( ) ( )
( ) ( )p,n;1xBp,n;xB
1xXPxXPxXpp,n;xb
−−=
−≤−≤===
Thus ( )60.00,12;9b = ( ) ( )60.0,12;860.0,12;9 BB −
1419.0
7747.09166.0
=
−=
(You can verify this by directly calculating ( )).9;12,0.60b

30
Example 5 (Exercise 4.15 of your book)
During one stage in the manufacture of integrated circuit chips, a coating must be
applied. If 70% of the chips receive a thick enough coating find the probability that
among 15 chips.
(a) At least 12 will have thick enough coatings.
(b) At most 6 will have thick enough coatings.
(c) Exactly 10 will have thick enough coatings.
Solution
Among 15 chips, let X be the number of chips that will have thick enough coatings.
Hence X is a rv having Binomial distribution with parameters n =15 and p = 0.70.
(a) ( ) ( )11XP112XP ≤−=≥
( )
3969.07031.01
70.0,15;111
=−=
−= B
(b) ( ) ( )70.0,15;6B6XP =≤
0152.0=
(c) ( ) ( ) ( )70.0,15;9B70.0,15;10B10XP −==
2065.0
2784.04849.0
=
−=
Example 6 (Exercise 4.19 of your text book)
A food processor claims that at most 10% of her jars of instant coffee contain less coffee
than printed on the label. To test this claim, 16 jars are randomly selected and contents
weighed. Her claim is accepted if fewer than 3 of the 16 jars contain less coffee (note that
10% of 16 = 1.6 and rounds to 2). Find the probability that the food processor’s claim
will be accepted if the actual percent of the jars containing less coffee is
(a) 5% (b) 10% (c) 15% (d) 20%
Solution:
Let X be the number of jars that contain less coffee (than printed on the label) (among the
16 jars randomly chosen. Thus X is a random variable having a Binomial distribution

31
with parameters n = 16 and p (the prob of “success” = The prob that a jar chosen at
random will have less coffee)
(a) Here p = 5% = 0.05
Hence P (claim is accepted) = ( ) ( ) .9571.005.0,16;2B2XP ==≤
(b) Here p = 10% = 0.10
Hence P (claim is accepted) = ( ) 7892.001.0,16;2 =B
(c) Here p = 15% = 0.15.
Hence P (claim is accepted) = B( ) 5614.015.0,16;2 =
(d) Here p = 20% = 0.20
Hence P(claims accepted) = ( ) 3518.029.0,16,2 =B
Binomial Distribution – Sampling with replacement
Suppose there is an urn containing 10 marbles of which 4 are white and the rest are black.
Suppose 5 marbles are chosen with replacement. Let X be the rv that counts the no of
white marbles drawn. Thus X = 0,1,2,3,4 or 5 (Remember that we replace each marble in
the urn before drawing the next one. Hence we can draw 5 white marbles)
P (“Success”) = P (Drawing a white marble in any one of the 5 draws) =
10
4
(remember
we draw with replacement).
Thus X has a Binomial distribution with parameters n = 5 and
10
4
=p
Hence ( ) ==
10
4
,5;xbxXP
Mode of a Binomial distribution
We say 0x is the mode of the Binomial distribution with parameters n and p if
( )0xXP = is the greatest. From the binomial tables given in the book we can easily see
that

32
When ( ) .mod55,
2
1
,10 etheisorgreatesttheisXPpn ===
Fact
( )
( )
( )pnpxif
p
p
n
xn
pnxb
pnxb
−−<>
−
×
+
−
=
+
11
11;;
,;1
( )
( )ppnnif
pnpxif
−−><
−−==
11
11
Thus so long as x <np – (1-p) the binomial probabilities increase and if x> np-(1-p) they
decrease. Hence if np-(1-p) = x0 is an integer, then the mode is .100 +xandx If n – (1-p)
in not an integer and if 0x = smallest integer ( )pnp −−≥ 1 , the mode is .x0
Hypergeometric Distribution (Sampling without replacement)
An urn contains 10 marbles of which 4 are white. 5 marbles are chosen at random
without replacement. Let X be the rv that counts the number of white marbles drawn.
Thus X can take 5 values names 0,1,2,3,4. What is P (X = x)? Now out of 10 marbles 5
can be chosen in
5
10
equally like ways, out of which there will be
− xx 5
64
ways of
drawing x white marbles (and so 5-x read marbles) (Reason out of 4 white marbles, x can
be chosen in
x
4
ways and out of 6 red marbles, 5-x can be chosen in
− x5
6
ways).
Hence ( ) .4,3,2,1,0
5
10
5
64
=
−
== x
xx
xXP
We generalize the above result.
A box contains N marbles out of which a are white. n marbles are chosen without
replacement. Let X be the random variable that counts the number of white marbles
drawn. X can take the values 0,1,2……. n.

33
( ) ....2,1,0=
−
−
== x
n
N
xn
aN
x
a
aXP n
(Note x must be less than or equal to a and n-x must be less than or equal to N-a)
We say the rv X has a hypergeometric distribution with parameters n,a and N. We denote
P(X=x) by h (x;n,a,N).
Among the 12 solar collectors on display, 9 are flat plate collectors and the other three
are concentrating collectors. If a person choses at random 4 collectors, find the prob that
3 are flat plate ones.
Ans ( )=
4
12
1
3
3
9
12,9,4;3h
If 6 of 18 new buildings in a city violate the building code, what is the probability that a
building inspector, who randomly selects 4 of the new buildings for inspection, will catch
(a) None of the new buildings that violate the building code
Ans ( )=
4
18
4
12
18,6,4;1h
(b) One of the new buildings that violate the building code

34
Ans ( )=
4
18
3
12
1
6
18,6,4;1h
(c) Two of the new buildings that violate the building code
Ans ( )=
4
18
2
12
2
6
18,6,4;2h
(d) At least three of the new buildings that violate the building code
Ans ( ) ( )18,6,4;418,6,4;3 hh +
(Note: We choose 4 buildings out of 18 without replacement. Hence hypergeometric
distribution is appropriate)
Binomial distribution as an approximation to the Hypergeometric Distribution
We can show that ( ) ( ) ∞→→ NaspnxbNanxh ,;,,;
(Where )"" successaofprob
N
a
p == . Hence if N is large the hypergeometric
probability ( )N,a,n;xh can be approximated by the binomial probability
( )p,n;xb where .
N
a
p =
Example 9 (exercise 4.26 of your text)
A shipment of 120 burglar alarms contains 5 that are defective. If 3 of these alarms are
randomly selected and shipped to a customer, find the probability that the customer will
get one defective alarm.
(a) By using the hypergemetric distribution
(b) By approximating the hypergeometric probability by a binomial probability.

35
Solution
Here N = 120 (Large!) a = 5 n = 3 x =1
(a) Reqd prob = ( )120,5,3;1h
1167.0
280840
65555
3
120
2
115
1
5
=
×
==
(b) ( )≈
120
5
,3;1120,5,3;1 bh
1148.0
120
5
1
120
5
1
3
2
=−=
Example 10 (Exercise 4.27 of your text)
Among the 300 employees of a company, 240 are union members, while the others are
not. If 8 of the employees are chosen by lot to serve on the committee which
administrates the provident fund, find the prob that 5 of them will be union members
while the others are not.
(a) Using hypergemoretric distribution
(b) Using binomial approximation
Solution
Here N = 300, a = 240, n = 8 x = 5
(a) ( )300,240,8;5h
(b) ≈
300
240
8,5;b

36
THE MEAN AND VARIANCE OF PROBABILITY DISTRIBUTIONS
We know that the equation of a line can be written as .cmxy += Here m is the slope and
c is the y intercept. Different m,c give different lines. Thus m and c characterize a line.
Similarly we define certain numbers that characterize a probability distribution.
The mean of a probability distribution is simply the mathematical expectation of the
corresponding r.v. If a rv X takes on the values .....xx 2,1 with probabilities
( ) ( )....,xf,xf 21 its mathematical expectation or expected value is
( ) ( ) ( ) obabilityvaluexxPxxxfxxfx ii
i
Pr......2211 ×===++
We use the symbol µ to denote the mean of X.
Thus ( ) ( )ii xxPxXE ===µ (Summation over all xi in the Range of X)
Example 11
Suppose X is a rv having the probability distribution
X 1 2 3
Prob
2
1
3
1
6
1
Hence the mean µ of the prob distribution (of X) is
3
5
6
1
3
3
1
2
2
1
1 =×+×+×=µ
Example 12
Let X be the rv having the distribution
X 0 1
Prob q p

37
where .10Thus.1 ppqpq =×+×=−= µ
The mean of a Binomial Distribution
Suppose X is a rv having Binomial distribution with parameters n and p. Then
Mean of .npX =µ=
(Read the proof on pages 107-108 of your text book)
The mean of a hypergeometric Distribution
If X is a rv having hypergeometric distribution with parameters .,,,
N
a
nthenanN =µ
Digression
The mean of a rv x give the “average” of the values taken by the rv. X. Thus the
average marks in a test is 40 means the students would have got marks less than 40
and greater than 40 but it averages out to be 40. But we do not get an idea about the
spread (≡deviation from the mean) of the marks. This spread is measured by the
variance. Informally speaking by the average of the squares of deviation from the
mean.
Variance of a Probability Distribution of X is defined as the expected value of
( )2
X µ−
Variance of 2
X σ=
( ) ( )
Xi
i
2
i
Rx
xXPx
∈
=−=
Note that R.H.S is always 0≥ (as it is the sum of non-ve numbers)
The positive square root 2
of σσ is called the standard deviation of X and has the
same units as X and .µ

38
Example 13
For the rv X having the prob distribution given in example 11, the variance is
9
5
6
1
9
16
3
1
9
1
2
1
9
4
6
1
3
5
3
3
1
3
5
2
2
1
3
5
1
222
=×+×+=
×−+×−+×−
x
We could have also used the equivalent formula
( ) ( )
( )
.
9
5
9
25
3
10
3
10
18
60
6
9
3
4
2
1
6
1
3
3
1
2
2
1
1XEHere
XEXE
2
2222
2222
=−=σ∴
==++=×+×+×=
µ−=µ−=σ
Example 14
For the probability distribution of example 12,
( )
( ) pqp1ppp
pp1qoXE
22
222
=−=−=σ∴
=×+×=
Variance of the Binomial Distribution
npq=2
σ
Variance of the Hypergeometric Distribution
.
1
.12
−
−
−=
N
nN
N
a
N
a
nσ

39
CHEBYCHEV’S THEOREM
Suppose X is a rv with mean µ and variance 2
σ . Chebychev’s theorem states that: If k
is a constant > 0,
( ) 2
k
1
k|X|P ≤σ≥µ−
In words the prob of getting a value which deviates from its mean µ by at least σk is at
most 2
1
k
.
Note: Chebyshev’s Theorem gives us an upper bound of the prob of an event. Mostly it is
of theoretical interest.
In one out of 6 cases, material for bullet proof vests fails to meet puncture standards. If
405 specimens are tested, what does Chebyshev theorem tell us about the prob of getting
at most 30 or at least 105 cases that do not meet puncture standards?
Here
2
135
6
1
405 =×==npµ
2
15
6
5
6
1
4052
=∴
××==
σ
σ qpn
Let X = no of cases out of 405 that do not meet puncture standards
Reqd ( )105Xor30XP ≥≤
Now
2
75
X30X −≤µ−≤
2
75
X105X ≥µ−≥
Thus σ=≥µ−≥≤ 5
2
75
|X|105Xor30X

40
( ) ( ) 04.0
25
1
5
1
5|X|P105Xor30XP 2
==≤σ≥µ−=≥≤∴
Example 16 (Exercise 446 of your text)
How many times do we have to flip a balanced coin to be able to assert with a prob of at
most 0.01 that the difference between the proportion of tails and 0.50 will be at least
0.04?
Solution:
Suppose we flip the coin n times and suppose X is the no of tails obtained. Thus the
proportion of tails = =
flipsofNoTotal
tailsofNo
n
X
. We must find n so that
10.00.040.50
n
X
P ≤≥−
Now X = no of tails among n flips of a balanced coin is a rv having Binomial distribution
with parameters n and 0.5.
Hence ( ) 50.0nnpXE ×===µ
( )50.050.0 ==×== qpasnqpnσ
Now 04.050.0
n
X
≥− is equivalent to .n04.050.0nX ≥×−
We know ( ) 2
k
1
kXP ≤σ≥µ−
Here nk 04.0=σ
n
n
n
k 08.0
50.0
04.0
=
×
=∴

41
( )
( )
( )
15625
08.
100
nor
.100n08.ifor100
01.0
1
kif
01.0
k
1
k|X|P
04.050.0
n
X
P
2
22
2
=≥
≥=≥
≤≤σ≥µ−=
≥−∴
Law of large Numbers
Suppose a factory manufactures items. Suppose there is a constant prob p that an item is
defective. Suppose we choose n items at random and let X be the no of defectives found.
Then X is a rv having binomial distribution with parameters n and p.
( ) npqiancevar,npXEmean 2
=σ==µ∴
Let ε be any no > 0.
Now ε≥− p
n
X
P
( ) ( )σ≥µ−=ε≥−= kxPnnpXP (where ε=σ nk )
≤ .nas0
n
pq
n
npq
n
)theorems'Chebyshevby(
k
1
22222
2
2
∞→→
ε
=
ε
=
ε
σ
=
Thus we can say that the prob that the proportion of defective items differs from the
actual prob. p by any + ve no ∞→→ nas0ε . (This is called the Law of Large
numbers)
This means “most of the times” the proportion of defectives will be close to the actual
(unknown) prob p that an item is defective for large n. So we can estimate
n
X
byp , the
(Sample) proportion of defectives.

42
POISSON DISTRIBUTION
A random variable X is said to have a Poisson distribution with parameter 0>λ if its
probability distribution is given by
( ) ( ) ......2,1,0
!
; ==== −
x
x
exfxXP
x
λ
λ λ
We can easily show: mean of λ=µ=X and variance of .X 2
λ=σ=
Also ( )xXP = is largest when λλ−λ= ifand1x is an integer and when [ ]λ=x = the
greatest integer λ≤ (when λ is not an integer). Also note that ( ) .0 ∞→→= xasxXP
POISSON APPROXIMATION TO BINOMIAL DISTRIBUTION
Suppose X is a rv having Binomial distribution with parameters n and p. We can easily
show ( ) ( ) ( ) ∞→→== nasx;fxXPpn,x;b in such a way that np remains a constant
.λ
Hence for n large, p small, the binomial prob ( )pnxb ,; can be approximated by the
Poisson prob ( )λ;xf where .np=λ
Example 17
( )03.0,100;3b
( )
!3
3
3;3
33−
=≈
e
f
If 0.8% of the fuses delivered to an arsenal are defective, use the Poisson approximation
to determine the probability that 4 fuses will be defective in a random sample of 400.
Solution
If X is the number of defectives in a sample of 400, X has the binomial distribution with
parameters n = 400 and p = 0.8% = 0.008.

43
Thus P (4 out of 400 are defective)
( ) ( ) ( )
( )
603.0781.0
!4
2.3
e
2.3008.0400Where;4f008.0,400;4b
4
2.3
−=
=
=×=λλ≈=
−
(from table 2 at the end of the text)
= 0.178
Cumulative Poisson Distribution Function
If X is a rv having Poisson Distribution with parameter ,λ the cumulative Poisson Prob
( ) ( ) ( ) ( )λ===≤=λ=
==
;kfkXPxXP;xF
x
0k
x
0k
For various ( )λλ ;xF,xand has been tabulated in table 2 (of your text book on page 581
to 585) .We use the table 2 as follows.
( ) ( ) ( ) ( )
( ) ( )λ−−λ=
−≤−≤===λ
;1xF;xF
1xXPxXPxXP;xf
Thus ( ) ( ) ( ) .178.0603.0781.02.3;32.3;42.3;4 =−=−= FFf
Poisson Process
There are many situations in which events occur randomly in regular intervals of time.
For example in a time period t, let tX be the number of accidents at a busy road junction
in New Delhi; tX be the number of calls received at a telephone exchange; tX be the
number of radio active particles emitted by a radioactive source etc. In all such examples
we find tX is a discrete rv which can take non-ve integral values 0,1,2,….. The important
thing to note is that all such random variables have “same” distribution except that the
parameter(s) depend on time t.
The collection of random variables ( )tX t > 0 is said to constitute a random process. If
each ( )tX has a Poisson Distribution, we say ( )tX is a Poisson process. Now we show
the rvs ( )tX which counts the number of occurrences of a random phenomena in a time

44
period t constitute a Poisson process under suitable assumptions. Suppose in a time
period t, a random phenomenon which we call “success” occurs. We let Xt = number of
successes in time period t. We assume :
1. In a small time period ,t∆ either no success or one success occurs.
2. The prob of a success in a small time period t∆ is proportional to t∆ i.e. say
( ) tXP t ∆==∆ α1 . ( →α constant of proportionality)
3. The prob of a success during any time period does not depend on what
happened prior to that period.
Divide the time period t into n small time periods each of length t∆ . Hence by
assumptions above, we note that Xt = no of successes in time period t is a rv having
Binomial distribution with parameters n and tp ∆= α . Hence
( ) ( )t,n;xbxXP t ∆α==
( )
tn.where
nasx;f
=
∞→→
So we can say that Xt = no of successes in time period t is a rv having Poisson
distribution with parameter .tα
Meaning of the proportaratility constant α
Since mean of tisXt α=λ , We find α = mean no of successes in unit time.
(Note: For a more rigorous derivation of the distribution of Xt, you may see Meyer,
Introductory probability and statistical applications, pages 165-169).
Given that the switch board of a consultant’s office receives on the average 0.6 call per
minute, find the probability that
(a) In a given minute there will be at least one call.
(b) In a 4-minute interval, there will be at least 3 calls.

45
Solution
Xt= no of calls in a t-minute interval is a rv having Poisson distribution with parameter
tt 6.0=α
(a) ( ) ( ) .451.0549.01e10XP11XP 6.0
11 =−=−==−=≥ −
(b) ( ) ( ) ( ) 430.0570.014.2;2F12XP13XP 44 =−=−=≤−=≥
Example 20
Suppose that Xt, the number of particles emitted in t hours from a radio – active source
has a Poisson distribution with parameter 20t. What is the probability that exactly 5
particles are emitted during a 15 minute period?
Solution
15 minutes = hour
4
1
Hence 4
1Xif = no of particles emitted in hour
4
1
( )
)2tablefrom(176.0440.0616.0
!5
5
e
!5
20
4
1
e5XP
5
5
5
204
1
4
1
=−=
=
×
== −×−

46
THE GEOMETRIC DISTRIBUTION
Suppose there is a random experiment having only two possible outcomes, called
‘success’ and ‘failure’. Assume that the prob of a success in any one ‘trial’ (≡repetition
of the experiment) is p and remains the same for all trials. Also assume the trials are
independent. The experiment is repeated till a success is got. Let X be the rv that counts
the number of trials needed to get the 1st
success. Clearly X = x if the first (x-1) trials
were failures and the xth trial gave the first success. Hence
( ) ( ) ( ) ( )......2,1xpqpp1p;xgxXP 1x1x
==−=== −−
We say X has a geometric distribution with parameter p (as the respective probabilities
form a geometric progression with common ratio q).
We can show the mean of this distribution is
p
1
=µ and the variance is 2
2
p
q
=σ
(For example suppose a die is rolled till a 6 is got. It is reasonable to expect on an average
we will need 6
1
6
1
= rolls as there are 6 nos!)
An expert hits a target 95% of the time. What is the probability that the expert will miss
the target for the first time on the fifteenth shot?
Solution
Here ‘Success’ means the expert misses the target. Hence ( ) 05.0%5 === SuccessPp . If
X is the rv that counts the no. of shots needed to get ‘a success’, we want
( ) ( ) .05.095.015
1414
×=×== pqXP

47
Example 22
The probability of a successful rocket launching is 0.8. Launching attempts are made till
a successful launching has occurred. Find the probability that exactly 6 attempts will be
necessary.
Solution ( ) 8.02.0
5
×
Example 23
X has a geometric distribution with parameter p. show
(a) ( ) ,.........2,11
==≥ −
rqrXP r
(b) ( ) ( )tXPsxtsxP ≥=>+≥ |
Solution
(a) ( ) .q
q1
pq
.pqrXP 1r
1r
rx
1x −
−∞
=
−
=
−
==≥
(b) ( ) ( )
( )
( ).1
1
tXPq
q
q
sXP
tsXP
sXtsXP t
s
ts
≥===
>
+≥
=>+≥ −
−+
Application to Queuing Systems
Service facility
Customers arrive in a
Depart after service
Poisson Fashion
There is a service facility. Customers arrive in a random fashion and get service if the
server is idle. Else they stand in a Queue and wait to get service.
Examples of Queuing systems
1. Cars arriving at a petrol pump to get petrol
2. Men arriving at a Barber’s shop to get hair cut.
3. Ships arriving at a port to deliver goods.
S

48
Questions that one can ask are :
1. At any point of time on an average how many customers are in the system
(getting service and waiting to get service)?
2. What is the mean time a customer waits in the system?
3. What proportion of time a server is idle? And so on.
We shall consider only the simplest queueing system where there is only one server. We
assume that the population of customers is infinite and that there is no limit on the
number of customers that can wait in the queue.
We also assume that the customers arrive in a ‘Poission fashion’ at the mean rate of α .
This means that tX the number of customers that arrive in a time period t is a rv having
Poisson distribution with parameter tα . We also assume that so long as the service
station is not empty, customers depart in a Poisson fashion at a mean rate of β . This
means, when there is at least one customer, tY , the number of customers that depart
(after getting service) in a time period t is a r.v. having Poisson distribution with
parameter tβ (where αβ > ).
Further assumptions are : In a small time interval ,t∆ there will be a single arrival or a
single departure but not both. (Note that by assumptions of Poisson process in a small
time interval ,t∆ there can be at most one arrival and at most one departure). Let at time
t, tN be the number of customers in the system. Let ( ) ( ).tpnNP nt == We make another
assumption:
( ) nnn .tastp π∞→π→ is known as the steady state probability distribution of the
number of customers in the system. It can be shown:
( )...,2,1,01
1
=−=
−=
n
n
n
o
β
α
β
α
π
β
α
π
Thus L = Mean number of customers in the system getting service and waiting to get
service)

49
αβ
α
π
−
==
∞
=
n
n
n.
0
qL = Mean no of customers in the queue (waiting to get service)
( )
( ) β
α
αββ
α
π −=
−
=−=
∞
=
Ln n
n
2
1
1
W = mean time a customer spends in the system
ααβ
L
=
−
=
1
qW = Mean time a customer spends in the queue.
( )
.
1
βααββ
α
−==
−
= W
Lq
(For a derivation of these results, see Operations Research Vol. 3 by Dr. S.
Venkateswaran and Dr. B Singh, EDD Notes of BITS, Pilani).
Trucks arrive at a receiving dock in a Poisson fashion at a mean rate of 2 per hour. The
trucks can be unloaded at a mean rate of 3 per hour in a Poisson fashion (so long as the
receiving dock is not empty).
(a) What is the average number of trucks being unloaded and waiting to get
unloaded?
(b) What is the mean no of trucks in the queue?
(c) What is the mean time a truck spends waiting in the queue?
(d) What is the prob that there are no trucks waiting to be unloaded?
(e) What is the prob that an arriving truck need not wait to get unloaded?

50
Solution
Here α = arrival rate = 2 per hour
β = departure rate = 3 per hour.
Thus
(a) 2
23
2
=
−
=
−
=
αβ
α
L
(b)
( ) ( ) 3
4
13
222
==
−
=
αββ
α
qL
(c)
( )
hrWq
3
2
=
−
=
αββ
α
(d) P (no trucks are waiting to be unloaded)
= (No of trucks in the dock is 0 or 1)
3
2
3
2
1
3
2
11110 −+−=−+−=+=
β
α
β
α
β
α
ππ
9
5
9
2
3
1
=+=
(e) P (arriving truck need not wait)
= P (dock is empty)
=
3
1
0 =π
Example 25
With reference to example 24, suppose that the cost of keeping a truck in the system is
Rs. 15/hour. If it were possible to increase the mean loading rate to 3.5 trucks per hour at
a cost of Rs. 12 per hour, would this be worth while?

51
Solution
In the old scheme, 2,3,2 === Lβα
∴ Mean cost per hour to the dock = 2 x 15 = 30/hr.
In the new scheme
3
4
L,3,2 ==β=α verify!
∴ Net cost per hour to the dock = .hr/321215
3
4
=+×
Hence it is not worthwhile to go in for the new scheme.

52
MULTINOMIAL DISTRIBUTION
Consider a random experiment E and suppose it has k possible outcomes .,...., 21 kAAA
Suppose ( ) ii pAP = for all i and that pi remains the same for all independent repetitions
of E. Consider n independent repetitions of E. Suppose A1 occurs X1 times, A2 occurs X2
times, …, Ak occurs Xk times. Then ( )kk xXxXxXP === ,...., 2211
= kx
k
xx
k
ppp
xxx
n
.....
!!......!
! 21
21
21
for all non-ve integers xxxxwithxxx kk =+++ .....,, 2121
Proof. The probability of getting 11 xA times, 22 xA times, kk xA times in any one way
is kx
k
xx
ppp ......21
21 as all the repetitions are independent. Now among the n repetitions
1A occurs 1x times in
( )!!
!
111 xnx
n
x
n
−
= ways.
From the remaining 1xn − repetitions 2A can occur 2x times in
( )
( )!!
!
212
1
2
1
xxnx
xn
x
xn
−−
−
=
−
ways and so on.
Hence the total number of ways of getting 11 xA times, 22 xA times, …. kk xA times will
be
( )
( )
( )
( )
( )!....!
!.....
...
!!
!
!!
!
121
121
212
1
11 kkk
k
xxxxnx
xxxn
xxnx
xn
xnx
n
−−−
−−
×
−−
−
×
− −
−
1!0.....
!!......!
!
21
21
==++= andnxxxas
xxx
n
k
k
Hence ( ) kx
k
xx
k
kk ppp
xxx
n
xXxXxXP ....
!!....!
!
,....., 21
21
21
2211 ====

53
Example 26
A die is rolled 30 times. Find the probability of getting 1 2 times, 2 3 times, 3 4 times,
4 6 times, 5 7 times and 6 8 times.
Ans
876432
6
1
6
1
6
1
6
1
6
1
6
1
!8!7!6!4!3!2
!30
×
Example 27 (See exercise 4.72 of your text)
The probabilities are, respectively, 0.40, 0.40, and 0.20 that in city driving a certain type
of imported car will average less than 10 kms per litre, anywhere between 10 and 15 kms
per litre, or more than 15 kms per litre. Find the probability that among 12 such cars
tested, 4 will average less than 10 kms per litre, 6 will average anywhere from 10 to 15
kms per litre and 2 will average more than 15 kms per litre.
Solution
( ) ( ) ( ) .20.40.40.
!2!6!4
!12 264
Remark
1. Note that the different probabilities are the various terms in the expansion of the
multinomial
( )n
kppp ......21 ++ .
Hence the name multinomial distribution.
2. The binomial distribution is a special case got by taking k =2.
3. For any fixed ( ) iXkii ≤≤1 (the number of ways of getting iA ) is a random
variable having binomial distribution with parameters n and pi. Thus
( ) ii pnXE = and ( ) ( ) ...k1,2.......i.p1npXV iii =−=

54
SIMULATION
Nowadays simulation techniques are being applied to many problems in Science and
Engineering. If the processes being simulated involve an element of chance, these
techniques are referred to as Monte Carlo methods. For example to study the distribution
of number of calls arriving at a telephone exchange, we can use simulation techniques.
Random Numbers : In simulation problems one uses the tables of random numbers to
“generate” random deviates (values assumed by a random variable). Table of random
numbers consists of many pages on which the digits 0,1,2….. 9 are distributed in such a
was that the probability of any one digit appearing is the same, namely
10
1
1.0 = .
Use of random numbers to generate ‘heads’ and ‘tails’. For example choose the 4th
column of the four page of table 7, start at the top and go down the page. Thus we get
6,2,7,5,5,0,1,8,6,3….. Now we can interpret this as H,H,T, T,T, H, T, H, H,T, because the
prob of getting an odd no. = the propagating an even number = 0.5 Thus we associate
head to the occurrence of an even number and tail to that of an odd no. We can also
associate a head if we get 5,6,7,8, or 9 and tail otherwise. The use can say we got
H,T,H,H,H,T,T,H,H,T….. In problems on simulation we shall adopt the second scheme
as it is easy to use and is easily ‘extendable’ for more than two outcomes. Suppose for
example, we have an experiment having 4 outcomes with prob. 0.1, 0.2, 0.3 and 0.4
respectively.
Thus to simulate the above experiment, we have to allot one of the 10 digits 0,1….9 to
the first outcome, two of them to the second outcome, three of them to the third outcome
and the remaining four to the fourth outcome. Though this can be done in a variety of
ways, we choose the simplest way as follows:
Associate the first digit 0 to the 1st
outcome 10
Associate the next 2 digits 1,2 to the 2nd
outcome 20
Associate the next 3 digits 3,4,5 to the 3rd
outcome 30 .
And associate the last 4 digits 6,7,8,9 to the 4th
outcome 40 .
Hence the above sequence 6,2,7,5,5,0,1,8,6,3… of random numbers would correspond to
the sequence of outcomes ..............,,,,,,,,, 3442133424 OOOOOOOOOO
Using two and higher – digit Random numbers in Simulation

55
Suppose we have a random experiment with three outcomes with probabilities 0.80, 0.15
and 0.05 respective. How can we now use the table of random numbers to simulate this
experiment? We now read 2 numbers at a time : say (starting from page 593 room 12,
column 4) 84,71,14,24,20,31,78, 03………….. Since P (anyone digit) =
10
1
, P (any two
digits) = 01.0
10
1
10
1
=× . Thus each 2 digit random number occurs with prob 0.01.
Now that there will be 100 2 digit random numbers : 00, 01, …, 10, 11, …, 20, 21, …,
98, 99. Thus we associate the first 80 numbers 00,01…79 to the first out come, the next
15 numbers (80, 81, …94) to the second outcome and the last 5 numbers (95, 96, …, 99)
to the 3rd
outcome. Thus the above sequence of 2 digit random numbers would simulate
the outcomes:
.......,,,,,,, 11111112 OOOOOOOO
We describe the above scheme in a diagram as follows:
Outcome Probability Cumulative Probability*
Random Numbers**
O1 0.80 0.80 00-79
O2 0.15 0.95 80-94
O3 0.05 1.00 95-99
* Cumulative prob is got by adding all the probabilities at that position and above thus cumulative
prob at O2 = Prob of O1 + Prob O2 = 0.80 + 0.15 = 0.95.
** You observe the beginning random number is 00 for the 1st
outcome; and for the remaining
outcomes, it is one more than the ending random numbers of the immediately preceding outcome.
Also the ending random number for each outcome is “one less than the cumulative probability”.
Similarly three digit random numbers are used if the prob of an outcome has 3 decimal
places. Read the example on page 133 of your text book.

56
Exercise 4.97 on page 136
Starting with page 592, Row 14, Column 7, we read of the 4 digit random nos as :
R No. Polluting spics R.No. Polluting spics
5095 1 2631 1
0150 0 3033 1
8043 2 9167 3
9079 3 4998 1
6440 2 7036 2
CONTINOUS RANDOM VARIABLES
In many situations, we come across random variables that take all values lying in a
certain interval of the x axis.
Example
(1) life length X of a bulb is a continuous random variable that can take all non-ve
real values.
(2) The time between two consecutive arrivals in a queuing system is a random
variable that can take all non-ve real values.
No. of polluting spices Probability
Cumulative
Probability
Random Numbers
0 0.2466 0.2466 0000-2465
1 0.3452 0.5918 2466-5917
2 0.2417 0.8335 5918-8334
3 0.1128 0.9463 8335-9462
4 0.0395 0.9858 9463-9857
5 0.0111 0.9969 9858-9968
6 0.0026 0.9995 9969-9994
7 0.0005 1.0000 9995-9999

57
(3) The distance R of the point (where a dart hits) (from the centre) is a
continuous random variable that can take all values in the interval (0,a) where
a is the radius of the board.
It is clear that in all such cases, the probability that the random variable takes any one
particular value is meaningless. For example, when you buy a bulb, you ask the question?
What are the chances that it will work for at least 500 hours?
Probability Density function (pdf)
If X is a continuous random variable, the questions about the probability that X takes
values in an interval (a,b) are answered by defining a probability density function.
Def Let X be a continuous rv. A real function f(x) is called the prob density function of X
if
(1) ( ) xallforxf 0≥
(2) ( ) 1=
∞
∞−
dxxf
(3) ( ) ( ) .dxxfbXaP
b
a
=≤≤
Condition (1) is needed as probability is always .0≥
Condition (2) says that the probability of the certain event is 1.
Condition (3) says to get the prob that X takes a value between a and b, integrate the
function f(x) between a and b. (This is similar to finding the mass of a rod by integrating
its density function).
Remarks
1. ( ) ( ) ( ) 0==≤≤== dxxfaXaPaXP
a
a
2. Hence ( ) ( ) ( ) ( )bXaPbXaPbXaPbXaP <<=<≤=≤<=≤≤
Please note that unlike discrete case, it is immaterial whether we include or
exclude one or both the end points.
3. ( ) ( ) xxfxxXxP ∆≈∆+≤≤

58
This is proved using Mean value theorem.
Definition (Cumulative Distribution function)
If X is a continuous rv and if f(x) is its density,
( ) ( ) ( )dttfxXPxXP
x
∞−
=≤<∞−=≤
We denote the above by F(x) and call it the cumulative distribution function (cdf) of X.
Properties of cdf
1. ( ) .10 xallforxF ≤≤
2. ( ) ( )2121 xFxFxx ≤< i.e., F(x) is a non-decreasing function of x.
3. ( ) ( ) ( ) ( ) .1;0 limlim ==∞+==∞−
∞→−∞→
xFfxfF
xx
4. ( ) ( ) ( )xfdttf
dx
d
xF
dx
d
x
==
∞−
(Thus we can get density function f(x) by differentiating the distribution function F(x)).
If the prob density of a rv is given by ( ) 102
<<= xkxxf (and 0 elsewhere) find the
value of k and the probability that the rv takes on a value
(a) Between
4
3
4
1
and
(b) Greater than
3
2
Find the distribution function F(x) and hence answer the above questions.

59
Solution
( ) 1=
∞
∞−
dxxf
gives
( ) ( )( )
.31
3
1
1..
1001
2
1
0
1
0
===
><==
korkordxkxei
orxifxfasdxxf
Thus ( ) .0103 2
otherwiseandxxxf ≤≤=
32
13
64
26
4
1
4
3
3
4
3
4
1
33
24
3
4
1
==−==<< dxxXP
27
19
3
2
1
31
3
2
3
2
3
3
2
1
3
2
=−=
=<<=> dxxXPXP
Distribution function ( ) ( )dttfxF
x
∞−
=
Case (i) 0≤x . In this case ( ) 0=tf between ( ) 0=∴∞− xFxand
Case (ii) 0<x<1. In this case ( ) 2
3ttf = between 0 and x and 0 for t<0.
( ) ( ) .3 32
0
xdttdttfxF
xx
===∴
∞−
Case (iii) x > 1
Now ( ) 10 >= tfortf

60
( ) ( ) ( ) )(1
1
iicasebydttfdttfxF
x
===∴
∞−∞−
Hence we can say the distribution function
( )
>
≤<
≤
=
01
10
00
3
x
xx
x
xF
Now ≤−<=<<
4
1
4
3
4
3
4
1
XPXPXP
= ≤−≤
4
1
4
3
XPXP
=
32
13
4
1
4
3
4
1
4
3
33
=−=− FF
27
19
3
2
1
3
2
1
3
2
1
3
2
3
=−=−=
≤−=>
F
XPXP
The prob density of a rv X is given by
( ) <≤−
<<
=
elsewhere
xx
xx
xf
0
212
10
Find the prob that the rv takes a value
(a) between 0.2 and 0.8
(b) between 0.6 and 1.2
Find the distribution function and answer the same questions.

61
Solution
(a) ( ) ( )dxxfXP =<<
8.0
2.0
8.02.0
= 3.0
2
2.0
2
8.0
22
8.0
2.0
=−=dxx
(b) ( ) ( )dxxfXP =<<
2.1
6.0
2.16.0
= ( ) ( ) ( )?
2.1
4
1
6.0
whydxxfdxxf +
= ( )
2.1
1
222
2.1
1
1
6.0 2
2
2
6.0
2
1
2
−
−+−=−+
x
dxxdxx
( ) 5.018.032.0
2
8.
2
1
32.0
2
=+==+=
To Find the distribution function ( ) ( ) ( )dttfxxPxF
x
∞−
=≤=
Case (i) 0≤x In this case ( ) xtfortf ≤= 0
( ) ( ) .0==∴
∞−
dttfxF
x
Case (ii) 10 ≤< x In this case ( ) xtfortandttfortf ≤==≤= 00
Hence ( ) ( ) ( ) ( )dttfdttfdttfxF
xx
+==
∞−∞− 0
0
1
=
2
0
2
0
x
dtt
x
=+
Case (iii) 21 ≤< x In this case ( ) 00 ≤= ttf
xtt
tt
≤<−
≤<
12
10

62
( ) ( )dttfxF
x
∞−
=∴
= ( ) ( )dttfdttf
x
+
∞− `1
1
= ( ) ( )dttiicaseby
x
−+ 2
2
1
1
=
( ) ( )
2
2
1
2
2
2
1
2
1
22
xx −
−=
−
−+
Case (iv) x > 2 In this case ( ) xtfortf <<= 20
( ) ( )dttfxF
x
∞−
=∴
( ) ( )
( ) 101
2
2
2
=+=
+=
∞−
dtiiicaseby
dttfdttf
x
x
Thus
( ) ( )
>
≤<
−
−
≤<
≤
=
21
21
2
2
1
10
2
00
2
2
x
x
x
x
x
x
xF
( ) ( ) ( )
( ) ( )
( ) ( )
( ) ( )
5.0
2
6.0
2
8.0
1
6.02.1
6.02.1
6.02.12.16.0
22
=
−−=
−=
≤−≤=
≤−<=<<∴
FF
XPXP
XPXPXP

63
( ) ( )
( ) ( ) 02.0
2
2.
118.11
8.118.1
2
=−−=−=
≤−=>
F
XPXP
The mean and Variance of a continuous r.v
Let X be a continuous rv with density f(x)
We define its mean as
( ) ( )dxxfxXE
∞
∞−
==µ
We define its variance 2
σ as
( ) ( ) ( )
( ) 22
22
µ
µµ
−=
−=−
∞
∞−
XE
dxxfxxE
Here ( ) ( )dxxfxXE 22
∞
∞−
=
Example 3 The density of a rv X is
( ) ( )elsewhereandxxxF 0103 2
<<=
Its mean ( ) ( ) .
4
3
3. 2
1
0
====
∞
∞−
dxxxdxxfxXEµ
( ) ( )
5
3
3. 22
1
0
22
==
=
∞
∞−
dxxx
dxxfxXE
Hence 0375.0
4
3
5
3
2
2
=−=σ
Hence its sd is .1936.0=σ

64
Example 4 The density of a rv X is
( ) >
=
−
elsewhere
xe
xf
x
0
0
20
1 20/
( ) ( ) dxexdxxfxXE x 20/
0 20
1
. −
∞∞
∞−
===µ
Integrating by parts we get
( )[ ]
.20
20. 0
20/20/
=
−−=
∞−− xx
eex
( ) ( )
dxex
dxxfxXE
x 20/2
0
22
20
1 −
∞
∞
∞−
=
=
On integrating by parts we get
( ) ( )( ) ( )[ ]
( )
.20
400400800
800
400.2202
222
0
20/20/20/2
=∴
=−=−=∴
=
−+−−
∞−−−
σ
µσ XE
eexex xxx
NORMAL DISTRIBUTION
A random variable X is said to have the normal distribution (or Gaussian Distribution) if
its density is
( )
( )
∞<<∞−=
−
−
xexf
x
2
2
22
2
1
,; σ
µ
σπ
σµ
Hence σµ, are fixed (called parameters) and .0>σ The graph of the normal density is
a bell shaped curve:

65
Figure
It is symmetrical about the line µ=x and has points of inflection at .σµ ±=x
One can use integration and show that ( ) 1=
∞
∞−
dxxf . We also see that ( ) µ=XE and
variance of ( ) .22
σµ =−= XEX
If ,1,0 == σµ we say that X has standard normal distribution. We usually use the
symbol Z to denote the variable having standard normal distribution. Thus when Z is
standard normal, its density is ( ) .,
2
1 2
2
∞<<∞−= −
zezf z
π
The cumulative distribution function of Z is
( ) ( ) dtezZPzF tz
2
2
2
1 −
∞−
=≤=
π
and represents the area under the density upto z. It is the shaded portion in the figure.
Figure
We at once see from the symmetry of the graph that ( ) 5.0
2
1
0 ==F
( ) ( )zFzF −=− 1

66
F(z) for various positive z has been tabulated at in table 3 (at the end of your book).
We thus see from Table 3 that
( ) ( ) 95.0645.1,6443.037.0 == FF
( ) ( ) 3199.033.2 ≥≈= zforzFF
Hence ( ) 3557.06443.0137.0 =−=−F
( ) etcF 05.095.01645.1 =−=−
Definition of αz
If Z is standard normal, we define αz to be that number such that
( ) ( ) .1 αα αα −==> zForzZP
Since F(1.645) = 0.95 = 1-0.05, we see that
645.105.0 =z
Similarly 33.201.0 =z
we also note αα zz −=−1
Thus 645.105.095.0 −=−= zz
.33.201.099.0 −=−= zz
Important
If X is normal with mean µ and variance ,2
σ it can be shown that the standardized r.v.
σ
µ−
=
X
Z has standard normal distribution. Thus questions about the prob that X
assumes a value between say a and b can be translated into the prob that Z assumes
values in a corresponding range. Specifically :
( )bXaP <<

67
−
−
−
=
−
<<
−
=
−
<
−
<
−
=
σ
µ
σ
µ
σ
µ
σ
µ
σ
µ
σ
µ
σ
µ
a
F
b
F
b
Z
a
P
bXa
P
Example 1 (See Exercise 5.24 on page 152)
Given that X has a normal distribution with mean 2.16=µ and variance ,5625.12
=σ
find the prob that it will take on a value
(a) > 16.8
(b) < 14.9
(c) between 13.6 and 18.8
(d) between 16.5 and 16.7
Here 25.15625.1 ==σ
Thus ( ) −
>
−
=>
25.1
2.168.16
8.16
σ
µX
PXP
( )
( ) ( )
3156.06844.01
48.0148.01
48.0
25.1
6.
=−=
−=≤−=
>=>=
FzP
ZPZP
(b) ( ) −
<
−
=<
25.1
2.169.14
9.14
σ
µX
PXP
( )
( ) ( ) 1492.8508.0104.1104.1
04.1
25.1
3.1
=−=−=−=
−<=−<=
FF
ZPZP
( )
−
<
−
<
−
<<
25.1
2.168.18
25.1
2.166.13
8.186.13
σ
µX
P
XP

68
( )
( ) ( ) ( ) ( )( )
( ) 9624.19812.02108.22
08.2108.208.208.2
08.208.2
25.1
6.2
25.1
6.2
=−×=−=
−−=−−=
<<−=<<−=
F
FFFF
ZPZP
(Note that ( ) ( ) 012 >−=<<− cforcFcZcP )
( )
( ) ( ) ( )
606.05948.06554.0
24.04.04.024.0
25.1
5.
25.1
3.
25.1
2.167.16
25.1
2.165.16
7.165.16
=−=
−=<<=
<<=
−
<
−
<
−
=<<
FFzP
ZP
X
PXP
σ
µ
Example 2
A rv X has a normal distribution with .10=σ If the prob is 0.8212 that it will take on a
value < 82.5, what is the prob that it will take on a value > 58.3?
Solution
Let the mean (unknown) be µ .
Given ( ) 8212.05.82 =<XP
Thus 8212.0
10
5.82
=
−
<
− µ
σ
µX
P
Or 8212.0
10
5.82
=
−
<
µ
ZP
8212.0
10
5.82
=
− µ
F
From table 3, 92.0
10
5.82
=
− µ
Or 3.732.95.82 =−=µ
Hence ( )3.58>XP

69
( )5/1
10
3.733.58
>=
−
>
−
= ZP
X
P
σ
µ
( ) ( )
( )( ) ( ) 9332.05.15.111
5.115.11
==−−=
−−=−≤−=
FF
FZP
In a Photographic process the developing time of prints may be looked upon as a r.v. X
having normal distribution with 28.16=µ seconds and s.d. of 0.12 second. For which
value is the prob 0.95 that it will be exceeded by the time it takes to develop one of the
prints.
Solution
That is find a number c so that
( ) 95.0=> cXP
i.e 95.0
2.1
28.16
=
−
>
− cX
P
σ
µ
i.e. 95.0
2.1
28.16
=
−
>
c
ZP
Hence 05.0
2.1
28.16
=
−
≤
c
ZP
.306.14645.12.128.16
645.1
2.1
28.16
=×−=∴
=
−
∴
c
c
NORMAL APPROXIMATION TO BINOMIAL DISTRIBUTION
Suppose X is a r.v. having Binomial distribution with parameters n and p. Then it can be
shown that ( ) ( ) .∞→=≤→≤
−
naszFzZPz
npq
npX
P i.e in words, standardized
binomial tends to standard normal.

70
Thus when n is large, the binomial probabilities can be approximated using normal
distribution function.
A manufacturer knows that on the average 2% of the electric toasters that he makes will
require repairs within 90 days after they are sold. Use normal approximation to the
binomial distribution to determine the prob that among 1200 of these toasters at least 30
will require repairs within the first 90 days after they are sold?
Solution
Let X = No. of toasters (among 1200) that require repairs within the first 90 days after
they are sold. Hence X is a rv having Binomial Distribution with parameters n = 1200
and .02.
100
2
==p
Required ( ) −
≥
−
=≥
85.4
2430
30
npq
npX
PXP
( ) ( )
( ) 1075.08925.0124.11
24.1124.1
=−=−=
<−≥≈
F
ZPZP
Correction for Continuity
Since for continuous rvs
( ) ( )czPczP >=≥ (which is not true for discrete rvs), when we
approximate binomial prob by normal prob, we must ensure that we do not ‘lose’ the end
point. This is achieved by what we call continuity correction: In the previous example,
( )30≥XP also = ( )5.29≥XP (Read the justification given in your book on page 150
line 1to 7).
( )
( ) ( )
1292.
878.0113.1113.11
13.1
85.4
5.5
85.4
245.29
=
−=−=≤−=
≥=≥≈
−
≥
−
=
FZP
ZPZP
npq
npX
P
(probably better answer).

71
A safety engineer feels that 30% of all industrial accidents in her plant are caused by
failure of employees to follow instructions. Find approximately the prob that among 84
industrial accidents anywhere from 20 to 30 (inclusive) will be due to failure of
employees to follow instructions.
Solution
Let X = no. of accidents (among 84) due to failure of employees to follow instructions.
Thus X is a rv having Binomial distribution with parameters n = 84 and p = 0.3.
Thus 2.42.25 == npqandnp
Required ( )3020 ≤≤ XP
( )5.305.19 ≤≤= XP (continuity correction)
( )
( ) ( ) ( ) ( )
8093.019131.08962.0
136.126.136.126.1
26.136.1
2.4
2.255.30
2.4
2.255.19
=−+=
−+=−−=
≤≤−≈
−
≤
−
≤
−
=
FFFF
ZP
npq
npX
P
OTHER PROBABILITY DENSITIES
The Uniform Distribution
A r.v X is said to have uniform distribution over the interval ( )βα, if its density is given
by
( ) <<
−=
elsewhere
x
xf
0
1
βα
αβ

72
Thus the graph of the density is a constant over the interval ( )βα,
If βα <<< dc
( )
−
−
=
−
=<<
d
c
cd
dxdXcP
αβαβ
1
and thus is proportional to the length of the interval ( )., dc
You may verify that
The mean of ( )
2
βα
µ
+
=== XEX (mid point of the interval ( )βα, )
The variance of
( )
12
2
2 αβ
σ
−
==X . The cumulative distribution function is
( )
>
≤<
−
−
≤
=
β
βα
αβ
α
α
x
x
x
x
xf
1
0
Example 6 (See page 165 exercise 546)
In certain experiments, the error X made in determining the solubility of a substance is a
rv having the uniform density with 025.0025.0 =−= βα and . What is the prob such an
error will be
(a) between 0.010 and 0.015?
(b) between –0.012 and 0.012?
Solution
(a) ( )
( )025.0025.0
010.0015.0
015.0010.0
−−
−
=<< XP
1.0
050.0
005.0
==
(b) ( ) ( )
( )025.0025.0
012.0012.0
012.0012.0
−−
−−
=<<− XP
48.0
25
12
==

73
Example 7 (See exercise 5.47 on page 165)
From experience, Mr. Harris has found that the low bid on a construction job can be
regarded as a rv X having uniform density
( ) <<
=
elsewhere
Cx
C
Cxf
0
2
3
2
4
3
where C is his own estimate of the cost of the job. What percentage should Mr. Harris
add to his cost estimate when submitting bids to maximize his expected profit?
Solution
Suppose Mr. Harris adds k% of C when submitting his bid. Thus Mr. Harris gets a profit
100
kC
if he gets the contract which happens if the lowest bid (by others) +≥
100
kC
C and
gets no profit if the lowest bid
100
kC
C +< . Thus the prob that he gets the bid
−=+−×=<<+=
100
1
4
3
100
2
4
3
2
100
kkC
CC
C
CX
kC
CP
Thus the expected profit of Mr. Harris is
( )....0
100
1
4
3
100
×+−×
kkC
−=
100400
3 2
k
k
C
which is maximum (by using calculus) when k =50.
Thus Mr. Harris’s expected profit is a maximum when he adds 50% of C to C, when
submitting bids.
Gamma Function
This is one of the most useful functions in Mathematics. If x > 0, it is shown that the
improper integral dtte xt 1
0
−−
∞
converges to a fuite real number which we denote by ( )xΓ
(Capital gamma of x). Thus for all real no x > 0, we define
( ) .1
0
dttex xt −−
∞
=Γ

74
Properties of Gamma Function
1. ( ) ( )xxx Γ=+Γ 1 , x > 0
2. ( ) 11 =Γ
3. ( ) ( ) ( ) ( ) !212223,1112 =×=Γ=Γ=Γ=Γ
More generally ( ) !1 nn =+Γ whenever n is a +ve integer or zero.
4. Γ
2
1
.π=
5. ( )xΓ decreases in the interval (0,1) and increases in the interval ( )∞,2 and has a
minimum somewhere between 1 and 2.
THE GAMMA DISTRIBUTION
Let βα1 be 2 +ve real numbers. A r.v X is said to have a Gamma Distribution with
parameters βα1 if its density is
( ) ( )
>
Γ=
−−
elsewhere
xxe
xf
x
0
0
1 1. α
α
β
αβ
It can be shown that
Mean of ( ) αβµ === XEX
(See the working on Page 159 of your text book)
Variance of .22
αβσ ==X
Exponential Distribution
If ,1=α we say X has exponential distribution. Thus X has an exponential distribution
(with parameter 0>β ) if its density is
( ) >
=
−
elsewhere
xe
xf
x
0
0
1 β
β

75
We also see easily that:
1. Mean of ( ) β== XEX
2. Variance of 22
βσ ==X
3. The cumulative distribution function of X is
( ) >−
=
−
elsewhere
xe
xF
x
0
01 β
4. X has the memoryless property:
( ) ( ) 0,.,| >>=>+> tstXPsXtsXP
Proof of (4): ( ) ( )sXPsXP ≤−=> 1
( ) β
s
esF
−
=−=1 (by (3))
( ) ( ) ( )( )
( )sXP
sXtsXP
sXtsXP
>
>∩+>
=>+> |
( )
( )
( )
( )QEDtxPe
e
e
sXP
tsXP t
s
ts
.
/
>===
>
+>
=
−
−
+−
β
β
β
In a certain city, the daily consumption of electric power (in millions of kw hours) can be
treated as a r.v. X having a Gamma distribution with .2,3 == βα If the power plant in
the city has a daily capacity of 12 million kw hrs, what is the prob. that the power supply
will be inadequate on any given day?
Solution
The power supply will be inadequate if demand exceeds the daily capacity.
Hence the prob that the power supply is inadequate
( ) ( )
∞
=>=
12
12 dxxfXP

76
Now as ( )
( )
132
3
32
1
,2,3 −
−
Γ
=== xexf
x
βα
22
16
1
x
ex
−
=
Hence ( )
∞
−
=>
12
22
10
1
12 dxexXP
x
Integrating by parts, we get
[ ]
062.025
10
400
16128122
16
1
82422
10
1
66
6662
12
2222
===
+××+××=
−+−−=
−−
−−−
∞
−−−
ee
eee
eexex
xxx
Example 9 (see exercise 5.58 on Page 166)
The amount of time that a surveillance camera will run without having to be reset is a r.v.
X having exponential distribution with 50=β days. Find the prob that such a camera
(a) will have to be reset in less than 20 days.
(b) will not have to be reset in at least 60 days.
Solution
The density of X is
( ) )0(0
50
1 50
elsewhereandxexf
x
>=
−
(a) P (The camera has to be reset in < 20 days)
= P (the running time < 20)

77
( )
3297.011
50
1
20
5
2
50
20
20
0
20
0
5050
=−=−=
−==<=
−−
−−
ee
edxeXP
xx
(b) P (The camera will not have to be reset in at least 60 days.)
( )
3012.0
50
1
60
5
6
60
50
60
50
==−=
=>=
−
∞
−
∞
−
ee
dxeXP
x
x
Given a Poisson process with the average α arrivals per unit time, find the prob density
of the inter arrival time (i.e the time between two consecutive arrivals).
Solution
Let T be the time between two consecutive arrivals. Thus clearly T is a continuous r.v.
with values > 0. Now T > t No arrival in time period t.
Thus ( ) ( )0==> tXPtTP
( tX = Number of arrivals in time period t)
t
e α−
= (as tX has a Poisson distribution with parameter tαλ = )
Hence the distribution function of T
( ) ( ) ( ) 011 >−=>−=≤== tettPtTPtF tα
( )( )00 ≤= tallforclearlytF

78
Hence the density of ( ) ( )tF
dt
d
tfT =,
>
=
−
elsewhere
tife t
0
0α
α
Hence we would say the IAT is a continuous rv. with exponential density with parameter
α
1
.
The Beta Function
If x,y>0 the beta function, ( )yxB , (read capital Beta x,y), is defined by
( ) ( ) −−
−=
1
0
11
1, dtttyxB
yx
It is well-known that ( ) ( ) ( )
( )
.0,,, >
+Γ
ΓΓ
= yx
yx
yx
yxB
BETA DISTRIBUTION
A r.v. X is said to have a Beta distribution with parameter 0, >βα if its density is
( )
( )
( )
elsewhere
xxx
B
xf
0
101,
,
1 11
<<−=
−− βα
βα
It is easily shown that
(1) ( )
βα
α
µ
+
==XE
(2) ( )
( ) ( )1
2
2
+++
==
βαβα
αβ
σXV

79
Example 11 (See Exercise 5.64)
If the annual proportion of erroneous income tax returns can be looked upon as a rv
having a Beta distribution with ,9,2 == βα what is the prob that in any given year,
there will be fewer than 10% of erroneous returns?
Solution
Let X = annual proportion of erroneous income tax returns. Thus X has a Gamma density
with .9,2 == βα
( ) ( )=<∴
1.0
0
1.0 dxxfXP (Note the proportion can not be < 0)
( )
( ) −−
−=
1.0
0
1912
1
9,2
1
dxxx
B
( ) ( ) ( )
( ) 990
1
11109
1
!11
!81
11
92
9,2 =
××
=
×
=
Γ
ΓΓ
=B
( ) ( ) ( )[ ]−−−=−
1.0
0
1.0
0
988
111. dxxxdxxx
( ) ( ) ( ) ( )
( ) ( )
00293.0
900
19
9.
90
1
10
1
9
1
9
1
10
9.
9.
10
1
10
9.
9
1
9
9.
10
1
9
1
99
1091.0
0
109
=
×−=−+−=
−++
−
=
−
−
−
−
−
=
xx
The Log –Normal Distribution
A r.v X is said to have a log normal distribution if its density is
( )
( )
>>
=
−−−
elsewhere
xex
xf
x
0
0,0
2
1 22
2/ln1
β
βπ
βα

80
It can be shown that if X has log-normal distribution, XlnY = has a normal distribution
with mean αµ = and s.d. .βσ =
Thus ( )bXaP <<
( )bXap lnlnln <<=
−
−
−
=
−
<<
−
=
β
α
β
α
β
α
β
α a
F
b
F
b
Z
a
p
lnlnlnln
Where ( ) cdfzF = of the standard normal variable Z.
Lengthy calculations show that if X has log-normal distribution, its mean ( ) 2
2β
α +
= eXE
and its variance = ( )1
22
2
−+ ββα
ee
More problems on Normal Distribution
Example 12
Let X be normal with mean .sdand Determine c as a function of σµ and such
that
( ) ( )cXPcXP ≥=≤ 2
Solution
( ) ( )cxPcXP ≥=≤ 2
Implies ( ) ( )( )cXPcXP <−=≤ 12
Let ( ) pcXP =≤
Thus
3
2
23 == porp
Now ( ) 6667.
3
2
==
−
=
−
≤
−
=≤
σ
µ
σ
µ
σ
µ c
F
cX
PcXP
Implies 43.0=
−
σ
µc
(approx from Table 3)
σµ 43.0+=∴c

81
Example 13
Suppose X is normal with mean 0 and sd 5. Find ( )41 2
<< XP
Solution
( )
( )
<−<=<<=
<<=
<<
5
1
5
2
5
2
5
1
21
41 2
ZPZPZP
XP
XP
−=−−−=
5
1
5
2
21
5
1
21
5
2
2 FFFF
( )5793.06554.02 −= from Table 3
( ) 1522.00761.2 =×=
Example 14
The annual rain fall in a certain locality is a r.v. X having normal distribution with mean
29.5” and sd 2.5”. How many inches of rain (annually) is exceeded about 5% of the time?
Solution
That is we have to find a number C such that
( )
6125.33
645.15.25.29
645.1
5.2
5.29
05.0
5.2
5.29
.
05.0
05.0
=
×+=∴
==
−
=
−
>
−
=>
C
z
C
Hence
CX
Pei
CXP
σ
µ

82
Example 15
A rocket fuel is to contain a certain percent (say X) of a particular compound. The
specification calls for X to lie between 30 and 35. The manufacturer will make a net
profit on the fuel per gallon which is the following function of X.
( )
≤<<≤
<<
=
3025403505.0$
353010.0$
XorXifgallonper
Xifgallonper
XT
-$0.10 per gallon elsewhere.
If X has a normal distribution with mean 33and s.d. 3, find the prob distribution of T and
hence the expected profit per gallon.
Solution
T = 0.10 if 30 < X < 35
( ) ( )
( ) ( )
5899.018413.07486.0
11
3
2
1
3
2
3
2
1
3
3335
3
3330
353010.0
=−+=
−+=−−=
<<−=
−
<
−
<
−
=
<<==∴
FFFF
ZP
X
P
XPTP
σ
µ
( ) ( ) ( )
( )
( )1
3
8
3
2
3
7
3
8
1
3
2
3
7
1
3
8
3
7
3
2
5
3330
3
3325
3
3340
3
3335
3025403505.0
FFFF
FFFF
ZPZP
X
P
X
P
XPXPTP
−+−=
−−−+−=
−≤<
−
+<≤=
−
<
−
<
−
+
−
<
−
≤
−
=
≤<+<≤==
σ
µ
σ
µ
( )
0138.0
3963.05899.0110.0
3963.08413.09961.07486.09901.0
=
−−=−=
=−+−=
TPHence
Hence expected profit = E(T)

83
( )
077425.0$
0138.010.03963.005.05899.10.0
=
×−+×+×=
JOINT DISTRIBUTIONS – Two and higher dimensional Random
Variables
Suppose X,Y are 2 discrete rvs
and suppose X can take values Yandxx ......., 21 can take
values ........., 21 yy we refer to the function ( ) ( )yYxYPyxf === ,, as the joint prob
distribution of X and Y. The ordered pair (X,Y) is sometimes referred to as a two –
dimensional discrete r.v.
Example 16
Two cards are drawn at random from a pack of 52 cards. Let X be the number of aces
drawn and Y be the number of Queens drawn.
Find the joint prob distribution of X and Y.
Solution
Clearly X can take any one of the three values 0,1,2 and Y one of the three values, 0,1,2.
The joint prob distribution of X, and Y is depicted in the following 3 x 3 table
x
0 1 2
0
2
52
2
44
2
52
2
44
1
4
2
52
2
4
1
2
52
1
44
1
4
2
52
1
4
1
4
0y
2
2
52
2
4
0 0

84
Justification
( )0,0 == yxP
= P (no aces and no queens in t he 2 cards)
=
2
52
2
44
( )0,1 == YXP (the entry in the 2nd
col and 1st
row)
=P (one ace and one other card which is neither ace nor a queen)
.
2
52
1
44
1
44
etc=
Can we write down the distribution of X? X can take any one of the 3 values 0,1,2
What is ( )?0=XP
X = 0 means no ace is drawn but we might draw 2 queens, or 1 queen and one non queen
or 2 cards which are neither aces nor queens.
Thus
( ) ( ) ( ) ( )
( )!
2
52
2
48
2
52
2
4
2
52
1
44
1
4
2
52
2
44
1.3
1,01,00,00
Verify
colinprobtheofSum
YXPYXPYXPXP
=++
=
==+==+====
Similarly ( ) ( ) ( ) ( )2,11,10,11 ==+==+==== YXPYXPYXPXP

85
= Sum of the 3 probabilities in 2nd
col.
( )
( ) ( ) ( ) ( )2,21,20,22
!
2
52
1
48
1
4
0
2
52
1
4
1
4
2
52
1
44
1
4
==+==+====
=++=
YXPYXPYXPXP
Verify
= Sum of the 3 probabilities in 3rd
col
=++=
2
52
2
4
00
2
52
2
4
The distribution of X derived from the joint distribution of X and Y is referred to as the
marginal distribution of X..
Similarly the marginal distribution of Y are the 3 row totals.
Example 17
The joint prob distribution of X and Y is given by
x
-1 0 1
-1
8
1
8
1
8
1
8
3
0
8
1
0
8
1
8
2
y
1
8
1
8
1
8
1
8
3
Marginal Distribution of X
8
3
8
2
8
3
Write the marginal distribution of X and Y. To get the marginal distribution of X, we find
the column totals and write them in the (bottom) margin. Thus the (marginal) distribution
of X is
X -1 0 1
Prob
8
3
8
2
8
3

86
(Do you see why we call it the marginal distribution)
Similarly to get the marginal distribution of Y, we find the 3 row totals and write them in
the (right) margin.
Thus the marginal distribution of y is
Y Prob
-1
8
3
0
8
2
1
8
3
Notation: If ( ) ( )yYxXPyxf === ,, is the joint prob distribution of the 2-dimensional
discrete r.v (X.Y), we denote by g (x) the marginal distribution of X and by h(y) the
marginal distribution of Y.
Thus ( ) ( ) ( ) ( )yxfyYxXPxXPxg ,,
1
1
1
1
======
All y all y
And
( ) ( ) ( ) ( )
xallxall
yxfyYxXPyYPyh ,,
1
1
1
1
======
Conditional Distribution
The conditional prob distribution of Y for a given X = x is defined as
( ) ( )
( )
( )
( )
( )xg
yxf
xXP
yYxXP
xXgivenyYofprobreadxXyYPxyh
,,
)(
=
=
==
=
=====
where g (x) is the marginal distribution of X.
Thus in the above example 17,
( ) ( )
( )
( ) 3
1
1
0,1
1|01|0
8
3
8
1
==
=
==
====
XP
YXP
XYPh
Similarly, the conditional prob distribution of X for a given Y = y is defined as

87
( ) ( ) ( )
( )
( )
( )yh
yxf
yYP
yYxXP
yYxXPyxg
,,
|| =
=
==
====
Where h(y) is the marginal distribution of Y.
In the above example,
( ) ( ) ( )
( )
0
8
2
0
0
0,0
0|00|0 ==
=
==
====
YP
yYP
YXPg
Independence
We say X,Y are independent if
( ) ( ) ( ) .,, yxallforyYPxXPyYxXP =====
Thus X,Y are independent if and only if
( ) ( ) ( ) yandxallforyhxgyxf =,
which is the same as saying of g(x|y) =g(x) for all x and y which is the same as saying
( ) ( )yhxyh =| for all x,y.
In the above example X,Y are not independent as ( ) ( ) ( )000,0 ==≠== YPXPYXP
Example 18
The joint prob distribution of X and Y is given by
X
2 0 1
2 0.1 0.2 0.1
0 0.05 0.1 0.15
Y
1 0.1 0.1 0.1
(a) Find the marginal distribution of x.
Ans
X 2 0 1
Prob 0.25 0.4 0.35

88
(b) Find the marginal distribution of Y
Ans
Y Prob
2 0.4
0 0.3
1 0.3
(c) Find ( )2=+YXP
Ans ( ) ( ) ( )2,01,10,22 =======+ YXorYXorYXifYX
Thus ( ) 35.02.01.005.02 =++==+YXP
(d) Find ( )0=−YXP
Ans : ( ) ( ) ( )1,10,02,20 =======− YXorYXorYXifYX
( ) 3.01.01.01.00 =++==−∴ YXP
(e) Find ( )0≥XP Ans. 1
(f) Find ( ) 3.0
1
3.0
.00 =≥=− AnsXYXP
(g) Find ( )
3
1
6.0
2.0
.10 =≥=− AnsXYXP
(h) Are X,Y independent?
Ans No! ( ) ( ) ( ).111,1 ==≠== YPXPYXP
Two-Dimensional Continuous Random Variables
Let (X,Y) be a continuous 2-dimensional r.v. This means (X,Y) can take all values in a
certain region of the X,Y plane. For example, suppose a dart is thrown at a circular board
of radius 2. Then the position where the dart hits the board (X,Y) is a continuous two
dimensional r.v as it can take all values (x,y) such that .422
≤+ yx
A function ( )yxf , is said to be the joint prob density of (X,Y) if
(i) ( ) yxallforyxf ,0, ≥

89
(ii) ( ) 1, =
∞
∞−
∞
∞−
dxdyyxf
(iii) ( ) ( ) .,, dxdyyxfdYcbXaP
b
a
d
c
=≤≤≤≤
Example 19(a)
Let the joint prob density of (X,Y) be
( )
elsewhere
yxyxf
0
20,20
4
1
, ≤≤≤≤=
Find ( )1≤+YXP
Ans : The region 1≤+ yx is given by the shaded portion.
( )
( ) ( ) =−−=−=
≤+∴
−
==
1
0
1
0
2
1
0
1
0
.
8
1
1
8
1
1
4
1
4
1
1
xdxx
dxdyyxP
x
yx
Example 19(b)
The joint prob density of (X,Y) is
( ) ( )
( )3,1
40,206
8
1
,
<<
<<<<−−=
YXPFind
yxyxyxf
Solution
( ) dxdyyxf
yx
,
3
2
1
0 ==

90
( )
( )
( )
( )
8
3
18
2
5
2
25
8
1
2
5
2
6
8
1
2
5
6
8
1
2
6
8
1
6
8
1
1
0
2
1
0
3
2
21
0
3
2
1
0
=+−−=−
−
−=
−−=
−−
−−=
=
=
==
x
dxx
dx
y
yx
dxdyyx
x
x
yx
If ( )yxf , is the joint prob density of the 2-dimensional continuous rv (X,Y), we define
the marginal prob density of X as
( ) ( )dyyxfxg ,
∞
∞−
=
That is fix x and integrate f(x,y) w.r.t y
Similarly the marginal prob density of Y is
( ) ( )dxyxfyh ,
∞
∞−
=
The conditional prob density of Y for a given x is
( ) ( )
( )xg
yxf
xyh
,
| = (Defined only for those x for which g(x) ≠ 0)
The conditional prob density of X for a given y is
( ) ( )
( )
( ) )0(
,
| ≠= yhwhichforythoseforonlydefined
yh
yxf
yxg
Marginal and Conditional Densities

91
We say X,Y are independent if and only if ( ) ( ) ( )yhxgyxf =,
which is the same as saying ( ) ( ) ( ) ( ).|| yhxyhorxgyxg ==
Example 20
Consider the density of (X,Y) as given in example 19.
The marginal density of x
= ( ) ( )dyyxxg
y
−−=
=
6
8
1
4
2
( )
4
2
2
2
6
8
1
−−
y
yx
( )[ ]
elsewhereand
xx
0
20662
8
1
=
<<−−=
We verify this is a valid density.
( ) ( ) 20026
8
1
<<≥−= xforxxg
Secondly ( ) ( )dxxdxxg 26
8
1
2
0
2
0
−=
[ ] [ ] 1412
8
1
6
8
1 2
0
2
=−=−= xx
The marginal density of Y is
( ) ( )dxyxyh
x
−−
=
6
8
1
2
0
( ) ( )[ ]262
8
1
2
6
8
1
2
0
2
−−=−−=
=
y
x
xy
x
Independence

92
=
( ) <<−
elsewhere
yory
0
4210
8
1
Again ( ) ( )dyyhandyh ≥
4
2
0
( ) [ ] [ ] 11220
8
1
10
8
1
210
8
1 4
2
2
4
2
=−=−=−= yydyy
The conditional density of Y for X = 1
is ( ) ( )
( )
( )
( )
( ) 42,5
4
1
26
8
1
16
8
1
1
,
1| <<−=
−
−
== yy
y
g
yxf
yh
And 0 elsewhere
Again this is a valid density as ( ) 01| ≥yh
And ( ) ( )dyydyyh −= 5
4
1
1|
4
2
4
2
( )
( ) ( )
( )3
3,1
3|1
1
2
1
2
9
4
1
2
5
4
1
4
2
2
<
<<
=<<
=−=
−
−=
YP
YXP
YxP
y
Now
8
3
=Nr
( ) ( ) ( )
( ) ( )
8
5
2
4
2
9
4
1
2
5
4
1
5
4
1
210
8
1
3
3
2
23
2
3
2
3
2
=−=
−
−=−=
−==<=
y
dyy
dyydyyhYPDr

93
The conditional density of Y for X = 1
Is ( ) ( )
( )
( )
( )
( ) 425
4
1
26
8
1
16
8
1
1
,1
1| <<−=
−
−
== yy
y
g
yf
yh
And 0 elsewhere
Again this is a valid density as ( ) 01| ≥yh and ( ) ( )dyydyyh −= 5
4
1
1|
4
2
4
2
( )
( )
( )
( )3
3,1
3|1
1
2
1
2
9
4
1
2
5
4
1
4
2
2
<
<<
=<<
=−=
−
−=
YP
yxP
YXP
y
Now Numerator
8
3
=
( ) ( ) ( )
( ) ( )
8
5
2
4
2
9
4
1
2
5
4
1
5
4
1
210
8
1
3rDenominato
3
2
23
2
3
2
3
2
=−=
−
−=−=
−==<=
y
dyy
dyydyyhYP
Hence ( )
5
3
8
5
8
3
3,1 ==<< YXP
Let ( )yxf , be the joint density of (X,Y). We define the cumulative distribution function
as
( ) ( )yYxXPyxF ≤≤= ,,
( ) ., dvduvuf
yx
∞−∞−
=
The Cumulative Distribution Function

94
The joint prob density of X and Y is given by
( )
( ) <<<<+
=
elsewhere
yxyx
yxf
0
10,10
,
2
5
6
Find the cumulative distribution function F(x,y)
Case (i) x < 0
( ) ( )
( )
0,
0,0
,,
<
==
=
∞−∞−
vuany
forvufas
dvduvufyxF
yx
Case (ii) y < 0.
Again ( ) 0, =yxF whatever be x.
Case (iii)
( )
( ) ( )
( ) ( )( )000,
,,
10,10
2
5
6
00
<<=+=
=
<<<<
==
∞−
voruforvufasdvduvu
dvduvufyxF
yx
y
v
x
u
y
du
v
uv
yx
u 0
3
0
35
6
+=
=
+=+=
=
325
6
35
6 323
0
xyyx
du
y
uy
x
u
.
Solution

95
Case (iv) 1,10 ≥<< yx
( ) ( )
( )
+=+=
+=
=
=
==
∞−∞−
325
6
3
1
5
6
5
6
,,
2
0
2
1
00
xx
duu
dudvvu
dudvvufyxF
x
u
v
x
u
yx
Case (v) 10,1 <<≥ yx
as in case (iii) we can show
( ) +=
325
6
,
3
yy
yxF
Case (v) 1,1 ≥≥ yx
( ) ( ) ( )dvduvududvvufyxF
vu
yx
2
1
0
1
0
5
6
,, +==
==∞−∞−
1
3
1
2
1
5
6
3
1
5
6
1
0
=+=+=
−
duu
u
(Did you anticipate this?)
Hence
( )6.04.0,5.02.0 <<<< YXP
( )
( ) ( )
( ) ( )?4.0,2.0
4.0,5.06.0,2.0
6.0,5.0
WhyF
FF
F
+
−−
=

96
( ) ( ) ( )( ) ( ) ( ) ( )( )
( ) ( ) ( )( ) ( ) ( ) ( )( )
3
4.02.0
2
4.02.0
3
4.05.0
2
4.05.0
3
6.02.0
2
6.02.0
3
6.05.0
2
6.05.
5
6
3232
3232
++−−
−−+=
( ) ( ) ( ) ( )( ) ( ) ( ) ( ) ( )−
−×−−+×=
3
4.06.0
2.01.2.04.06.0
3
5.0
15.0
5
6
2
2332
( ) ( )[ ] ( ) ( ) ( )[ ][ ]
( ) ( ) ( ) ( )[ ]
[ ]
04344.0
362.01.0
5
6
4.06.02.05.01.0
5
6
4.06.01.01.02.05.0
5
6
3322
3322
=
××=
−+−×=
−+×−=
Example 22
The joint density of X and Y is
( )
( ) <<<<+
=
elsewhere
yxyx
yxf
0
10,10
,
2
5
6
(a) Find the conditional prob density g (x | y)
(b) Find
2
1
|xg
(c) Find the mean of the conditional density of X given that
2
1
=Y
Solution
( ) ( )
( )
( )yhwhere
yh
yxf
yxg
,
| = is the marginal density of y.

97
Thus
( ) ( ) ( )
.10
2
1
5
6
,
2
2
5
6
1
0
1
0
<<+=
+==
==
yy
dxyxdxyxfyh
xx
Hence
( )
( )
( )
( )
10,
4
1
3
4
2
1
|
0
.10,|
4
1
2
1
4
1
2
2
1
2
2
2
1
5
6
2
5
6
<<+=
+
+
=∴
<<
+
+
=
+
+
=
xx
x
xg
elsewhereand
x
y
yx
y
yx
yxg
Hence
dxx
dxxgx
yxE
+×=
=
=
4
1
3
4
2
1
|
2
1
|
1
0
1
0
8
11
8
1
3
1
3
4
833
4
1
0
23
=+=+=
xx

98
Example 23
(X,Y) has a joint density which is uniform on the rhombus find
(a) Marginal density of X.
(b) Marginal density of Y
(c) The conditional density of Y given
2
1
=X
Solution
(X,Y) has uniform density on the rhombus means
( )
bushomrtheofArea
1
y,xf =
bushomrtheover
2
1
=
and 0 elsewhere
(a) Marginal Density of X
Case (i) 0<x<1
( ) ( )xdyxf
x
xy
−==
−
−=
1
2
1
1
1
Case (ii) –1<x<0
( ) xdyxf
x
xy
+==
+
−−=
1
2
1
1
1
Thus
( ) <<−
<<−+
=
elsewhere0
1x0x1
0x1x1
xg
(b) By symmetry marginal density of Y is

99
( ) <<−
<<−+
=
elsewhere
yy
yy
yh
0
101
011
(c)
2
1
2
1
,
2
1
tofromrangesyxfor −=
Thus conditional density of Y for is
2
1
X =
( )
( )
( )
<<−
==
elsewhere0
y1
f
,xf
|yh 2
1
2
1
2
1
2
1
2
1
for
3
2
to
3
2
fromrangsY
3
1
x −=
( ) <<−=
=∴
elsewhere0
3
2
y
3
2
4
3
|yh
3
2
2
1
3
1

100
PROPERTIES OF EXPECTATIONS
Let X be a r.v. a,b be constants
Then
(a) ( ) ( ) bxEabaXE +=+
(b) ( ) ( )XVarabaXVar 2
=+
If nXXX ......, 21 are any n rvs,
( ) ( ) ( ) ( )n21n21 XE....XEXEX.......XXE +++=+++
But if nareX,.....X n1 indep rvs then
( ) ( ) ( ) ( )n21n21 XVar....XVarXVarX.....XXVar +++=+++
In particular if X,Y are independent
( ) ( ) ( ) ( )YVarXVarYXVarYXVar +=−=+
Please note : whether we add X and Y or subtract Y from X, we always must add their
variances.
If X,Y are two rvs, we define their covariance
( ) ( )( )[ ]
( ) ( )YE,XEWhere
YXEY,XCOV
21
21
=µ=µ
µ−µ−=
Th. If X,Y are indep, ( ) ( ) ( ) ( ) 0Y,XCOVandYEXEXYE ==

101
Sample Mean
Let n21 X.....X,X be n indep rvs
each having the same mean µ and same variance .2
σ
We define
n
X...XX
X n21 +++
=
X is called the mean of the rvs .X.....X n1 Please note that X is also a rv.
Theorem
1. ( ) µ=XE
2. ( ) .
n
XVar
2
σ
=
Proof
(i) ( ) ( ) ( ) ( )[ ]n21 XE....XEXE
n
1
XE +++=
µ
µµµ
=
+++
=
timesnn
.....1
(2) ( ) ( ) ( ) ( )[ ]n212
XVar....XVarXVar
n
1
XVar +++=
(as the variables are independent)
nn
n
timesnn
2
2
2222
2
..1 σσσσσ
==
+++
=

102
Sample variance
Let n1 X...X be n indep rvs
each having the same mean µ and same variance 2
σ . Let
n
XXX
X n21 ++
= be their sample mean. We define the sample variance as
( )2
i
n
1i
2
XX
1n
1
S −
−
=
=
Note 2
S is also a r.v.
( ) 22
σ=SE
Proof. Read it on page 179.
Simulation
To simulate the values taken by a continuous r.v. X, we have to use the following
theorem.
Theorem
Let X be a continuous r.v. with density f(x) and cumulative distribution function F(x). Let
)(XFU = . Then U is a r.v. having uniform distribution on (0,1).
In other words, U is a random number. Thus to simulate the value taken by X, we take a
random no U from the table 7 (Now you must put a decimal point before the no) And
solve for X, the equation
( ) UXF =

103
Example 24
Let X have uniform density on ( )βα, . Simulate the values of X using the 3-digit random
numbers.
937, 133, 753, 503, …..
Solution
Since X has uniform density on ( )βα, its density is
( )
<<
= −
elsewhere
n
xf
0
1
βααβ
Thus the cumulative distribution function is
( )
>
≤<
≤
= −
−
β
βα
α
αβ
α
x
x
x
xF x
1
0
( ) =
α−β
α−
=
X
meansXF
( )
__
αβα −+=∴X
Hence if ( )937.X,937. α−β+α==
( )
.etc
133.X,133. α−β+α==
Let x have exponential density (with parameter β )
( ) >=
β
−
β
elsewhere0
0xexf
x
1

104
Hence the cumulative distribution function is
( )
>−
≤
= −
0xe1
0x0
xF p
x
Thus solving ( ) getwe,XforUe1)ie(,UXF
x
=−= β
−
−
β=
U1
1
lnX
Since U is a random number implies 1-U is also a random number, we can as well use the
formula
.Uln
U
1
lnX
β−=
β=
Example 25
X has exponential density with parameter 2. Simulate a few values of X.
Solution
The defining equation for X is
ln2X −=
Taking 3 digit random numbers form table 7 page 595 row 21 col. 3, we get the random
numbers : 913, 516, 692, 007 etc.
The corresponding X values are :
( ) ( ) ( )..........692.ln2,516.ln2,913.ln2 −−−

105
Example 26
The density of a rv X is given by
( )
elsewhere
xxxf
0
11
=
<<−=
Simulate a few values of X.
Solution
First let us find the cumulative distribution function F(x).
Case (i) 1≤x In this case F(x) = 0
Case (ii) .01 ≤<− x
( ) ( )
2
1 2
1
1
x
dtt
dttdttfxF
x
xx
−
=−=
==
−
−∞−
Case (iii) 10 ≤< x
In this case ( ) ( )dttfxF
x
∞−
=
2
1
22
1
0
0
22
0
0
1
1
xx
tdtdttdt
x
+
=++=
+−+=
−
−
∞−

106
Case (iv) x>1. In this case F(x) =1
Thus
( )
1x1
1x0
0x1
1x0
xF
2
x1
2
x1
2
2
>
≤<
≤<−
−≤
=
+
−
To simulate a value for X, we have to solve the equation F(x) = U for X
Case (i)
2
1
0 <≤ U
In this case we use the equation
( ) ( )
( )?whyU21X
?whyU
2
x1
xF
2
−−=∴
=
−
=
Case (ii) 1
2
1
<≤ U
In this case we solve for X, the equation
( )
1U2X
U
2
X1
XF
2
−+=∴
=
+
=
Thus the defining conditions are :
+
−
−=<≤
−=<≤
1U2x,1U
2
1
If
and
U21X,
2
1
U0If

107
Let us consider the 3 digit random numbers on page 594 Row 17 Col. 5
726, 282, 272, 022,…….
662.0281.221XThus
2
1
281.U
672.01726.2XThus
2
1
726.U
−=×−=<=
=−×=≥=
−
+
Note : Most of the computers have built in programs which generate random deviates
from important distributions. Especially, we can invoke the random deviates from a
standard normal distribution. You may also want to study how to simulate values from a
standard normal distribution by Box-Muller-Marsaglia method given on page 190 of the
text book.
Example 27
Suppose the no of hours it takes a person to learn how to operate a certain machine is a
random variable having normal distribution with .2.18.5 == σµ and Suppose it takes
two person to operate the machine. Simulate the time it takes four pairs of persons to
learn how to operate the machine. That is, for each pair, calculate the maximum of the
two learning times.
Solution
We use Box-Muller-Marsaglia Method to generate pairs of values 21, zz taken by a
standard normal distribution. Then we use the formula
22
11
zx
zx
σµ
σµ
+=
+=
to simulate the time taken by a pair of persons.
(where 2.1,8.5 == σµ )
We start with the random numbers from Table 7

108
Page 593, Row 19, Column 4
729, 016, 672, 823, 375, 556, 424, 854
Note
( ) ( )
( )122
121
u2Sinuln2z
2Cosuln2z
π−=
πµ−=
The angles are expressed in radians.
U1 U2 Z1 Z2 X1 X2
.729 .016 -0.378 -0.991 5.346 4.611
etc.
Review Exercises
5.108. If the probability density of a r.v. X is given by
( )
( ) <<−
=
elsewhere0
1x0x1k
xf
2
Find the value of k and the probabilities
(a) ( )2.0X1.0P <<
(b) ( )5.0XP >
Solution
( ) ( )
2
3
1
3
1
1
111 2
1
0
=∴
=−
=−=
∞
∞−
k
kor
dxxkwgivesdxxf s

109
The cumulative distribution function F(x) of X is:
Case (i) ( ) 00 =∴≤ xFx
Case (ii) ( ) ( )dtt1kxF,1x0 2
x
0
−=≤<
−=
32
3 3
x
x .
Case (iii) ( ) 1xF.1x =>
( ) ( ) ( )
( ) ( ) ( ) ( )−−−=
−=<<∴
3
1.0
1.0
2
3
3
2.0
2.0
2
3
1.0F2.0F2.0X1.0P
33
2
( ) ( )
( ) ( ) ( )−−=−=
≤−=<
3
5.0
5.0
2
3
15.0F1
5.0XP15.0XP
3
5.113: The burning time X of an experimental rocket is a r.v. having the normal
distribution with .sec04.0sec76.4 == σµ and What is the prob that this kind of rocket
will burn
(a) <4.66 Sec
(b) > 4.80 se
(c) anywhere from 4.70 to 4.82 sec?
Solution
(a) ( ) −
<
σ
µ−
=<
04.0
76.466.4X
P66.4XP
( ) ( )
( ) 040135987.0125.01
25.0125.0
=−=−=
<−=−<=
F
ZPZP

110
(b) ( ) −
>
σ
µ−
=>
04.0
76.480.4X
P80.4XP
( ) ( ) 1587.08413.01111 =−=−=>= FZP
(c) ( )82.4X70.4P <<
( )
( ) 8664.019332.0215.1F2
5.1Z5.1P
04.0
76.482.4X
04.0
76.470.4
P
=−×=−=
<<−=
−
<
σ
µ−
<
−
=
5.11 The prob density of the time (in milliseconds) between the emission of beta particles
is a r.v. X having the exponential density
( ) >
=
−
elsewhere
xe
xf
0
025.0 25.0
Find the probability that
(a) The time to observe a particle is more than 200 microseconds (=200x 10-3
milliseconds)
(b) The time to observe a particle is < 10 microseconds
Solution
(a) ( ) ( )secmilli10200XPsecmicro200P 3−
×>=>
[ ]
3
3
3
1050
10200
25.025.0
10200
25.0
−
−
−
×−
∞
×
−−
∞
×
=
−==
e
edxe xx

111
(b) ( ) ( )3
1010XPondssecmicro10XP −
×<=<
[ ]
3
3
3
105.2
1010
0
25.0
1010
0
25.0
1
25.0
−
−
−
×−
×−
×
−
−=
−==
e
edxe bx
5.120: If n sales people are employed in a door-to-door selling campaign, the gross sales
volume in thousands of dollars may be regarded as a r.v. having the Gamma distribution
with .
2
1
100 == βα andn If the sales costs are $5,000 per salesperson, how many
sales persons should be employed to maximize the profit.
Solution
For a Gamma distribution .50 n== αβµ Thus (in thousands of dollars) the “average”
profit when n persons are employed.
nnT 550 −== (5 x 1000 per person is the cost per person)
This is a maximum (using calculus) when n = 25.
5.122: Let the times to breakdown for the processors of a parallel processing machine
have joint density
( ) >>
=
−−
elsewhere
yxe
yxf
yx
0
0,004.0
,
2.02.0
where X is the time for the first processor and Y is the time for the 2nd
processor. Find
(a) The marginal distributions and their means
(b) The expected value of the sum of the X and Y.
(c) Verify that the mean of a sum is the sum of the means.

112
Solution
(a) Marginal density of X
( ) ( )
( )0xif0and
0x,e2.0dye2.0e2.0
dye04.0dyy,xfxg
x2.0y2.0
0y
x2.0
y2.0x2.0
0yy
≤=
>==
===
−−
∞
=
−
−−
∞
=
∞
−∞=
By symmetry, the marginal distribution of Y is
( ) >
=
−
elsewhere
ye
yh
y
0
02.0 2.0
Since X (& Y) have exponential distributions (with parameters 5
2.0
1
= ) E(X)
= E(Y) = 5.
E since f(x,y) = g (x) h (y), X,Y are independent.
( ) ( ) ( )
( )( )
dydxex
dydxeyx
dydxyxfyxYXE
yx
yx
yx
yx
02.02.0
00
2.02.0
00
04.0.
04.0
,
−−
∞
=
∞
=
−−
∞
=
∞
=
∞
∞−
∞
∞−
=
+=
+=+

113
( )
( ) ( )YEXE
!verify1055
dydxe04.0y y2.0x2.0
0y0x
+=
=+=
×+ −−
+∞
=
∞
=
5.123: Two random variable are independent and each has binomial distribution with
success prob 0.7 and 2 trials.
(a) Find the joint prob distribution.
(b) Find the prob that the 2nd
variable is greater than the first.
Solution
Let X,Y be independent and have Binomial distribution with parameters n = 2, and
p = 0.7 Thus
( ) ( ) ( )
( ) ( ) ( )
( ) ( ) ( )
( ) ( ) ( )
2r,k0
3.7.0
r
2
k
2
.tindependenareY,XasrYPkXPrY,kXP
2,1,0r3.07.0
r
2
rYP
2,1,0k3.07.0
k
2
kXP
rk4rk
r2r
k2k
≤≤
=
=====∴
===
===
+−+
−
−

114
(b) ( )XYP >
( ) ( )
( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( )+
+=
==+===
2011
112002
3.07.0
0
2
3.07.0
1
2
3.07.0
1
2
3.07.0
0
2
3.7.0
2
2
0X,1YP1or0X,2YP
5.124 If X1 has mean – 5, variance 3 while X2 has mean 1 and variance 4, and the two are
independent, find
(a) ( )2X5X3E 21 ++
(b) ( )2X5X3Var 21 ++
Ans:
(a) ( ) ( ) 821553 −=++−
(b) 12742539 =×+×

115
Sampling Distribution
Statistical Inference
Suppose we want to know the average height of an Indian or the average life length of a
bulb manufactured by a company, etc. obviously we cannot burn out every bulb and find
the mean life length. One chooses at random, say n bulbs, find their lifelengths
nXXX ....., 21 and take the mean life length
n
X....XX
X n21 +++
= as an ‘approximation’
to the actual (unknown) mean life length. Thus we make a statement about the
“population” (of all life lengths) by looking at a sample of it. This is the basis behind
statistical inference. The whole theory of statistical inference tells us how close we are to
the true (unknown) characteristic of the population.
Random Sample of size n
In the above example, let X be the lifelength of a bulb manufactured by the company.
Thus X is a rv which can assume values > 0. It will have a certain distribution and a
certain mean µ etc. When we make n independent observations, we get n values
nxxx ...., 21 . clearly if we again take n observations, we would get nyyy ...., 21 . Thus we
may say
Definition
Let X be a random variable. A random sample of size n from x is a finite ordered
sequence { }n21 X....,X,X of n independent rv3
such that each Xi has the same
distributions that of X.
Sampling from a finite population
Suppose there is an universe having a finite number of elements only (like the number of
Indians, the number of females in USA who are blondes etc.). A sample of size n from
the above is a subset of n elements such that each subset of n elements has the same prob
of being selected.

116
Statistics
Whenever we sample, we use a characteristic of the sample to make a statement about the
population. For example suppose the true mean height of an Indian is µ (cms). To make a
statement about µ , we randomly select n Indians, Find their heights { }n21 X....,X,X and
then their mean namely
n
X.....XX
X n21 +++
=
We use then X as an estimate of the unknown parameter µ . Remember µ is a
parameter, a constant that is unchanged. But the sample mean X is a r.v. It may assume
different values depending on the sample of n Indians chosen.
Definition : Let X be a r.v. Let { }n21 X.....X,X be a sample of size n from X. A statistic
is a function of the sample { }n21 X,....,X,X .
Some Important Statistics
1. The sample mean
n
X.....XX
X n21 +++
=
2. The sample Variance ( )2
i
n
1i
2
XX
1n
1
S −
−
=
=
3. The minimum of the sample { }n21 X,....,X,XminK =
4. The maximum of the sample { }.X,......X,XmaxM n21=
5. The Range of the sample KMR −=
Definition
If n1 X,.....X is a random sample of size n and if
∧
X is a statistic, then we remember
∧
X is
also a r.v. Its distribution is referred to as the sampling distribution of
∧
X .

117
The Sampling Distribution of the Sample Mean X .
Suppose X is a r.v. with mean µ and variance n21
2
X.....X,XLet.σ be a random sample
of size n from X. Let
n
X........XX
X n21 +++
= be the sample mean. Then
(a) ( ) .XE µ=
(b) ( ) .
n
XV
2
σ
=
(c) If n1 X....X is a random sample from a finite population with N elements, then
Var ( ) .
1N
nN
n
X
2
−
−σ
=
(d) If X is normal, X is also normal
(e) Whatever be the distribution of X, if n is “large”
n
X
σ
µ−
has approximately the
standard normal distribution. (This result is known as the central limit theorem.)
Explanation
(a) tells us that we can “expect” the sample mean X to be an approximation to
the population mean µ .
(b) tells us that the “nearness” of X to µ is small when the sample size n is
large.
(c) says that if X has a normal distribution.
n
X
σ
µ−
has a standard normal
distribution.
(d) says that whatever be the distribution of X, discrete or continuous,
n
X
σ
µ−
has approximately standard normal distribution if n is large.

118
Example 1 (See exercise 6.14, page 207)
The mean of a random sample of size n = 25 is used to estimate the mean of an infinite
population with standard deviation .4.2=σ What can we assert about the prob that the
error will be less than 1.2 if we use
(a) Chebyshev’s theorem
(b) The central limit theorem?
Solution
(a) We know the sample mean X is a rv with ( ) ( ) n
XVarandXE
2
σ
=µ=
Chebyshev’s theorem tell us that for any r.v. T,
( ) ( )( ) 2
k
1
1TVark|TET|P −≥−
Taking ,XT = and noting ( ) ( ) ,XETE µ==
( ) ( ) ( ) ,
25
4.2
n
XvarTvar
22
=
σ
== we find
.
k
1
1
5
4.2
.kXP 2
−≥<µ−
Desired ( )?2.1XP <µ−
2
5
kgives2.1
5
4.2
.k ==
Thus we can assert using Chebyshev’s theorem that
( ) 84.0
25
211
12.1XP
4
25
==−≥<µ−

119
(b) Central limit theorem says
5
4.2
n
XX µ−
=
µ−
σ
is approximately standard normal.
Thus ( )2.1XP <µ−
( ) 9876.019938.0215.2F2
1
2
5
F2
2
5
ZP
2.1X
P
5
4.2
n
=−×=−×=
−=<≈
<
µ−
= σ
A random sample of size 100 is taken from an infinite population having mean 76=µ
and variance .2562
=σ What is the prob that X will be between 75 and 78?
Solution
We use central limit theorem namely
n
X
σ
µ−
is approximately standard normal.
Required ( )78X75P <<
−
<
µ−
<
−
= σ
10
16
n10
16
7678X7675
P
8284.017340.08944.0
1
8
5
4
5
8
5
4
5
4
5
8
5
16
20
16
10
=−+=
−+=−−=
<<−=<<−≈
FFFF
ZPZP

120
If the distribution of weights of all men travelling by air between Dallas and El Paso has
a mean of 163 pounds and a s.d .of 18 pounds, what is the prob. That the combined gross
weight of 36 men travelling on a plane between these two cities is more than 6000
pounds?
Solution
Let X be the weight of a man traveling by air between D and E. It is given that X is a rv
with mean ( ) 163XE =µ= lbs and sd .lbs18=σ
Let 3621 X.....X,X be the weights of 36 men traveling on a plane between these two cities.
Thus we can regard { }3621 X.....,X,X as a random sample of size 36 from X.
Required ( )6000X.....XXP 3621 >+++
>≈
−
>
µ−
=
>=
σ
18
22
ZP
163X
P
36
6000
XP
6
18
6
1000
n
by central limit theorem
( )
1112.08888.01
22.11
18
22
1
=−=
−=≤−= FZP

121
The sampling distribution of the sample mean X (when σ is unknown).
Theorem
Let X be a rv having normal distribution with mean ( ) µ=XE . Let X be the sample
mean and S2
the sample variance of a random sample of size n form that of X.
Then the rv.
n
S
X
t
µ−
= has (student’s) t-distribution with n-1 degrees of freedom.
Remark
(1) The shape of the density curve of t-distribution (with parameter ν -greek nu)
is like that of standard normal distribution and is symmetrical about the y-
axis.
αν,t is that
unique number such that
( )
)parameterthe(
ttP ,v
→ν
α=> α
By symmetry αναν ,1, 1 tt −=−
The values of αν ,t for various αυ and are tabulated in Table 4.
For ν large, ααν Zt ≈, .
A random sample of size 25 from a normal population has the mean 5.47=x and the s.d.
s = 8.4. Does this information tend to support or refute the claim that the mean of the
population is ?1.42=µ

122
Solution:
n
s
x
t
µ−
= has a t-distribution with parameter 1−= nν
Here 25,4.8,1.42 === nsµ
797.2tt 005.0,24005.0,1n ==α−
Thus ( ) 005.0797.2 =>tP
Or 005.0797.2
X
P
n
s
=>
µ−
Or 005.0
5
4.8
797.21.42XP =×+>
Or ( ) 005.078.46XP =>
This means when 21.4=µ only in about 0.5 percent of the cases we may get an
78.46X > . Thus we will have to refute the claim 1.42=µ (in favour of )1.42>µ
The following are the times between six calls for an ambulance (in a certain city) and the
patients arrival at the hospital : 27, 15,20, 32, 18 and 26 minutes. Use these figures to
judge the reasonableness of the ambulance service’s claim that it takes on the average 20
minutes between the call for an ambulance and the patients arrival at the hospital.
Solution
Let X = time (in minutes) between the call for an ambulance and the patient’s arrival at
the hospital. We assume X has a normal distribution. (When nothing is given, we assume
normality). We want to judge the reasonableness of the claim that ( ) 20XE =µ= minutes.
For this we recorded the times for 6 calls. So we have a random sample of size 6 from X
with

123
( )
.23
6
138
6/261832201527XThus.26X,18X,32X,20X,15X,27X 654321
==
+++++=======
( ) ( ) ( ) ( ) ( ) ( )[ ]
[ ]
5
204
9258196416
5
1
232623182332232023152327
16
1 2222222
=+++++=
−+−+−+−+−+−
−
=S
Hence
5
204
=S
We calculate
150.1
6/
2023x
t
5
204
n
s
=
−
=
µ−
=
Now 05.0015.2,5,1 ===− ααα forttn
= 10.0476.1 =αfor
Since our observed 10.5150.1 tt <=
We can say that it is reasonable to assume that the average time is 20=µ minutes
Example 6
A process for making certain bearings is under control if the diameters of the bearings
have a mean of 0.5000 cm. What can we say about this process if a sample of 10 of these
bearings has a mean diameter of 0.5060 cm and sd 0.0040 cm?
( )
,504.0506.0
01.0504.0492.0
01.025.3
5.0
25.3.int
10
004.
>=
=<<
=<
−
<−
XSince
xPor
X
PH
the process is not under control.

124
Sampling Distribution of S2
(The sample variance)
Theorem
If S2
is the sample variance of a random sample of size n taken from the normal
population with (population) variance ,2
σ then
( ) ( )2
i
n
1i
22
2
2
XX
1S
1n −
σ
=
σ
−=Χ
=
is a random variable having chi-square distribution with parameter .1−= nν
Remark
Since S2
> 0, the rv has +ve density only to right of the origin. 2
,ανΧ is that unique
number such that ( ) α=Χ>Χ αν
2
,
2
P and is tabulated for some ss
and να in table 5.
A random sample of 10 observations is taken from a normal population having the
variance 5.422
=σ . Find approximately the prob of obtaining a sample standard
deviation S between 3.14 and 8.94
Solution
Required ( )94.814.3 << SP
( ( ) ( ) )
( ) ( ) ( )
( )925.16088.2
94.8
5.42
91
14.3
5.42
9
94.814.3
2
22
2
2
222
<Χ<=
×<
−
<×=
<<=
P
S
n
P
Sp
σ
(From Table 5, 088.299.0,,919.1605 2
9
2
9 =Χ=Χ )
( ) ( )( )
)(94.005.099.0
919.16088.2 22
approx
approxPP
=−=
>Χ−>Χ=

125
The claim that the variance of a normal population is 3.212
=σ is rejected if the
variance of a random sample of size 15 exceeds 39.74. What is the prob that the claim
will be rejected even though ?3.212
=σ
Solution
The prob that the claim is rejected
( )
( ) ( )
( )12.21,5tablefromAs025.0
12.21P74.39
3.21
14
S
1n
P
74.29SP
2
025.0,14
22
2
2
=Χ=
>Χ=×>
σ
−
=
>=
Theorem
If 2
2
2
1 ,SS are the variances of two independent random samples of sizes 21 ,nn
respectively taken from two normal populations having the same variance, then
2
2
2
1
S
S
F =
is a rv having the (Snedecor’s) F distribution with parameters 11 2211 −=−= nandn νν
Remark
1. 11 −n is called the numerator degrees of freedom and 12 −n is called the
denominator degrees of freedom.
2. If F is a rv having ( )21 ,νν degrees of freedom, then ανν ,, 21
F is that unique number
such that

126
( ) αανν => ,21
FFP and is tabulated for 05.0=α in table 6(a) and for 01.0=α in table
6(b).
We also note the fact :
ανν
ανν
−
=
1,,
,,
21
22
1
F
F
Thus 36.0
77.2
11
05.0,10,20
95.0,20,10 ===
F
F
Example 9
(a) 38.0
62.2
11
05.0,12,15
95.0,15,12 ===
F
F
(b) 135.0
40.7
11
01.0,6,20
99.0,20,6 ===
F
F
Example 10 (See Exercise on page 213)
If independent random samples of size 821 == nn come from two normal populations
having the same variance, what is the prob that either sample variance will be at least
seven times as large as the other?
Solution
Let 2
2
2
1 ,SS be the sample variances of the two samples.
Reqd ( )2
1
2
2
2
2
2
1 S7SORS7SP >>
( )72
77 2
1
2
2
2
2
2
1
>=
>>=
FP
S
S
or
S
S
P
where F is a rv having F distribution with (7,7) degrees of freedom
= 2 x 0.01 = 0.02 (from table 6(b)).

127
Example 11 (see exercise 6.38 on page 215)
If two independent random samples of size 169 21 == nandn are taken from a normal
population, what is the prob that the variance of the first sample will be at least four times
as large as the variance of the second sample?
Hint : Reqd prob ( )2
2
2
1 4SSP >=
( )
( )4Fas01.0
4FP4
S
S
P
01.0,15,8
2
2
2
1
==
>=>=
The F distribution with (4,4) degrees of freedom is given by
( ) ( )
≤
>+
=
−
00
016
4
F
FFF
Ff
If random samples of size 5 are taken from two normal populations having the same
variance, find the prob that the ratio of the larger to the smaller sample variance will
exceed 3?
Solution
Let 2
2
2
1 S,S be the sample variance of the two random samples.
Reqd ( )2
1
2
2
2
2
2
1 33 SSorSSP >>
( )3232 2
2
2
1
>=>= FP
S
S
P
where F is a rv having (4,4) degrees of freedom

128
( ) ( ) ( )
( ) ( )
16
5
192
125
192
1
32
1
12
F13
1
F12
1
12
dF
F1
1
F1
1
12dF
F1
F6
2
32
3
434
3
=
×
=−=
+
+
+
−=
+
−
+
=
+
=
∞∞
Inferences Concerning Means
We shall discuss how we can make statement about the mean of a population from the
knowledge about the mean of a random sample. That is we ‘estimate’ the mean of a
population based on a random sample.
Point Estimation
Here we use a statistic to estimate the parameter of a distribution representing a
population. For example if we can assume that the lifelength of a transistor is a r.v.
having exponential distribution with (unknown) parameter ββ, can be estimated by
some statistic, say X the mean of a random sample. Or we may say the sample mean is
an estimate of the parameter β .
Definition
Let θ be a parameter associated with the distribution of a r.v. A statistic
∧
θ (based on a
random sample of size n) is said to be an unbiased estimate (≡estimator) of θ if
θθ =
∧
E . That is,
∧
θ will be on the average close to θ .
Example
Let X be a rv; µ the mean of X. If X is the sample mean then we know ( ) µ=XE . Thus
we may say the sample mean X is an unbiased estimate of µ (Note X is a rv, a
statistic,
n
X.....XX
X n21 +++
= a function of the random sample

129
( ) n21n21 ....,If.X.....,X,X ωωω are any n non-ve numbers 1≤ such that
,1...... n21 =ω++ω+ω then we can easily see that nn2211 x.....xx ω++ω+ω is also an
unbiased estimate of µ . (Prove this). X is got as a special case by taking
.
1
....21
n
n ==== ωωω Thus we have a large number of unbiased estimates for µ .
Hence the question arises : If
∧∧
21 ,θθ are both unbiased estimates of θ , which one do we
prefer? The answer is given by the following definition.
Definition
Let
∧∧
21 ,θθ be both unbiased estimates of the parameter θ . We say
∧
θ is more efficient than
.VarVarif 212 θ≤θθ
∧∧∧
Remark
That is the above definition says prefer that unbiased estimate which is “more closer” to
θ . Remember the variance is a measure of the “closeness’ of X
∧
θ to θ .
Maximum Error in estimating Xbyµ
Let X be the sample mean of a random sample of size n from a population with
(unknown) mean µ . Suppose we use X to estimate µ . X - µ is called the error in
estimating µ by X . Can we find an upperbound on this error? We know if X is normal
(or if n is large) then by Cantral Limit Theorem.
n
X
σ
µ−
is a r.v. having (approximately) the standard normal distribution. And we can say
α−=<
µ−
<− αα
σ
1Z
X
ZP
22
n

130
Thus we can say with prob ( )α−1 that the max absolute error µ−X in estimating µ by
X is atmost
n
Z
σ
α
2
. (Here obviously we assume, σ the population s.d. is known. And
2
αZ is that unique no. such that ( ) 2
ZZP
2
α
=> α .
We also say that we can say with ( )α−1100 percent confidence that the max. abs error is
atmost
n
Z
σ
α
2
. The book denotes, this by E.
Estimation of n
Thus to find the size n of the sample so that we may say with ( )α−1100 percent
confidence, the max. abs. error is a given quantity E, we solve for n, the equation
.
2
E
n
Z =
σ
α
or n
2
2
=
E
Z σα
Example 1
What is the maximum error one can expect to make with prob 0.90 when using the mean
of a random sample of size n = 64 to estimate the mean of a population with ?56.22
=σ
Solution
Substituting 645.1ZZand6.1,64n 05.0
2
===σ= α (Note 90.01 =−α implies 05.0
2
=
α
)
in the formula for the maximum error
n
ZE
σ
α
2
= we get
3290.02.0645.1
8
6.1
445.1
64
6.1
645.1 =×=×=×=E
Thus the maximum error one can expect to make with prob 0.90 is 0.3290.

131
Example 2
If we want to determine the average mechanical aptitude of a large group of workers,
how large a random sample will we need to be able to assert with prob 0.95 that the
sample mean will not differ from the population mean by more than 3.0. points? Assume
that it is known from past experience that .200=σ
Solution
Here 95.01 =−α so that 025.0
2
=
α
, hence 96.1025.0
2
== ZZα
Thus we want n so that we can assert with prob 0.95 that the max error E = 3.0
74.170
3
2096.1
2
2
2
=
×
==∴
E
Z
n
σα
Since n must be an integer, we take it as 171.
Small Samples
If the population is normal and we take a random sample of size n (n small) from it, we
note
n
s
X
t
µ−
= (X sample mean, S = Sample s.d)
is a rv having t-distribution with (n-1) degrees of freedom.
Thus we can assert with prob α−1 that
22
,1,1 αα
−−
≤ nn
twherett is that unique no such that
( ) 22
,1
α
α => −n
ttP . Thus if we use X to estimate µ , we can assert with prob ( )α−1 that
the max error will be
n
S
tE n
2
,1 α
−
=
(Note : If n is large, then t is approx standard normal. Thus for n large, the above
formula will become
n
S
ZE
2
α= )

132
Example 3
20 fuses were subjected to a 20% overload, and the times it took them to blow had a
mean x = 10.63 minutes and a s.d. S = 2.48 minutes. If we use x = 10.63 minutes as a
point estimate of the true average it takes for such fuses to blow with a 20% overload,
what can we assert with 95% confidence about the maximum error?
Solution
Here n = 20 (fuses) x = 10.63, S = 2.478
95.0
100
95
1 ==−α so that 025.0
2
=
α
Hence 093.2025.0,19,1 2
==−
ttn α
Hence we can assert with 95% confidence (ie with prob 0.95) that the max error will be
16.1
20
48.2
093.2
2
,1
=×== −
n
S
tE n α
Interval Estimation
If X is the mean of a random sample of size n from a population with known sd σ , then
we know by central limit theorem,
n
X
Z σ
µ−
=
is (approximately) standard normal. So we can say with prob ( )α−1 that
22
Z
X
Z
n
αα <
µ−
<− σ
.
which can be rewritten as
22
Z
n
XZ
n
X αα
σ
+<µ<
σ
−

133
Thus we can assert with Prob ( ) ( )( )confidencewithie %1001.1 ×−≡− αα that µ lies in
the interval .Z
n
X,Z
n
X
22
σ
+−
σ
− αα
We refer to the above interval as a ( ) %1001 α− confidence interval for µ . The end
points
2
Z
n
X α
σ
± are known as ( ) %1001 α− . confidence limits for µ .
Example 4
Suppose the mean of a random sample of size 25 from a normal population (with 2=σ )
is x = 78.3. Obtain a 99% confidence interval for µ , the population mean.
Solution
Here ( ) 99.0
100
79
1,2,25 ==−== ασn
575.2005.0
2
005.0
2
==∴=∴ ZZα
α
x = 78.3
Hence a 99% confidence interval for µ is
( )
( )33.79,27.77
0300.13.78,0300.13.78
25
2
575.23.78,
25
2
575.23.78
,
22
=
+−=
×+×−=
+−
n
Zx
n
Zx
σσ
αα

134
σ unknown
Suppose X is the sample mean and S is the sample sd of a random sample of size n taken
from a normal population with (unknown) mean µ . Then we know the r.v.
n
s
X
t
µ−
=
has a t-distribution with (n-1) degrees of freedom. Thus we can say with prob α−1 that
22
,1,1 αα
−−
<<− nn
ttt
or
2
,1n,1n
t
n
S
X
t
2
α
−−
<
µ−
<− α
or
n
S
tX
n
S
tX
2
,1n
2
,1n
α
−
α
−
+<µ<−
Thus a ( ) %1001 α− confidence interval for µ is
+− α
−
α
− n
S
tX,
n
S
tX
2
,1n
2
,1n
Note :
(1) If n is large, t has approx the standard normal distribution. In which case the
( ) %1001 α− confidence interval for µ will be
+−
n
S
Zx
n
S
Zx
22
, αα
(2) If nothing is mentioned, we assume that the sample is taken from a normal
population so that the above is valid.

135
Example 5
Material manufactured continuously before being cut and wound into large rolls must be
monitored for thickness (caliper). A sample of ten measurements on paper, in mm,
yielded
32.2, 32.0, 30.4, 31.0, 31.2, 31.2, 30.3, 29.6, 30.5, 30.7
Obtain a 95% confidence interval for the mean thickness.
Solution
Here n = 10
262.2
025.0
2
95.01
7880.041.30
0025.0,9
2
,1
==∴
==−
==
−
tt
or
Sx
n
α
α
α
Hence a 95% confidence interval for µ is
( )46.31,34.30
10
7880.0
262.29.30,
10
7880.0
262.29.30
=
×+×−
Example 6:
Ten bearings made by a certain process have a mean diameter of 0.5060 cm with a sd of
0.0040 cm. Assuming that the data may be looked upon as a random sample from a
normal population, construct a 99% confidence interval for the actual average diameter of
bearings made by this process.

136
Solution
Here 0040.0,5060.0,10 === Sxn
( )
250.3tt
005.0Hence.99.0
100
99
1
005.0,9
2
,1n
==∴
=α==α−
α
−
Thus a 99% confidence interval for the mean
( )5101.0,5019.0
10
0040.0
250.35060.0,
10
0040.0
250.35060.0
,
2
,1
2
,1
=
×+×−=
+−=
−− n
s
tx
n
S
tx
nn
αα
Example 7
In a random sample of 100 batteries the lifetimes have a mean of 148.2 hours with a s.d.
of 24.9 hours. Construct a 76.60% confidence interval for the mean life of the batteries.
Solution
Here 9.24,2.148,100 === Sxn
19.1
1170.0
2
7660.
100
60.76
1
1170.01170.0,99
2
,1
=≈=
===−
−
ZttThus
thatso
n
α
α
α
Hence a 76.60% confidence interval is
( ).2.151,2.145
100
9.24
19.12.148,
100
9.24
19.12.148
=
×+×−

137
Example 8
A random sample of 100 teachers in a large metropolitan area revealed a mean weekly
salary of $487 with a sd of $48. With what degree of confidence can we assert that the
average weekly salary of all teachers in the metropolitan area is between $472 and $502?
Solution
Suppose the degree of confidence is ( ) %1001 ×−α
Thus 502$
2
,1
=+
− n
S
tx
n
α
Here 100,48,487 === nSx
22
,99
αα Zt ≈∴
Thus we get 502
10
48
487
2
=+ αZ
Or 125.3
8.4
15
2
==αZ
9982.010009.0
2
=−=∴ α
α
or
∴ We can assert with 99.82% confidence that the true mean salaries will be between
$472 and $502.
Maximum Likelihood Estimates (See exercise 7.23, 7.24)
Definition
Let X be a rv. Let ( ) ( )xXPxf ==θ, be the point prob function if X is discrete and let
( )θ,xf be the pdf of X if X is continuous (here θ is a parameter). Let n21 X.....X,X be a
random sample of size n from X. Then the likelihood function based on the random
sample is defined as

138
( ) ( ) ( ) ( ) ( ).,xf.....,xf,xf;x,....x,xLL n21n21 θθθ=θ=θ
Thus the likelihood function ( ) ( ) ( ) ( )nn xxPxxPxxPL ==== ...2211θ if X is discrete and
is the joint pdf of n1 X,...X when X is continous. The maximum likelihood estimate
(MLE)of θ is that
∧
θ which maximizes ( )θL .
Example 8
Let X be a rv having Poisson distribution with parameter λ .
Thus ( ) ( ) .......2,1,0x;
!x
exXP,xf
x
=
λ
===λ λ−
Hence the likelihood function is
( )
!
....
!! 21
21
n
xxx
x
e
x
e
x
eL
n
λλλ
λ λλλ −−−
=
.......2,1,0x;
!x!.....x!x
e
i
n21
x....xxn n21
=
λ
=
+++λ−
To find
∧
λ the value of λ which maximizes ( )λL , we use calculus.
First we take ln (log to base e natural logarithm)
( ) ( ) ( )!x!....xlnlnx.....xnLln n1n1 −λ+++λ−=λ
Differentiating w.r.t. λ (noting nxx .....1 are not be varied)
We get
( )
( )
λλλ
nxx
n
L
L
....1 1 +
+−=
∂
∂
n
x....x
gives0 n1 ++
=λ=

139
We can ‘easily’ verify 2
2
λ∂
∂ L
is <0 for this λ .
Hence the MLE of λ is x
n
xx n
=
+
=
∧ ....1
λ (The sample mean)
Example 9 MLE of Proportion
Suppose p is the proportion of defective bolts produced by a factory. To estimate p, we
proceed as follows. We take n bolts at random and calculate fD = Sample proportion of
defectives.
n
oneschosenntheamongfounddefectivesofNo
=
we show fD ist he MLE of p.
We define a rv X as follows.
=
defectiveischosenbolttheif1
defectivenotischosenbolttheif0
X
Thus X has the prob distribution
x 0 1
Prob 1-p p
It is clear that the point prob function
( )( )Xofpxf ; is given by
( ) ( ) 1,0x;p1pp;xf
x1x
=−=
−
(Note ( ) ( ) ( ) ( ) )p1xP1;xf&p10xP0;xf ===−===
Choosing n bolts at random amounts to choosing a random sample { }n21 X...,X,X from X
where Xi = 0 if the ith bolt chosen is not defective and = 1 if it is defective (I=1,2…n).

140
Hence n21 X....XX ++ (can you guess?)
= no of defective bolts among the n chosen.
The likelihood function of the sample is
( ) ( ) ( ) ( )
( ) ( )
( ) ( )n1
sns
i
x...xxnx...x
n21
x....xsp1p
n,....1iallfor1or0xp1p
p;xf.....p,xfp;xfpL
n21nT1
+=−=
==−=
=
−
+++−+
Taking ln and differentiating (partially) wrt p,
We get
( )
p
sn
p
s
p
L
L −
−
−=
∂
∂
1
1
for maximum,
p1
sn
p
s
or0
p
L
−
−
==
∂
∂
(i.e)
n
x.....xx
n
s
p n21 +++
==
defectivesofproportionSample
n
chosenntheamongdefectivesofNo
=
=
(One can easily see this p makes 0
p
L
2
2
<
∂
∂
so that L is maximum for this p).
Example 10
Let X be a rv having exponential distribution with parameter β (unknown). Hence the
density of X is ( ) ( )0
1
; >=
−
xexf
x
β
β
β

141
Let { }n21 X.....,X,X be a random sample of size n. Hence the likelihood function is
( ) ( ) ( ) ( )
( )
( )0
1
;....;;
....
21
21
>=
=
+++
−
i
xxx
n
n
xe
xfxfxfL
n
β
β
ββββ
Taking ln and differentiating (partially) w.r.t. ,β we get
( )imummaxfor0
x....xnL
L
1
2
n1
=
β
++
+
β
−=
β∂
∂
gives x
n
xxx n
=
+++
=
....21
β
Thus the sample mean x is the MLE of β .
Example 11
A r.v. X has density
( ) ( ) 1x0;x1;xf <<+β=β β
Obtain the ML estimate of β based on a random sample { }n21 X.....X,X of size n from
x.
Solution
The likelihood function is
( ) ( ) ( ) 1x0;x...xx1L in21
n
<<+β=β
β
Taking ln and differentiating (partially) wrt ,β we get

142
( )
( )
( )n
n1
n1
x.....xln
1
1gives
imummaxbetoLfor0
x.......xln
1
nL
L
1
−−=β
=
+
+β
=
β∂
∂
which is the ML estimate for β .
So far we have considered situations where the ML estimate is got by differentiating L
(and equalizing the derivative) to zero. The following example is one where the
differentiation will not work.
Example 12
A rv X has uniform density over [ ]β,0
(ie) The density of X is ( ) β≤≤
β
=β x0;
1
;xf (and 0 elsewhere)
The likelihood function based on a random sample of size n from X is
( ) ( ) ( ) ( )
β≤≤β≤≤β≤≤
β
=
βββ=β
n21n
n21
x0....,,x0,x0;
1
;xf.....;xf;xfL
This is a maximum when the Dr is least
(ie) when β is least. But nx ii ....2,1=∀>β
Hence the least β is max { }nxx .....1 which is the MLE of β

143
Estimation of Sample proportion
We have just in the above seen if p = population proportion (i.e proportion of persons,
things etc. having a characteristics) then the ML estimate of p = sample proportion Now
we would like to find a ( )α−1 100% confidence interval for p.
(This is treated in chapter 9 of your text book)
Large Samples
Suppose we have a ‘dichotomous’ universe; that is a population whose members are
either “haves” on “have – nots”; that is a member has a property or not.
For example we can think of a population of all bulbs produced by a factory. Any bulb is
either a “have” (ie defective) or is a “have-not” (ie it is good) and p = proportion of haves
= “Prob that a randomly chosen member is a “have”.
As another example, we can think of a population of all females in USA. A member is a
“have” ( = 0) is a blond or is a “have-not
“ (=is not a blond). As a last example, consider the population of all voters in India. A
member is a “have” if he follows BJP and is a “have-not” otherwise.
To estimate p, we choose n members at random and count the number X of “haves”. Thus
X is a rv having binomial distribution with parameters n and p!
( ) ( ) ( ) n.....2,1,0x;p1p
x
n
p;xfxXP
xnx
=−===
−
and if n is large, we know “standardized Binomial standard normal”
(ie) for large n ,
( )p1np
npX
−
−
has approx standard normal distribution. So we can say with
prob ( )α−1 that

144
( )
( ) 22
22
1
1
αα
αα
z
n
pp
p
n
x
zor
z
pnp
npx
z
−
−
<−
<
−
−
<−
or
( ) ( )
n
pp
z
n
x
p
n
pp
z
n
x −
+<<
−
−
11
22
αα
In the end points, we replace ‘p’ by the MLE
n
X
(=sample proportion)
Thus we can say with prob ( )α−1 that
n
n
x
1
n
x
z
n
x
p
n
n
x
1
n
x
z
n
x
22
−
+<<
−
− αα
Hence a ( ) %1001 α− confidence interval for p is
−
+
−
− αα
n
n
x
1
n
x
z
n
X
,
n
n
x
1
n
x
z
n
x
22
Remark : We can say with prob ( )α−1 that the max error p
n
X
− in approximating p by
n
X
is
( )
n
pp
ZE
−
=
1
2
α
We can replace p by
n
X
and say the

145
Max error =
n
n
X
1
n
X
Z
2
−
α
Or we note that ( ) ( )101 ≤≤− pforpp is a maximum
4
1
(which is obtained when
2
1
=p )
Thus we can also say with prob ( )α−1 that the max error.
n
Z
4
1
2
α=
This last equation tell us that to assert with prob ( )α−1 that the max error is E, n must be
2
2
E
Z
4
1
α
Example 13
In a random sample of 400 industrial accidents, it was found that 231 were due at least
partially to unsafe working conditions. Construct a 99% confidence interval for the
corresponding true proportion p.
Solution
Here ( ) 99.01,231x,400n =α−==
so that 575.2005.0
2 2
== α
α
Zhence
Thus a 99% confidence interval for p will be
−
+
−
− αα
n
n
x
1
n
x
Z
n
x
n
n
x
1
n
x
Z
n
x
22

146
( )6411.0,5139.0
400
400
231
1
400
231
575.2
400
231
400
400
231
1
400
231
575.2
400
231
=
−
+
−
−=
Example 14
In a sample survey of the ‘safety explosives’ used in certain mining operations,
explosives containing potassium mitrate were found to be used in 95 out of 250 cases. If
38.0
250
95
= is used as an estimate of the corresponding true proportion, what can we say
with 95% confidence about the maximum error?
Solution
Here n = 250, X = 95, 95.01 =−α
so that 025.0
2
=
α
; hence 96.1
2
=αZ
Hence we can say with 95% confidence that the max. error is
n
n
x
n
x
ZE
−
=
1
2
α
0602.0
250
62.038.0
96.1
=
×
×=
Example 15:
Among 100 fish caught in a large lake, 18 were inedible due to the pollution of the
environment. If we use 18.0
100
18.
= as an estimate of the corresponding true proportion,
with what confidence can we assert that the error of this estimate is atmost 0.065?

147
Solution
Here 065.0Eerrormax18X,100n ====
We note
100
82.18.
Z
n
n
X
1
n
X
ZE
22
×
=
−
= αα
69.1
03842.0
065.0
03842.0
2
2
==∴
×=
α
α
Z
Z
Hence 0455.09545.01
2
=−=
α
9190.010910.0 =−=∴ αα or
So we can assert with ( ) %9.91%1001 =×−α confidence that the error is at most 0.065.
Example 16
What is the size of the smallest sample required to estimate an unknown proportion to
within a max. error of 0.06 with at least 95% confidence?
Solution
Here 025.0
2
or95.01;06.0E =
α
=α−=
96.1025.0
2
==∴ ZZα
Hence the smallest sample size n is

148
77.266
06.0
96.1
4
1
4
1
2
2
2
=
==
E
Z
n
α
Since n must be an integer, we take the size to be 267.
Remark
Read the relevant material in your text on pages 279-281 of finding the confidence
interval for the proportion in case of small samples.
Tests of Statistical Hypothesis
In many problems, instead of estimating the parameter, we must decide whether a
statement concerning a parameter is true of false. For instance one may like to test the
truth of the statement: The mean life length of a bulb is 500 hours.
In fact we may even have to decide whether the mean life is 500 hours or more (!)
In such situations, we have a statement whose truth or falsity we want to test. We then
say we want to test the null hypothesis H0 = the mean life lengths is 500 hours (Here
onwards, when we say we want to test a statement, it shall mean we want to test whether
the statement is true). We then have another (usually called alternative) hypothesis. Make
some ‘experiment’ and on the basis of that we will ‘decide’ whether to accept the null
hypothesis or reject it. (When we reject the null hypothesis we automatically accept the
alternative hypothesis).
Example
Suppose we wish to test the null hypothesis H0 = The mean life length of a bulb is 500
hours against the alternative H1 = The mean life length is > 500 hours. Suppose we take a
random sample of 50 bulbs and found that the sample mean is 520 hours. Should we
accept H0 or reject H0 ? We have to note that even though the population mean is 500
hours the sample mean could be more or less. Similarly even though the population mean
is > 500 hours, say 550 hours, even then the sample mean could be less than 550 hours.
Thus whatever decision we may make, there is a possibility of making an error. That is

149
falsely rejecting H0 (when it should have been accepted) and falsely accepting H0 (when
it should have been rejected). We put this in a tabular form as follows:
Accept H0 Reject H0
H0 is true Correct Decision Type I error
H0 is false Type II Error Correct Decision
Thus the type I error is the error of falsely rejecting H0 and the type II error is the error of
falsely accepting H0. A good decision (≡ test) is one where the prob of making the errors
is small.
Notation
The prob of committing a type I error is denoted by α . It is also referred to as the size of
the test or the level of significance of the test. The prob of committing Type II error is
denoted by β .
Example 1
Suppose we want to test the null hypothesis 80=µ against the alternative hyp 83=µ on
the basis of a random sample of size n = 100 (assume that the population s.d. 4.8=σ )
The null hyp. is rejected if the sample mean 82>x ; otherwise is is accepted. What is the
prob of typeI error; the prob of type II error?
Solution
We know that when 80=µ (and 4.8=σ ) the r.v.
n
X
σ
µ−
has a standard normal
distribution. Thus,
P (Type I error)
=P (Rejecting the null hyp when it is true)

150
( )
( )
( ) 0087.9913.0138.2ZP1
38.2ZP
10
4.8
8082
n
X
P
80given82XP
=−=≤−=
>=
−
>
σ
µ−
=
=µ>=
Thus in roughly about 1% of the cases we will be (falsely) rejecting H0. Recall this is also
called the size of the test or level of significance of the test.
P (Type II error) = P (Falsely accepting H0)
= P (Accepting H0 when it is false)
( )
( )
1170.08830.01)19.1Z(P1
19.1ZP
10
4.8
8382
n
X
P
83given82XP
=−=≤−=
≤=
−
≤
σ
µ−
=
=µ≤=
Thus roughly in 12% of the cases we will be falsely accepting H0.
Definition (Critical Region)
In the previous example we rejected the null hypothesis when 82>x (i.e.) when x lies in
the ‘region’ x>82 (of the x axis). This portion of the horizontal axis is then called the
critical region and denoted by C. Thus the critical region for the above situation is
{ }82>= xC and remember we reject H0 when the (test) statistic X lies in the critical

151
region (ie takes a value > 82). So the size of the critical region (≡prob that X lies in C)
is the size of the test or level or significance.
The shaded portion is the critical region. The portion is the region of false
acceptance of H0.
Critical regions for Hypothesis Concerning the means
Let X be a rv having a normal distribution with (unknown) mean µ and (known) s.d. σ .
Suppose we wish to test the null hypothesis 0µµ = .
The following tables given the critical regions (criteria for rejecting H0) for various
alternative hypotheses.
Null hypothesis : 0µµ = (Normal population σ known)
n
x
Z
σ
µ0−
=
Alternative Hypothesis
H1
Reject H0 if Prob of Type I error Prob of type II error
( )01 µµµ <= αZZ −< α −
−
− α
σ
µµ
Z
n
F 10
1
0µµ < αZZ −< α
01 µµµ >= αZZ > α +
−
α
σ
µµ
Z
n
F 10
0µ>µ αZZ > α
0µµ ≠
2
2
α
α
ZZor
ZZ
>
−<
α
...

152
F(x) = cd f of standard normal distribution.
Remark:
The prob of Type II error is blank in case H1 (the alternative hypothesis) is one of the
following three things = 000 ,, µµµµµµ ≠>< . This is because the Type II error can
happen in various ways and so we cannot determine the prob of its occurrence.
Example 2:
According to norms established for a mechanical aptitude test, persons who are 18 years
old should average 73.2 with a standard deviation of 8.6. If 45 randomly selected persons
averaged 76.7 test the null hypothesis 2.73=µ against the alternative 2.73>µ at the
0.01 level of significance.
Solution
Step I Null hypothesis 2.73:H0 =µ
Alternative hypothesis 2.73:H1 >µ
(Thus here 2.730 =µ )
Step II The level of significance
01.0== α
Step III Reject the null hypothesis if 33.201.0 ==> ZZZ α
Step IV Calculations
73.2
45
6.8
2.737.760
=
−
=
−
=
n
x
Z
σ
µ
Step V Decision net para since 33.273.2 =>= αZZ
we reject H0 (at 0.01 level of significance)
(i.e) we would say 2.73>µ (and the prob of falsely saying this is 01.0≤ ).

153
Example 3
It is desired to test the null hypothesis 100=µ against the alternative hypothesis
100<µ on the basis of a random sample of size n = 40 from a population with .12=σ
For what values of x must the null hypothesis be rejected if the prob of Type I error is to
be ?01.0=α
Solution
33.201.0 == ZZα . Hence from the table we reject H0 if αZZ −< =-2.33 where
33.2
40
12
1000
−<
−
=
−
=
x
n
x
Z
σ
µ
gives
58.95
40
12
33.2100 =×−<x
Example 4
To test a paint manufacturer’s claim that the average drying time of his new “fast-drying”
paint is 20 minutes, a ‘random sample’ of 36 boards is painted with his new paint and his
claim is rejected if the mean drying time 50.20isx > minutes. Find
(a) The prob of type I error
(b) The prob of type II error when 21=µ minutes.
(Assume that 4.2=σ minutes)
Solution
Here null hypothesis 20:H0 =µ
Alt hypothesis 20:H1 >µ
P (Type I error) = P (Rejecting H0 when it is true)
Now when H0 is true, 20=µ and hence

154
( )20X
4.2
6
36
4.2
20X
n
X
−=
−
=
σ
µ−
is standard normal.
Thus P (Type I error)
P= ( 50.20X > given that 20=µ )
( ) ( ) ( )
1056.08944.01
25.1F125.1ZP125.1ZP
36
4.2
2050.20
n
X
P
=−=
−=≤−=>=
−
>
σ
µ−
=
(b) P (Type II error when 21=µ )
=P (Accepting H0 when 21=µ )
( )
( ) ( )
1056.0
25.1ZP25.1ZP
36
4.2
2150.20
n
X
P
21when50.20XP
=
>=−≤=
−
≤
σ
µ−
=
=µ≤=

155
Example 5
It is desired to test the null hypothesis 100=µ pounds against the alternative hypothesis
100<µ pounds on the basis of a random sample of size n=40 from a population with
.12=σ For what values of x must the null hypothesis be rejected if the prob of type I
error is to be ?01.0=α
Solutions
We want to test the null hypothesis 100:H0 =µ against the alt hypothesis 100:H1 <µ
given .50n,12 ==σ
Suppose we reject H0 when .Cx <
Thus P (Type I error)
= P (Rejecting H0 when it is true)
( )100givenCXP =µ<=
−
<=
−
<
σ
µ−
=
50
12
100C
ZP
50
12
100C
n
X
P
01.0
50
12
100
=
−
=
C
F
implies 33.2
50
12
100
−=
−C
Or 05.9633.2
50
12
100 =×−=C
Thus reject H0 if 05.96X <

156
Example 6
Suppose that for a given population with 2
4.8 in=σ , we want to test the null hypothesis
2
0.80 in=µ against the alternative hypothesis 2
0.80 in<µ on the basis of a random
sample of size n = 100.
(a) If the null hypothesis is rejected for 2
0.78 inx < and otherwise it is accepted,
what is the probability of type I error?
(b) What is the answer to part (a) if the null hypothesis is 2
80 in≥µ instead of
2
0.80 in=µ
Solution
(a) null hypothesis 80:H0 =µ
Alt hypothesis 80:H1 <µ
Given 100,4.8 == nσ
P (Type I error) = P (Rejecting H0 when it is true)
( )
−<=
−
<
σ
µ−
=
=µ<=
2.4
10
1ZP
100
4.8
0.800.78
n
X
P
80given0.78XP
( )38.21
2.4
10
1 FZP −=<−=
=1-0.9913 =.0087
(b) In this case we define the type I error as the max prob of rejecting H0 when it is
true ( )0.800.78 ≥<= numberaisgivenxP µ
Now ( )µismeanpopulationthewhenxP 0.78<

157
( )−<=
−
<
−
= µ
µ
σ
µ
78
4.8
10
100
4.8
0.78
ZP
n
x
P
( )( )µ−= 7819.1F
We note that cdf of Z, viz F(z) is an increasing function of Z. Thus when
( )( )µµ −≥ 7819.1,80 F is largest when µ is smallest i.e. .80=µ Hence P (Type I
error)
( )( ) ( )( )
0087.0
80
807819.17819.1
=
≥
−×=−=
µ
µ FFMax
Example 7
If the null hypothesis 0µµ = is to be tested against the one-sided alternative hypothesis
( )00 µµµµ >< or and if the prob of Type I error is to be α and the prob of Type II
error is to be β when 1µµ = , it can be shown that this is possible when the required
sample size is
( )
( )2
01
22
ZZ
n
µ−µ
+σ
=
βα
where 2
σ is the population variance.
(a) It is desired to test the null hypothesis 40=µ against the alternative hypothesis
40<µ on the basis of a large random sample from a population with .4=σ
If the prob of type I error is to be 0.05 and the prob of Type II error is to be 0.12
for ,38=µ find the required size of the sample.
(b) Suppose we want to test the null hypothesis 64=µ against the alternative
hypothesis 64<µ for a population with standard deviation .2.7=σ How large a

158
sample must we take if α is to be 0.05 and β is to be 0.01 for ?61=µ Also for
what values of x will the null hypothesis have to be rejected?
Solution
(a) Hence 4,38,4012.0,05.0 10 ===== σµµβα
175.1,645.1 12.005.0 ==== ZZZZ βα
Thus the required sample size
( )
( )
.3289.31
4038
175.1645.116
2
2
≥∴=
−
+
= n
(b) Here 2.7,61,64,01.0,05.0 10 ===== σµµβα
( ) ( )
( )
9201.91
6461
33.2645.12.7
2
22
≥∴=
−
+
≥∴ nn
We reject 76.62Xor645.1
92
2.7
64X
ieZZifH0 <−<
−
−< α
Tests concerning mean when the sample is small
If X is the sample mean and S the sample s.d. of a (small) random sample of size n from
a normal population (with mean 0µ ) we know that the statistic
n
S
X
t 0µ−
= has a t-
distribution with (n-1) degrees of freedom. Thus to test the null hypothesis 00 :H µ=µ
against the alternative hypothesis 01 :H µ>µ , we note that when 0H is true, (ie) when
( ) αµµ α =>= − ,10 , nttP
Thus if we reject the null hypothesis when α,1−> ntt (ie) when
n
S
tX ,1n0 α−+µ> we
shall be committing a type I error with prob α .

159
The corresponding tests when the alternative hypothesis is ( )00 & µµµµ ≠< are
described below.
Note: If n is large, we can approximate αα Zbytn ,1− in these tests.
Critical Regions for Testing 00 :H µ=µ (Normal population, unknownσ )
Alt Hypothesis Reject Null hypothesis if
0µµ < α,1−−< ntt
0µµ > α,1−> ntt
0µµ ≠
2
,1
2
,1
α
α
−
−
>
−<
n
n
tt
ortt
( )sizesamplen
n
s
X
t 0
→
µ−
=
In each case P(Type I error) = α
Example 8
A random sample of six steel beams has a mean compressive strength of 58,392 psi
(pounds per square inch) with a s.d. of 648 psi. Use this information and the level of
significance 05.0=α to test whether the true average compressive strength of the steel
from which this sample came is 58,000 psi. Assume normality.
Solution
1. Null Hypothesis 000,580 == µµ
Alt hypothesis ( )!000,58 why>µ
2. Level of significance 05.0=α
3. Criterion : Reject the null hypothesis if 015.205.0,5,1 ==> − ttt n α
4. Calculations

160
48.1
6.
648
000,58392,58
n
S
X
t 0
=
−
=
µ−
=
5. Decision
= 1.48 015.2≤
Since observedt
we cannot reject the null hypothesis. That is we can say the true average compressive
strength is 58,000 psi.
Example 9
Test runs with six models of an experimental engine showed that they operated for
24,28,21,23,32 and 22 minutes with a gallon of a certain kind of fuel. If the prob of type I
error is to be at most 0.01, is this evidence against a hypothesis that on the average this
kind of engine will operate for at least 29 minutes per gallon with this kind of fuel?
Assume normality.
Solution
1. Null hypothesis 29:H 00 =µ≥µ
Alt hypothesis: 01 :H µ<µ
2. Level of significance 01.0=≤ α
3. Criterion : Reject the null hypothesis if ( )6nNote365.3ttt 01.0,5,1n =−=−=−< α−
where
n
S
X
t 0µ−
=
4. Calculations
25
6
223223212824
X =
+++++
=

161
( ) ( ) ( ) ( ) ( ) ( )[ ]
34.2
6
6.17
2925
6.17
252225322523252125282524
16
1 2222222
−=
−
=∴
=
−+−+−+−+−+−
−
=
t
S
5. Decision
Since 365.334.2tobs −≥−= , we cannot reject the null hypothesis. That is we can
say that this kind of engine will operate for at least 29 minute per gallon with this
kind of fuel.
Example 10
A random sample from a company’s very extensive files shows that orders for a certain
piece of machinery were filled, respectively in 10,12,19,14,15,18,11 and 13 days. Use the
level of significance 01.0=α to test the claim that on the average such orders are filled
in 10.5 days. Choose the alternative hypothesis so that rejection of the null hypothesis.
5.10=µ indicates that it takes longer than indicated. Assume normality.
Solution
1. Null hypothesis 5.10:H 00 =µ≥µ
Alt hypothesis : 5.10:H1 <µ
3. Criterion : Reject the null hypothesis if 998.201.0,7001,18,1 ==−=−< −− tttt n α
where
n
S
X
t 0µ−
= ( )8n,5.10where 0 ==µ
4. Calculations
14
8
1311181514191210
X =
+++++++
=

162
( ) ( ) ( ) ( ) ( )
( ) ( ) ( )
09.3
8
29.10
5.1014
29.10
141314111418
14151414141914121410
18
1
222
22222
2
=
−
=∴
=
−+−+−+
−+−+−+−+−
−
=
t
S
5. Decision
Since 998.209.3 >=observedt , we have to reject the null hypothesis .That is we can
say on the average, such orders are filled in more than 10.5 days.
Example 11
Tests performed with a random sample of 40 diesel engines produced by a large
manufacturer show that they have a mean thermal efficiency of 31.4% with a sd of 1.6%.
At the 0.01 level of significance, test the null hypothesis %3.32=µ against the
alternative hypothesis %3.32≠µ
Solution
1. Null hypothesis 3.320 == µµ
Alt hypothesis 3.32≠µ
3. Criterion : Reject
2
,1
2
,10 αα −−
−< nn
tortifH (ie) if 005.0,39005.0,39 tortt −< .
Now 575.2005.0005.0,39 =≈ Zt
Thus we reject 575.2575.20 >−< tortifH
where
n
S
X
t 0µ−
=
4. Calculations
558.3
40
6.1
3.324.31
−=
−
=t
5. Decision
Since 575.2558.3 −<−=observedt
Reject H0 ; That is we can say the mean thermal efficiency 3.32≠

163
Example 12
In 64 randomly selected hours of production, the mean and the s.d. of the number of
acceptable pieces produced by an automatic stamping machine are
.146Sand038,1X == At the 0.05 level of significance, does this enable us to reject the
null hypothesis 1000=µ against the alt hypothesis ?1000>µ
Solution
1. The null hypothesis 1000: 00 == µµH
Alt hypothesis 1000:H1 >µ
3. Criterion : Reject 05.0,164,10 −− => tttifH n α
Now 645.105.005.0,63 =≈ Zt
Thus we reject 645.10 >tifH
4. Calculations: 082.2
64
146
000,1038,1
n
S
X
t 0
=
−
=
µ−
=
5. Decision : Since 645.1082.2 >=obst
we reject 05.00 atH level of significance.

164
REGRESSION AND CORRELATION
Regression
A major objective of many statistical investigations is to establish relationships that make
it possible to predict one or more independent variables in terms of others. Thus studies
are made to predict the potential sales of a new product in terms of he money spent on
advertising, the patient’s weight in terms of the number of weeks he/she has been on a
diet, the marks obtained by a student in terms of the number of classes he attended, etc.
Although it is desirable to predict the quantity exactly in terms of the others, this is
seldom possible and in most cases, we have to be satisfied with predicting average or
expected values. Thus we would like to predict the average sales in terms of the money
spent on advertising, the average income of a college student in terms of the number of
years he/she has been out of the college.
Thus given two random variables, X, Y and given that X takes th value x, the basic
problem of bivariate regression is to determine the conditional expected value E(Y|x) as a
function of x. In most cases, we may find that E(Y|x) is a linear function of x:
E(Y|x) = α + βx, where the constants α , β are called the regression coefficients.
Denoting E(X) = µ1, E(Y) = µ2, )(XVar = σ1, )(YVar = σ2, cov(X,Y) = σ12, ρ =
21
12
σσ
σ
, we can show:
Theorem: (a) If the regression of Y on X is linear, then
E(Y|x) = µ2 + ρ
1
2
σ
σ
(x -µ1)
(b) If the regression of X on Y is linear, then
E(X|y) = µ1 + ρ
2
1
σ
σ
(y -µ2)
Note: ρ is called the correlation coefficient between X and Y.
In actual situations, we have to “estimate” the regression coefficients α , β from a random
sample { (x1,y1), (x2, y2), … (xn, yn)} of size n from the 2-dimensional random variable
(X, Y). We now “fit” a straight line y = a + bx for the above data by the method of ”least

165
squares”. The method of least squares says that choose constants a and b for which the
sum of the squares of the “vertical deviations” of the sample points (xi, yi) from the line y
= a+bx is a minimum. I.e. find a, b so that T =
=
+−
n
i
ii bxay
1
2
)]([ is a minimum. Using
2-variable calculus, we should determine a, b so that
a
T
∂
∂
= 0 and
b
T
∂
∂
= 0. Thus we get
the following two equations
=
−
n
i 1
)2( [yi – (a + bxi)] = 0 and
=
n
i 1
( -2xi) [yi – (a + bxi)] = 0.
Simplifying, we get the so called “normal equations”:
+na (
=
n
i
ix
1
)b =
=
n
i
iy
1
(
=
n
i
ix
1
)a + (
=
n
i
ix
1
2
)b = (
=
n
i
ii yx
1
)
Solving we get
= =
===
−
−
= n
i
n
i
ii
n
i
i
n
i
i
n
i
ii
xxn
yxyxn
b
1
2
1
2
111
)()(
)()()(
;
n
bxy
a
n
i
n
i
ii
= =
−
= 1 1
)()(
.
These constants a and b are used to estimate the unknown regression coefficients α , β.
Now if x = xg, we predict y as yg = a + bxg.
Problem 1.
Various doses of a poisonous substance were given to groups of 25 mice and the
following results were observed:
Dose (mg)
x
Number of deaths
y
4 1
6 3
8 6
10 8
12 14
14 16
16 20

166
(a) Find the equation of the least squares line fit to these data
(b) Estimate the number of deaths in a group of 25 mice who receive a 7 mg dose of
this poison.
Solution:
(a) n = number of sample pairs (xi, yi) = 7
xi = 70, yi = 68 xi
2
= 812, xi yi = 862
Hence b = {7 x 862 – 70 x 68 } / { 7 x 812 – (70)2
} = 1274/784 = 1.625
a = {68 – 70 x 1.625}/7 = - 6.536
Thus the least square line that fits the given data is: y = -6.536 + 1.625 x
(b) If x = 7, y = -6.536 + 1.625 x 7 = 4.839.
Problem 2:
The following are the scores that 12 students obtained in the midterm and final
examinations in a course in Statistics:
Mid Term Examination
x
Final Examination
y
71 83
49 62
80 76
73 77
93 89
85 74
58 48
82 78
64 76
32 51
87 73
80 89

167
(a) Fit a straight line to the above data
(b) Hence predict the final exam score of a student who received a score of 84 in the
midterm examination.
Solution:
(a) n = number of sample pairs (xi, yi) = 12
xi = 854, yi = 876 xi
2
= 64222, xi yi = 64346
Hence b = {12 x 64346 – 854 x 876 } / { 12 x 64222 – (854)2
} = 24048/41348 = 0.5816
a = {876 – 854 x 0.5816}/12 = 31.609
Thus the least square line that fits the given data is: y = 31.609 + 0.5816 x
(b) If x = 84, y = 31.609 + 0.5816 x 84 = 80.46
Correlation
If X, Y are two random variables, the correlation coefficient, ρ, between X and Y is
defined as
ρ =
)()(
),(cov
YVarXVar
YX
.
It can be shown that
(a) -1 ≤ ρ ≤ 1
(b) If Y is a linear function of X, ρ = ± 1
(c) If X and Y are independent, then ρ = 0
(d) If X, Y have bivariate normal distribution and if ρ = 0, then X and Y are
independent.
Sample Correlation Coefficient
If { (x1,y1), (x2, y2), … (xn, yn)} is a random sample of size n from the 2-dimensional
random variable (X, Y), then the sample correlation coefficient, r, is defined by

168
= =
=
−−
−−
=
n
i
n
i
ii
i
n
i
i
yyxx
yyxx
r
1 1
22
1
)()(
)()(
.
We shall use r to estimate the (unknown) population correlation coefficient ρ. If (X, Y)
has a bivariate normal distribution, we can show that the random variable,
Z =
r
r
−
+
1
1
ln
2
1
is approximately normal with mean
ρ
ρ
−
+
1
1
ln
2
1
and variance
3
1
−n
.
Note: A computational formula for r is given by
yyxx
xy
SS
S
r = ,
where
= =
=
−=−=
n
i
n
i
n
i
i
iixx
n
x
xxxS
1 1
1
2
22
)(
)( ,
= =
=
−=−=
n
i
n
i
n
i
i
iixx
n
y
yyyS
1 1
1
2
22
)(
)( ,
= =
= =
−=−−=
n
i
n
i
n
i
n
i
ii
iiiixy
n
yx
yxyyxxS
1 1
1 1
)()(
)()( .
Problem 3.
Calculate r for the data { (8, 3), (1, 4), (5, 0), (4, 2), (7, 1) }.
Solution
x = 25/5 = 5. y = 10/5 = 2.
=
−−
n
i
ii yyxx
1
)()( = 3 x 1 + (-4) x 2 + 0 x (-2) + (-1) x 0 + 2 x (-1) = -7
=
−
n
i
i xx
1
2
)( = 9 + 16 + 0 + 1 + 4 = 30
=
−
n
i
i yy
1
2
)( = 1 + 4 + 4 + 0 + 1 = 10
Hence
)10()30(
7−
=r = - 0.404.

169
Problem 4.
The following are the measurements of the air velocity and evaporation coefficient of
burning fuel droplets in an impulse engine:
Air velocity
x
Evaporation Coefficient
y
20 0.18
60 0.37
100 0.35
140 0.78
180 0.56
220 0.75
260 1.18
300 1.30
340 1.17
380 1.65
Find the sample correlation coefficient, r.
Solution.
= =
=
−=−=
n
i
n
i
n
i
i
iixx
n
x
xxxS
1 1
1
2
22
)(
)( = 532000 – (2000)2
/10 = 132000
= =
=
−=−=
n
i
n
i
n
i
i
iixx
n
y
yyyS
1 1
1
2
22
)(
)( = 9.1097 – (8.35)2
/10 = 2.13745
= =
= =
−=−−=
n
i
n
i
n
i
n
i
ii
iiiixy
n
yx
yxyyxxS
1 1
1 1
)()(
)()( = 2175.4 –
10
)35.8()2000(
= 505.4
Hence
yyxx
xy
SS
S
r = =
)13745.2()132000(
4.505
= 0.9515.
**************

R4 m.s. radhakrishnan, probability & statistics, dlpd notes.

More Related Content

What's hot (20)

Similar to R4 m.s. radhakrishnan, probability & statistics, dlpd notes. (20)

Recently uploaded (20)