SlideShare a Scribd company logo
Measures of central tendency
 Measure of central tendency are summary statistics used to indicate the central location of
group a of data values.
 It may also be called a center or location of the distribution.
Objectives of central tendency
To facilitate comparison.
To describe characteristic of entire group.
To help in decision making.
To know about universe from a sample.
2. Requisites for an ideal measure of central tendency:
It should be rigidly defined.
It should be simple to understand.
It should be easy to calculate.
It should be suitable for further mathematical treatment.
It should be least affected by fluctuation of sample.
It should not be affected by extreme observations.
3. Types of central tendency:
Mean (Mathematical average)
Median (Positional average)
Mode (Positional average)
A. Arithmetic mean:
Arithmetic mean (A.M) of a set of data may be defined as the sum of observation divided
by the number of observation.
The mean is the most commonly used measure of central tendency.
=
=
1. Individual series
=
(a) (direct method)
(b) = A + (Short cut method)
2.Discrete series
=
(a) (Direct method)
= (Shortcut method)
(b)
3. Continuous series
=
(a)
Where, A=assumed mean
d=X-A and N=total frequency
Where, m=mid value
(b) =
(Direct method)
A +
Where, A = assumed mean
d=X-A and N=total frequency
(c)
A +
= A + X h
Where, d'
=
h=class size
The health expenditure of 5 families in rupees are given
below.
Family A B C D
Health expenditure (RS) 3000 4000 1500 3500
Calculate arithmetic mean
Solution:
Family Items Health expenditures(RS)
A 3000
B 4000
C 1500
D 3500
Total Ʃ𝑋 = 12000
Here, n=4, Ʃ𝑋 = 12000
A.M = 𝑋 =
Ʃ𝑋
𝑛
=
12000
4
=3000
4)central tendency and dispersion biostatistics
4)central tendency and dispersion biostatistics
Combined mean
If X1 and X2 are the mean of two different groups having frequencies N1 and
N2,the combined mean is given by:
= X1N1
+ X2 N2
N1+N2
Example:
Q. The mean monthly salary of 10 lady doctors is Rs. 400 per month and that of 20 male doctors
is Rs. 600 per month, calculate the mean monthly salary of all doctors taken together.
solution
Here, N1 = 10
N2= 20
X1
X2
=
=
400
600
Hence,
X1 X2
N1 + N2
N1+N2
=
= 10 x 400 20 x 600
+
10 + 20
= 4000 + 12000
30
533.33
=
Therefore mean monthly salary = Rs.533.33
Weighted arithmetic mean
The arithmetic average discussed above is simple arithmetic average in which all the items are
assumed to be equally important in the distribution. But in practice, this may not be so. The
importance of some items in a distribution may be greater than the other. so in such cases
proper weightage should be given to various items. Now, we defined the following weighted
arithmetic average in which proper weight is considered.
Let be the weights given to the variate values X1,x2,x3……..;xn
respectively,
Then, their weighted arithmetic mean denoted by xw
is defined by,
W1 ,W2 ,W3,………;Wn
xw
= w1x1+w2x2+……+wnxn
w1+w2+……+wn
=
Example:
Q. A contractor employees three types of workers- male, female and children .To a male
Worker he pays Rs.10 per day, to a female worker Rs.8 per day and to a child worker
Rs.3 per day . If the number of male, female and child workers employees is 20, 15 and
5 respectively . What is the average wage?
solution
Here ,the suitable average is weighted mean.
We have,
xw =
Calculation of weighted average
Wages per day(X) No. of workers(W) (WX)
10 20 200
8 15 120
3 5 15
W=40 WX=335
Hence, the average wage is Rs.8.38
Geometric mean:
The geometric mean of the n non-zero and non-negative variate values is the nth
root of their product.
G=
It is used for rates ,ratios ,percentage variate values and exponentially expressed values.
x1.x2.x3…………..xn
n
Harmonic mean:
Harmonic mean is the reciprocal of arithmetic mean of the reciprocal of the set of non-zero
Variate values .Harmonic mean is used for rates and ratios type of variables.
H=
n
∑
Merits And Demerits Of Arithmetic Mean
MERITS :
It is rigidly defined.
It is based on all observations.
It is simple to understand and easy to calculate.
It is suitable for further mathematical treatments.
It is least affected by fluctuation of sampling.
2.DEMERITS :
It is very much affected by extreme observations.
It cannot be computed in case of open end classes.
It gives sometimes fallacious conclusion.
It cannot be determined by inspection or by graphical method.
It cannot be used if we are dealing with qualitative characteristics which cannot be
measured quantitatively.
B. MEDIAN
The values which divides the distribution into two equal
parts, provided the observations are arranged in the order
of magnitude.
If the number of observations in a series is odd , then the median is
the middle value and if the number of the observation is even, then
the median is the midpoint between the two middle values.
A.For Individual series:
Arrange the data in ascending order or descending order
of magnitude and apply the formulae,
Median= Size of
th
item
4)central tendency and dispersion biostatistics
4)central tendency and dispersion biostatistics
B. For Discrete series :
Arrange the data in ascending or descending order of magnitude.
Obtain c.f. (cumulative frequency).
Apply the formulae,
M.d. = size of
Now look at the c.f. column and see that value which is either
equal to or
th
item
greater than ,this gives the value of
C. For continuous series:
Median = L+
_ c.f
f
x h
Median.
4)central tendency and dispersion biostatistics
4)central tendency and dispersion biostatistics
1.Merits:
Easy to understand and simple to compute.
Is rigidly defined.
Not influenced by the extreme values.
Can be computed even for open .
Median is only the average to be used while dealing with while
Dealing with qualitative characteristics such as intelligence, beauty
etc.
2.Demerits:
Arrangement of data is necessary.
Not based on all the observations.
Not suitable for further mathematical treatment.
Exact value cannot be determined in case of even numbers.
C. Mode:
Mode is that variate value which repeats maximum number of times .
It is used in business for forecasting rates of goods ,But rarely used in
medical sciences.
In case of individual and discrete series the mode can easily found out by
inspection But in case of continuous series, we use the following formula
to calculate the mode.
Mode= L + 1
1 2
+
x h
Where L= lower limit of modal class
1 = f1 – f0
2
=
f1 – f2
f1
= maximum frequency,
f2
= frequency following the modal class,
f0
= frequency preceding modal class
,
h= size of modal class
4)central tendency and dispersion biostatistics
4)central tendency and dispersion biostatistics
4)central tendency and dispersion biostatistics
Merits:
It is easy to calculate and simple to understand.
It is not affected by extreme observations.
It can be calculated for open end classes.
It can be obtained by inspection or by graph.
Demerits :
Mode is not rigidly defined.
mode is not based on all observations.
Mode is not suitable for further mathematical treatment.
Mode is affected by fluctuation of sampling.
Relation between various measures of central tendency:
Mode = 3Median – 2Mean ( Empirical relationship)
A.M ≥ G.M ≥ H.M
1.
2.
The mean weight of 150 students in a certain class is 60 kg .The mean
weight of boys in the class is 70 kg and that of girls is 55 kg .Find the
no. of boys and girls in the class.
Q.
Here,
= 60
N1 = 150
N2
+
N1 = N2
150 -
X1 = 70
X2 = 55
N1
N2
= ?
= ?
We have,
=
N1X1 + N2X2
N1 + N2
……….. ( i )
Or, 60 =
70 N1 + 55 N2
150
Or, 9000 = 70 N1 + 55 N2……………. ( ii )
Putting N1 = ( 150 - N2 ) in eq
n
( ii )
9000= 70 (150 -N2) + 55 N2
9000 = 10500 - 70 N2+ 55 N2
15N2 = 10500 - 9000
15 N2 = 1500
N2 = 100
Or,
Or,
Or,
Or,
.
. .
Putting value of N2 In eq
n
( i )
N1 N2
+ = 150
N1 + 100 = 150
. .
. N1 = 50
Hence, N1= 50 and N2 = 100
Selection of an average:
No single average is suitable for all conditions. The selection of an
average depends upon the nature of the data.
1. All mean, median and mode can be used for symmetrical data.
2. Mean is suggested when:
i) The average of quantitative data is to be calculated but it should
not be used on the following condition:
a. When the data is highly skewed.
b. When the distribution have open end classes.
c. When the distribution have extreme observations.
3. Median is used when:
i) Qualitative data such as intelligence, beauty etc.
ii)Open end classes
iii)Highly skewed distribution.
4. Mode: It is useful for most repeated value, particularly in business
and for highly skewed distribution.
5.Geometric Mean:
i) It is used to calculate average rates ratio and percentage.
ii) Construction of index number.
6.Harmonic Mean:
i) It is used in computing averages related to rates and ratios where
time factor is available.
Numerical problems:
1. In a class of 50 students 10 have failed and their average of marks is
2.5 .The total marks secured by the entire class were 281. Find the
average marks of those who have passed. (Ans=6.4)
2. The mean age of combined group of men and women is 30 years.
If the mean age of the group of men is 32 and that of group of women is
27. Find out the percentage of men and women in the group.(60 n 40)
3. Mean of 100 items was 50. Later on it was found that two items
were misread as 92 and 8 instead of 192 and 88. Find the correct
mean. (Ans=51.8)
4. Arithmetic mean of 98 items is 50. Two items 60 and 70 were left out
at the time of calculation. What is the correct mean of all the items?
(Ans=50.3)
4)central tendency and dispersion biostatistics
4)central tendency and dispersion biostatistics
Measure of position (partition) values:
The partition value are those variate values which divides the total
number of observations into equal number of parts .The equal number of
parts may be four, ten, hundred etc.
1. Quartiles
Quartiles are those positional values which divides the ordered series
into four equal parts, so there are 3 quartiles.
Q1 Q2 Q3
Lowest value Highest value
0% 25% 75%
50% 100%
Presentation of data by quartiles
Md
=
fig:
a. First quartile (lower quartile):
Q1 has 25% observation before it and 75% observation after it.
b. Second quartile (Median):
Q2 Coincides with Median. Median divides the total number of
observations into two halves.
c. Third quartile:
Q3 has 75% observation before it and 25% observation after it.
a. For individual and discrete series:
i(N+1)
4
th
Qi = Value of item
Where i =1,2,3
b. For continuous series
Q i = L +
iN
4
f
- c.f
Where i= 1,2,3
x h
Deciles: Deciles are those positional values which divides the ordered
series into 10 equal parts. So there are 9 deciles.
a. For individual series and discrete series:
Dj
= Value of j (N+1)
10
th
item
Where, j =1,2,3……9
b. For continuous series:
Dj = L +
jN
- c.f
10
f
x h
Where, j =1,2,3……9
Percentiles:
Percentiles are those positional values which divides the ordered series
into 100 equal parts. So, there are 99 percentiles.
a. For individual series and discrete series:
PK = Value of k (N+1)
100
th
item
Where, k =1,2,3……99
b. For continuous series:
Pk= L +
kN
100
f
x h
Where, k =1,2,3……99
Measures of dispersion(variability)
Averages gives us the idea of concentration of the items around the
central part of distribution. But the averages do not give the clear
picture about the distribution because two distribution with same
averages may differ in the scatterness of the items from the central value.
X Mo Md
A 25 26 27 27 27 28 29 27 27 27
B 0 10 18 27 27 27 80 27 27 27
From the above table, We see that mean , median and mode of two
series A and B are same. Only with these results we cannot say that the
two series A and B are similar. Because, the difference of the items from
the average in B is more in comparison to A. so, in series A, items are
concentrating more around the central value but the scatterness of the
items from the central value in series B is more. Hence, though two
series A and B have same averages, they cannot be said similar because
they are differently constituted.
Definition :
Dispersion is the scatterness of the items from central value or
measure of variation of the items from the central value.
Main objects of measuring variability are:
1. To determine the reliability of an average.
2. To compare two or more series with regard to their variability.
3. To help in using other statistical terms.
Absolute Measure Of Dispersion:
A measure of dispersion is said to be an absolute if it is
expressed in terms of original units of data.
Relative Measure Of Dispersion:
A measure of dispersion is said to be relative if it is independent of
units of the data.
Requisites of ideal measure of dispersion:
It should be rigidly defined.
It should be simple to understand and easy to calculate.
It should be based on all observations.
It should be suitable for further mathematical treatment.
It should be least affected by fluctuation of sampling.
It should not be affected by extreme values.
Method of measuring dispersion:
1. Range
2. Quartile deviation (Semi-interquartile range)
3. Mean deviation(Average deviation)
4. Standard deviation
Range:
Range is the simplest measure of dispersion. Range is defined as the
differences between the largest item and smallest item in a set of
observation.
Range= Largest item – Smallest item
= L - S
Coefficient of Range:
Coefficient of range is the relative measure correspond to range.
It can be used to compare two distribution with different units.
L - S
L + S
Coefficient of Range =
Merits:
It is rigidly defined.
It is simple to understand and easy to calculate.
Variation can be understood in short time.
Demerits
It is not based on all observation.
It is affected by fluctuation by sampling.
It is affected by extreme values.
It is not suitable for further mathematical treatment.
It cannot be calculated in case of open end classes.
Quartile deviation (Semi-interquartile range)
Interquartile range = Q3-Q1
Quartile deviation(Q.D) =
𝟏
𝟐
(Q3−Q1)
Coefficient of Q.D =
Q3−Q1
Q3+Q1
Merits
• Rigidly defined
• Not affected by extreme values
• Can be calculated in open-end class distribution
• It is better measure of dispersion in comparison to range as
it is based on 50% of central values.
Demerits:
• It is not based on all observation
• It is affected by fluctuation of sampling.
• It is not suitable for further mathematical
treatment.
Example: find Q.D of given data.
2, 4, 6, 8, 10, 12, 14
Q3 = value of 3(
𝑵+𝟏
𝟒
) item
= 6th item
= 12
th
We have, Q1 = value of (
𝑵+𝟏
𝟒
)th item
= (
𝟕+𝟏
𝟒
)𝐭𝐡 item
= 2nd item
= 4
Quartile deviation(Q.D) =
𝟏
𝟐
(Q3−Q1)
= 4
Mean deviation (M.D)
Mean deviation from AM
For individual series
For discrete and continuous series
Where A= mean, median and mode
Coefficient of mean deviation from mean =
𝒎𝒆𝒂𝒏 𝒅𝒆𝒗𝒊𝒂𝒕𝒊𝒐𝒏 𝒇𝒓𝒐𝒎 𝒎𝒆𝒂𝒏
𝒎𝒆𝒂𝒏
Coefficient of mean deviation from median =
𝒎𝒆𝒂𝒏 𝒅𝒆𝒗𝒊𝒂𝒕𝒊𝒐𝒏 𝒇𝒓𝒐𝒎 𝒎𝒆𝒅𝒊𝒂𝒏
𝒎𝒆𝒅𝒊𝒂𝒏
4)central tendency and dispersion biostatistics
4)central tendency and dispersion biostatistics
Merits:
1. Based on all observation.
2. It is easy to understand and calculate.
3. It is less affected by extreme items as compared to other
dispersions.
Demerits:
1. Cannot be computed in open end classes.
2. It is less reliable because of ignoring the sign.
3. It is not suitable for further mathematical treatment.
4. Not suitable measure when mode is ill defined.
Standard Deviation (Root-Mean square deviation)
It is said to be the best measure of dispersion as it satisfies most of the
requisites of good measure of dispersion.
Definition:
It is defined as the positive square root of the mean of the square of
deviations taken from the arithmetic mean .It is denoted by .
1. For individual series:
=
∑ (x-x)2
n
Where, n= total no. of observation
2. For discrete series:
=
f(x-x) 2
N
∑
Where, N= total frequency
3. For continuous series:
= f(m-x )
∑
N
2
Where, N= total frequency
Merits:
 It is rigidly defined.
 It is based on all observation.
 It is least affected by fluctuation of sampling.
 It is suitable for further mathematical treatment.
 It helps in calculating standard error.
Demerits:
It is difficult to compute.
It is very much affected by extreme values.
It cannot be calculated for open end classes.
4)central tendency and dispersion biostatistics
4)central tendency and dispersion biostatistics
4)central tendency and dispersion biostatistics
Variance:
The square of standard deviation is called variance.
2
=
∑f ( x – x )
N
2
Coefficient of variance:
Coefficient of variance is relative measure of finding out dispersion .
It is the ratio of standard deviation and mean expressed as percent.
Coefficient of variance (C.V) =
X
X 100
It is independent of unit .So, two distributions can be easily compared
with the help of coefficient of variance.
Less the C.V , more will be the uniformity, consistency etc.
More the C.V , less will be the uniformity, consistency etc.
In two series of adults aged 21 years and children 3 month old following values were obtained
for height. Find which series shows greater variation?
Persons Mean height SD
Adults 160 cm 10 cm
children 60 cm 5 cm
C.V =
X
X 100
C.V of adults= 10
160
X 100 = 6.25%
C.V of children= 5
60
X 100 =8.33%
Hence C.V of children > C.V of adults i.e. the height of children shows greater variation than
the height of adults.
Q. Example:
Q. Suppose two group of human males yield the following information.
Group A Group B
Age 24years 15years
Mean weight 145lbs 80lbs
Variance 100lbs 100lbs
Find which is more variable, the weight of 24 years old or the weight of 15 years old?
Solution,
For group A For group B
X1 = 145lbs X2 =80lbs
1
2
=100
2
2
=100
1 2
=10 =10
C.V = 1
X1
X 100 C.V = 2
X2
X 100
= 10
145
X 100 = 6.9% = 10
80
X 100 =12.5%
Here ,C.V of B is greater than C.V of A i.e. the weight of 15 years has more
variation than the weight of 24 years old.

More Related Content

PPTX
MEASURE OF CENTRAL TENDENCY
PPTX
#3Measures of central tendency
PPTX
**Measures of Central Tendency**:.pptx
PPTX
Topic 2 Measures of Central Tendency.pptx
PPTX
Measures of central tendency
PPTX
Topic 2 Measures of Central Tendency.pptx
PPTX
Ch3MCT24.pptx measure of central tendency
PPTX
Biostatistics Measures of central tendency
MEASURE OF CENTRAL TENDENCY
#3Measures of central tendency
**Measures of Central Tendency**:.pptx
Topic 2 Measures of Central Tendency.pptx
Measures of central tendency
Topic 2 Measures of Central Tendency.pptx
Ch3MCT24.pptx measure of central tendency
Biostatistics Measures of central tendency

Similar to 4)central tendency and dispersion biostatistics (20)

PPTX
Measures of central tendency
PPT
Basic of Statistics
PPTX
Measures of central tendency mean
PPTX
Measures of central tendency and dispersion
PDF
Measures for Central Tendency/location/averages
PPTX
Measures of central tendency
PPT
Business statistics
PDF
Central Tendancy.pdf
PDF
Measures of central tendency
PPTX
Measures of Central tendency
PPTX
Research Methodology
PDF
Unit 1 - Mean Median Mode - 18MAB303T - PPT - Part 1.pdf
PDF
Basic statistics
DOCX
PPTX
03. Summarizing data biostatic - Copy.pptx
PPTX
Measures of Central Tendency.pptx for UG
PPT
Measure of Central Tendency
PPTX
Module 2 Measures of Central Tendency.pptx
PPTX
Measures of central tendency and dispersion
Measures of central tendency
Basic of Statistics
Measures of central tendency mean
Measures of central tendency and dispersion
Measures for Central Tendency/location/averages
Measures of central tendency
Business statistics
Central Tendancy.pdf
Measures of central tendency
Measures of Central tendency
Research Methodology
Unit 1 - Mean Median Mode - 18MAB303T - PPT - Part 1.pdf
Basic statistics
03. Summarizing data biostatic - Copy.pptx
Measures of Central Tendency.pptx for UG
Measure of Central Tendency
Module 2 Measures of Central Tendency.pptx
Measures of central tendency and dispersion
Ad

Recently uploaded (20)

PDF
Worlds Next Door: A Candidate Giant Planet Imaged in the Habitable Zone of ↵ ...
PPTX
TORCH INFECTIONS in pregnancy with toxoplasma
PPTX
endocrine - management of adrenal incidentaloma.pptx
PPT
Heredity-grade-9 Heredity-grade-9. Heredity-grade-9.
PDF
Looking into the jet cone of the neutrino-associated very high-energy blazar ...
PPTX
Presentation1 INTRODUCTION TO ENZYMES.pptx
PDF
CHAPTER 2 The Chemical Basis of Life Lecture Outline.pdf
PDF
CHAPTER 3 Cell Structures and Their Functions Lecture Outline.pdf
PPTX
BODY FLUIDS AND CIRCULATION class 11 .pptx
PPTX
SCIENCE 4 Q2W5 PPT.pptx Lesson About Plnts and animals and their habitat
PPTX
limit test definition and all limit tests
PDF
Packaging materials of fruits and vegetables
PDF
Assessment of environmental effects of quarrying in Kitengela subcountyof Kaj...
PDF
Wound infection.pdfWound infection.pdf123
PDF
Is Earendel a Star Cluster?: Metal-poor Globular Cluster Progenitors at z ∼ 6
PPT
LEC Synthetic Biology and its application.ppt
PPTX
gene cloning powerpoint for general biology 2
PPTX
GREEN FIELDS SCHOOL PPT ON HOLIDAY HOMEWORK
PPT
1. INTRODUCTION TO EPIDEMIOLOGY.pptx for community medicine
PPTX
A powerpoint on colorectal cancer with brief background
Worlds Next Door: A Candidate Giant Planet Imaged in the Habitable Zone of ↵ ...
TORCH INFECTIONS in pregnancy with toxoplasma
endocrine - management of adrenal incidentaloma.pptx
Heredity-grade-9 Heredity-grade-9. Heredity-grade-9.
Looking into the jet cone of the neutrino-associated very high-energy blazar ...
Presentation1 INTRODUCTION TO ENZYMES.pptx
CHAPTER 2 The Chemical Basis of Life Lecture Outline.pdf
CHAPTER 3 Cell Structures and Their Functions Lecture Outline.pdf
BODY FLUIDS AND CIRCULATION class 11 .pptx
SCIENCE 4 Q2W5 PPT.pptx Lesson About Plnts and animals and their habitat
limit test definition and all limit tests
Packaging materials of fruits and vegetables
Assessment of environmental effects of quarrying in Kitengela subcountyof Kaj...
Wound infection.pdfWound infection.pdf123
Is Earendel a Star Cluster?: Metal-poor Globular Cluster Progenitors at z ∼ 6
LEC Synthetic Biology and its application.ppt
gene cloning powerpoint for general biology 2
GREEN FIELDS SCHOOL PPT ON HOLIDAY HOMEWORK
1. INTRODUCTION TO EPIDEMIOLOGY.pptx for community medicine
A powerpoint on colorectal cancer with brief background
Ad

4)central tendency and dispersion biostatistics

  • 1. Measures of central tendency  Measure of central tendency are summary statistics used to indicate the central location of group a of data values.  It may also be called a center or location of the distribution. Objectives of central tendency To facilitate comparison. To describe characteristic of entire group. To help in decision making. To know about universe from a sample.
  • 2. 2. Requisites for an ideal measure of central tendency: It should be rigidly defined. It should be simple to understand. It should be easy to calculate. It should be suitable for further mathematical treatment. It should be least affected by fluctuation of sample. It should not be affected by extreme observations. 3. Types of central tendency: Mean (Mathematical average) Median (Positional average) Mode (Positional average)
  • 3. A. Arithmetic mean: Arithmetic mean (A.M) of a set of data may be defined as the sum of observation divided by the number of observation. The mean is the most commonly used measure of central tendency. = = 1. Individual series = (a) (direct method) (b) = A + (Short cut method)
  • 4. 2.Discrete series = (a) (Direct method) = (Shortcut method) (b) 3. Continuous series = (a) Where, A=assumed mean d=X-A and N=total frequency Where, m=mid value (b) = (Direct method) A + Where, A = assumed mean d=X-A and N=total frequency (c) A + = A + X h Where, d' = h=class size
  • 5. The health expenditure of 5 families in rupees are given below. Family A B C D Health expenditure (RS) 3000 4000 1500 3500 Calculate arithmetic mean Solution: Family Items Health expenditures(RS) A 3000 B 4000 C 1500 D 3500 Total Ʃ𝑋 = 12000 Here, n=4, Ʃ𝑋 = 12000 A.M = 𝑋 = Ʃ𝑋 𝑛 = 12000 4 =3000
  • 8. Combined mean If X1 and X2 are the mean of two different groups having frequencies N1 and N2,the combined mean is given by: = X1N1 + X2 N2 N1+N2
  • 9. Example: Q. The mean monthly salary of 10 lady doctors is Rs. 400 per month and that of 20 male doctors is Rs. 600 per month, calculate the mean monthly salary of all doctors taken together. solution Here, N1 = 10 N2= 20 X1 X2 = = 400 600 Hence, X1 X2 N1 + N2 N1+N2 = = 10 x 400 20 x 600 + 10 + 20 = 4000 + 12000 30 533.33 = Therefore mean monthly salary = Rs.533.33
  • 10. Weighted arithmetic mean The arithmetic average discussed above is simple arithmetic average in which all the items are assumed to be equally important in the distribution. But in practice, this may not be so. The importance of some items in a distribution may be greater than the other. so in such cases proper weightage should be given to various items. Now, we defined the following weighted arithmetic average in which proper weight is considered. Let be the weights given to the variate values X1,x2,x3……..;xn respectively, Then, their weighted arithmetic mean denoted by xw is defined by, W1 ,W2 ,W3,………;Wn xw = w1x1+w2x2+……+wnxn w1+w2+……+wn =
  • 11. Example: Q. A contractor employees three types of workers- male, female and children .To a male Worker he pays Rs.10 per day, to a female worker Rs.8 per day and to a child worker Rs.3 per day . If the number of male, female and child workers employees is 20, 15 and 5 respectively . What is the average wage? solution Here ,the suitable average is weighted mean. We have, xw = Calculation of weighted average Wages per day(X) No. of workers(W) (WX) 10 20 200 8 15 120 3 5 15 W=40 WX=335 Hence, the average wage is Rs.8.38
  • 12. Geometric mean: The geometric mean of the n non-zero and non-negative variate values is the nth root of their product. G= It is used for rates ,ratios ,percentage variate values and exponentially expressed values. x1.x2.x3…………..xn n
  • 13. Harmonic mean: Harmonic mean is the reciprocal of arithmetic mean of the reciprocal of the set of non-zero Variate values .Harmonic mean is used for rates and ratios type of variables. H= n ∑
  • 14. Merits And Demerits Of Arithmetic Mean MERITS : It is rigidly defined. It is based on all observations. It is simple to understand and easy to calculate. It is suitable for further mathematical treatments. It is least affected by fluctuation of sampling. 2.DEMERITS : It is very much affected by extreme observations. It cannot be computed in case of open end classes. It gives sometimes fallacious conclusion. It cannot be determined by inspection or by graphical method. It cannot be used if we are dealing with qualitative characteristics which cannot be measured quantitatively.
  • 15. B. MEDIAN The values which divides the distribution into two equal parts, provided the observations are arranged in the order of magnitude. If the number of observations in a series is odd , then the median is the middle value and if the number of the observation is even, then the median is the midpoint between the two middle values. A.For Individual series: Arrange the data in ascending order or descending order of magnitude and apply the formulae, Median= Size of th item
  • 18. B. For Discrete series : Arrange the data in ascending or descending order of magnitude. Obtain c.f. (cumulative frequency). Apply the formulae, M.d. = size of Now look at the c.f. column and see that value which is either equal to or th item greater than ,this gives the value of C. For continuous series: Median = L+ _ c.f f x h Median.
  • 21. 1.Merits: Easy to understand and simple to compute. Is rigidly defined. Not influenced by the extreme values. Can be computed even for open . Median is only the average to be used while dealing with while Dealing with qualitative characteristics such as intelligence, beauty etc. 2.Demerits: Arrangement of data is necessary. Not based on all the observations. Not suitable for further mathematical treatment. Exact value cannot be determined in case of even numbers.
  • 22. C. Mode: Mode is that variate value which repeats maximum number of times . It is used in business for forecasting rates of goods ,But rarely used in medical sciences. In case of individual and discrete series the mode can easily found out by inspection But in case of continuous series, we use the following formula to calculate the mode. Mode= L + 1 1 2 + x h Where L= lower limit of modal class 1 = f1 – f0 2 = f1 – f2 f1 = maximum frequency, f2 = frequency following the modal class, f0 = frequency preceding modal class , h= size of modal class
  • 26. Merits: It is easy to calculate and simple to understand. It is not affected by extreme observations. It can be calculated for open end classes. It can be obtained by inspection or by graph. Demerits : Mode is not rigidly defined. mode is not based on all observations. Mode is not suitable for further mathematical treatment. Mode is affected by fluctuation of sampling.
  • 27. Relation between various measures of central tendency: Mode = 3Median – 2Mean ( Empirical relationship) A.M ≥ G.M ≥ H.M 1. 2.
  • 28. The mean weight of 150 students in a certain class is 60 kg .The mean weight of boys in the class is 70 kg and that of girls is 55 kg .Find the no. of boys and girls in the class. Q. Here, = 60 N1 = 150 N2 + N1 = N2 150 - X1 = 70 X2 = 55 N1 N2 = ? = ? We have, = N1X1 + N2X2 N1 + N2 ……….. ( i )
  • 29. Or, 60 = 70 N1 + 55 N2 150 Or, 9000 = 70 N1 + 55 N2……………. ( ii ) Putting N1 = ( 150 - N2 ) in eq n ( ii ) 9000= 70 (150 -N2) + 55 N2 9000 = 10500 - 70 N2+ 55 N2 15N2 = 10500 - 9000 15 N2 = 1500 N2 = 100 Or, Or, Or, Or, . . . Putting value of N2 In eq n ( i ) N1 N2 + = 150 N1 + 100 = 150 . . . N1 = 50 Hence, N1= 50 and N2 = 100
  • 30. Selection of an average: No single average is suitable for all conditions. The selection of an average depends upon the nature of the data. 1. All mean, median and mode can be used for symmetrical data. 2. Mean is suggested when: i) The average of quantitative data is to be calculated but it should not be used on the following condition: a. When the data is highly skewed. b. When the distribution have open end classes. c. When the distribution have extreme observations. 3. Median is used when: i) Qualitative data such as intelligence, beauty etc. ii)Open end classes iii)Highly skewed distribution. 4. Mode: It is useful for most repeated value, particularly in business and for highly skewed distribution.
  • 31. 5.Geometric Mean: i) It is used to calculate average rates ratio and percentage. ii) Construction of index number. 6.Harmonic Mean: i) It is used in computing averages related to rates and ratios where time factor is available.
  • 32. Numerical problems: 1. In a class of 50 students 10 have failed and their average of marks is 2.5 .The total marks secured by the entire class were 281. Find the average marks of those who have passed. (Ans=6.4) 2. The mean age of combined group of men and women is 30 years. If the mean age of the group of men is 32 and that of group of women is 27. Find out the percentage of men and women in the group.(60 n 40) 3. Mean of 100 items was 50. Later on it was found that two items were misread as 92 and 8 instead of 192 and 88. Find the correct mean. (Ans=51.8) 4. Arithmetic mean of 98 items is 50. Two items 60 and 70 were left out at the time of calculation. What is the correct mean of all the items? (Ans=50.3)
  • 35. Measure of position (partition) values: The partition value are those variate values which divides the total number of observations into equal number of parts .The equal number of parts may be four, ten, hundred etc. 1. Quartiles Quartiles are those positional values which divides the ordered series into four equal parts, so there are 3 quartiles. Q1 Q2 Q3 Lowest value Highest value 0% 25% 75% 50% 100% Presentation of data by quartiles Md = fig:
  • 36. a. First quartile (lower quartile): Q1 has 25% observation before it and 75% observation after it. b. Second quartile (Median): Q2 Coincides with Median. Median divides the total number of observations into two halves. c. Third quartile: Q3 has 75% observation before it and 25% observation after it.
  • 37. a. For individual and discrete series: i(N+1) 4 th Qi = Value of item Where i =1,2,3 b. For continuous series Q i = L + iN 4 f - c.f Where i= 1,2,3 x h
  • 38. Deciles: Deciles are those positional values which divides the ordered series into 10 equal parts. So there are 9 deciles. a. For individual series and discrete series: Dj = Value of j (N+1) 10 th item Where, j =1,2,3……9 b. For continuous series: Dj = L + jN - c.f 10 f x h Where, j =1,2,3……9
  • 39. Percentiles: Percentiles are those positional values which divides the ordered series into 100 equal parts. So, there are 99 percentiles. a. For individual series and discrete series: PK = Value of k (N+1) 100 th item Where, k =1,2,3……99 b. For continuous series: Pk= L + kN 100 f x h Where, k =1,2,3……99
  • 40. Measures of dispersion(variability) Averages gives us the idea of concentration of the items around the central part of distribution. But the averages do not give the clear picture about the distribution because two distribution with same averages may differ in the scatterness of the items from the central value. X Mo Md A 25 26 27 27 27 28 29 27 27 27 B 0 10 18 27 27 27 80 27 27 27 From the above table, We see that mean , median and mode of two series A and B are same. Only with these results we cannot say that the two series A and B are similar. Because, the difference of the items from the average in B is more in comparison to A. so, in series A, items are concentrating more around the central value but the scatterness of the items from the central value in series B is more. Hence, though two series A and B have same averages, they cannot be said similar because they are differently constituted.
  • 41. Definition : Dispersion is the scatterness of the items from central value or measure of variation of the items from the central value. Main objects of measuring variability are: 1. To determine the reliability of an average. 2. To compare two or more series with regard to their variability. 3. To help in using other statistical terms.
  • 42. Absolute Measure Of Dispersion: A measure of dispersion is said to be an absolute if it is expressed in terms of original units of data. Relative Measure Of Dispersion: A measure of dispersion is said to be relative if it is independent of units of the data.
  • 43. Requisites of ideal measure of dispersion: It should be rigidly defined. It should be simple to understand and easy to calculate. It should be based on all observations. It should be suitable for further mathematical treatment. It should be least affected by fluctuation of sampling. It should not be affected by extreme values. Method of measuring dispersion: 1. Range 2. Quartile deviation (Semi-interquartile range) 3. Mean deviation(Average deviation) 4. Standard deviation
  • 44. Range: Range is the simplest measure of dispersion. Range is defined as the differences between the largest item and smallest item in a set of observation. Range= Largest item – Smallest item = L - S Coefficient of Range: Coefficient of range is the relative measure correspond to range. It can be used to compare two distribution with different units. L - S L + S Coefficient of Range =
  • 45. Merits: It is rigidly defined. It is simple to understand and easy to calculate. Variation can be understood in short time. Demerits It is not based on all observation. It is affected by fluctuation by sampling. It is affected by extreme values. It is not suitable for further mathematical treatment. It cannot be calculated in case of open end classes.
  • 46. Quartile deviation (Semi-interquartile range) Interquartile range = Q3-Q1 Quartile deviation(Q.D) = 𝟏 𝟐 (Q3−Q1) Coefficient of Q.D = Q3−Q1 Q3+Q1 Merits • Rigidly defined • Not affected by extreme values • Can be calculated in open-end class distribution • It is better measure of dispersion in comparison to range as it is based on 50% of central values.
  • 47. Demerits: • It is not based on all observation • It is affected by fluctuation of sampling. • It is not suitable for further mathematical treatment. Example: find Q.D of given data. 2, 4, 6, 8, 10, 12, 14 Q3 = value of 3( 𝑵+𝟏 𝟒 ) item = 6th item = 12 th We have, Q1 = value of ( 𝑵+𝟏 𝟒 )th item = ( 𝟕+𝟏 𝟒 )𝐭𝐡 item = 2nd item = 4 Quartile deviation(Q.D) = 𝟏 𝟐 (Q3−Q1) = 4
  • 48. Mean deviation (M.D) Mean deviation from AM For individual series For discrete and continuous series Where A= mean, median and mode Coefficient of mean deviation from mean = 𝒎𝒆𝒂𝒏 𝒅𝒆𝒗𝒊𝒂𝒕𝒊𝒐𝒏 𝒇𝒓𝒐𝒎 𝒎𝒆𝒂𝒏 𝒎𝒆𝒂𝒏 Coefficient of mean deviation from median = 𝒎𝒆𝒂𝒏 𝒅𝒆𝒗𝒊𝒂𝒕𝒊𝒐𝒏 𝒇𝒓𝒐𝒎 𝒎𝒆𝒅𝒊𝒂𝒏 𝒎𝒆𝒅𝒊𝒂𝒏
  • 51. Merits: 1. Based on all observation. 2. It is easy to understand and calculate. 3. It is less affected by extreme items as compared to other dispersions. Demerits: 1. Cannot be computed in open end classes. 2. It is less reliable because of ignoring the sign. 3. It is not suitable for further mathematical treatment. 4. Not suitable measure when mode is ill defined.
  • 52. Standard Deviation (Root-Mean square deviation) It is said to be the best measure of dispersion as it satisfies most of the requisites of good measure of dispersion. Definition: It is defined as the positive square root of the mean of the square of deviations taken from the arithmetic mean .It is denoted by . 1. For individual series: = ∑ (x-x)2 n Where, n= total no. of observation
  • 53. 2. For discrete series: = f(x-x) 2 N ∑ Where, N= total frequency 3. For continuous series: = f(m-x ) ∑ N 2 Where, N= total frequency
  • 54. Merits:  It is rigidly defined.  It is based on all observation.  It is least affected by fluctuation of sampling.  It is suitable for further mathematical treatment.  It helps in calculating standard error. Demerits: It is difficult to compute. It is very much affected by extreme values. It cannot be calculated for open end classes.
  • 58. Variance: The square of standard deviation is called variance. 2 = ∑f ( x – x ) N 2 Coefficient of variance: Coefficient of variance is relative measure of finding out dispersion . It is the ratio of standard deviation and mean expressed as percent. Coefficient of variance (C.V) = X X 100 It is independent of unit .So, two distributions can be easily compared with the help of coefficient of variance. Less the C.V , more will be the uniformity, consistency etc. More the C.V , less will be the uniformity, consistency etc.
  • 59. In two series of adults aged 21 years and children 3 month old following values were obtained for height. Find which series shows greater variation? Persons Mean height SD Adults 160 cm 10 cm children 60 cm 5 cm C.V = X X 100 C.V of adults= 10 160 X 100 = 6.25% C.V of children= 5 60 X 100 =8.33% Hence C.V of children > C.V of adults i.e. the height of children shows greater variation than the height of adults. Q. Example:
  • 60. Q. Suppose two group of human males yield the following information. Group A Group B Age 24years 15years Mean weight 145lbs 80lbs Variance 100lbs 100lbs Find which is more variable, the weight of 24 years old or the weight of 15 years old? Solution, For group A For group B X1 = 145lbs X2 =80lbs 1 2 =100 2 2 =100 1 2 =10 =10 C.V = 1 X1 X 100 C.V = 2 X2 X 100 = 10 145 X 100 = 6.9% = 10 80 X 100 =12.5%
  • 61. Here ,C.V of B is greater than C.V of A i.e. the weight of 15 years has more variation than the weight of 24 years old.