SlideShare a Scribd company logo
UNIT 1
AVERAGES
WHAT IS AN AVERAGE?
• SINGLE VALUE
• USED TO REPRESENT ALL VALUES IN A DATA SERIES
• LIES IN BETWEEN THE EXTREME VALUES
• ALSO KNOWN AS MEASURE OF CENTRAL TENDENCY
WHY DETERMINE THE AVERAGE?
• DESCRIBE THE CHARACTERISTICS OF A SERIES
• DRAW STATISTICAL INFERENCES
• COMPARE TWO OR MORE SERIES
• FACILITATE IMPORTANT DECISIONS
IDEAL CHARACTERISTICS OF AN AVERAGE
• SIMPLE TO COMPUTE
• CLOSE REPRESENTATION
• OBJECTIVELY DETERMINED
• CAPABLE OF FURTHER ALGAEBRIC OPERATIONS
STAT PPT 1 AVERAGES (useful in analysing)
MATHEMATICAL AVERAGES
CALCULATED USING AN ALGAEBRIC FORMAULA
BASED ON ALL VALUES OF THE DATA SERIES
ARITHMETIC MEAN
• DEFINED AS SUM OF ALL OBSERVATIONS DIVIDED BY THE TOTAL NUMBER
OF OBSERVATIONS
• SIMPLEST FORM OF AVERAGE
• MOST POPULARLY USED
• CAN BE COMPUTED FOR INDIVIDUAL/DISCRETE AND CONTINUOUS SERIES
ROLL NO. MARKS
(X)
DEVIATIONS
(X-A)
1 5 -20
2 15 -10
3 25 0
4 35 10
5 45 20
6 55 30
TOTAL 180 30
MEAN CALCULATION USING DIRECT METHOD:
ҧ
𝑥 = ෍
𝑥
𝑛
= 180/6
= 30
MEAN CALCULATION USING ASSUMED MEAN
METHOD:
ҧ
𝑥 = 𝐴 + ൗ
σ 𝑑
𝑛
= 25 + 30/6
=25 + 5
=30
= A
EXAMPLE 1: INDIVIDUAL SERIES
DISCRETE SERIES
FEW DATA POINTS ALONG WITH FREQUENCY
• DIRECT METHOD
ҧ
𝑥 = ൗ
𝛴𝑓𝑥
𝛴𝑓
WHERE,
ҧ
𝑥 IS A.M.
σ 𝑓𝑋 IS SUM OF PRODUCT OF ALL
OBSERVATIONS WITH THE FREQUENCY
𝛴𝑓 IS THE TOTAL NUMBER OF
OBSERVATIONS
• ASSUMED MEAN METHOD
ҧ
𝑥 = 𝐴 + ൘
σ f𝑑
𝛴𝑓
WHERE,
ҧ
𝑥 IS A.M.
A IS THE ASSUMED MEAN VALUE
σ f𝑑 IS SUM OF THE PRODUCT OF
FREQUENCY AND DEVIATIONS FROM
THE ASSUMED MEAN VALUE
𝛴𝑓 IS THE TOTAL NUMBER OF
OBSERVATIONS
MARKS
(X)
FREQUEN
CY (f)
DEVIATIONS
d = (X-A)
fx fd
5 10 -20 50 -200
15 20 -10 300 -200
25 30 0 750 0
35 50 10 1750 500
45 40 20 1800 800
55 30 30 1650 900
180 6300 1800
MEAN CALCULATION USING DIRECT METHOD:
ҧ
𝑥 = ൗ
𝛴𝑓𝑥
𝛴𝑓
= 6300/180
= 35
MEAN CALCULATION USING ASSUMED MEAN
METHOD:
ҧ
𝑥 = 𝐴 + ൘
σ f𝑑
𝛴𝑓
= 25 + 1800/180
=25 + 10
=35
= A
EXAMPLE 2: DISCRETE SERIES
CONTINUOUS SERIES
CLASS INTERVAL AND FREQUENCY IS GIVEN
• DIRECT METHOD
ҧ
𝑥 = ൗ
𝛴𝑓𝑚
𝛴𝑓
WHERE,
ҧ
𝑥 IS A.M.
σ 𝑓𝑚 IS SUM OF PRODUCT OF
MIDPOINT OF CLASS INTERVAL WITH
THE FREQUENCY
𝛴𝑓 IS THE TOTAL NUMBER OF
OBSERVATIONS
• ASSUMED MEAN METHOD
ҧ
𝑥 = 𝐴 + ൘
σ f𝑑
𝛴𝑓
WHERE,
ҧ
𝑥 IS A.M.
A IS THE ASSUMED MEAN VALUE
σ f𝑑 IS SUM OF THE PRODUCT OF
FREQUENCY AND DEVIATIONS FROM
THE ASSUMED MEAN VALUE
𝛴𝑓 IS THE TOTAL NUMBER OF
OBSERVATIONS
• STEP DEVIATION METHOD
ҧ
𝑥 = 𝐴 + ൘
{σ f𝑑′
𝛴𝑓} 𝑥 𝑖
WHERE,
ҧ
𝑥 IS A.M.
A IS THE ASSUMED MEAN VALUE
σ f𝑑’ IS SUM OF THE PRODUCT OF FREQUENCY AND DEVIATIONS FROM THE
ASSUMED MEAN VALUE
𝛴𝑓 IS THE TOTAL NUMBER OF OBSERVATIONS
i IS THE CLASS SIZE
CONTINUOUS SERIES
CLASS INTERVAL AND FREQUENCY IS GIVEN
EXAMPLE 3
CLASS
INTERVAL
MIDPOINT
(m)
FREQUENCY
(f)
DEVIATIONS
d = (m-A)
d’ = (m-A)/i fm fd fd’
0-10 5 10 -20 -2 50 -200 -20
10-20 15 20 -10 -1 300 -200 -20
20-30 25 30 0 0 750 0 0
30-40 35 50 10 1 1750 500 50
40-50 45 40 20 2 1800 800 80
50-60 55 30 30 3 1650 900 90
180 6300 1800 180
= A
SOLUTION USING STEP DEVIATION
METHOD
• STEP DEVIATION METHOD
ҧ
𝑥 = 𝐴 + ൘
{σ f𝑑′
𝛴𝑓} 𝑥 𝑖
= 25 + [180/180] x 10
= 35
OBSERVATIONS
• USING THE SAME DATA, WE GET SAME ANSWER WITH ALL THE
METHODS
• THIS IS THE OBJECTIVITY TRAIT OF THE METHOD
• TYPE OF SERIES HAS AN EFFECT ON CHOICE OF METHOD
• USE ANY METHOD UNLESS SPECIFIED
• IN CASE OF CONTINUOUS SERIES, STEP DEVIATION METHOD
IS MOST APPROPRIATE
PROBLEM 1
DAILY
EXPENSE
FREQUENCY
40-59 50
60-79 ?
80-99 500
100-119 ?
120- 139 50
FIND THE MISSING FREQUENCY FROM THE GIVEN DATA OF 1000 FAMILIES:
THE VALUE OF MEAN IS 87.5.
PROBLEM 1
CALCULATION OF MEAN
DAILY
EXPENSE
CLASS
INTERVAL
FREQUENCY MIDPOINT
(m)
d’ = (m – A)/i fd’
40-59 39.5 – 59.5 50 49.5 -2 - 100
60-79 59.5 – 79.5 X 69.5 -1 - X
80-99 79.5 – 99.5 500 A = 89.5 0 0
100-119 99.5 – 119.5 400 – X 109.5 1 400 - X
120- 139 119.5 – 139.5 50 129.5 2 100
TOTAL = 1000 400 – 2X
SOLUTION
• STEP DEVIATION METHOD
ҧ
𝑥 = 𝐴 + ൘
{σ f𝑑′
𝛴𝑓} 𝑥 𝑖
87.5 = 89.5 + [400 -2X/1000] x 20
X = 250
400 – X = 150
DAILY
EXPENSE
FREQUENCY
40-59 50
60-79 250
80-99 500
100-119 150
120- 139 50
PROBLEM 2
THE SUM OF DEVIATIONS OF A CERTAIN NUMBER OF
OBSERVATIONS MEASURED FROM 4 IS 72 AND THE SUM OF
DEVIATIONS OF SAME OBSERVATIONS MEASURED FROM 7 IS -3.
FIND THE NUMBER OF OBSERVATIONS AND THEIR MEAN.
SOLUTION
THE SUM OF DEVIATIONS IS EXPLAINED AS:
෍ 𝑑 = ෍(𝑋 − 𝐴)
ACCORDINGLY, σ 𝑋 − 4 = 72 𝐴𝑁𝐷 σ 𝑋 − 7 = −3
LET n BE THE NUMBER OF OBSERVATIONS AND ҧ
𝑥 BE THE MEAN
OF THE SERIES
VALUE CAN BE DETERMINED BY FORMING THE EQUATION
SOLUTION
EQUATION 1
ҧ
𝑥 = 𝐴 + ൗ
σ 𝑑
𝑛
ҧ
𝑥 = 4 + ൗ
72
𝑛
n ҧ
𝑥 − 4n = 72
EQUATION 2
ҧ
𝑥 = 𝐴 + ൗ
σ 𝑑
𝑛
ҧ
𝑥 = 7 + ൗ
− 3
𝑛
n ҧ
𝑥 − 7n = - 3
SOLVING THE TWO EQUATIONS SIMULTANEOUSLY, n = 25 AND ത
𝐱 = 6.88
MATHEMATICAL PROPERTIES OF A.M.
• 𝛴 𝑥 − ഥ
𝑥 = 0
• 𝛴 𝑥 − ഥ
𝑥 2 IS MINIMUM
• IF EACH VALUE OF THE DATA SERIES IS CHANGED (+/ - / x / ÷)
BY A CONSTANT, THE MEAN VALUE UNDERGOES A SIMILAR
CHANGE
• CALCULATION OF COMBINED AVERAGE
COMBINED MEAN = N1X1 + N2X2/ N1 +N2
PROBLEM 3
THE AVERAGE MONTHLY WAGE OF ALL WORKERS IN A FACTORY IS Rs.
444. IF THE AVERAGE WAGES PAID TO MALE AND FEMALE ARE Rs. 480
AND Rs. 360 RESPECTIVELY, FIND THE PERCENTAGE OF MALE AND
FEMALE WORKERS IN THE FACTORY.
SOLUTION:
X1 = 480 ; X2 = 360; X12 = 444
LET THE NUMBER OF MALE AND FEMALE WORKERS BE N1 AND N2
RESPECTIVELY
SOLUTION
COMBINED MEAN = N1X1 + N2X2/ N1 +N2
444 = 480N1 + 360N2/ N1 +N2
36N1 - 84N2 = 0
N1/ N2 =84/36
N1: N2 =7 : 3
THUS, THE PERCENTAGE OF MALE AND FEMALE WORKERS IN
THE FACTORY IS 70% AND 30% RESPECTIVELY.
PROBLEM 4
THE MEAN MARKS OF 99 STUDENTS WAS FOUND TO BE 60.
LATER, IT IS DISCOVERED THAT A SCORE OF 35 WAS MISREAD AS
53, ANOTHER SCORE WAS TAKEN AS 63 INSTEAD OF 36 AND
SCORE OF 5 WAS NOT TAKEN INTO ACCOUNT. FIND THE
CORRECT MEAN.
n = 99 ; X = 60
ACTUAL WRONG DIFFERENCE CHANGE IN n
35 53 -18 --
36 63 - 27 --
5 0 + 5 + 1
TOTAL = - 40 +1
SOLUTION
• SUM OF ALL OBSERVATIONS = n X MEAN = 99 X 60 = 5940
• CHANGE IN n = 99 +1 = 100
• CORRECT MEAN = 5940 -40 / 100 = 59
WEIGHTED A.M.
• DIFFERENT WEIGHTS ( RELATIVE IMPORTANCE)
• ALL THE DATA IN A SERIES VALUES ARE ASSIGNED WEIGHTS
• USEFUL IN CONSTRUCTION OF INDEX NUMBERS;
STANDARDISED RATES LIKE BIRTH RATE
FORMULA AND CALCULATION
1.MULTIPLY THE NUMBERS IN THE GIVEN DATA SET BY THEIR
RESPECTIVE WEIGHTS
2.SUM UP THE PRODUCTS
3.DIVIDE THE SUM OF THE PRODUCTS OBTAINED BY THE
TOTAL OF THE WEIGHTS
ഥ
𝒙𝒘 =
σ 𝒘𝒙
𝜮𝒘
PROBLEM 5
• A student takes three 100-point exams in statistics and score 80, 80 and 95.
The last exam is much easier than the first two, so the teacher has assigned it
less weight. The weights for the three exams are:
• Exam 1: 40 % of grade obtained.
• Exam 2: 40 % of grade obtained.
• Exam 3: 20 % of grade obtained.
• What is the student’s final weighted average for statistics?
SOLUTION
SCORE WEIGHTS PRODUCT
80 0.4 32
80 0.4 32
95 0.2 19
1.0 83
ഥ
𝒙𝒘 =
σ 𝒘𝒙
𝜮𝒘
ഥ
𝒙𝒘 =
𝟖𝟑
𝟏
THUS, THE FINAL GRADE OF THE
STUDENT IS 83.
LIMITATIONS OF A.M.
• UNDULY AFFECTED BY EXTREME VALUES
• IN OPEN DISTRIBUTION A.M. MAY NOT BE THE APPROPRIATE
MEASURE
• NON-HOMOGENOUS DATA, AVERAGE MAY GIVE MISLEADING
CONCLUSION
GEOMETRIC MEAN
• DEFINED AS THE nth OF THE PRODUCT OF ALL THE n ITEMS IN
A SERIES
• USED TO MEASURE THE AVERAGE CHANGE IN A VARIABLE
OVERA PERIOD OF TIME
• G.M. = 𝑥1 ⋅ 𝑥2 −−−−−− ⋅ 𝑥𝑛
USE OF LOG AND ANTILOG
• TO SIMPLIFY THE CALCULATION OF G.M. WE USE LOG AND
ANTILOG
• INDIVIDUAL SERIES:
G.M. = A.L. (𝛴log X)/n
• DISCRETE SERIES:
G.M. = A.L. (𝛴𝑓x log X)/n
• CONTINUOUS SERIES:
G.M. = A.L. (𝛴 𝑓 x log m)/n
PROBLEM 6
THE ANNUAL RATE OF DEPRECIATION OF A MACHINE IS 20% IN
YEAR 1, 15% IN YEAR 2, 10% IN YEAR 3 AND 5% IN THE 4TH YEAR.
CALCULATE THE AVERAGE ANNUAL RATE OF DEPRECIATION.
YEAR RATE OF DEP BASE YEAR VALUE (X) log X
I 20 80 1.9031
II 15 85 1.9294
III 10 90 1.9542
IV 5 95 1.9777
7.7644
SOLUTION
INDIVIDUAL SERIES:
G.M. = A.L. (𝛴log X)/n
= A.L. ( 7.7644/ 4)
= A.L. (1.9411)
= 0.8732 OR 87.32%
AVERAGE ANNUAL DEPRECIATION = 100 – 87.32 = 12.68%
HARMONIC MEAN
• DEFINED AS RECIPROCAL OF THE A.M. OF THE RECIPROCALS OF THE ITEMS IN
A SERIES
• USEFUL TO ASSIGN LARGER WEIGHTS TO SMALLER VALUES
• INDIVIDUAL SERIES:
H.M. = n ÷ 𝛴(1/X)
• DISCRETE SERIES:
H.M. = 𝛴𝑓 ÷ (𝛴𝑓 /𝑋)
• CONTINUOUS SERIES:
H.M. = 𝛴𝑓 ÷ (𝛴𝑓 /m)
PROBLEM 7
IN A CERTAIN OFFICE A LETTER IS TYPED BY A IN 4 MINUTES.
THE SAME LETTER IS TYPED BY B, C, D IN 5, 6 AND 10 MINUTES
RESPECTIVELY. WHAT IS THE AVERAGE TIME TAKEN IN
COMPLETING A LETTER? HOW MANY LETTERS CAN BE TYPED IN
A WORKING DAY COMPRISING 8 HOURS?
SOLUTION
EMPLOYEE TIME TAKEN
(X)
(1/X)
A 4 0.250
B 5 0.200
C 6 0.167
D 10 0.100
0.717
(i)
H.M. = n ÷ 𝛴(1/X) ; n =4
= 4/0.717
= 5.58 MINUTES PER LETTER
(ii) NO. OF LETTERS TYPED IN AN 8
HOUR DAY
= (8 X 60) /5.58 = 86 LETTERS
RELATIONSHIP BETWEEN
A.M. ,G.M. AND H.M.
• FOR ANY SET OF UNEQUAL POSITIVE NUMBERS, THE FOLLOWING RELATIONSHIP
HOLDS:
A.M. > G.M. > H.M.
• IF ALL VALUES IN THE DATA SET ARE EQUAL, THEN THE THREE AVERAGES WOULD
ALSO BE EQUAL
• THUS, GEOMETRIC MEAN OF TWO NUMBERS IS EQUAL TO THE GEOMETRIC MEAN
OF THEIR ARITHMETIC AND HARMONIC MEAN.
POSITIONAL AVERAGES
REPRESENTS THE DATA SERIES
NOT BASED ON ALL ITEMS OF THE SERIES
CONSIDERED AS AVERAGE DUE TO SPECIFIC PROPERTY/POSITION
MATHEMATICAL vs. POSITIONAL AVERAGES
• THE POSITIONAL AVERAGES ARE NOT BASED ON ALL OBSERVATIONS BUT THE
MATHEMATICAL AVERAGES ARE BASED ON THE WHOLE DATA SET
• COMPUTATION OF MATHEMATICAL AVERAGES CALLS FOR APPLICATION OF WELL-
DEFINED FORMULAE WHILE POSITIONAL AVERAGES CAN BE OBTAINED BY MERE
INSPECTION (IN RAW DATA AND IN DISCRETE FREQUENCY DISTRIBUTIONS)
• WHILE POSITIONAL AVERAGES CAN BE OBTAINED BY USING GRAPHS (IN CASE OF
CONTINUOUS FREQUENCY DISTRIBUTIONS), THE MATHEMATICAL AVERAGES
NEED TO BE CALCULATED
MATHEMATICAL vs. POSITIONAL AVERAGES
• THE POSITIONAL AVERAGES CAN BE CALCULATED IN OPEN-ENDED
DISTRIBUTIONS WHILE THE MATHEMATICAL AVERAGES CANNOT BE
• ALGEBRAIC MANIPULATION OF POSITIONAL AVERAGES IS NOT POSSIBLE. ON THE
OTHER HAND, THE MATHEMATICAL AVERAGES ARE AMENABLE TO FURTHER
ALGEBRAIC TREATMENT AS THEY HAVE WELL-DEFINED MATHEMATICAL
PROPERTIES
• THE MATHEMATICAL AVERAGES CAN BE DISTINGUISHED AS BEING SIMPLE AND
WEIGHTED. THERE IS NO QUESTION OF ANY POSITIONAL AVERAGES BEING
‘WEIGHTED’.
MATHEMATICAL vs. POSITIONAL AVERAGES
• IN COMPARISON TO MATHEMATICAL AVERAGES, THE POSITIONAL AVERAGES ARE
AFFECTED TO A LARGER EXTENT BY FLUCTUATIONS OF SAMPLING
• SAMPLING FLUCTUATIONS MEAN FLUCTUATIONS IN THE VALUES OF AN AVERAGE
OBTAINED FROM DIFFERENT SAMPLES OF SAME SIZE TAKEN FROM A GIVEN
POPULATION
MEDIAN
• MIDDLE VALUE OF A DISTRIBUTION
• 50% PERCENT OF THE VALUES LIE BELOW THE MEDIAN AND
50% PERCENT VALUES LIES ABOVE IT
• MEDIAN SPLITS THE DATA SERIES INTO TWO EQUAL HALVES
FORMULA
• INDIVIDUAL SERIES ( n IS ODD)
(i) ARRANGE ALL THE ITEMS IN A SERIES
IN A LOGICAL ORDER
(ii) THE MEDIAN POSITION WILL BE
CALCULATED AS FOLLOWS:
MEDIAN POSITION = [(n +1)/2]TH ITEM
WHERE,
n IS THE TOTAL NUMBER OF OBSERVATIONS
THUS, MIDDLE VALUE AMONG THE
ARRANGED DATA SET IS THE VALUE OF
MEAN
• INDIVIDUAL SERIES (n IS EVEN)
(i) ARRANGE ALL THE ITEMS IN A SERIES
IN A LOGICAL ORDER
(ii) THE MEDIAN POSITION WILL BE
CALCULATED AS FOLLOWS:
MEDIAN POSITION = [(n +1)/2]TH ITEM
n IS THE TOTAL NUMBER OF OBSERVATIONS
MEDIAN VALUE = MEAN{[(n/2) + 1TH] AND
(n/2)TH ITEM]}
PROBLEM 8
• FIND THE MEDIAN OF THE DAILY WAGES OF 7 WORKERS:
1000, 1500, 800, 900 1600, 2000, 1400
ALSO, FIND THE MEDIAN IF ANOTHER WORKER WITH A DAILY
WAGE OF ₹ 984 IS ADDED TO THE ABOVE DISTRIBUTION.
SOLUTION
• ARRANGE ALL OBSERVATIONS IN A LOGICAL ORDER:
800, 900, 1000, 1400, 1500, 1600, 2000
• Me position = (n+1/2)th item = (7+1)/2 = 4th item i.e. Me= 1400
SOLUTION
• ARRANGE ALL OBSERVATIONS IN A LOGICAL ORDER:
800, 900, 984, 1000, 1400, 1500, 1600, 2000
• Me position = (n+1/2)th item = (8+1)/2 = 4.5th item
• Me = Mean of 4th and 5th item of the series
• Me = (1000 + 1400)/2 = 1200
DISCRETE FREQUENCY DISTRIBUTION
• MEDIAN IS GIVEN BY THE VALUE OF (n+1)/2th ITEM
• TO DETERMINE MEDIAN, THE ‘LESS THAN’ CUMULATIVE FREQUENCIES ARE
OBTAINED
• LOCATE THE VALUE OF THIS ITEM HAVING REFERENCE TO THE CUMULATIVE
FREQUENCIES COLUMN
• FOR THIS, WE START FROM THE TOP AND MOVE DOWNWARD AS LONG AS THE
VALUE (n+1)/2 EXCEEDS THE CUMULATIVE FREQUENCY VALUES, AND STOP ONCE
THIS VALUE IS LOCATED.
PROBLEM 9
FIND THE MEDIAN OF THE FOLLOWING DISTRIBTUION OF MARKS OF
STUDENTS:
MARKS 100 150 80 200 250 180
NUMBER OF STUDENTS 24 26 16 20 6 30
1. ARRANGE THE ABOVE DISTRIBUTION IN ASCENDING ORDER:
MARKS 80 100 150 180 200 250
NUMBER OF STUDENTS 16 24 26 30 20 6
SOLUTION
2. CALCULATE CUMULATIVE FREQUENCY:
MARKS 80 100 150 180 200 250
NUMBER OF STUDENTS 16 24 26 30 20 6
CUMULATIVE FREQUENCY 16 40 66 96 116 122
MEDIAN IS GIVEN BY THE VALUE OF (n+1)/2th ITEM
Me = (122 +1)/2 = 61.5th ITEM
61.5th ITEM IS 150
ALGEBRAIC METHOD
(USED IN CONTINUOUS DISTRIBUTION)
• TO OBTAIN THE VALUE OF MEDIAN FIND THE CUMULATIVE FREQUENCIES
• CALCULATE n/2 AND THEN, WITH REFERENCE TO THE CUMULATIVE FREQUENCIES
COLUMN, DETERMINE THE CLASS INTERVAL IN WHICH MEDIAN FALLS.
• THIS IS TERMED AS THE MEDIAN CLASS. THE FOLLOWING FORMULA IS USED:
𝑀𝑒 = 𝑙1 +
𝑛
2
− 𝑐. 𝑓.
𝑓
𝑋 𝑖
PROBLEM 10
THE FOLLOWING IS THE DISTRIBUTION OF A SAMPLE OF 200 COMPANIES
ACCORDING TO THE PROFITS EARNED BY THEM IN THE LAST YEAR.
DETERMINE MEDIAN PROFIT FROM THE DISTRIBUTION.
PROFITS (IN ₹ LAKH) 0-25 25-50 50-75 75-100
NUMBER OF COMPANIES 30 50 80 40
SOLUTION
1. COMPUTE THE CUMULATIVE FREQUENCY
PROFITS (IN ₹ LAKH) 0-25 25-50 50-75 75-100
NUMBER OF COMPANIES 30 50 80 40
CUMULATIVE FREQUENCY 30 80 160 200
2. DETERMINE THE MEDIAN CLASS AS THE CLASS INTERVAL IN WHICH THE (n/2)th
ITEM LIES: 50-75
3. 𝑀𝑒 = 𝑙1 +
𝑛
2
− 𝑐.𝑓.
𝑓
𝑋 𝑖 = 50 +
100 − 80
80
𝑋 25 = 56.25
PROPERTIES OF MEDIAN
• THE SUM OF ABSOLUTE DEVIATIONS FROM MEDIAN IS THE MINIMUM.
• THAT IS TO SAY, THE AGGREGATE ABSOLUTE DEVIATIONS FROM A POINT
OTHER THAN MEDIAN CANNOT BE SMALLER THAN THE CORRESPONDING
AGGREGATE FROM MEDIAN OF THE OBSERVATIONS.
• SYMBOLICALLY, Σ|X –Me| IS MINIMUM
OTHER POSITIONAL VALUES
• MEDIAN IS THE VALUE THAT DIVIDES A DISTRIBUTION INTO TWO EQUAL
PARTS
• THERE ARE OTHER MEASURES AS WELL WHICH ARE USED TO DIVIDE THE
DISTRIBUTION IN MORE THAN TWO PARTS. THESE MEASURES INCLUDING
MEDIAN ARE KNOWN AS PARTITION VALUES
• THE VALUES THAT DIVIDE A SERIES OR DISTRIBUTION INTO FOUR PARTS
ARE TERMED AS QUARTILES,
• MEASURES THAT DIVIDE A SERIES INTO 10 AND 100 PARTS ARE CALLED,
RESPECTIVELY, DECILES AND PERCENTILES.
OTHER POSITIONAL VALUES
• FOR A DISTRIBUTION, THERE ARE THREE QUARTILES, NINE DECILES AND
NINETY-NINE PERCENTILES
• IT MAY BE NOTED THAT THE VALES OF SECOND QUARTILE, FIFTH DECILE
AND 50TH PERCENTILE ARE IDENTICAL TO THE MEDIAN VALUE.
• THE FIRST AND THIRD QUARTILES OF A DISTRIBUTION ARE RESPECTIVELY
CALLED LOWER QUARTILE AND UPPER QUARTILE.
• THE PARTITION VALUES REFLECT HOW THE VALUES IN DATA ARE
DISTRIBUTED AND THEY HELP TO LOCATE THE RELATIVE POSITION OF A
CERTAIN VALUE IN A GIVEN SERIES/DISTRIBUTION.
CALCULATION OF POSITIONAL VALUES
• THE PRINCIPLES AND METHODOLOGY OF USED FOR COMPUTING VARIOUS
PARTITION VALUES IS SAME AS FOR COMPUTATION OF MEDIAN.
• LIKE IN CASE OF MEDIAN, THE INDIVIDUAL OBSERVATIONS ARE
ARRANGED IN THE ASCENDING ORDER OF MAGNITUDE.
• THEN n IS DIVIDED BY 4, 10 OR 100 ACCORDINGLY AS QUARTILE, DECILES
OR PERCENTILES ARE REQUIRED TO BE CALCULATED, AND MULTIPLIED BY
KWHERE KTH QUARTILE, DECILE OR PERCENTILE IS DESIRED.
CALCULATION OF POSITIONAL VALUES
• THE FORMULAE, IN GENERAL FORM, FOR OBTAINING VARIOUS PARTITION
VALUES ARE:
• QUARTILEk = 𝑙1 +
𝑘(𝑛)
4
− 𝑐.𝑓.
𝑓
𝑋 𝑖
• DECILEk= 𝑙1 +
𝑘(𝑛)
10
− 𝑐.𝑓.
𝑓
𝑋 𝑖
• PERCENTILEk= 𝑙1 +
𝑘(𝑛)
100
− 𝑐.𝑓.
𝑓
𝑋 𝑖
PROBLEM 11
DETERMINE THE MEDIAN FOR THE FOLLOWING DISTRIBUTION
OF WEEKLY INCOME IN AN OFFICE:
WEEKLY INCOME (IN ₹ ‘000)
(EQUAL TO OR MORE THAN)
12 11 10 8 6 4 3 2 1
NUMBER OF WORKERS 0 0 14 26 42 54 62 70 80
CALCULATE:
• NUMBER OF WORKERS WITH WEEKLY INCOME BETWEEN ₹ 2,400 AND ₹ 10,500
• MAXIMUM INCOME OF THE LOWEST 25% WORKERS
SOLUTION
REWRITE THE GIVEN TABLE IN THE FORM OF CLASS INTERVALS:
WEEKLY INCOME (IN ₹ ‘000) 0-1 1-2 2-3 3-4 4-6 6-8 8-10 10-11 11-12
NUMBER OF WORKERS 0 10 8 8 12 16 12 14 0
CUMULATIVE FREQUENCY 0 10 18 26 38 54 66 80 80
MEDIAN CLASS WILL BE 6,000 – 8,000
Me = 𝑀𝑒 = 𝑙1 +
𝑛
2
− 𝑐.𝑓.
𝑓
𝑋 𝑖 = 6000 +
40 − 38
16
𝑋 2000 = ₹ 6,250
SOLUTION
TO DETERMINE NUMBER OF WORKERS BETWEEN ₹ 2,400 AND ₹ 10,500:
WEEKLY INCOME (IN ₹ ‘000)
(EQUAL TO OR MORE THAN)
0-1 1-2 2-3 3-4 4-6 6-8 8-10 10-11 11-12
NUMBER OF WORKERS 0 10 8 8 12 16 12 14 0
CUMULATIVE FREQUENCY 0 10 18 26 38 54 66 80 80
ADD THE FREQUENCY BETWEEN THE GIVEN LIMITS:
8 X (600/1000) + 8 + 12 + 16 + 12 + 14 X (500/1000) = 60 WORKERS APPROXIMATELY
SOLUTION
TO DETERMINE MAXIMUM INCOME OF THE LOWEST 25% WORKERS:
WEEKLY INCOME (IN ₹ ‘000)
(EQUAL TO OR MORE THAN)
0-1 1-2 2-3 3-4 4-6 6-8 8-10 10-11 11-12
NUMBER OF WORKERS 0 10 8 8 12 16 12 14 0
CUMULATIVE FREQUENCY 0 10 18 26 38 54 66 80 80
DETERMINE FIRST QUARTILE: (n/4)th ITEM = 80/4 = 20th ITEM LIES IN INTERVAL 3,000
– 4,000
Q1 = 𝑙1 +
𝑛
4
− 𝑐.𝑓.
𝑓
𝑋 𝑖 = 3000 +
20 − 18
8
𝑋 1000 = ₹ 3,250
PROBLEM 12
USE THE FOLLOWING DATA TO COMPUTE THE UPPER AND
LOWER QUARTILES, D2, P5 AND P90.
MARKS BELOW 10 10-20 20-40 40-60 60-80 ABOVE 80
FREQUENCY 8 10 22 25 10 5
STEP 1: COMPUTE CUMULATIVE FREQUENCY
MARKS BELOW 10 10-20 20-40 40-60 60-80 ABOVE 80
FREQUENCY 8 10 22 25 10 5
CUMULATIVE
FRQUENCY
8 18 40 65 75 80
SOLUTION
STEP 2: ASCERTAIN THE CI FOR EACH POSITIONAL VALUE:
MARKS BELOW 10 10-20 20-40 40-60 60-80 ABOVE 80
FREQUENCY 8 10 22 25 10 5
CUMULATIVE
FRQUENCY
8 18 40 65 75 80
P5 D2
Q1 Q3 P90
SOLUTION
• D2 = 18
• P5 = 5
• Q1 = 21.82
• Q3 = 56
• P90 = 74
MODE
• MODE IS THE VALUE IN THE DATA WHICH OCCURS MOST FREQUENTLY
• IT IS TYPICAL IN THE SENSE THAT IT IS THE MOST PROBABLE VALUE
• IN THE CONTEXT OF A CONTINUOUS FREQUENCY DISTRIBUTION, MODE IS
DEFINED AS THAT VALUE OF THE VARIABLE WHERE THERE IS HIGHEST
CONCENTRATION OF OBSERVATIONS
• IT IS THE VALUE OF THE VARIABLE UNDER CONSIDERATION CORRESPONDING TO
THE HIGHEST ORDINATE OF THE SMOOTH FREQUENCY CURVE
CALCULATION OF MODE
• IN AN INDIVIDUAL SERIES, THE VALUE OF MODE IS THE SIZE OF THE ITEM
REPEATING THE HIGHEST NUMBER OF TIMES
• IN THE FOLLOWING DATA SET:
10, 6, 9, 7, 10, 8, 11, 7, 10, AND 12
THE VALUE 10 APPEARS MORE FREQUENTLY THAN ANY OTHER VALUE
THEREFORE, MODE = 10
CALCULATION OF MODE
• IN CASE OF DISCRETE FREQUENCY DISTRIBUTIONS, THE VALUE OF
MODE IS THE VALUE CORRESPONDING TO WHICH THE FREQUENCY IS
HIGHEST
EXAMPLE,
MARKS 10 14 16 22
FREQUENCY 8 18 40 65
THE VALUE 22 HAS THE HIGHEST FREQUENCY
THEREFORE, MODE = 22 MARKS
CALCULATION OF MODE
• ALGAEBRIC METHOD OF CALCULATION OF MODE:
𝑀𝑂𝐷𝐸 = 𝑀𝑜 = 𝑙1 + (
𝑓1
−𝑓0
|2𝑓1
−𝑓0
−𝑓2
|
) x i
• HERE, l1 IS THE LOWER LIMIT OF THE MODAL CLASS,
f1 IS FREQUENCY OF THE MODAL CLASS,
f0 AND f2 ARE RESPECTIVE FREQUENCIES OF THE PRE-MODAL AND POST
MODAL CLASSES, AND
i = WIDTH OF THE MODAL CLASS.
PROBLEM 13
Weight 10-20 20-30 30-40 40-50 50-60 60-70 70-80 80-90
Frequency 4 12 40 41 27 13 9 4
CALCULATE THE VALUE OF MODE FOR THE GIVEN DATA
STEP 1: DETERMINE THE MODAL CLASS:
SINCE THE FREQUENCY OF 2 CLASS INTERVALS IS VERY CLOSE, PREPARE THE
GROUPING AND CLASSIFICATION TABLES.
GROUPING TABLE
TO DETERMINE THE CI WITH HIGHEST FREQUENCY
Weight 10-20 20-30 30-40 40-50 50-60 60-70 70-80 80-90
Frequency 4 12 40 41 27 13 9 4
2 VALUE 16 81 40 13
2 VALUE - 52 68 24 -
3 VALUE 56 81 - -
3 VALUE - 93 49 -
3 VALUE - - 108 26
CLASSIFICATION TABLE
TALLY MARKS UNDER THE CI WITH HIGHEST FREQUENCY
CI 10-20 20-30 30-40 40-50 50-60 60-70 70-80 80-90
I (f) 1
II 1 1
III 1 1
IV 1 1 1
V 1 1 1
VI 1 1 1
TOTAL 0 1 3 6 3 1 0 0
MODAL CLASS
COMPUTATION OF MODE
• ALGAEBRIC METHOD OF CALCULATION OF MODE:
𝑀𝑂𝐷𝐸 = 𝑙1 + (
𝑓1
−𝑓0
2𝑓1
−𝑓0
−𝑓2
) x i
𝑀𝑜 = 40 + (
41 − 40
2𝑋 41 −40 −27
) x 10
= 40.67
EMPIRICAL METHOD
IN CASE A SERIES IS NOT UNIMODAL
• MODE CAN ALSO BE ESTIMATED BY USING MEAN AND
MEDIAN IN CASE A SERIES IS NOT UNIMODAL
• ACCORDING TO THE EMPIRICAL RULE,
MODE = 3MEDIAN – 2MEAN
CHOICE OF AVERAGE
• MOST COMMONLY USED IS ARITHMETIC MEAN
• G.M. AND H.M. TO BE USED WHERE THE LARGER WEIGHTS
ARE TO BE ASSIGNED TO LOWER VALUES AND VICE-VERSA
• MEDIAN TO BE USED IN CASE OF EXTREME VALUES IN A
SERIES AND OPEN ENDED DISTRIBUTION
• MODE IS SUITABLE WHERE THE PURPOSE IS TO DETERMINE
THE MOST TYPICAL VALUE

More Related Content

PPTX
Chapter 3_M of Location and dispersion mean, median, mode, standard deviation
PPT
Statistics 3, 4
PPT
Central tendency
PDF
Measures for Central Tendency/location/averages
PDF
Measurement Techniques,Uncertainty Analysispdf
PPTX
Measure of Central Tendency
PPTX
Summary statistics (1)
PPTX
VARIANCE AND STANDARD DEVIATION.pptx
Chapter 3_M of Location and dispersion mean, median, mode, standard deviation
Statistics 3, 4
Central tendency
Measures for Central Tendency/location/averages
Measurement Techniques,Uncertainty Analysispdf
Measure of Central Tendency
Summary statistics (1)
VARIANCE AND STANDARD DEVIATION.pptx

Similar to STAT PPT 1 AVERAGES (useful in analysing) (20)

PPTX
MEASURES-OF-CENTRAL-TENDENCY-VARIABILITY-TEAM-S-PERSISTENCE.pptx
PPTX
Measures of Dispersion.pptx
PPT
BIIntro.ppt
PDF
DAVLectuer3 Exploratory data analysis .pdf
PPT
Business Intelligence and Data Analytics.ppt
PPT
BIIntroduction. on business intelligenceppt
PDF
Statistical Analysis using Central Tendencies
PPTX
Measures of central tendency.education pptx
PPTX
Measures of central tendency mean
PPTX
SP and R.pptx
PDF
슬로우캠퍼스: scikit-learn & 머신러닝 (강박사)
PPTX
Biostatistics community medicine or Psm.pptx
PPTX
STATISTCAL MEASUREMENTS.pptx
PPTX
data analysis, Sums, Unit 4.pptx We help you structure, clean, and organize y...
PPT
Penggambaran Data Secara Numerik
PDF
Lesson2 - lecture two Measures mean.pdf
PPTX
MEAN.pptx
PPTX
Measures of central tendency
PPT
Chapter 3 Ken Black 2.ppt
MEASURES-OF-CENTRAL-TENDENCY-VARIABILITY-TEAM-S-PERSISTENCE.pptx
Measures of Dispersion.pptx
BIIntro.ppt
DAVLectuer3 Exploratory data analysis .pdf
Business Intelligence and Data Analytics.ppt
BIIntroduction. on business intelligenceppt
Statistical Analysis using Central Tendencies
Measures of central tendency.education pptx
Measures of central tendency mean
SP and R.pptx
슬로우캠퍼스: scikit-learn & 머신러닝 (강박사)
Biostatistics community medicine or Psm.pptx
STATISTCAL MEASUREMENTS.pptx
data analysis, Sums, Unit 4.pptx We help you structure, clean, and organize y...
Penggambaran Data Secara Numerik
Lesson2 - lecture two Measures mean.pdf
MEAN.pptx
Measures of central tendency
Chapter 3 Ken Black 2.ppt
Ad

Recently uploaded (20)

PDF
[EN] Industrial Machine Downtime Prediction
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PPTX
Market Analysis -202507- Wind-Solar+Hybrid+Street+Lights+for+the+North+Amer...
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPT
Quality review (1)_presentation of this 21
PPTX
IB Computer Science - Internal Assessment.pptx
PPTX
SAP 2 completion done . PRESENTATION.pptx
PDF
Fluorescence-microscope_Botany_detailed content
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PPT
Reliability_Chapter_ presentation 1221.5784
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PPTX
Introduction to Knowledge Engineering Part 1
PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
PPTX
climate analysis of Dhaka ,Banglades.pptx
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPTX
Computer network topology notes for revision
PDF
Mega Projects Data Mega Projects Data
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
[EN] Industrial Machine Downtime Prediction
Introduction-to-Cloud-ComputingFinal.pptx
Market Analysis -202507- Wind-Solar+Hybrid+Street+Lights+for+the+North+Amer...
Acceptance and paychological effects of mandatory extra coach I classes.pptx
Quality review (1)_presentation of this 21
IB Computer Science - Internal Assessment.pptx
SAP 2 completion done . PRESENTATION.pptx
Fluorescence-microscope_Botany_detailed content
Miokarditis (Inflamasi pada Otot Jantung)
Reliability_Chapter_ presentation 1221.5784
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
Introduction to Knowledge Engineering Part 1
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
climate analysis of Dhaka ,Banglades.pptx
oil_refinery_comprehensive_20250804084928 (1).pptx
Computer network topology notes for revision
Mega Projects Data Mega Projects Data
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
Galatica Smart Energy Infrastructure Startup Pitch Deck
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Ad

STAT PPT 1 AVERAGES (useful in analysing)

  • 2. WHAT IS AN AVERAGE? • SINGLE VALUE • USED TO REPRESENT ALL VALUES IN A DATA SERIES • LIES IN BETWEEN THE EXTREME VALUES • ALSO KNOWN AS MEASURE OF CENTRAL TENDENCY
  • 3. WHY DETERMINE THE AVERAGE? • DESCRIBE THE CHARACTERISTICS OF A SERIES • DRAW STATISTICAL INFERENCES • COMPARE TWO OR MORE SERIES • FACILITATE IMPORTANT DECISIONS
  • 4. IDEAL CHARACTERISTICS OF AN AVERAGE • SIMPLE TO COMPUTE • CLOSE REPRESENTATION • OBJECTIVELY DETERMINED • CAPABLE OF FURTHER ALGAEBRIC OPERATIONS
  • 6. MATHEMATICAL AVERAGES CALCULATED USING AN ALGAEBRIC FORMAULA BASED ON ALL VALUES OF THE DATA SERIES
  • 7. ARITHMETIC MEAN • DEFINED AS SUM OF ALL OBSERVATIONS DIVIDED BY THE TOTAL NUMBER OF OBSERVATIONS • SIMPLEST FORM OF AVERAGE • MOST POPULARLY USED • CAN BE COMPUTED FOR INDIVIDUAL/DISCRETE AND CONTINUOUS SERIES
  • 8. ROLL NO. MARKS (X) DEVIATIONS (X-A) 1 5 -20 2 15 -10 3 25 0 4 35 10 5 45 20 6 55 30 TOTAL 180 30 MEAN CALCULATION USING DIRECT METHOD: ҧ 𝑥 = ෍ 𝑥 𝑛 = 180/6 = 30 MEAN CALCULATION USING ASSUMED MEAN METHOD: ҧ 𝑥 = 𝐴 + ൗ σ 𝑑 𝑛 = 25 + 30/6 =25 + 5 =30 = A EXAMPLE 1: INDIVIDUAL SERIES
  • 9. DISCRETE SERIES FEW DATA POINTS ALONG WITH FREQUENCY • DIRECT METHOD ҧ 𝑥 = ൗ 𝛴𝑓𝑥 𝛴𝑓 WHERE, ҧ 𝑥 IS A.M. σ 𝑓𝑋 IS SUM OF PRODUCT OF ALL OBSERVATIONS WITH THE FREQUENCY 𝛴𝑓 IS THE TOTAL NUMBER OF OBSERVATIONS • ASSUMED MEAN METHOD ҧ 𝑥 = 𝐴 + ൘ σ f𝑑 𝛴𝑓 WHERE, ҧ 𝑥 IS A.M. A IS THE ASSUMED MEAN VALUE σ f𝑑 IS SUM OF THE PRODUCT OF FREQUENCY AND DEVIATIONS FROM THE ASSUMED MEAN VALUE 𝛴𝑓 IS THE TOTAL NUMBER OF OBSERVATIONS
  • 10. MARKS (X) FREQUEN CY (f) DEVIATIONS d = (X-A) fx fd 5 10 -20 50 -200 15 20 -10 300 -200 25 30 0 750 0 35 50 10 1750 500 45 40 20 1800 800 55 30 30 1650 900 180 6300 1800 MEAN CALCULATION USING DIRECT METHOD: ҧ 𝑥 = ൗ 𝛴𝑓𝑥 𝛴𝑓 = 6300/180 = 35 MEAN CALCULATION USING ASSUMED MEAN METHOD: ҧ 𝑥 = 𝐴 + ൘ σ f𝑑 𝛴𝑓 = 25 + 1800/180 =25 + 10 =35 = A EXAMPLE 2: DISCRETE SERIES
  • 11. CONTINUOUS SERIES CLASS INTERVAL AND FREQUENCY IS GIVEN • DIRECT METHOD ҧ 𝑥 = ൗ 𝛴𝑓𝑚 𝛴𝑓 WHERE, ҧ 𝑥 IS A.M. σ 𝑓𝑚 IS SUM OF PRODUCT OF MIDPOINT OF CLASS INTERVAL WITH THE FREQUENCY 𝛴𝑓 IS THE TOTAL NUMBER OF OBSERVATIONS • ASSUMED MEAN METHOD ҧ 𝑥 = 𝐴 + ൘ σ f𝑑 𝛴𝑓 WHERE, ҧ 𝑥 IS A.M. A IS THE ASSUMED MEAN VALUE σ f𝑑 IS SUM OF THE PRODUCT OF FREQUENCY AND DEVIATIONS FROM THE ASSUMED MEAN VALUE 𝛴𝑓 IS THE TOTAL NUMBER OF OBSERVATIONS
  • 12. • STEP DEVIATION METHOD ҧ 𝑥 = 𝐴 + ൘ {σ f𝑑′ 𝛴𝑓} 𝑥 𝑖 WHERE, ҧ 𝑥 IS A.M. A IS THE ASSUMED MEAN VALUE σ f𝑑’ IS SUM OF THE PRODUCT OF FREQUENCY AND DEVIATIONS FROM THE ASSUMED MEAN VALUE 𝛴𝑓 IS THE TOTAL NUMBER OF OBSERVATIONS i IS THE CLASS SIZE CONTINUOUS SERIES CLASS INTERVAL AND FREQUENCY IS GIVEN
  • 13. EXAMPLE 3 CLASS INTERVAL MIDPOINT (m) FREQUENCY (f) DEVIATIONS d = (m-A) d’ = (m-A)/i fm fd fd’ 0-10 5 10 -20 -2 50 -200 -20 10-20 15 20 -10 -1 300 -200 -20 20-30 25 30 0 0 750 0 0 30-40 35 50 10 1 1750 500 50 40-50 45 40 20 2 1800 800 80 50-60 55 30 30 3 1650 900 90 180 6300 1800 180 = A
  • 14. SOLUTION USING STEP DEVIATION METHOD • STEP DEVIATION METHOD ҧ 𝑥 = 𝐴 + ൘ {σ f𝑑′ 𝛴𝑓} 𝑥 𝑖 = 25 + [180/180] x 10 = 35
  • 15. OBSERVATIONS • USING THE SAME DATA, WE GET SAME ANSWER WITH ALL THE METHODS • THIS IS THE OBJECTIVITY TRAIT OF THE METHOD • TYPE OF SERIES HAS AN EFFECT ON CHOICE OF METHOD • USE ANY METHOD UNLESS SPECIFIED • IN CASE OF CONTINUOUS SERIES, STEP DEVIATION METHOD IS MOST APPROPRIATE
  • 16. PROBLEM 1 DAILY EXPENSE FREQUENCY 40-59 50 60-79 ? 80-99 500 100-119 ? 120- 139 50 FIND THE MISSING FREQUENCY FROM THE GIVEN DATA OF 1000 FAMILIES: THE VALUE OF MEAN IS 87.5.
  • 17. PROBLEM 1 CALCULATION OF MEAN DAILY EXPENSE CLASS INTERVAL FREQUENCY MIDPOINT (m) d’ = (m – A)/i fd’ 40-59 39.5 – 59.5 50 49.5 -2 - 100 60-79 59.5 – 79.5 X 69.5 -1 - X 80-99 79.5 – 99.5 500 A = 89.5 0 0 100-119 99.5 – 119.5 400 – X 109.5 1 400 - X 120- 139 119.5 – 139.5 50 129.5 2 100 TOTAL = 1000 400 – 2X
  • 18. SOLUTION • STEP DEVIATION METHOD ҧ 𝑥 = 𝐴 + ൘ {σ f𝑑′ 𝛴𝑓} 𝑥 𝑖 87.5 = 89.5 + [400 -2X/1000] x 20 X = 250 400 – X = 150 DAILY EXPENSE FREQUENCY 40-59 50 60-79 250 80-99 500 100-119 150 120- 139 50
  • 19. PROBLEM 2 THE SUM OF DEVIATIONS OF A CERTAIN NUMBER OF OBSERVATIONS MEASURED FROM 4 IS 72 AND THE SUM OF DEVIATIONS OF SAME OBSERVATIONS MEASURED FROM 7 IS -3. FIND THE NUMBER OF OBSERVATIONS AND THEIR MEAN.
  • 20. SOLUTION THE SUM OF DEVIATIONS IS EXPLAINED AS: ෍ 𝑑 = ෍(𝑋 − 𝐴) ACCORDINGLY, σ 𝑋 − 4 = 72 𝐴𝑁𝐷 σ 𝑋 − 7 = −3 LET n BE THE NUMBER OF OBSERVATIONS AND ҧ 𝑥 BE THE MEAN OF THE SERIES VALUE CAN BE DETERMINED BY FORMING THE EQUATION
  • 21. SOLUTION EQUATION 1 ҧ 𝑥 = 𝐴 + ൗ σ 𝑑 𝑛 ҧ 𝑥 = 4 + ൗ 72 𝑛 n ҧ 𝑥 − 4n = 72 EQUATION 2 ҧ 𝑥 = 𝐴 + ൗ σ 𝑑 𝑛 ҧ 𝑥 = 7 + ൗ − 3 𝑛 n ҧ 𝑥 − 7n = - 3 SOLVING THE TWO EQUATIONS SIMULTANEOUSLY, n = 25 AND ത 𝐱 = 6.88
  • 22. MATHEMATICAL PROPERTIES OF A.M. • 𝛴 𝑥 − ഥ 𝑥 = 0 • 𝛴 𝑥 − ഥ 𝑥 2 IS MINIMUM • IF EACH VALUE OF THE DATA SERIES IS CHANGED (+/ - / x / ÷) BY A CONSTANT, THE MEAN VALUE UNDERGOES A SIMILAR CHANGE • CALCULATION OF COMBINED AVERAGE COMBINED MEAN = N1X1 + N2X2/ N1 +N2
  • 23. PROBLEM 3 THE AVERAGE MONTHLY WAGE OF ALL WORKERS IN A FACTORY IS Rs. 444. IF THE AVERAGE WAGES PAID TO MALE AND FEMALE ARE Rs. 480 AND Rs. 360 RESPECTIVELY, FIND THE PERCENTAGE OF MALE AND FEMALE WORKERS IN THE FACTORY. SOLUTION: X1 = 480 ; X2 = 360; X12 = 444 LET THE NUMBER OF MALE AND FEMALE WORKERS BE N1 AND N2 RESPECTIVELY
  • 24. SOLUTION COMBINED MEAN = N1X1 + N2X2/ N1 +N2 444 = 480N1 + 360N2/ N1 +N2 36N1 - 84N2 = 0 N1/ N2 =84/36 N1: N2 =7 : 3 THUS, THE PERCENTAGE OF MALE AND FEMALE WORKERS IN THE FACTORY IS 70% AND 30% RESPECTIVELY.
  • 25. PROBLEM 4 THE MEAN MARKS OF 99 STUDENTS WAS FOUND TO BE 60. LATER, IT IS DISCOVERED THAT A SCORE OF 35 WAS MISREAD AS 53, ANOTHER SCORE WAS TAKEN AS 63 INSTEAD OF 36 AND SCORE OF 5 WAS NOT TAKEN INTO ACCOUNT. FIND THE CORRECT MEAN. n = 99 ; X = 60 ACTUAL WRONG DIFFERENCE CHANGE IN n 35 53 -18 -- 36 63 - 27 -- 5 0 + 5 + 1 TOTAL = - 40 +1
  • 26. SOLUTION • SUM OF ALL OBSERVATIONS = n X MEAN = 99 X 60 = 5940 • CHANGE IN n = 99 +1 = 100 • CORRECT MEAN = 5940 -40 / 100 = 59
  • 27. WEIGHTED A.M. • DIFFERENT WEIGHTS ( RELATIVE IMPORTANCE) • ALL THE DATA IN A SERIES VALUES ARE ASSIGNED WEIGHTS • USEFUL IN CONSTRUCTION OF INDEX NUMBERS; STANDARDISED RATES LIKE BIRTH RATE
  • 28. FORMULA AND CALCULATION 1.MULTIPLY THE NUMBERS IN THE GIVEN DATA SET BY THEIR RESPECTIVE WEIGHTS 2.SUM UP THE PRODUCTS 3.DIVIDE THE SUM OF THE PRODUCTS OBTAINED BY THE TOTAL OF THE WEIGHTS ഥ 𝒙𝒘 = σ 𝒘𝒙 𝜮𝒘
  • 29. PROBLEM 5 • A student takes three 100-point exams in statistics and score 80, 80 and 95. The last exam is much easier than the first two, so the teacher has assigned it less weight. The weights for the three exams are: • Exam 1: 40 % of grade obtained. • Exam 2: 40 % of grade obtained. • Exam 3: 20 % of grade obtained. • What is the student’s final weighted average for statistics?
  • 30. SOLUTION SCORE WEIGHTS PRODUCT 80 0.4 32 80 0.4 32 95 0.2 19 1.0 83 ഥ 𝒙𝒘 = σ 𝒘𝒙 𝜮𝒘 ഥ 𝒙𝒘 = 𝟖𝟑 𝟏 THUS, THE FINAL GRADE OF THE STUDENT IS 83.
  • 31. LIMITATIONS OF A.M. • UNDULY AFFECTED BY EXTREME VALUES • IN OPEN DISTRIBUTION A.M. MAY NOT BE THE APPROPRIATE MEASURE • NON-HOMOGENOUS DATA, AVERAGE MAY GIVE MISLEADING CONCLUSION
  • 32. GEOMETRIC MEAN • DEFINED AS THE nth OF THE PRODUCT OF ALL THE n ITEMS IN A SERIES • USED TO MEASURE THE AVERAGE CHANGE IN A VARIABLE OVERA PERIOD OF TIME • G.M. = 𝑥1 ⋅ 𝑥2 −−−−−− ⋅ 𝑥𝑛
  • 33. USE OF LOG AND ANTILOG • TO SIMPLIFY THE CALCULATION OF G.M. WE USE LOG AND ANTILOG • INDIVIDUAL SERIES: G.M. = A.L. (𝛴log X)/n • DISCRETE SERIES: G.M. = A.L. (𝛴𝑓x log X)/n • CONTINUOUS SERIES: G.M. = A.L. (𝛴 𝑓 x log m)/n
  • 34. PROBLEM 6 THE ANNUAL RATE OF DEPRECIATION OF A MACHINE IS 20% IN YEAR 1, 15% IN YEAR 2, 10% IN YEAR 3 AND 5% IN THE 4TH YEAR. CALCULATE THE AVERAGE ANNUAL RATE OF DEPRECIATION. YEAR RATE OF DEP BASE YEAR VALUE (X) log X I 20 80 1.9031 II 15 85 1.9294 III 10 90 1.9542 IV 5 95 1.9777 7.7644
  • 35. SOLUTION INDIVIDUAL SERIES: G.M. = A.L. (𝛴log X)/n = A.L. ( 7.7644/ 4) = A.L. (1.9411) = 0.8732 OR 87.32% AVERAGE ANNUAL DEPRECIATION = 100 – 87.32 = 12.68%
  • 36. HARMONIC MEAN • DEFINED AS RECIPROCAL OF THE A.M. OF THE RECIPROCALS OF THE ITEMS IN A SERIES • USEFUL TO ASSIGN LARGER WEIGHTS TO SMALLER VALUES • INDIVIDUAL SERIES: H.M. = n ÷ 𝛴(1/X) • DISCRETE SERIES: H.M. = 𝛴𝑓 ÷ (𝛴𝑓 /𝑋) • CONTINUOUS SERIES: H.M. = 𝛴𝑓 ÷ (𝛴𝑓 /m)
  • 37. PROBLEM 7 IN A CERTAIN OFFICE A LETTER IS TYPED BY A IN 4 MINUTES. THE SAME LETTER IS TYPED BY B, C, D IN 5, 6 AND 10 MINUTES RESPECTIVELY. WHAT IS THE AVERAGE TIME TAKEN IN COMPLETING A LETTER? HOW MANY LETTERS CAN BE TYPED IN A WORKING DAY COMPRISING 8 HOURS?
  • 38. SOLUTION EMPLOYEE TIME TAKEN (X) (1/X) A 4 0.250 B 5 0.200 C 6 0.167 D 10 0.100 0.717 (i) H.M. = n ÷ 𝛴(1/X) ; n =4 = 4/0.717 = 5.58 MINUTES PER LETTER (ii) NO. OF LETTERS TYPED IN AN 8 HOUR DAY = (8 X 60) /5.58 = 86 LETTERS
  • 39. RELATIONSHIP BETWEEN A.M. ,G.M. AND H.M. • FOR ANY SET OF UNEQUAL POSITIVE NUMBERS, THE FOLLOWING RELATIONSHIP HOLDS: A.M. > G.M. > H.M. • IF ALL VALUES IN THE DATA SET ARE EQUAL, THEN THE THREE AVERAGES WOULD ALSO BE EQUAL • THUS, GEOMETRIC MEAN OF TWO NUMBERS IS EQUAL TO THE GEOMETRIC MEAN OF THEIR ARITHMETIC AND HARMONIC MEAN.
  • 40. POSITIONAL AVERAGES REPRESENTS THE DATA SERIES NOT BASED ON ALL ITEMS OF THE SERIES CONSIDERED AS AVERAGE DUE TO SPECIFIC PROPERTY/POSITION
  • 41. MATHEMATICAL vs. POSITIONAL AVERAGES • THE POSITIONAL AVERAGES ARE NOT BASED ON ALL OBSERVATIONS BUT THE MATHEMATICAL AVERAGES ARE BASED ON THE WHOLE DATA SET • COMPUTATION OF MATHEMATICAL AVERAGES CALLS FOR APPLICATION OF WELL- DEFINED FORMULAE WHILE POSITIONAL AVERAGES CAN BE OBTAINED BY MERE INSPECTION (IN RAW DATA AND IN DISCRETE FREQUENCY DISTRIBUTIONS) • WHILE POSITIONAL AVERAGES CAN BE OBTAINED BY USING GRAPHS (IN CASE OF CONTINUOUS FREQUENCY DISTRIBUTIONS), THE MATHEMATICAL AVERAGES NEED TO BE CALCULATED
  • 42. MATHEMATICAL vs. POSITIONAL AVERAGES • THE POSITIONAL AVERAGES CAN BE CALCULATED IN OPEN-ENDED DISTRIBUTIONS WHILE THE MATHEMATICAL AVERAGES CANNOT BE • ALGEBRAIC MANIPULATION OF POSITIONAL AVERAGES IS NOT POSSIBLE. ON THE OTHER HAND, THE MATHEMATICAL AVERAGES ARE AMENABLE TO FURTHER ALGEBRAIC TREATMENT AS THEY HAVE WELL-DEFINED MATHEMATICAL PROPERTIES • THE MATHEMATICAL AVERAGES CAN BE DISTINGUISHED AS BEING SIMPLE AND WEIGHTED. THERE IS NO QUESTION OF ANY POSITIONAL AVERAGES BEING ‘WEIGHTED’.
  • 43. MATHEMATICAL vs. POSITIONAL AVERAGES • IN COMPARISON TO MATHEMATICAL AVERAGES, THE POSITIONAL AVERAGES ARE AFFECTED TO A LARGER EXTENT BY FLUCTUATIONS OF SAMPLING • SAMPLING FLUCTUATIONS MEAN FLUCTUATIONS IN THE VALUES OF AN AVERAGE OBTAINED FROM DIFFERENT SAMPLES OF SAME SIZE TAKEN FROM A GIVEN POPULATION
  • 44. MEDIAN • MIDDLE VALUE OF A DISTRIBUTION • 50% PERCENT OF THE VALUES LIE BELOW THE MEDIAN AND 50% PERCENT VALUES LIES ABOVE IT • MEDIAN SPLITS THE DATA SERIES INTO TWO EQUAL HALVES
  • 45. FORMULA • INDIVIDUAL SERIES ( n IS ODD) (i) ARRANGE ALL THE ITEMS IN A SERIES IN A LOGICAL ORDER (ii) THE MEDIAN POSITION WILL BE CALCULATED AS FOLLOWS: MEDIAN POSITION = [(n +1)/2]TH ITEM WHERE, n IS THE TOTAL NUMBER OF OBSERVATIONS THUS, MIDDLE VALUE AMONG THE ARRANGED DATA SET IS THE VALUE OF MEAN • INDIVIDUAL SERIES (n IS EVEN) (i) ARRANGE ALL THE ITEMS IN A SERIES IN A LOGICAL ORDER (ii) THE MEDIAN POSITION WILL BE CALCULATED AS FOLLOWS: MEDIAN POSITION = [(n +1)/2]TH ITEM n IS THE TOTAL NUMBER OF OBSERVATIONS MEDIAN VALUE = MEAN{[(n/2) + 1TH] AND (n/2)TH ITEM]}
  • 46. PROBLEM 8 • FIND THE MEDIAN OF THE DAILY WAGES OF 7 WORKERS: 1000, 1500, 800, 900 1600, 2000, 1400 ALSO, FIND THE MEDIAN IF ANOTHER WORKER WITH A DAILY WAGE OF ₹ 984 IS ADDED TO THE ABOVE DISTRIBUTION.
  • 47. SOLUTION • ARRANGE ALL OBSERVATIONS IN A LOGICAL ORDER: 800, 900, 1000, 1400, 1500, 1600, 2000 • Me position = (n+1/2)th item = (7+1)/2 = 4th item i.e. Me= 1400
  • 48. SOLUTION • ARRANGE ALL OBSERVATIONS IN A LOGICAL ORDER: 800, 900, 984, 1000, 1400, 1500, 1600, 2000 • Me position = (n+1/2)th item = (8+1)/2 = 4.5th item • Me = Mean of 4th and 5th item of the series • Me = (1000 + 1400)/2 = 1200
  • 49. DISCRETE FREQUENCY DISTRIBUTION • MEDIAN IS GIVEN BY THE VALUE OF (n+1)/2th ITEM • TO DETERMINE MEDIAN, THE ‘LESS THAN’ CUMULATIVE FREQUENCIES ARE OBTAINED • LOCATE THE VALUE OF THIS ITEM HAVING REFERENCE TO THE CUMULATIVE FREQUENCIES COLUMN • FOR THIS, WE START FROM THE TOP AND MOVE DOWNWARD AS LONG AS THE VALUE (n+1)/2 EXCEEDS THE CUMULATIVE FREQUENCY VALUES, AND STOP ONCE THIS VALUE IS LOCATED.
  • 50. PROBLEM 9 FIND THE MEDIAN OF THE FOLLOWING DISTRIBTUION OF MARKS OF STUDENTS: MARKS 100 150 80 200 250 180 NUMBER OF STUDENTS 24 26 16 20 6 30 1. ARRANGE THE ABOVE DISTRIBUTION IN ASCENDING ORDER: MARKS 80 100 150 180 200 250 NUMBER OF STUDENTS 16 24 26 30 20 6
  • 51. SOLUTION 2. CALCULATE CUMULATIVE FREQUENCY: MARKS 80 100 150 180 200 250 NUMBER OF STUDENTS 16 24 26 30 20 6 CUMULATIVE FREQUENCY 16 40 66 96 116 122 MEDIAN IS GIVEN BY THE VALUE OF (n+1)/2th ITEM Me = (122 +1)/2 = 61.5th ITEM 61.5th ITEM IS 150
  • 52. ALGEBRAIC METHOD (USED IN CONTINUOUS DISTRIBUTION) • TO OBTAIN THE VALUE OF MEDIAN FIND THE CUMULATIVE FREQUENCIES • CALCULATE n/2 AND THEN, WITH REFERENCE TO THE CUMULATIVE FREQUENCIES COLUMN, DETERMINE THE CLASS INTERVAL IN WHICH MEDIAN FALLS. • THIS IS TERMED AS THE MEDIAN CLASS. THE FOLLOWING FORMULA IS USED: 𝑀𝑒 = 𝑙1 + 𝑛 2 − 𝑐. 𝑓. 𝑓 𝑋 𝑖
  • 53. PROBLEM 10 THE FOLLOWING IS THE DISTRIBUTION OF A SAMPLE OF 200 COMPANIES ACCORDING TO THE PROFITS EARNED BY THEM IN THE LAST YEAR. DETERMINE MEDIAN PROFIT FROM THE DISTRIBUTION. PROFITS (IN ₹ LAKH) 0-25 25-50 50-75 75-100 NUMBER OF COMPANIES 30 50 80 40
  • 54. SOLUTION 1. COMPUTE THE CUMULATIVE FREQUENCY PROFITS (IN ₹ LAKH) 0-25 25-50 50-75 75-100 NUMBER OF COMPANIES 30 50 80 40 CUMULATIVE FREQUENCY 30 80 160 200 2. DETERMINE THE MEDIAN CLASS AS THE CLASS INTERVAL IN WHICH THE (n/2)th ITEM LIES: 50-75 3. 𝑀𝑒 = 𝑙1 + 𝑛 2 − 𝑐.𝑓. 𝑓 𝑋 𝑖 = 50 + 100 − 80 80 𝑋 25 = 56.25
  • 55. PROPERTIES OF MEDIAN • THE SUM OF ABSOLUTE DEVIATIONS FROM MEDIAN IS THE MINIMUM. • THAT IS TO SAY, THE AGGREGATE ABSOLUTE DEVIATIONS FROM A POINT OTHER THAN MEDIAN CANNOT BE SMALLER THAN THE CORRESPONDING AGGREGATE FROM MEDIAN OF THE OBSERVATIONS. • SYMBOLICALLY, Σ|X –Me| IS MINIMUM
  • 56. OTHER POSITIONAL VALUES • MEDIAN IS THE VALUE THAT DIVIDES A DISTRIBUTION INTO TWO EQUAL PARTS • THERE ARE OTHER MEASURES AS WELL WHICH ARE USED TO DIVIDE THE DISTRIBUTION IN MORE THAN TWO PARTS. THESE MEASURES INCLUDING MEDIAN ARE KNOWN AS PARTITION VALUES • THE VALUES THAT DIVIDE A SERIES OR DISTRIBUTION INTO FOUR PARTS ARE TERMED AS QUARTILES, • MEASURES THAT DIVIDE A SERIES INTO 10 AND 100 PARTS ARE CALLED, RESPECTIVELY, DECILES AND PERCENTILES.
  • 57. OTHER POSITIONAL VALUES • FOR A DISTRIBUTION, THERE ARE THREE QUARTILES, NINE DECILES AND NINETY-NINE PERCENTILES • IT MAY BE NOTED THAT THE VALES OF SECOND QUARTILE, FIFTH DECILE AND 50TH PERCENTILE ARE IDENTICAL TO THE MEDIAN VALUE. • THE FIRST AND THIRD QUARTILES OF A DISTRIBUTION ARE RESPECTIVELY CALLED LOWER QUARTILE AND UPPER QUARTILE. • THE PARTITION VALUES REFLECT HOW THE VALUES IN DATA ARE DISTRIBUTED AND THEY HELP TO LOCATE THE RELATIVE POSITION OF A CERTAIN VALUE IN A GIVEN SERIES/DISTRIBUTION.
  • 58. CALCULATION OF POSITIONAL VALUES • THE PRINCIPLES AND METHODOLOGY OF USED FOR COMPUTING VARIOUS PARTITION VALUES IS SAME AS FOR COMPUTATION OF MEDIAN. • LIKE IN CASE OF MEDIAN, THE INDIVIDUAL OBSERVATIONS ARE ARRANGED IN THE ASCENDING ORDER OF MAGNITUDE. • THEN n IS DIVIDED BY 4, 10 OR 100 ACCORDINGLY AS QUARTILE, DECILES OR PERCENTILES ARE REQUIRED TO BE CALCULATED, AND MULTIPLIED BY KWHERE KTH QUARTILE, DECILE OR PERCENTILE IS DESIRED.
  • 59. CALCULATION OF POSITIONAL VALUES • THE FORMULAE, IN GENERAL FORM, FOR OBTAINING VARIOUS PARTITION VALUES ARE: • QUARTILEk = 𝑙1 + 𝑘(𝑛) 4 − 𝑐.𝑓. 𝑓 𝑋 𝑖 • DECILEk= 𝑙1 + 𝑘(𝑛) 10 − 𝑐.𝑓. 𝑓 𝑋 𝑖 • PERCENTILEk= 𝑙1 + 𝑘(𝑛) 100 − 𝑐.𝑓. 𝑓 𝑋 𝑖
  • 60. PROBLEM 11 DETERMINE THE MEDIAN FOR THE FOLLOWING DISTRIBUTION OF WEEKLY INCOME IN AN OFFICE: WEEKLY INCOME (IN ₹ ‘000) (EQUAL TO OR MORE THAN) 12 11 10 8 6 4 3 2 1 NUMBER OF WORKERS 0 0 14 26 42 54 62 70 80 CALCULATE: • NUMBER OF WORKERS WITH WEEKLY INCOME BETWEEN ₹ 2,400 AND ₹ 10,500 • MAXIMUM INCOME OF THE LOWEST 25% WORKERS
  • 61. SOLUTION REWRITE THE GIVEN TABLE IN THE FORM OF CLASS INTERVALS: WEEKLY INCOME (IN ₹ ‘000) 0-1 1-2 2-3 3-4 4-6 6-8 8-10 10-11 11-12 NUMBER OF WORKERS 0 10 8 8 12 16 12 14 0 CUMULATIVE FREQUENCY 0 10 18 26 38 54 66 80 80 MEDIAN CLASS WILL BE 6,000 – 8,000 Me = 𝑀𝑒 = 𝑙1 + 𝑛 2 − 𝑐.𝑓. 𝑓 𝑋 𝑖 = 6000 + 40 − 38 16 𝑋 2000 = ₹ 6,250
  • 62. SOLUTION TO DETERMINE NUMBER OF WORKERS BETWEEN ₹ 2,400 AND ₹ 10,500: WEEKLY INCOME (IN ₹ ‘000) (EQUAL TO OR MORE THAN) 0-1 1-2 2-3 3-4 4-6 6-8 8-10 10-11 11-12 NUMBER OF WORKERS 0 10 8 8 12 16 12 14 0 CUMULATIVE FREQUENCY 0 10 18 26 38 54 66 80 80 ADD THE FREQUENCY BETWEEN THE GIVEN LIMITS: 8 X (600/1000) + 8 + 12 + 16 + 12 + 14 X (500/1000) = 60 WORKERS APPROXIMATELY
  • 63. SOLUTION TO DETERMINE MAXIMUM INCOME OF THE LOWEST 25% WORKERS: WEEKLY INCOME (IN ₹ ‘000) (EQUAL TO OR MORE THAN) 0-1 1-2 2-3 3-4 4-6 6-8 8-10 10-11 11-12 NUMBER OF WORKERS 0 10 8 8 12 16 12 14 0 CUMULATIVE FREQUENCY 0 10 18 26 38 54 66 80 80 DETERMINE FIRST QUARTILE: (n/4)th ITEM = 80/4 = 20th ITEM LIES IN INTERVAL 3,000 – 4,000 Q1 = 𝑙1 + 𝑛 4 − 𝑐.𝑓. 𝑓 𝑋 𝑖 = 3000 + 20 − 18 8 𝑋 1000 = ₹ 3,250
  • 64. PROBLEM 12 USE THE FOLLOWING DATA TO COMPUTE THE UPPER AND LOWER QUARTILES, D2, P5 AND P90. MARKS BELOW 10 10-20 20-40 40-60 60-80 ABOVE 80 FREQUENCY 8 10 22 25 10 5 STEP 1: COMPUTE CUMULATIVE FREQUENCY MARKS BELOW 10 10-20 20-40 40-60 60-80 ABOVE 80 FREQUENCY 8 10 22 25 10 5 CUMULATIVE FRQUENCY 8 18 40 65 75 80
  • 65. SOLUTION STEP 2: ASCERTAIN THE CI FOR EACH POSITIONAL VALUE: MARKS BELOW 10 10-20 20-40 40-60 60-80 ABOVE 80 FREQUENCY 8 10 22 25 10 5 CUMULATIVE FRQUENCY 8 18 40 65 75 80 P5 D2 Q1 Q3 P90
  • 66. SOLUTION • D2 = 18 • P5 = 5 • Q1 = 21.82 • Q3 = 56 • P90 = 74
  • 67. MODE • MODE IS THE VALUE IN THE DATA WHICH OCCURS MOST FREQUENTLY • IT IS TYPICAL IN THE SENSE THAT IT IS THE MOST PROBABLE VALUE • IN THE CONTEXT OF A CONTINUOUS FREQUENCY DISTRIBUTION, MODE IS DEFINED AS THAT VALUE OF THE VARIABLE WHERE THERE IS HIGHEST CONCENTRATION OF OBSERVATIONS • IT IS THE VALUE OF THE VARIABLE UNDER CONSIDERATION CORRESPONDING TO THE HIGHEST ORDINATE OF THE SMOOTH FREQUENCY CURVE
  • 68. CALCULATION OF MODE • IN AN INDIVIDUAL SERIES, THE VALUE OF MODE IS THE SIZE OF THE ITEM REPEATING THE HIGHEST NUMBER OF TIMES • IN THE FOLLOWING DATA SET: 10, 6, 9, 7, 10, 8, 11, 7, 10, AND 12 THE VALUE 10 APPEARS MORE FREQUENTLY THAN ANY OTHER VALUE THEREFORE, MODE = 10
  • 69. CALCULATION OF MODE • IN CASE OF DISCRETE FREQUENCY DISTRIBUTIONS, THE VALUE OF MODE IS THE VALUE CORRESPONDING TO WHICH THE FREQUENCY IS HIGHEST EXAMPLE, MARKS 10 14 16 22 FREQUENCY 8 18 40 65 THE VALUE 22 HAS THE HIGHEST FREQUENCY THEREFORE, MODE = 22 MARKS
  • 70. CALCULATION OF MODE • ALGAEBRIC METHOD OF CALCULATION OF MODE: 𝑀𝑂𝐷𝐸 = 𝑀𝑜 = 𝑙1 + ( 𝑓1 −𝑓0 |2𝑓1 −𝑓0 −𝑓2 | ) x i • HERE, l1 IS THE LOWER LIMIT OF THE MODAL CLASS, f1 IS FREQUENCY OF THE MODAL CLASS, f0 AND f2 ARE RESPECTIVE FREQUENCIES OF THE PRE-MODAL AND POST MODAL CLASSES, AND i = WIDTH OF THE MODAL CLASS.
  • 71. PROBLEM 13 Weight 10-20 20-30 30-40 40-50 50-60 60-70 70-80 80-90 Frequency 4 12 40 41 27 13 9 4 CALCULATE THE VALUE OF MODE FOR THE GIVEN DATA STEP 1: DETERMINE THE MODAL CLASS: SINCE THE FREQUENCY OF 2 CLASS INTERVALS IS VERY CLOSE, PREPARE THE GROUPING AND CLASSIFICATION TABLES.
  • 72. GROUPING TABLE TO DETERMINE THE CI WITH HIGHEST FREQUENCY Weight 10-20 20-30 30-40 40-50 50-60 60-70 70-80 80-90 Frequency 4 12 40 41 27 13 9 4 2 VALUE 16 81 40 13 2 VALUE - 52 68 24 - 3 VALUE 56 81 - - 3 VALUE - 93 49 - 3 VALUE - - 108 26
  • 73. CLASSIFICATION TABLE TALLY MARKS UNDER THE CI WITH HIGHEST FREQUENCY CI 10-20 20-30 30-40 40-50 50-60 60-70 70-80 80-90 I (f) 1 II 1 1 III 1 1 IV 1 1 1 V 1 1 1 VI 1 1 1 TOTAL 0 1 3 6 3 1 0 0 MODAL CLASS
  • 74. COMPUTATION OF MODE • ALGAEBRIC METHOD OF CALCULATION OF MODE: 𝑀𝑂𝐷𝐸 = 𝑙1 + ( 𝑓1 −𝑓0 2𝑓1 −𝑓0 −𝑓2 ) x i 𝑀𝑜 = 40 + ( 41 − 40 2𝑋 41 −40 −27 ) x 10 = 40.67
  • 75. EMPIRICAL METHOD IN CASE A SERIES IS NOT UNIMODAL • MODE CAN ALSO BE ESTIMATED BY USING MEAN AND MEDIAN IN CASE A SERIES IS NOT UNIMODAL • ACCORDING TO THE EMPIRICAL RULE, MODE = 3MEDIAN – 2MEAN
  • 76. CHOICE OF AVERAGE • MOST COMMONLY USED IS ARITHMETIC MEAN • G.M. AND H.M. TO BE USED WHERE THE LARGER WEIGHTS ARE TO BE ASSIGNED TO LOWER VALUES AND VICE-VERSA • MEDIAN TO BE USED IN CASE OF EXTREME VALUES IN A SERIES AND OPEN ENDED DISTRIBUTION • MODE IS SUITABLE WHERE THE PURPOSE IS TO DETERMINE THE MOST TYPICAL VALUE