SlideShare a Scribd company logo
Biostatistics (Biol5001)
METHODS OF DATA Organization and PRESENTATION
Instructor: Tatek Getachew(PhD)
() Biol5001 1 / 29
Methods of Data Organization and Presentation
Methods of Data Organization and Presentation
Introduction
Data collected from any source should be assembled in away that is
convenient to understand and look attractive.
This can be done by classification, tabulation, drawing graphs and
diagrams.
The first step in the analysis is to classify and tabulate the
information collected
Objectives
To condense a mass of data in such away that similarities and
dissimilarities can readily apprehended
To facilitate comparisons and drawing inference
To eliminate unnecessary details etc.
() Biol5001 2 / 29
Methods of Data Organization and Presentation
Classification and Tabulation
Classification:- The first step of tabulation
- is grouping of related facts in to groups or classes.
Tabulation:- is a systematic arrangement of statistical data in to columns
and rows (tables).
Frequency Distribution
A frequency distn is a special type of tabular representation in which
values of a variable are classified in to set of classes with corresponding
frequencies of occurrence.
Eg. Frequency distribution of age of science students
Age No Students
15-19 150
20-24 70
25-29 300
Terms associated with f.d
Frequency is the no of occurrence of a certain variable in a data.
ungrouped data:- data in its original raw form
() Biol5001 3 / 29
Methods of Data Organization and Presentation
Methods of Data Organization and Presentation
The presentation of data is classified in to the following two
categories:
Tabular presentation
Diagrammatic and Graphic presentation.
The process of arranging data in to classes or categories according to
similarities technically is called classification.
Definition
Raw Data: recorded information in its original collected form, whether
it be counts or measurements.
Class: is a description of a group of similar numbers in a data set.
Frequency: is the number of times a variable value is repeated.
Frequency distribution: is the organization of raw data in table form
using classes and frequencies.
() Biol5001 4 / 29
Methods of Data Organization and Presentation
Frequency Distributions
There are three basic types of frequency distributions
Categorical frequency distribution
Ungrouped frequency distribution
Grouped frequency distribution
Categorical frequency Distribution:
Used for data that is qualitative such as nominal, or ordinal.
e.g. marital status,blood type
Example: Distribution of Blood Types
Twenty-five army inductees were given a blood test to determine their
blood type. The data set is
A B B AB O
O O B AB B
B B O A O
A O O O AB
AB A O B A
() Biol5001 5 / 29
Methods of Data Organization and Presentation
Solution
Since the data are categorical, There are four blood types: A, B, O, and
AB.
Step 1. Make a table as shown.
A B C D
Class Tally Frequency Percentage
A
B
O
AB
Step 2. Tally the data and place the results in column B.
Step 3. Count the tallies and place the results in column C.
Step 4. Find the percentage of values in each class by using the formula
% =
f
n
× 100% (1)
Step 5. Find the totals for columns C (frequency) and D (percent). The
completed table is shown.
() Biol5001 6 / 29
Methods of Data Organization and Presentation
A B C D
Class Tally Frequency Percentage
A 5 20
B 7 28
O 9 36
AB 4 16
25 100
Ungrouped FD
A FD of numerical data (quantitative) in which each value of a variable
represents a single class (i.e. the values of the variable are not grouped).
Example: The following data represent the mark of 20 students.
80 76 90 85 80
70 60 62 70 85
65 60 63 74 75
76 70 70 80 85
Construct a frequency distribution, which is ungrouped.
() Biol5001 7 / 29
Methods of Data Organization and Presentation
Arrange the data in ascending order
60 60 62 63 65 70 70 70 70 74 75 76 76 80 80 80 85 85 85
90
A B C D
Mark Tally Frequency Percentage
60 2 10
62 1 5
63 1 5
65 1 5
70 4 20
74 1 5
75 2 10
76 1 5
80 3 15
85 3 15
90 1 5
20
() Biol5001 8 / 29
Methods of Data Organization and Presentation
Grouped data data presented in the form of f.d
Array data data arranged in ascending or descending order
unit of measurement (u):- the smallest possible difference between any
consecutive values in the recorded data.
u=1 if the data are integers
u=0.1 if the data are in to one decimal place
u=0.01 if the data are in to two decimal place
tally a traditional method of counting frequencies
Class limit:- The end point of the class
- the smallest and largest value of the class
- smallest =⇒ lower class limit (Lcl)
- largest =⇒ upper class limit (Ucl)
Class boundaries are the true mathematical boundary of the class
- are the precise points that separate various classes rather than the
values included in any one of the class
Lcb=Lcl-1
2u (lower class boundaries)
Ucb=Ucl+1
2u (upper class boundaries)
() Biol5001 9 / 29
Methods of Data Organization and Presentation
Class Mark is the mid point of the class
cm =
Lcl + Ucl
2
=
Ucb + Lcb
2
Class width (interval) is the length of the class
w=Ucb-Lcb
=Ucl-Lcl+u
If the classes have uniform width
w=cmi − cmi−1=Lcli − Lcli−1=Ucli − Ucli−1
Types of f.d
Depending on the variable
* Discrete * Continuous
Depending on the information needed
* Absolute * Relative * Commulative f.ds
() Biol5001 10 / 29
Methods of Data Organization and Presentation
Important points
each observation should go to one and only one class
The smallest and the largest observations fall with in the classification
The class should not overlap
Whenever possible make class intervals of the same size
Whenever possible avoid open ended class
For easy computation, reading and use of distribution, it is advisable to
use width 5, 10, 15 or multiple of 5 steps
In construction f.d
i Arrange the data in ascending or descending order
ii Determine the unit of measurements (u)
iii Determine the range R=xmax − xmin
iv Fix the number of classes (k) arbitrarily
a the most common number of classes is between 5 and 15
b Alternatively use Sturge’s rule
k = 1 + 3.322.logn where n is the number of observations
() Biol5001 11 / 29
Methods of Data Organization and Presentation
v Determine the class width (w) as w=R
k
vi Determine the lower class limit of the 1st class
- arbitrarily, it may be xmin or any number less than xmin, but
not greater than xmin
vii Determine the upper class limit of the 1st class
Ucli = Lcli + w − u
Then determine the other classes
Example:
Construct a frequency distribution for the following data.
11 29 6 33 14 31 22 27 19 20
18 17 22 38 23 21 26 34 39 27
Soln
Arrange the data in ascending order
6 11 14 17 18 19 20 21 22 22
23 26 27 27 29 31 33 34 38 39
() Biol5001 12 / 29
Methods of Data Organization and Presentation
Solutions:
1: Find the highest and the lowest value H=39, L=6
2: U=19-18=1
3: Find the range; R=H-L=39-6=33
4: Select the number of classes desired using Sturge’s formula;
=1 + 3.32log(20) = 5.32 = 6(roundingup)
5: Find the class width; w=R/k=33/6=5.5=6 (rounding up)
6: Select the starting point, let it be the minimum observation. 6, 12,
18, 24, 30, 36 are the lower class limits.
7: Find the upper class limit; e.g. the first upper class=12-U=12-1=11
11, 17, 23, 29, 35, 41 are the upper class limits. So combining step 6
and step 7, one can construct the following classes.
() Biol5001 13 / 29
Methods of Data Organization and Presentation
The complete frequency distribution follows:
Class limit Class boundary Class Mark Tally Freq.
6-11 5.5-11.5 8.5 2
12-17 11.5-17.5 14.5 2
18-23 17.5-23.5 20.5 6
24-29 23.5-29.5 26.5 4
30-35 29.5-35.5 32.5 3
36-41 35.5-41.5 38.5 2
Or
Class limit Class boundary Class Mark Tally Freq.
5-10 4.5-10.5 7.5 1
11-16 10.5-16.5 13.5 2
17-22 16.5-22.5 19.5 7
23-28 22.5-28.5 25.5 4
29-34 28.5-34.5 31.5 4
35-40 34.5-40.5 37.5 2
() Biol5001 14 / 29
Methods of Data Organization and Presentation
Relative and Percentage f.d
A relative f.d is a distribution in which frequency of classes are
expressed relative to the total frequency.
If the frequency of a class are given as a percentage of the total, then
the f.d is called Percentage f.d
Commulative Frequency
Commulative frequency refers to the number of observation that are
below as above a specific value
Less than comm.fr refers to the number of items in the distribution
that have a value equal or less than the upper class limit of the first,
second, third and so on
More than comm. freq refers to the number of items in the
distribution that have a value equal or greater than the lower class
limit
() Biol5001 15 / 29
Methods of Data Organization and Presentation
Eg. For the following distribution construct the less than and more than
comm. f.d
Class fr lessthan morethan
40-45 7
46-50 7
51-55 17
56-60 16
61-65 16
66-70 1
64
Class fr lessthan morethan
40-45 7 7 64
46-50 7 14 57
51-55 17 31 50
56-60 16 47 33
61-65 16 63 17
66-70 1 64 1
64
() Biol5001 16 / 29
Methods of Data Organization and Presentation
Diagrammatic and Graphic presentation of data.
One of the most convincing and appearing ways of in which statistical
results may be presented is through diagrams and graphs.
Importance:
They have greater attraction.
They facilitate comparison.
They are easily understandable.
-The most commonly used diagrammatic presentation for discrete as well
as qualitative data are:
Bar charts
Pie charts
() Biol5001 17 / 29
Methods of Data Organization and Presentation
Bar charts:- are one dimensional rectangular diagram used to display
mostly qualitative or discrete data.
Features
Equal spaces are left between successive bars
Each has equal width
The height of the bar corresponds to the frequency of the class it
represents.
Simple Bar charts:- vertical or horizontal bars are used to represent
figures. The bars rankd and drawn by orders of length for categorical data.
Eg. Consider the following data
Type Area of scale
Local Export Total
Men’s 150 100 250
Women’s 125 225 350
Children 70 110 180
Total 345 435 780
() Biol5001 18 / 29
Methods of Data Organization and Presentation
Children
Men'
s
Women'
s
Horizontal Bars
0 50 100 150 200 250 300 350 Women's Men's Children
Vertical Bars
0
50
100
150
200
250
300
350
2. Component (Stacked) Bar Chart
These are like the ordinary bar chart except that bars are subdivided in to
two or more component parts.
Used to represent total figures items of components
The components are proportional in size to the component parts of
the total being represented by each bar
() Biol5001 19 / 29
Methods of Data Organization and Presentation
a. Actual Component Bar Chart:- where the overall height of the bar
and the individual component length indicate actual figure
Local Export
Children
Women's
Men's
0
100
200
300
400
Men's Women's Children
Export
Local
0
50
100
150
200
250
300
350
b. Percentage Component Bar Chart:- In this chart the individual
component length the percentage forms of the overall total.
Men's Women's Children
Local
Export
0
0.2
0.4
0.6
0.8
1
Local Export
Men's
Women's
Children
0
0.2
0.4
0.6
0.8
1
() Biol5001 20 / 29
Methods of Data Organization and Presentation
3. Multiple Bar Chart
This is the chart in which component parts are shown as separate
bars adjoining each other
The height of each bar represent the actual value of the component
figure
Local Export
Men's
Women's
Children
0
50
100
150
200
Men's Women's Children
Local
Export
0
50
100
150
200
() Biol5001 21 / 29
Methods of Data Organization and Presentation
When to Use Each Chart
Simple Bar Chart:- When change in the total are required
Actual Component Bar Chart:- When changes in total and indication of
the size of each component is required
Percentage Bar Chart:- When changes in the relative size of component
part is required
Multiple Bar Chart:- When changes in the actual value of the component
part is only required and the overall total is not important
Pie Chart
is a circle divided by radial lines in to sectors so that the area of each
sector is proportional to the the size of the figure represented
- Generally used to depict data classified by attributes
Construction:- compute relative frequency
() Biol5001 22 / 29
Methods of Data Organization and Presentation
Eg.
Marital Status No R.f Angle
Single 30 0.3 1080
Married 40 0.4 1440
Widowed 10 0.1 350
Divorced 20 0.2 720
Single
Married
Widowed
Divorced
() Biol5001 23 / 29
Methods of Data Organization and Presentation
Histogram
is a graphical form of f.d
consists of a set of adj rectangles whose bars are marked by class
boundaries
no gaps between successive bars
The length corresponds with frequency of the class
The width with the class interval
can not be constructed for open ended classes
Eg 1. Consider the ff frequency distribution
Weight # of ra c.b
80-89 2 79.5-89.5
90-99 4 89.5-99.5
100-109 14 99.5-109.5
110-119 25 109.5-119.5
69.5 79.5 89.5 99.5 109.5 119.5 129.5
0
5
10
15
20
25
Eg 2. Consider the following frequency distribution
() Biol5001 24 / 29
Methods of Data Organization and Presentation
Frequency Polygon
is a line graph of class frequencies plotted against class marks
Assume two additional classes with zero frequency at the beginning
and at the end
Weight # of ra c.b cm
80-89 2 79.5-89.5 84.5
90-99 4 89.5-99.5 94.5
100-109 14 99.5-109.5 104.5
110-119 25 109.5-119.5 114.5
69.5 79.5 89.5 99.5 109.5 119.5 129.5
0
5
10
15
20
25
79.5 89.5 99.5 109.5 119.5
0
5
10
15
20
25
() Biol5001 25 / 29
Methods of Data Organization and Presentation
Note:- The frequency polygon can be constructed by joining the mid
points of the tops of the histogram with a line.
-The advantage of frequency polygon against histogram is that it allows us
to compare directly two or more frequency distributions.
Commulative Frequency Polygon (Ogive)
These are curves for commulative f.d where commulative frequencies are
plotted on the vertical axis against class boundaries on the horizontal axis.
Then the points are smoothly joined.
We can have ”less than” or ”More than” Ogive
LCF
69.5 79.5 89.5 99.5 119.5
0
5
15
25
35
45
MCF
69.5 79.5 89.5 99.5 119.5
0
5
15
25
35
45
() Biol5001 26 / 29
Methods of Data Organization and Presentation
Graphs:- graphs usually take the form of lines or curves on a coordinate
plane (mostly used for continuous data).
Line Graph:- a graph denoted by joining a series of points that represent
time series data by an appropriate line segment.
Eg. The following data production of ...
Production Year 1985 1986 1987 1988 1989
Quantity 9.5 10.2 11.4 12.6 10.6
1985 1986 1987 1988 1989
8
9
10
11
12
13
Production Y
ear
Quant
i
t
y
() Biol5001 27 / 29
Methods of Data Organization and Presentation
Exercise
1 Suppose data collected for heights of 390 cows were tabulated in a
frequency distribution and the following results were obtained fi 6, 25,
48, 72, 116, 60, 38, 23
and cm1=112, cm2=117
Determine the class width, class limit and less than and more than
cummulative f.d
2 Given the following table
M F Total
Christian 40 25 65
Muslim 15 10 25
Others 5 5 10
Total 60 40 100
a) Which diagrammatic presentation is appropriate to compare
religion with out considering sex? Why?
b) If both between and with in comparisons of religion ans sex is
required, which diagrammatic presentation is appropriate?
() Biol5001 28 / 29
Methods of Data Organization and Presentation
1 Classify the following first as qualitative and quantitative and second
as nominal, ordinal, interval, ratio
Time for swimmers to complete a 50 meter race
Months of the year September, October, . . . etc.
Religion in Ethiopia
2 Suppose information is required on mentally ill person, who will be
reluctant to give the information. If you must get the information,
which method do you use? why?
3 Discuss difference between descriptive and inferential statistics, give
examples.
() Biol5001 29 / 29

More Related Content

PPTX
Data organization
PPTX
Methods of data presention
PPT
2. Data organization and presentaion.ppt
PPTX
data organization and presentation.pptx
PPTX
lesson-data-presentation-tools-1.pptx
PPT
data presentation....................ppt
PPTX
Basic statistics for marketing management
PPTX
Frequency Distribution – An Essential Concept in Statistics
Data organization
Methods of data presention
2. Data organization and presentaion.ppt
data organization and presentation.pptx
lesson-data-presentation-tools-1.pptx
data presentation....................ppt
Basic statistics for marketing management
Frequency Distribution – An Essential Concept in Statistics

Similar to Biostatistics methods of data organisation and presentation (20)

PPT
statistic.ppt
PDF
2 Methods of Data Presentation print.pdf
PPT
Graphical presentation of data
PDF
Principlles of statistics
PDF
Lessonone one - Data Definitions.pptx.pdf
PDF
Lesson1- chapter one Data Definitions.pptx.pdf
PDF
Lesson one Data Definitions type .pptx.pdf
PPTX
Chapter 3: Prsentation of Data
PDF
Frequency distribution table
PPTX
Tabulation of Data, Frequency Distribution, Contingency table
DOC
Ch 3 DATA.doc
PPT
Research notes on Distribution of data.ppt
PPTX
Data collection & Organization (Organizing Data: Frequency Distribution and G...
PPTX
3_-_frequency_distribution of data gathered.pptx
PPTX
Chapter 2 Descriptive statistics for pedatric.pptx
PPTX
2.1 frequency distributions for organizing and summarizing data
PPTX
3 Frequency Distribution biostatistics wildlife
PPT
Classidication and Tabulation
PPTX
Lesson 5 data presentation
PPTX
3_-frequency_distribution.pptx
statistic.ppt
2 Methods of Data Presentation print.pdf
Graphical presentation of data
Principlles of statistics
Lessonone one - Data Definitions.pptx.pdf
Lesson1- chapter one Data Definitions.pptx.pdf
Lesson one Data Definitions type .pptx.pdf
Chapter 3: Prsentation of Data
Frequency distribution table
Tabulation of Data, Frequency Distribution, Contingency table
Ch 3 DATA.doc
Research notes on Distribution of data.ppt
Data collection & Organization (Organizing Data: Frequency Distribution and G...
3_-_frequency_distribution of data gathered.pptx
Chapter 2 Descriptive statistics for pedatric.pptx
2.1 frequency distributions for organizing and summarizing data
3 Frequency Distribution biostatistics wildlife
Classidication and Tabulation
Lesson 5 data presentation
3_-frequency_distribution.pptx
Ad

Recently uploaded (20)

PPTX
Computer network topology notes for revision
PPTX
Introduction to Knowledge Engineering Part 1
PPTX
climate analysis of Dhaka ,Banglades.pptx
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPTX
STERILIZATION AND DISINFECTION-1.ppthhhbx
PPT
Quality review (1)_presentation of this 21
PPTX
Managing Community Partner Relationships
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PPTX
Supervised vs unsupervised machine learning algorithms
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PDF
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}
PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
PDF
Introduction to the R Programming Language
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PPTX
IB Computer Science - Internal Assessment.pptx
PDF
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
Computer network topology notes for revision
Introduction to Knowledge Engineering Part 1
climate analysis of Dhaka ,Banglades.pptx
oil_refinery_comprehensive_20250804084928 (1).pptx
STERILIZATION AND DISINFECTION-1.ppthhhbx
Quality review (1)_presentation of this 21
Managing Community Partner Relationships
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
Supervised vs unsupervised machine learning algorithms
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
Introduction to the R Programming Language
Acceptance and paychological effects of mandatory extra coach I classes.pptx
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
IB Computer Science - Internal Assessment.pptx
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
Ad

Biostatistics methods of data organisation and presentation

  • 1. Biostatistics (Biol5001) METHODS OF DATA Organization and PRESENTATION Instructor: Tatek Getachew(PhD) () Biol5001 1 / 29
  • 2. Methods of Data Organization and Presentation Methods of Data Organization and Presentation Introduction Data collected from any source should be assembled in away that is convenient to understand and look attractive. This can be done by classification, tabulation, drawing graphs and diagrams. The first step in the analysis is to classify and tabulate the information collected Objectives To condense a mass of data in such away that similarities and dissimilarities can readily apprehended To facilitate comparisons and drawing inference To eliminate unnecessary details etc. () Biol5001 2 / 29
  • 3. Methods of Data Organization and Presentation Classification and Tabulation Classification:- The first step of tabulation - is grouping of related facts in to groups or classes. Tabulation:- is a systematic arrangement of statistical data in to columns and rows (tables). Frequency Distribution A frequency distn is a special type of tabular representation in which values of a variable are classified in to set of classes with corresponding frequencies of occurrence. Eg. Frequency distribution of age of science students Age No Students 15-19 150 20-24 70 25-29 300 Terms associated with f.d Frequency is the no of occurrence of a certain variable in a data. ungrouped data:- data in its original raw form () Biol5001 3 / 29
  • 4. Methods of Data Organization and Presentation Methods of Data Organization and Presentation The presentation of data is classified in to the following two categories: Tabular presentation Diagrammatic and Graphic presentation. The process of arranging data in to classes or categories according to similarities technically is called classification. Definition Raw Data: recorded information in its original collected form, whether it be counts or measurements. Class: is a description of a group of similar numbers in a data set. Frequency: is the number of times a variable value is repeated. Frequency distribution: is the organization of raw data in table form using classes and frequencies. () Biol5001 4 / 29
  • 5. Methods of Data Organization and Presentation Frequency Distributions There are three basic types of frequency distributions Categorical frequency distribution Ungrouped frequency distribution Grouped frequency distribution Categorical frequency Distribution: Used for data that is qualitative such as nominal, or ordinal. e.g. marital status,blood type Example: Distribution of Blood Types Twenty-five army inductees were given a blood test to determine their blood type. The data set is A B B AB O O O B AB B B B O A O A O O O AB AB A O B A () Biol5001 5 / 29
  • 6. Methods of Data Organization and Presentation Solution Since the data are categorical, There are four blood types: A, B, O, and AB. Step 1. Make a table as shown. A B C D Class Tally Frequency Percentage A B O AB Step 2. Tally the data and place the results in column B. Step 3. Count the tallies and place the results in column C. Step 4. Find the percentage of values in each class by using the formula % = f n × 100% (1) Step 5. Find the totals for columns C (frequency) and D (percent). The completed table is shown. () Biol5001 6 / 29
  • 7. Methods of Data Organization and Presentation A B C D Class Tally Frequency Percentage A 5 20 B 7 28 O 9 36 AB 4 16 25 100 Ungrouped FD A FD of numerical data (quantitative) in which each value of a variable represents a single class (i.e. the values of the variable are not grouped). Example: The following data represent the mark of 20 students. 80 76 90 85 80 70 60 62 70 85 65 60 63 74 75 76 70 70 80 85 Construct a frequency distribution, which is ungrouped. () Biol5001 7 / 29
  • 8. Methods of Data Organization and Presentation Arrange the data in ascending order 60 60 62 63 65 70 70 70 70 74 75 76 76 80 80 80 85 85 85 90 A B C D Mark Tally Frequency Percentage 60 2 10 62 1 5 63 1 5 65 1 5 70 4 20 74 1 5 75 2 10 76 1 5 80 3 15 85 3 15 90 1 5 20 () Biol5001 8 / 29
  • 9. Methods of Data Organization and Presentation Grouped data data presented in the form of f.d Array data data arranged in ascending or descending order unit of measurement (u):- the smallest possible difference between any consecutive values in the recorded data. u=1 if the data are integers u=0.1 if the data are in to one decimal place u=0.01 if the data are in to two decimal place tally a traditional method of counting frequencies Class limit:- The end point of the class - the smallest and largest value of the class - smallest =⇒ lower class limit (Lcl) - largest =⇒ upper class limit (Ucl) Class boundaries are the true mathematical boundary of the class - are the precise points that separate various classes rather than the values included in any one of the class Lcb=Lcl-1 2u (lower class boundaries) Ucb=Ucl+1 2u (upper class boundaries) () Biol5001 9 / 29
  • 10. Methods of Data Organization and Presentation Class Mark is the mid point of the class cm = Lcl + Ucl 2 = Ucb + Lcb 2 Class width (interval) is the length of the class w=Ucb-Lcb =Ucl-Lcl+u If the classes have uniform width w=cmi − cmi−1=Lcli − Lcli−1=Ucli − Ucli−1 Types of f.d Depending on the variable * Discrete * Continuous Depending on the information needed * Absolute * Relative * Commulative f.ds () Biol5001 10 / 29
  • 11. Methods of Data Organization and Presentation Important points each observation should go to one and only one class The smallest and the largest observations fall with in the classification The class should not overlap Whenever possible make class intervals of the same size Whenever possible avoid open ended class For easy computation, reading and use of distribution, it is advisable to use width 5, 10, 15 or multiple of 5 steps In construction f.d i Arrange the data in ascending or descending order ii Determine the unit of measurements (u) iii Determine the range R=xmax − xmin iv Fix the number of classes (k) arbitrarily a the most common number of classes is between 5 and 15 b Alternatively use Sturge’s rule k = 1 + 3.322.logn where n is the number of observations () Biol5001 11 / 29
  • 12. Methods of Data Organization and Presentation v Determine the class width (w) as w=R k vi Determine the lower class limit of the 1st class - arbitrarily, it may be xmin or any number less than xmin, but not greater than xmin vii Determine the upper class limit of the 1st class Ucli = Lcli + w − u Then determine the other classes Example: Construct a frequency distribution for the following data. 11 29 6 33 14 31 22 27 19 20 18 17 22 38 23 21 26 34 39 27 Soln Arrange the data in ascending order 6 11 14 17 18 19 20 21 22 22 23 26 27 27 29 31 33 34 38 39 () Biol5001 12 / 29
  • 13. Methods of Data Organization and Presentation Solutions: 1: Find the highest and the lowest value H=39, L=6 2: U=19-18=1 3: Find the range; R=H-L=39-6=33 4: Select the number of classes desired using Sturge’s formula; =1 + 3.32log(20) = 5.32 = 6(roundingup) 5: Find the class width; w=R/k=33/6=5.5=6 (rounding up) 6: Select the starting point, let it be the minimum observation. 6, 12, 18, 24, 30, 36 are the lower class limits. 7: Find the upper class limit; e.g. the first upper class=12-U=12-1=11 11, 17, 23, 29, 35, 41 are the upper class limits. So combining step 6 and step 7, one can construct the following classes. () Biol5001 13 / 29
  • 14. Methods of Data Organization and Presentation The complete frequency distribution follows: Class limit Class boundary Class Mark Tally Freq. 6-11 5.5-11.5 8.5 2 12-17 11.5-17.5 14.5 2 18-23 17.5-23.5 20.5 6 24-29 23.5-29.5 26.5 4 30-35 29.5-35.5 32.5 3 36-41 35.5-41.5 38.5 2 Or Class limit Class boundary Class Mark Tally Freq. 5-10 4.5-10.5 7.5 1 11-16 10.5-16.5 13.5 2 17-22 16.5-22.5 19.5 7 23-28 22.5-28.5 25.5 4 29-34 28.5-34.5 31.5 4 35-40 34.5-40.5 37.5 2 () Biol5001 14 / 29
  • 15. Methods of Data Organization and Presentation Relative and Percentage f.d A relative f.d is a distribution in which frequency of classes are expressed relative to the total frequency. If the frequency of a class are given as a percentage of the total, then the f.d is called Percentage f.d Commulative Frequency Commulative frequency refers to the number of observation that are below as above a specific value Less than comm.fr refers to the number of items in the distribution that have a value equal or less than the upper class limit of the first, second, third and so on More than comm. freq refers to the number of items in the distribution that have a value equal or greater than the lower class limit () Biol5001 15 / 29
  • 16. Methods of Data Organization and Presentation Eg. For the following distribution construct the less than and more than comm. f.d Class fr lessthan morethan 40-45 7 46-50 7 51-55 17 56-60 16 61-65 16 66-70 1 64 Class fr lessthan morethan 40-45 7 7 64 46-50 7 14 57 51-55 17 31 50 56-60 16 47 33 61-65 16 63 17 66-70 1 64 1 64 () Biol5001 16 / 29
  • 17. Methods of Data Organization and Presentation Diagrammatic and Graphic presentation of data. One of the most convincing and appearing ways of in which statistical results may be presented is through diagrams and graphs. Importance: They have greater attraction. They facilitate comparison. They are easily understandable. -The most commonly used diagrammatic presentation for discrete as well as qualitative data are: Bar charts Pie charts () Biol5001 17 / 29
  • 18. Methods of Data Organization and Presentation Bar charts:- are one dimensional rectangular diagram used to display mostly qualitative or discrete data. Features Equal spaces are left between successive bars Each has equal width The height of the bar corresponds to the frequency of the class it represents. Simple Bar charts:- vertical or horizontal bars are used to represent figures. The bars rankd and drawn by orders of length for categorical data. Eg. Consider the following data Type Area of scale Local Export Total Men’s 150 100 250 Women’s 125 225 350 Children 70 110 180 Total 345 435 780 () Biol5001 18 / 29
  • 19. Methods of Data Organization and Presentation Children Men' s Women' s Horizontal Bars 0 50 100 150 200 250 300 350 Women's Men's Children Vertical Bars 0 50 100 150 200 250 300 350 2. Component (Stacked) Bar Chart These are like the ordinary bar chart except that bars are subdivided in to two or more component parts. Used to represent total figures items of components The components are proportional in size to the component parts of the total being represented by each bar () Biol5001 19 / 29
  • 20. Methods of Data Organization and Presentation a. Actual Component Bar Chart:- where the overall height of the bar and the individual component length indicate actual figure Local Export Children Women's Men's 0 100 200 300 400 Men's Women's Children Export Local 0 50 100 150 200 250 300 350 b. Percentage Component Bar Chart:- In this chart the individual component length the percentage forms of the overall total. Men's Women's Children Local Export 0 0.2 0.4 0.6 0.8 1 Local Export Men's Women's Children 0 0.2 0.4 0.6 0.8 1 () Biol5001 20 / 29
  • 21. Methods of Data Organization and Presentation 3. Multiple Bar Chart This is the chart in which component parts are shown as separate bars adjoining each other The height of each bar represent the actual value of the component figure Local Export Men's Women's Children 0 50 100 150 200 Men's Women's Children Local Export 0 50 100 150 200 () Biol5001 21 / 29
  • 22. Methods of Data Organization and Presentation When to Use Each Chart Simple Bar Chart:- When change in the total are required Actual Component Bar Chart:- When changes in total and indication of the size of each component is required Percentage Bar Chart:- When changes in the relative size of component part is required Multiple Bar Chart:- When changes in the actual value of the component part is only required and the overall total is not important Pie Chart is a circle divided by radial lines in to sectors so that the area of each sector is proportional to the the size of the figure represented - Generally used to depict data classified by attributes Construction:- compute relative frequency () Biol5001 22 / 29
  • 23. Methods of Data Organization and Presentation Eg. Marital Status No R.f Angle Single 30 0.3 1080 Married 40 0.4 1440 Widowed 10 0.1 350 Divorced 20 0.2 720 Single Married Widowed Divorced () Biol5001 23 / 29
  • 24. Methods of Data Organization and Presentation Histogram is a graphical form of f.d consists of a set of adj rectangles whose bars are marked by class boundaries no gaps between successive bars The length corresponds with frequency of the class The width with the class interval can not be constructed for open ended classes Eg 1. Consider the ff frequency distribution Weight # of ra c.b 80-89 2 79.5-89.5 90-99 4 89.5-99.5 100-109 14 99.5-109.5 110-119 25 109.5-119.5 69.5 79.5 89.5 99.5 109.5 119.5 129.5 0 5 10 15 20 25 Eg 2. Consider the following frequency distribution () Biol5001 24 / 29
  • 25. Methods of Data Organization and Presentation Frequency Polygon is a line graph of class frequencies plotted against class marks Assume two additional classes with zero frequency at the beginning and at the end Weight # of ra c.b cm 80-89 2 79.5-89.5 84.5 90-99 4 89.5-99.5 94.5 100-109 14 99.5-109.5 104.5 110-119 25 109.5-119.5 114.5 69.5 79.5 89.5 99.5 109.5 119.5 129.5 0 5 10 15 20 25 79.5 89.5 99.5 109.5 119.5 0 5 10 15 20 25 () Biol5001 25 / 29
  • 26. Methods of Data Organization and Presentation Note:- The frequency polygon can be constructed by joining the mid points of the tops of the histogram with a line. -The advantage of frequency polygon against histogram is that it allows us to compare directly two or more frequency distributions. Commulative Frequency Polygon (Ogive) These are curves for commulative f.d where commulative frequencies are plotted on the vertical axis against class boundaries on the horizontal axis. Then the points are smoothly joined. We can have ”less than” or ”More than” Ogive LCF 69.5 79.5 89.5 99.5 119.5 0 5 15 25 35 45 MCF 69.5 79.5 89.5 99.5 119.5 0 5 15 25 35 45 () Biol5001 26 / 29
  • 27. Methods of Data Organization and Presentation Graphs:- graphs usually take the form of lines or curves on a coordinate plane (mostly used for continuous data). Line Graph:- a graph denoted by joining a series of points that represent time series data by an appropriate line segment. Eg. The following data production of ... Production Year 1985 1986 1987 1988 1989 Quantity 9.5 10.2 11.4 12.6 10.6 1985 1986 1987 1988 1989 8 9 10 11 12 13 Production Y ear Quant i t y () Biol5001 27 / 29
  • 28. Methods of Data Organization and Presentation Exercise 1 Suppose data collected for heights of 390 cows were tabulated in a frequency distribution and the following results were obtained fi 6, 25, 48, 72, 116, 60, 38, 23 and cm1=112, cm2=117 Determine the class width, class limit and less than and more than cummulative f.d 2 Given the following table M F Total Christian 40 25 65 Muslim 15 10 25 Others 5 5 10 Total 60 40 100 a) Which diagrammatic presentation is appropriate to compare religion with out considering sex? Why? b) If both between and with in comparisons of religion ans sex is required, which diagrammatic presentation is appropriate? () Biol5001 28 / 29
  • 29. Methods of Data Organization and Presentation 1 Classify the following first as qualitative and quantitative and second as nominal, ordinal, interval, ratio Time for swimmers to complete a 50 meter race Months of the year September, October, . . . etc. Religion in Ethiopia 2 Suppose information is required on mentally ill person, who will be reluctant to give the information. If you must get the information, which method do you use? why? 3 Discuss difference between descriptive and inferential statistics, give examples. () Biol5001 29 / 29