SlideShare a Scribd company logo
Chapter 2
Descriptive Statistics:
Tabular and Graphical Methods
Summarizing Qualitative Data
Summarizing Quantitative Data
Exploratory Data Analysis
Crosstabulations
and Scatter Diagrams

Slide
1
Summarizing Qualitative Data
Frequency Distribution
Relative Frequency
Percent Frequency Distribution
Bar Graph
Pie Chart

Slide
2
Frequency Distribution
A frequency distribution is a tabular summary of
data showing the frequency (or number) of items in
each of several nonoverlapping classes.
The objective is to provide insights about the data
that cannot be quickly obtained by looking only at
the original data.

Slide
3
Example: Marada Inn
Guests staying at Marada Inn were asked to rate the
quality of their accommodations as being excellent,
above average, average, below average, or poor. The
ratings provided by a sample of 20 quests are shown
below.
Below Average Average
Above Average Above Average
Above Average Below Average
Average
Poor
Above Average Excellent
Average
Above Average
Above Average Average

Above Average
Above Average
Below Average
Poor
Above Average
Average

Slide
4
Example: Marada Inn
Frequency Distribution
Rating
Frequency
Poor
2
Below Average
3
Average
5
Above Average
9
Excellent
1
Total
20

Slide
5
Relative Frequency Distribution
The relative frequency of a class is the fraction or
proportion of the total number of data items
belonging to the class.
A relative frequency distribution is a tabular
summary of a set of data showing the relative
frequency for each class.

Slide
6
Percent Frequency Distribution
The percent frequency of a class is the relative
frequency multiplied by 100.
A percent frequency distribution is a tabular
summary of a set of data showing the percent
frequency for each class.

Slide
7
Example: Marada Inn
Relative Frequency and Percent Frequency
Distributions
Rating
Poor
Below Average
Average
Above Average
Excellent
Total

Relative
Percent
Frequency Frequency
.10
.15
.25
.45
.05
1.00

10
15
25
45
5
100
Slide
8
Bar Graph
A bar graph is a graphical device for depicting
qualitative data.
On the horizontal axis we specify the labels that are
used for each of the classes.
A frequency, relative frequency, or percent frequency
scale can be used for the vertical axis.
Using a bar of fixed width drawn above each class
label, we extend the height appropriately.
The bars are separated to emphasize the fact that
each class is a separate category.

Slide
9
Example: Marada Inn
Bar Graph
9

Frequency

8
7
6
5
4
3
2
1
Poor

Below Average Above Excellent
Average
Average

Rating

Slide
10
Pie Chart
The pie chart is a commonly used graphical device
for presenting relative frequency distributions for
qualitative data.
First draw a circle; then use the relative frequencies
to subdivide the circle into sectors that correspond to
the relative frequency for each class.
Since there are 360 degrees in a circle, a class with a
relative frequency of .25 would consume .25(360) =
90 degrees of the circle.

Slide
11
Example: Marada Inn
Pie Chart
Exc.
Poor
5%
10%
Above
Average
45%

Below
Average
15%
Average
25%

Quality Ratings
Slide
12
Example: Marada Inn
Insights Gained from the Preceding Pie Chart
• One-half of the customers surveyed gave Marada
a quality rating of “above average” or “excellent”
(looking at the left side of the pie). This might
please the manager.
• For each customer who gave an “excellent” rating,
there were two customers who gave a “poor”
rating (looking at the top of the pie). This should
displease the manager.

Slide
13
Summarizing Quantitative Data
Frequency Distribution
Relative Frequency and Percent Frequency
Distributions
Dot Plot
Histogram
Cumulative Distributions
Ogive

Slide
14
Example: Hudson Auto Repair
The manager of Hudson Auto would like to get a
better picture of the distribution of costs for engine
tune-up parts. A sample of 50 customer invoices has
been taken and the costs of parts, rounded to the
nearest dollar, are listed below.

91
71
104
85
62

78
69
74
97
82

93
72
62
88
98

57
89
68
68
101

75
66
97
83
79

52
75
105
68
105

99
79
77
71
79

80
75
65
69
69

97
72
80
67
62

62
76
109
74
73

Slide
15
Frequency Distribution
Guidelines for Selecting Number of Classes
• Use between 5 and 20 classes.
• Data sets with a larger number of elements
usually require a larger number of classes.
• Smaller data sets usually require fewer classes.

Slide
16
Frequency Distribution
Guidelines for Selecting Width of Classes
• Use classes of equal width.
• Approximate Class Width =

Largest Data Value − Smallest Data Value
Number of Classes

Slide
17
Example: Hudson Auto Repair
Frequency Distribution
If we choose six classes:
Approximate Class Width = (109 - 52)/6 = 9.5 ≅ 10
Cost ($)
50-59
60-69
70-79
80-89
90-99
100-109

Frequency
2
13
16
7
7
5
Total 50
Slide
18
Example: Hudson Auto Repair
Relative Frequency and Percent Frequency
Distributions
Relative
Cost ($)
Frequency
50-59
.04
60-69
.26
70-79
.32
80-89
.14
90-99
.14
100-109
.10
Total 1.00

Percent
Frequency
4
26
32
14
14
10
100
Slide
19
Example: Hudson Auto Repair
Insights Gained from the Percent Frequency
Distribution
• Only 4% of the parts costs are in the $50-59 class.
• 30% of the parts costs are under $70.
• The greatest percentage (32% or almost one-third)
of the parts costs are in the $70-79 class.
• 10% of the parts costs are $100 or more.

Slide
20
Dot Plot
One of the simplest graphical summaries of data is a
dot plot.
A horizontal axis shows the range of data values.
Then each data value is represented by a dot placed
above the axis.

Slide
21
Example: Hudson Auto Repair
Dot Plot

.
50

.
. .. .. .. .
. ..
.
. ..... .......... .. . .. . . ... . .. .
..
.
.
. .
60

70

80

90

100

110

Cost ($)

Slide
22
Histogram
Another common graphical presentation of
quantitative data is a histogram.
The variable of interest is placed on the horizontal
axis.
A rectangle is drawn above each class interval with
its height corresponding to the interval’s frequency,
relative frequency, or percent frequency.
Unlike a bar graph, a histogram has no natural
separation between rectangles of adjacent classes.

Slide
23
Example: Hudson Auto Repair
Histogram
18

Frequency

16
14
12
10
8
6
4
2
50

60

70

80

90

100

110

Parts
Cost ($)
Slide
24
Cumulative Distributions
Cumulative frequency distribution -- shows the
number of items with values less than or equal to the
upper limit of each class.
Cumulative relative frequency distribution -- shows
the proportion of items with values less than or equal
to the upper limit of each class.
Cumulative percent frequency distribution -- shows
the percentage of items with values less than or equal
to the upper limit of each class.

Slide
25
Example: Hudson Auto Repair
Cumulative Distributions

Cost ($)
< 59
< 69
< 79
< 89
< 99
< 109

Cumulative Cumulative
Cumulative
Relative
Percent
Frequency
Frequency
Frequency
2
.04
4
15
.30
30
31
.62
62
38
.76
76
45
.90
90
50
1.00
100

Slide
26
Ogive
An ogive is a graph of a cumulative distribution.
The data values are shown on the horizontal axis.
Shown on the vertical axis are the:
• cumulative frequencies, or
• cumulative relative frequencies, or
• cumulative percent frequencies
The frequency (one of the above) of each class is
plotted as a point.
The plotted points are connected by straight lines.

Slide
27
Example: Hudson Auto Repair
Ogive
• Because the class limits for the parts-cost data are
50-59, 60-69, and so on, there appear to be one-unit
gaps from 59 to 60, 69 to 70, and so on.
• These gaps are eliminated by plotting points
halfway between the class limits.
• Thus, 59.5 is used for the 50-59 class, 69.5 is used
for the 60-69 class, and so on.

Slide
28
Example: Hudson Auto Repair

Cumulative Percent Frequency

Ogive with Cumulative Percent Frequencies
100
80
60
40
20
50

60

70

80

90

100

110

Parts
Cost ($)
Slide
29
Exploratory Data Analysis
The techniques of exploratory data analysis consist of
simple arithmetic and easy-to-draw pictures that can
be used to summarize data quickly.
One such technique is the stem-and-leaf display.

Slide
30
Stem-and-Leaf Display
A stem-and-leaf display shows both the rank order
and shape of the distribution of the data.
It is similar to a histogram on its side, but it has the
advantage of showing the actual data values.
The first digits of each data item are arranged to the
left of a vertical line.
To the right of the vertical line we record the last
digit for each item in rank order.
Each line in the display is referred to as a stem.
Each digit on a stem is a leaf.
8 57
9 3678
Slide
31
Stem-and-Leaf Display
Leaf Units
• A single digit is used to define each leaf.
• In the preceding example, the leaf unit was 1.
• Leaf units may be 100, 10, 1, 0.1, and so on.
• Where the leaf unit is not shown, it is assumed to
equal 1.

Slide
32
Example: Leaf Unit = 0.1
If we have data with values such as
8.6

11.7

9.4

9.1

10.2

11.0

8.8

a stem-and-leaf display of these data will be
Leaf Unit = 0.1
8 6 8
9 1 4
10 2
11 0 7

Slide
33
Example: Leaf Unit = 10
If we have data with values such as
1806 1717 1974 1791 1682 1910 1838
a stem-and-leaf display of these data will be
Leaf Unit = 10
16 8
17 1 9
18 0 3
19 1 7

Slide
34
Example: Hudson Auto Repair
Stem-and-Leaf Display
5
6
7
8
9
10

2
2
1
0
1
1

7
2
1
0
3
4

2
2
2
7
5

2
2
3
7
5

5
3
5
7
9

6
4
8
8

7 8 8 8 9 9 9
4 5 5 5 6 7 8 9 9 9
9
9

Slide
35
Stretched Stem-and-Leaf Display
If we believe the original stem-and-leaf display has
condensed the data too much, we can stretch the
display by using two more stems for each leading
digit(s).
Whenever a stem value is stated twice, the first value
corresponds to leaf values of 0-4, and the second
values corresponds to values of 5-9.

Slide
36
Example: Hudson Auto Repair
Stretched Stem-and-Leaf Display
5
5
6
6
7
7
8
8
9
9
10
10

2
7
2
5
1
5
0
5
1
7
1
5

2
6
1
5
0
8
3
7
4
5

2
7
2
5
2
9

2
8
2
6
3

8
3
7

8
4
8

9 9 9
4
9 9 9

7 8 9
9
Slide
37
Crosstabulations and Scatter Diagrams
Thus far we have focused on methods that are used
to summarize the data for one variable at a time.
Often a manager is interested in tabular and
graphical methods that will help understand the
relationship between two variables.
Crosstabulation and a scatter diagram are two
methods for summarizing the data for two (or more)
variables simultaneously.

Slide
38
Crosstabulation
Crosstabulation is a tabular method for summarizing
the data for two variables simultaneously.
Crosstabulation can be used when:
• One variable is qualitative and the other is
quantitative
• Both variables are qualitative
• Both variables are quantitative
The left and top margin labels define the classes for
the two variables.

Slide
39
Example: Finger Lakes Homes
Crosstabulation
The number of Finger Lakes homes sold for each
style and price for the past two years is shown below.
Price
Range
< $99,000
> $99,000
Total

Home Style
Colonial Ranch Split A-Frame Total
18
12

6
14

19
16

12
3

55
45

30

20

35

15

100

Slide
40
Example: Finger Lakes Homes
Insights Gained from the Preceding Crosstabulation
• The greatest number of homes in the sample (19)
are a split-level style and priced at less than or
equal to $99,000.
• Only three homes in the sample are an A-Frame
style and priced at more than $99,000.

Slide
41
Crosstabulation: Row or Column Percentages
Converting the entries in the table into row
percentages or column percentages can provide
additional insight about the relationship between the
two variables.

Slide
42
Example: Finger Lakes Homes
Row Percentages
Price
Range
< $99,000
> $99,000

Home Style
Colonial Ranch Split A-Frame Total
32.73
26.67

10.91 34.55
31.11 35.56

21.82
6.67

100
100

Note: row totals are actually 100.01 due to rounding.

Slide
43
Example: Finger Lakes Homes
Column Percentages
Price
Range
< $99,000
> $99,000
Total

Home Style
Colonial Ranch Split A-Frame
60.00
40.00
100

30.00 54.29
70.00 45.71

80.00
20.00

100

100

100

Slide
44
Scatter Diagram
A scatter diagram is a graphical presentation of the
relationship between two quantitative variables.
One variable is shown on the horizontal axis and the
other variable is shown on the vertical axis.
The general pattern of the plotted points suggests the
overall relationship between the variables.

Slide
45
Example: Panthers Football Team
Scatter Diagram
The Panthers football team is interested in
investigating the relationship, if any, between
interceptions made and points scored.
x = Number of
Interceptions
1
3
2
1
3

y = Number of
Points Scored
14
24
18
17
27
Slide
46
Example: Panthers Football Team

Number of Points Scored

Scatter Diagram
y
30
25
20
15
10
5
0

0

1
2
3
Number of Interceptions

x

Slide
47
Example: Panthers Football Team
The preceding scatter diagram indicates a positive
relationship between the number of interceptions and
the number of points scored.
Higher points scored are associated with a higher
number of interceptions.
The relationship is not perfect; all plotted points in
the scatter diagram are not on a straight line.

Slide
48
Scatter Diagram
A Positive Relationship

y

x

Slide
49
Scatter Diagram
A Negative Relationship

y

x

Slide
50
Scatter Diagram
No Apparent Relationship

y

x

Slide
51
Tabular and Graphical Procedures
Data
Qualitative Data

Quantitative Data

Tabular
Methods

Graphical
Methods

Tabular
Methods

•Frequency
Distribution
•Rel. Freq. Dist.
•% Freq. Dist.
•Crosstabulation

•Bar Graph
•Pie Chart

•Frequency
Distribution
•Rel. Freq. Dist.
•Cum. Freq. Dist.
•Cum. Rel. Freq.
Distribution
•Stem-and-Leaf
Display
•Crosstabulation

Graphical
Methods
•Dot Plot
•Histogram
•Ogive
•Scatter
Diagram

Slide
52
End of Chapter 2

Slide
53

More Related Content

PPT
Ajeesh e resource book
PPTX
3.1 Measures of center
PPT
Chapter03
PPTX
3.3 Measures of relative standing and boxplots
PPTX
3.2 Measures of variation
PPT
Chapter 03
PPTX
Graphs that Enlighten and Graphs that Deceive
PPT
Chapter08
Ajeesh e resource book
3.1 Measures of center
Chapter03
3.3 Measures of relative standing and boxplots
3.2 Measures of variation
Chapter 03
Graphs that Enlighten and Graphs that Deceive
Chapter08

What's hot (18)

PPT
PPT
Ds vs Is discuss 3.1
PPTX
3.4 Measures of Position
PDF
Measures of dispersion discuss 2.2
PPT
Chapter01
PPTX
Descriptive Statistics Part II: Graphical Description
PDF
Density Curves and Normal Distributions
PDF
Dispersion stati
PPTX
Measures of Relative Standing and Boxplots
PDF
Business statistics-ii-aarhus-bss
PPT
Chapter04
PPTX
Chapter 4 powerpoint
PPTX
Chapter 3
PPT
Descriptive stat
PDF
Introduction to statistics
PPT
Statistical ppt
PPTX
Mba i qt unit-2.1_measures of variations
PPTX
Statistics Math project class 10th
Ds vs Is discuss 3.1
3.4 Measures of Position
Measures of dispersion discuss 2.2
Chapter01
Descriptive Statistics Part II: Graphical Description
Density Curves and Normal Distributions
Dispersion stati
Measures of Relative Standing and Boxplots
Business statistics-ii-aarhus-bss
Chapter04
Chapter 4 powerpoint
Chapter 3
Descriptive stat
Introduction to statistics
Statistical ppt
Mba i qt unit-2.1_measures of variations
Statistics Math project class 10th
Ad

Viewers also liked (18)

PPTX
Approaches to Develop Curriculum for Children Visual Impairment
PPT
PPT
Tutorials--Logarithmic Functions in Tabular and Graph Form
PDF
Tabular Data on the Web
PDF
CRL: A Rule Language for Table Analysis and Interpretation
PPTX
V.i. ppt copy
PPTX
Visual impairment
PPTX
Visual Impairment Information and Teaching Strategies
PPTX
Ses 4 tabulation
PPT
Visual Impairment
PPTX
visual impairment
PPT
visual impairment
PPTX
Visual Impairments
PPTX
Ncf 2005
PPTX
Frequency Distributions and Graphs
PPTX
Policies and Guidelines of Special Education in the Philippines
PPTX
Sampling Methods in Qualitative and Quantitative Research
PPT
Free Download Powerpoint Slides
Approaches to Develop Curriculum for Children Visual Impairment
Tutorials--Logarithmic Functions in Tabular and Graph Form
Tabular Data on the Web
CRL: A Rule Language for Table Analysis and Interpretation
V.i. ppt copy
Visual impairment
Visual Impairment Information and Teaching Strategies
Ses 4 tabulation
Visual Impairment
visual impairment
visual impairment
Visual Impairments
Ncf 2005
Frequency Distributions and Graphs
Policies and Guidelines of Special Education in the Philippines
Sampling Methods in Qualitative and Quantitative Research
Free Download Powerpoint Slides
Ad

Similar to Kxu stat-anderson-ch02 (20)

PPTX
SBE11ch02a.pptx
PPTX
Ppt02 tabular&amp;graphical
PPT
Assessment
PPTX
Stats LECTURE 2.pptx
PPT
Chapter 2_Presentation of Data.ppt mean, median, mode, variance
PPT
Sta2023 ch02
PPT
Source of DATA
PPTX
Numerical and statistical methods new
PPT
Chapter 02
PPTX
Chapter Two (PART ONE).pptx
PPTX
2.3 Graphs that enlighten and graphs that deceive
PDF
Day2 session i&amp;ii - spss
PPT
DATA ANALYSIS FOR BUSINESS ch02-Discriptive Statistics_Tabular and Graphical ...
PPTX
Chapter-1-section 2.1 Exploring data-Edition-5.pptx
PPT
8407195.ppt
PPTX
Chapter 2 of the book Basic Statistics as described by teacher
PPTX
frequency distribution
PDF
Galvin Frequency and Relative Distribution
PPTX
Descriptive statistics
PPT
Business Statistics Chapter 2
SBE11ch02a.pptx
Ppt02 tabular&amp;graphical
Assessment
Stats LECTURE 2.pptx
Chapter 2_Presentation of Data.ppt mean, median, mode, variance
Sta2023 ch02
Source of DATA
Numerical and statistical methods new
Chapter 02
Chapter Two (PART ONE).pptx
2.3 Graphs that enlighten and graphs that deceive
Day2 session i&amp;ii - spss
DATA ANALYSIS FOR BUSINESS ch02-Discriptive Statistics_Tabular and Graphical ...
Chapter-1-section 2.1 Exploring data-Edition-5.pptx
8407195.ppt
Chapter 2 of the book Basic Statistics as described by teacher
frequency distribution
Galvin Frequency and Relative Distribution
Descriptive statistics
Business Statistics Chapter 2

More from Alex Robianes Hernandez (20)

PDF
Grade8aralingpanlipunanmodyul3 130818183043-phpapp01
PPTX
Klinefelter syndrome
PPT
028 unit 4 (19)
PPS
Presentation 9
PPT
Conditionals(1)
PPT
PPTX
PPT
Statistics chm 235
PPTX
Presentationofdata 120111034007-phpapp02
PPT
Transitive and intertransitive verbs
PPT
5.5 triangle inequality theorem
DOC
Basic sentence patterns_with_e
PPTX
Work and energy
PPT
Adjectives and adverbs final
PPTX
Mollusks and annelids
PPTX
Sound and hearing
Grade8aralingpanlipunanmodyul3 130818183043-phpapp01
Klinefelter syndrome
028 unit 4 (19)
Presentation 9
Conditionals(1)
Statistics chm 235
Presentationofdata 120111034007-phpapp02
Transitive and intertransitive verbs
5.5 triangle inequality theorem
Basic sentence patterns_with_e
Work and energy
Adjectives and adverbs final
Mollusks and annelids
Sound and hearing

Recently uploaded (20)

PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
KodekX | Application Modernization Development
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Encapsulation theory and applications.pdf
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPTX
A Presentation on Artificial Intelligence
PDF
Modernizing your data center with Dell and AMD
PPTX
MYSQL Presentation for SQL database connectivity
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPT
Teaching material agriculture food technology
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
NewMind AI Monthly Chronicles - July 2025
KodekX | Application Modernization Development
Encapsulation_ Review paper, used for researhc scholars
Advanced methodologies resolving dimensionality complications for autism neur...
Encapsulation theory and applications.pdf
Agricultural_Statistics_at_a_Glance_2022_0.pdf
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Digital-Transformation-Roadmap-for-Companies.pptx
A Presentation on Artificial Intelligence
Modernizing your data center with Dell and AMD
MYSQL Presentation for SQL database connectivity
20250228 LYD VKU AI Blended-Learning.pptx
Per capita expenditure prediction using model stacking based on satellite ima...
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Review of recent advances in non-invasive hemoglobin estimation
Mobile App Security Testing_ A Comprehensive Guide.pdf
Teaching material agriculture food technology

Kxu stat-anderson-ch02

  • 1. Chapter 2 Descriptive Statistics: Tabular and Graphical Methods Summarizing Qualitative Data Summarizing Quantitative Data Exploratory Data Analysis Crosstabulations and Scatter Diagrams Slide 1
  • 2. Summarizing Qualitative Data Frequency Distribution Relative Frequency Percent Frequency Distribution Bar Graph Pie Chart Slide 2
  • 3. Frequency Distribution A frequency distribution is a tabular summary of data showing the frequency (or number) of items in each of several nonoverlapping classes. The objective is to provide insights about the data that cannot be quickly obtained by looking only at the original data. Slide 3
  • 4. Example: Marada Inn Guests staying at Marada Inn were asked to rate the quality of their accommodations as being excellent, above average, average, below average, or poor. The ratings provided by a sample of 20 quests are shown below. Below Average Average Above Average Above Average Above Average Below Average Average Poor Above Average Excellent Average Above Average Above Average Average Above Average Above Average Below Average Poor Above Average Average Slide 4
  • 5. Example: Marada Inn Frequency Distribution Rating Frequency Poor 2 Below Average 3 Average 5 Above Average 9 Excellent 1 Total 20 Slide 5
  • 6. Relative Frequency Distribution The relative frequency of a class is the fraction or proportion of the total number of data items belonging to the class. A relative frequency distribution is a tabular summary of a set of data showing the relative frequency for each class. Slide 6
  • 7. Percent Frequency Distribution The percent frequency of a class is the relative frequency multiplied by 100. A percent frequency distribution is a tabular summary of a set of data showing the percent frequency for each class. Slide 7
  • 8. Example: Marada Inn Relative Frequency and Percent Frequency Distributions Rating Poor Below Average Average Above Average Excellent Total Relative Percent Frequency Frequency .10 .15 .25 .45 .05 1.00 10 15 25 45 5 100 Slide 8
  • 9. Bar Graph A bar graph is a graphical device for depicting qualitative data. On the horizontal axis we specify the labels that are used for each of the classes. A frequency, relative frequency, or percent frequency scale can be used for the vertical axis. Using a bar of fixed width drawn above each class label, we extend the height appropriately. The bars are separated to emphasize the fact that each class is a separate category. Slide 9
  • 10. Example: Marada Inn Bar Graph 9 Frequency 8 7 6 5 4 3 2 1 Poor Below Average Above Excellent Average Average Rating Slide 10
  • 11. Pie Chart The pie chart is a commonly used graphical device for presenting relative frequency distributions for qualitative data. First draw a circle; then use the relative frequencies to subdivide the circle into sectors that correspond to the relative frequency for each class. Since there are 360 degrees in a circle, a class with a relative frequency of .25 would consume .25(360) = 90 degrees of the circle. Slide 11
  • 12. Example: Marada Inn Pie Chart Exc. Poor 5% 10% Above Average 45% Below Average 15% Average 25% Quality Ratings Slide 12
  • 13. Example: Marada Inn Insights Gained from the Preceding Pie Chart • One-half of the customers surveyed gave Marada a quality rating of “above average” or “excellent” (looking at the left side of the pie). This might please the manager. • For each customer who gave an “excellent” rating, there were two customers who gave a “poor” rating (looking at the top of the pie). This should displease the manager. Slide 13
  • 14. Summarizing Quantitative Data Frequency Distribution Relative Frequency and Percent Frequency Distributions Dot Plot Histogram Cumulative Distributions Ogive Slide 14
  • 15. Example: Hudson Auto Repair The manager of Hudson Auto would like to get a better picture of the distribution of costs for engine tune-up parts. A sample of 50 customer invoices has been taken and the costs of parts, rounded to the nearest dollar, are listed below. 91 71 104 85 62 78 69 74 97 82 93 72 62 88 98 57 89 68 68 101 75 66 97 83 79 52 75 105 68 105 99 79 77 71 79 80 75 65 69 69 97 72 80 67 62 62 76 109 74 73 Slide 15
  • 16. Frequency Distribution Guidelines for Selecting Number of Classes • Use between 5 and 20 classes. • Data sets with a larger number of elements usually require a larger number of classes. • Smaller data sets usually require fewer classes. Slide 16
  • 17. Frequency Distribution Guidelines for Selecting Width of Classes • Use classes of equal width. • Approximate Class Width = Largest Data Value − Smallest Data Value Number of Classes Slide 17
  • 18. Example: Hudson Auto Repair Frequency Distribution If we choose six classes: Approximate Class Width = (109 - 52)/6 = 9.5 ≅ 10 Cost ($) 50-59 60-69 70-79 80-89 90-99 100-109 Frequency 2 13 16 7 7 5 Total 50 Slide 18
  • 19. Example: Hudson Auto Repair Relative Frequency and Percent Frequency Distributions Relative Cost ($) Frequency 50-59 .04 60-69 .26 70-79 .32 80-89 .14 90-99 .14 100-109 .10 Total 1.00 Percent Frequency 4 26 32 14 14 10 100 Slide 19
  • 20. Example: Hudson Auto Repair Insights Gained from the Percent Frequency Distribution • Only 4% of the parts costs are in the $50-59 class. • 30% of the parts costs are under $70. • The greatest percentage (32% or almost one-third) of the parts costs are in the $70-79 class. • 10% of the parts costs are $100 or more. Slide 20
  • 21. Dot Plot One of the simplest graphical summaries of data is a dot plot. A horizontal axis shows the range of data values. Then each data value is represented by a dot placed above the axis. Slide 21
  • 22. Example: Hudson Auto Repair Dot Plot . 50 . . .. .. .. . . .. . . ..... .......... .. . .. . . ... . .. . .. . . . . 60 70 80 90 100 110 Cost ($) Slide 22
  • 23. Histogram Another common graphical presentation of quantitative data is a histogram. The variable of interest is placed on the horizontal axis. A rectangle is drawn above each class interval with its height corresponding to the interval’s frequency, relative frequency, or percent frequency. Unlike a bar graph, a histogram has no natural separation between rectangles of adjacent classes. Slide 23
  • 24. Example: Hudson Auto Repair Histogram 18 Frequency 16 14 12 10 8 6 4 2 50 60 70 80 90 100 110 Parts Cost ($) Slide 24
  • 25. Cumulative Distributions Cumulative frequency distribution -- shows the number of items with values less than or equal to the upper limit of each class. Cumulative relative frequency distribution -- shows the proportion of items with values less than or equal to the upper limit of each class. Cumulative percent frequency distribution -- shows the percentage of items with values less than or equal to the upper limit of each class. Slide 25
  • 26. Example: Hudson Auto Repair Cumulative Distributions Cost ($) < 59 < 69 < 79 < 89 < 99 < 109 Cumulative Cumulative Cumulative Relative Percent Frequency Frequency Frequency 2 .04 4 15 .30 30 31 .62 62 38 .76 76 45 .90 90 50 1.00 100 Slide 26
  • 27. Ogive An ogive is a graph of a cumulative distribution. The data values are shown on the horizontal axis. Shown on the vertical axis are the: • cumulative frequencies, or • cumulative relative frequencies, or • cumulative percent frequencies The frequency (one of the above) of each class is plotted as a point. The plotted points are connected by straight lines. Slide 27
  • 28. Example: Hudson Auto Repair Ogive • Because the class limits for the parts-cost data are 50-59, 60-69, and so on, there appear to be one-unit gaps from 59 to 60, 69 to 70, and so on. • These gaps are eliminated by plotting points halfway between the class limits. • Thus, 59.5 is used for the 50-59 class, 69.5 is used for the 60-69 class, and so on. Slide 28
  • 29. Example: Hudson Auto Repair Cumulative Percent Frequency Ogive with Cumulative Percent Frequencies 100 80 60 40 20 50 60 70 80 90 100 110 Parts Cost ($) Slide 29
  • 30. Exploratory Data Analysis The techniques of exploratory data analysis consist of simple arithmetic and easy-to-draw pictures that can be used to summarize data quickly. One such technique is the stem-and-leaf display. Slide 30
  • 31. Stem-and-Leaf Display A stem-and-leaf display shows both the rank order and shape of the distribution of the data. It is similar to a histogram on its side, but it has the advantage of showing the actual data values. The first digits of each data item are arranged to the left of a vertical line. To the right of the vertical line we record the last digit for each item in rank order. Each line in the display is referred to as a stem. Each digit on a stem is a leaf. 8 57 9 3678 Slide 31
  • 32. Stem-and-Leaf Display Leaf Units • A single digit is used to define each leaf. • In the preceding example, the leaf unit was 1. • Leaf units may be 100, 10, 1, 0.1, and so on. • Where the leaf unit is not shown, it is assumed to equal 1. Slide 32
  • 33. Example: Leaf Unit = 0.1 If we have data with values such as 8.6 11.7 9.4 9.1 10.2 11.0 8.8 a stem-and-leaf display of these data will be Leaf Unit = 0.1 8 6 8 9 1 4 10 2 11 0 7 Slide 33
  • 34. Example: Leaf Unit = 10 If we have data with values such as 1806 1717 1974 1791 1682 1910 1838 a stem-and-leaf display of these data will be Leaf Unit = 10 16 8 17 1 9 18 0 3 19 1 7 Slide 34
  • 35. Example: Hudson Auto Repair Stem-and-Leaf Display 5 6 7 8 9 10 2 2 1 0 1 1 7 2 1 0 3 4 2 2 2 7 5 2 2 3 7 5 5 3 5 7 9 6 4 8 8 7 8 8 8 9 9 9 4 5 5 5 6 7 8 9 9 9 9 9 Slide 35
  • 36. Stretched Stem-and-Leaf Display If we believe the original stem-and-leaf display has condensed the data too much, we can stretch the display by using two more stems for each leading digit(s). Whenever a stem value is stated twice, the first value corresponds to leaf values of 0-4, and the second values corresponds to values of 5-9. Slide 36
  • 37. Example: Hudson Auto Repair Stretched Stem-and-Leaf Display 5 5 6 6 7 7 8 8 9 9 10 10 2 7 2 5 1 5 0 5 1 7 1 5 2 6 1 5 0 8 3 7 4 5 2 7 2 5 2 9 2 8 2 6 3 8 3 7 8 4 8 9 9 9 4 9 9 9 7 8 9 9 Slide 37
  • 38. Crosstabulations and Scatter Diagrams Thus far we have focused on methods that are used to summarize the data for one variable at a time. Often a manager is interested in tabular and graphical methods that will help understand the relationship between two variables. Crosstabulation and a scatter diagram are two methods for summarizing the data for two (or more) variables simultaneously. Slide 38
  • 39. Crosstabulation Crosstabulation is a tabular method for summarizing the data for two variables simultaneously. Crosstabulation can be used when: • One variable is qualitative and the other is quantitative • Both variables are qualitative • Both variables are quantitative The left and top margin labels define the classes for the two variables. Slide 39
  • 40. Example: Finger Lakes Homes Crosstabulation The number of Finger Lakes homes sold for each style and price for the past two years is shown below. Price Range < $99,000 > $99,000 Total Home Style Colonial Ranch Split A-Frame Total 18 12 6 14 19 16 12 3 55 45 30 20 35 15 100 Slide 40
  • 41. Example: Finger Lakes Homes Insights Gained from the Preceding Crosstabulation • The greatest number of homes in the sample (19) are a split-level style and priced at less than or equal to $99,000. • Only three homes in the sample are an A-Frame style and priced at more than $99,000. Slide 41
  • 42. Crosstabulation: Row or Column Percentages Converting the entries in the table into row percentages or column percentages can provide additional insight about the relationship between the two variables. Slide 42
  • 43. Example: Finger Lakes Homes Row Percentages Price Range < $99,000 > $99,000 Home Style Colonial Ranch Split A-Frame Total 32.73 26.67 10.91 34.55 31.11 35.56 21.82 6.67 100 100 Note: row totals are actually 100.01 due to rounding. Slide 43
  • 44. Example: Finger Lakes Homes Column Percentages Price Range < $99,000 > $99,000 Total Home Style Colonial Ranch Split A-Frame 60.00 40.00 100 30.00 54.29 70.00 45.71 80.00 20.00 100 100 100 Slide 44
  • 45. Scatter Diagram A scatter diagram is a graphical presentation of the relationship between two quantitative variables. One variable is shown on the horizontal axis and the other variable is shown on the vertical axis. The general pattern of the plotted points suggests the overall relationship between the variables. Slide 45
  • 46. Example: Panthers Football Team Scatter Diagram The Panthers football team is interested in investigating the relationship, if any, between interceptions made and points scored. x = Number of Interceptions 1 3 2 1 3 y = Number of Points Scored 14 24 18 17 27 Slide 46
  • 47. Example: Panthers Football Team Number of Points Scored Scatter Diagram y 30 25 20 15 10 5 0 0 1 2 3 Number of Interceptions x Slide 47
  • 48. Example: Panthers Football Team The preceding scatter diagram indicates a positive relationship between the number of interceptions and the number of points scored. Higher points scored are associated with a higher number of interceptions. The relationship is not perfect; all plotted points in the scatter diagram are not on a straight line. Slide 48
  • 49. Scatter Diagram A Positive Relationship y x Slide 49
  • 50. Scatter Diagram A Negative Relationship y x Slide 50
  • 51. Scatter Diagram No Apparent Relationship y x Slide 51
  • 52. Tabular and Graphical Procedures Data Qualitative Data Quantitative Data Tabular Methods Graphical Methods Tabular Methods •Frequency Distribution •Rel. Freq. Dist. •% Freq. Dist. •Crosstabulation •Bar Graph •Pie Chart •Frequency Distribution •Rel. Freq. Dist. •Cum. Freq. Dist. •Cum. Rel. Freq. Distribution •Stem-and-Leaf Display •Crosstabulation Graphical Methods •Dot Plot •Histogram •Ogive •Scatter Diagram Slide 52
  • 53. End of Chapter 2 Slide 53