SlideShare a Scribd company logo
Analytic Square 1
Fundamentals ofFundamentals of
StatisticsStatistics
Analytic SquareAnalytic Square
Making the DifferenceMaking the Difference
Analytic Square 2
Outline
 Introduction
 Frequency Distribution
 Measures of Central Tendency
 Measures of Dispersion
Analytic Square 3
Outline-Continued
 Other Measures
 Concept of a Population and Sample
 The Normal Curve
 Tests for Normality
Analytic Square 4
Learning Objectives
When you have completed this chapter you
should be able to:
 Know the difference between a variable and an
attribute.
 Perform mathematical calculations to the correct
number of significant figures.
 Construct histograms for simple and complex
data.
Analytic Square 5
Learning Objectives-cont’d.
When you have completed this chapter you should
be able to:
 Calculate and effectively use the different measures
of central tendency, dispersion, and
interrelationship.
 Understand the concept of a universe and a sample.
 Understand the concept of a normal curve and the
relationship to the mean and standard deviation.
Analytic Square 6
Learning Objectives-cont’d.
When you have completed this chapter you should be
able to:
 Calculate the percent of items below a value, above a
value, or between two values for data that are normally
distributed.
 Calculate the process center given the percent of items
below a value
 Perform the different tests of normality
 Construct a scatter diagram and perform the necessary
related calculations.
Analytic Square 7
Definition of Statistics:
1. A collection of quantitative data pertaining to a
subject or group. Examples are blood
pressure statistics etc.
2. The science that deals with the collection,
tabulation, analysis, interpretation, and
presentation of quantitative data
IntroductionIntroduction
Analytic Square 8
Two phases of statistics:
 Descriptive Statistics:
Describes the characteristics of a product or
process using information collected on it.
 Inferential Statistics (Inductive):
Draws conclusions on unknown process
parameters based on information contained in a
sample.
Uses probability
IntroductionIntroduction
Analytic Square 9
Types of Data:
Attribute:
Discrete data. Data values can only be integers.
Counted data or attribute data. Examples include:
 How many of the products are defective?
 How often are the machines repaired?
 How many people are absent each day?
Collection of DataCollection of Data
Analytic Square 10
Types of Data:
Attribute:
Discrete data. Data values can only be
integers. Counted data or attribute data.
Examples include:
 How many days did it rain last month?
 What kind of performance was achieved?
 Number of defects, defectives
Collection of Data – Cont’d.Collection of Data – Cont’d.
Analytic Square 11
Types of Data:
Variable:
Continuous data. Data values can be any real
number. Measured data.
Examples include:
 How long is each item?
 How long did it take to complete the task?
 What is the weight of the product?
 Length, volume, time
Collection of DataCollection of Data
Analytic Square 12
Collection of DataCollection of Data
 Significant Figures
 Rounding
Analytic Square 13
 Significant Figures = Measured numbers
 When you measure something there is always
room for a little bit of error
 How tall are you 5 ft 9 inches or 5 ft 9.1 inches?
 Counted numbers and defined numbers ( 12 ins.
= 1 ft, there are 6 people in my family)
Significant FiguresSignificant Figures
Analytic Square 14
 Significant figures are used to indicate the amount of
variation which is allowed in a number.
 It is believed to be closer to the actual value than any
other digit.
 Significant figures:
 3.69 – 3 significant digits.
 36.900 – 5 significant digits.
Significant FiguresSignificant Figures
Analytic Square 15
 Use Scientific Notation
 3x10^2 (1 significant digit)
 3.0x10^2 (2 significant digits)
Significant Figures – Cont’d.Significant Figures – Cont’d.
Analytic Square 16
 Rules for Multiplying and Dividing
 Number of sig. = the same as the number with the
least number of significant digits.
 6.59 x 2.3 = 15
 32.65/24 = 1.4 (where 24 is not a counting
number)
 32.64/24=1.360(24 is a counting number i.e.
24.00)
Significant FiguresSignificant Figures
Analytic Square 17
Rules for Adding and Subtracting
 Result can have no more sig. fig. after the decimal
point than the number with the fewest sig. fig. after the
decimal point.
 38.26 – 6 = 32 (6 is not a counting number)
 38.2 -6 = 32.2 (6 is a counting number)
 38.26 – 6.1 = 32.2 (rounded from 32.16)
 If the last digit >=5 then round up, else round down
Significant FiguresSignificant Figures
Analytic Square 18
Precision
The precision of a measurement is determined by
how reproducible that measurement value is.
For example if a sample is weighed by a student
to be 42.58 g, and then measured by another
student five different times with the resulting data:
42.09 g, 42.15 g, 42.1 g, 42.16 g, 42.12 g Then
the original measurement is not very precise
since it cannot be reproduced.
Precision and AccuracyPrecision and Accuracy
Analytic Square 19
Accuracy
 The accuracy of a measurement is determined by how
close a measured value is to its “true” value.
 For example, if a sample is known to weigh 3.182 g, then
weighed five different times by a student with the resulting
data: 3.200 g, 3.180 g, 3.152 g, 3.168 g, 3.189 g
 The most accurate measurement would be 3.180 g,
because it is closest to the true “weight” of the sample.
Precision and AccuracyPrecision and Accuracy
Analytic Square 20
Precision and AccuracyPrecision and Accuracy
Figure 4-1 Difference between accuracy and precision
Analytic Square 21
 Frequency Distribution
 Measures of Central Tendency
 Measures of Dispersion
DescribingDescribing DataData
Analytic Square 22
 Ungrouped Data
 Grouped Data
Frequency DistributionFrequency Distribution
Analytic Square 23
2-72-7
There are three types of frequency
distributions
 Categorical frequency distributions
 Ungrouped frequency distributions
 Grouped frequency distributions
Frequency DistributionFrequency Distribution
Analytic Square 24
2-72-7
Categorical frequency distributions
 Can be used for data that can be placed in
specific categories, such as nominal- or
ordinal-level data.
 Examples - political affiliation, religious
affiliation, blood type etc.
CategoricalCategorical
Analytic Square 25
2-82-8
Example :Blood Type Frequency
Distribution
Class Frequency Percent
A 5 20
B 7 28
O 9 36
AB 4 16
CategoricalCategorical
Analytic Square 26
2-92-9
Ungrouped frequency distributions
 Ungrouped frequency distributions - can be
used for data that can be enumerated and
when the range of values in the data set is
not large.
 Examples - number of miles your instructors
have to travel from home to campus, number
of girls in a 4-child family etc.
UngroupedUngrouped
Analytic Square 27
2-102-10
Example :Number of Miles Traveled
Class Frequency
5 24
10 16
15 10
UngroupedUngrouped
Analytic Square 28
2-112-11
 Grouped frequency distributions
 Can be used when the range of values in
the data set is very large. The data must be
grouped into classes that are more than one
unit in width.
 Examples - the life of boat batteries in
hours.
GroupedGrouped
Analytic Square 29
2-122-12
Example: Lifetimes of Boat Batteries
Class
limits
Class
Boundaries
Cumulative
24 - 30 23.5 - 37.5 4 4
38 - 51 37.5 - 51.5 14 18
52 - 65 51.5 - 65.5 7 25
frequency
Frequency
GroupedGrouped
Analytic Square 30
Number non
conforming
Frequency Relative
Frequency
Cumulative
Frequency
Relative
Frequency
0 15 0.29 15 0.29
1 20 0.38 35 0.67
2 8 0.15 43 0.83
3 5 0.10 48 0.92
4 3 0.06 51 0.98
5 1 0.02 52 1.00
Table 4-3 Different Frequency Distributions of Data Given in Table 4-1
Frequency DistributionsFrequency Distributions
Analytic Square 31
Frequency Histogram
0
5
10
15
20
25
0 1 2 3 4 5
Number Nonconforming
Frequency
Frequency HistogramFrequency Histogram
Analytic Square 32
Relative Frequency Histogram
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.45
0 1 2 3 4 5
Number Nonconforming
RelativeFrequency
Relative Frequency HistogramRelative Frequency Histogram
Analytic Square 33
Cumulative Frequency Histogram
0
10
20
30
40
50
60
0 1 2 3 4 5
Number Nonconforming
CumulativeFrequency
Cumulative FrequencyCumulative Frequency
HistogramHistogram
Analytic Square 34
The histogram is the most important graphical tool
for exploring the shape of data distributions.
Check:
http://quarknet.fnal.gov/toolkits/ati/histograms.html
for the construction ,analysis and understanding of
histograms
The HistogramThe Histogram
Analytic Square 35
The Fast Way
Step 1: Find range of distribution, largest -
smallest values
Step 2: Choose number of classes, 5 to 20
Step 3: Determine width of classes, one
decimal place more than the data, class width =
range/number of classes
Step 4: Determine class boundaries
Step 5: Draw frequency histogram
#classes n=
Constructing a HistogramConstructing a Histogram
Analytic Square 36
Number of groups or cells
 If no. of observations < 100 – 5 to 9 cells
 Between 100-500 – 8 to 17 cells
 Greater than 500 – 15 to 20 cells
Constructing a HistogramConstructing a Histogram
Analytic Square 37
For a more accurate way of drawing a
histogram see the section on grouped data
in your textbook
Constructing a HistogramConstructing a Histogram
Analytic Square 38
 Bar Graph
 Polygon of Data
 Cumulative Frequency Distribution or Ogive
Other Types ofOther Types of
Frequency Distribution GraphsFrequency Distribution Graphs
Analytic Square 39
Bar Graph and Polygon of DataBar Graph and Polygon of Data
Analytic Square 40
Cumulative FrequencyCumulative Frequency
Analytic Square 41Figure 4-6 Characteristics of frequency distributions
Characteristics of FrequencyCharacteristics of Frequency
Distribution GraphsDistribution Graphs
Analytic Square 42
Analysis of HistogramsAnalysis of Histograms
Figure 4-7 Differences due to location, spread, and shape
Analytic Square 43
Analysis of HistogramsAnalysis of Histograms
Figure 4-8 Histogram of Wash Concentration
Analytic Square 44
The three measures in common use are the:
 Average
 Median
 Mode
Measures of Central TendencyMeasures of Central Tendency
Analytic Square 45
There are three different techniques
available for calculating the average three
measures in common use are the:
 Ungrouped data
 Grouped data
 Weighted average
AverageAverage
Analytic Square 46
1
n
i
i
X
X
n=
= ∑
Average-Ungrouped DataAverage-Ungrouped Data
Analytic Square 47
1
1 1 2 2
1 2
... .
...
h
i i
i
h h
h
f X
X
n
f X f X f X
f f f
=
=
+ +
=
+ +
∑
h = number of cellsh = number of cells fi=frequencyfi=frequency
Xi=midpointXi=midpoint
Average-Grouped DataAverage-Grouped Data
Analytic Square 48
1
1
n
i ii
w n
i
i
w X
X
w
=
=
=
∑
∑
Used when a number of averages are
combined with different frequencies
Average-Weighted AverageAverage-Weighted Average
Analytic Square 49
2
m
d m
m
n
cf
M L i
f
 
− 
= + 
 
 
Lm=lower boundary of the cell with the median
N=total number of observations
Cfm=cumulative frequency of all cells below m
Fm=frequency of median cell
i=cell interval
Median-Grouped DataMedian-Grouped Data
Analytic Square 50
Boundaries Midpoint Frequency Computation
23.6-26.5 25.0 4 100
26.6-29.5 28.0 36 1008
29.6-32.5 31.0 51 1581
32.6-35.5 34.0 63 2142
35.6-38.5 37.0 58 2146
38.6-41.5 40.0 52 2080
41.6-44.5 43.0 34 1462
44.6-47.5 46.0 16 736
47.6-50.5 49.0 6 294
Total 320 11549
Table 4-7 Frequency Distribution of the Life of 320 tires in 1000 km
Example ProblemExample Problem
Analytic Square 51
2
m
d m
m
n
cf
M L i
f
 
− 
= + 
 
 
320
154
235.6 3 35.9
58
Md
 
− 
= + = 
 
 
Median-Grouped DataMedian-Grouped Data
Using data from Table 4-7
Analytic Square 52
ModeMode
The Mode is the value that occurs with the
greatest frequency.
It is possible to have no modes in a series or
numbers or to have more than one mode.
Analytic Square 53Figure 4-9 Relationship among average, median and mode
Relationship Among theRelationship Among the
Measures of Central TendencyMeasures of Central Tendency
Analytic Square 54
 Range
 Standard Deviation
 Variance
Measures of DispersionMeasures of Dispersion
Analytic Square 55
The range is the simplest and easiest to
calculate of the measures of dispersion.
Range = R = Xh - Xl
 Largest value - Smallest value in
data set
MeasuresMeasures of Dispersion-Rangeof Dispersion-Range
Analytic Square 56
Sample Standard Deviation:
2
1
( )
1
n
i
Xi X
S
n
=
−
=
−
∑
2
2
1
1
/
1
n
n
i
i
Xi Xi n
S
n
=
=
 
− ÷
 =
−
∑ ∑
Measures of Dispersion-Measures of Dispersion-
Standard DeviationStandard Deviation
Analytic Square 57
Ungrouped Technique
2 2
1 1
( )
( 1)
n n
i i
n Xi Xi
S
n n
= =
−
=
−
∑ ∑
Standard DeviationStandard Deviation
Analytic Square 58
2 2
1
1
( ) ( )
( 1)
h
h
i i i ii
i
n f X f X
s
n n
=
=
−
=
−
∑ ∑
Standard DeviationStandard Deviation
Grouped
Technique
Analytic Square 59
Relationship Between theRelationship Between the
Measures of DispersionMeasures of Dispersion
 As n increases, accuracy of R decreases
 Use R when there is small amount of data or data
is too scattered
 If n> 10 use standard deviation
 A smaller standard deviation means better quality
Analytic Square 60
Relationship Between theRelationship Between the
Measures of DispersionMeasures of Dispersion
Figure 4-10 Comparison of two distributions with equal average and range
Analytic Square 61
Other MeasuresOther Measures
There are three other measures that are
frequently used to analyze a collection of data:
 Skewness
 Kurtosis
 Coefficient of Variation
Analytic Square 62
Skewness is the lack of symmetry of the data.
For grouped data:
3
1
3 3
( ) /
h
i ii
f X X n
a
s
=
−
=
∑
SkewnessSkewness
Analytic Square 63
SkewnessSkewness
Figure 4-11 Left (negative) and right (positive) skewness distributions
Analytic Square 64
Kurtosis provides information regrading the shape
of the population distribution (the peakedness or
heaviness of the tails of a distribution).
For grouped data:
4
1
4 4
( ) /
h
i ii
f X X n
a
s
=
−
=
∑
KurtosisKurtosis
Analytic Square 65
KurtosisKurtosis
Figure 4-11 Leptokurtic and Platykurtic distributions
Analytic Square 66
Correlation variation (CV) is a measure of how
much variation exists in relation to the mean.
Coefficient of VariationCoefficient of Variation
(100%)s
CV
X
=
Analytic Square 67
 Population
 Set of all items that possess a
characteristic of interest
 Sample
 Subset of a population
Population and SamplePopulation and Sample
Analytic Square 68
Parameter is a characteristic of a population, i.o.w. it
describes a population
 Example: average weight of the population, e.g. 50,000
cans made in a month.
Statistic is a characteristic of a sample, used to make
inferences on the population parameters that are typically
unknown, called an estimator
 Example: average weight of a sample of 500 cans from
that month’s output, an estimate of the average weight of the
50,000 cans.
Parameter and StatisticParameter and Statistic
Analytic Square 69
Characteristics of the normal curve:
 It is symmetrical -- Half the cases are to one
side of the center; the other half is on the
other side.
 The distribution is single peaked, not
bimodal or multi-modal
 Also known as the Gaussian distribution
The Normal CurveThe Normal Curve
Analytic Square 70
Characteristics:
Most of the cases will fall in the center portion of
the curve and as values of the variable become
more extreme they become less frequent, with
"outliers" at the "tail" of the distribution few in
number. It is one of many frequency
distributions.
The Normal CurveThe Normal Curve
Analytic Square 71
The standard normal distribution is a normal
distribution with a mean of 0 and a standard deviation
of 1. Normal distributions can be transformed to
standard normal distributions by the formula:
iX
Z
µ
σ
−
=
Standard Normal DistributionStandard Normal Distribution
Analytic Square 72
Relationship between the MeanRelationship between the Mean
and Standard Deviationand Standard Deviation
Analytic Square 73
Mean and Standard DeviationMean and Standard Deviation
Same mean but different standard deviation
Analytic Square 74
Mean and Standard DeviationMean and Standard Deviation
Same mean but different standard deviation
Analytic Square 75
IF THE DISTRIBUTION IS NORMAL
Then the mean is the best measure of
central tendency
Most scores “bunched up” in middle
Extreme scores are less frequent,
therefore less probable
Normal DistributionNormal Distribution
Analytic Square 76
Percent of items included between certain values of the std. deviation
Normal DistributionNormal Distribution
Analytic Square 77
 Histogram
 Skewness
 Kurtosis
Tests for NormalityTests for Normality
Analytic Square 78
Histogram:
Shape
Symmetrical
The larger the sampler size, the better the
judgment of normality. A minimum sample size
of 50 is recommended
Tests for NormalityTests for Normality
Analytic Square 79
Skewness (a3) and Kurtosis (a4)”
 Skewed to the left or to the right (a3=0 for a normal
distribution)
 The data are peaked as the normal distribution
(a4=3 for a normal distribution)
 The larger the sample size, the better the judgment
of normality (sample size of 100 is recommended)
Tests for NormalityTests for Normality
Analytic Square 80
Probability Plots
 Order the data from the smallest to the largest
 Rank the observations (starting from 1 for the lowest
observation)
 Calculate the plotting position
100( 0.5)i
PP
n
−
=
Where i = rank PP=plotting position n=sample size
Tests for NormalityTests for Normality
Analytic Square 81
Procedure:
 Order the data
 Rank the observations
 Calculate the plotting position
Probability PlotsProbability Plots
Analytic Square 82
Procedure cont’d:
 Label the data scale
 Plot the points
 Attempt to fit by eye a “best line”
 Determine normality
Probability PlotsProbability Plots
Analytic Square 83
Procedure cont’d:
 Order the data
 Rank the observations
 Calculate the plotting position
 Label the data scale
 Plot the points
 Attempt to fit by eye a “best line”
 Determine normality
Probability PlotsProbability Plots
Analytic Square 84
Chi-Square Test
2
Chi-squared
Observed value in a cell
Expected value for a cell
i
i
O
E
χ =
=
=
Where
2
2
1
( )i
k
i
ii
O E
E
χ
=
−
= ∑
Chi-Square Goodness of FitChi-Square Goodness of Fit
TestTest
Analytic Square 85
The simplest way to determine if a cause and-The simplest way to determine if a cause and-
effect relationship exists between two variableseffect relationship exists between two variables
Scatter DiagramScatter Diagram
Figure 4-19 Scatter Diagram
Analytic Square 86
 Supplies the data to confirm a hypothesis thatSupplies the data to confirm a hypothesis that
two variables are relatedtwo variables are related
 Provides both a visual and statistical meansProvides both a visual and statistical means
to test the strength of a relationshipto test the strength of a relationship
 Provides a good follow-up to cause and effectProvides a good follow-up to cause and effect
diagramsdiagrams
Scatter DiagramScatter Diagram
Analytic Square 87
Straight Line FitStraight Line Fit
2 2
[( )( ) /
[( ) / ]
/ ( / )
xy x y n
m
x x n
a y n m x n
y a mx
−
=
−
= −
= +
∑ ∑ ∑
∑ ∑
∑ ∑
Where m=slope of the line and a is the intercept on the y axis

More Related Content

PDF
Business statistics-i-part2-aarhus-bss
PDF
Unit III - Statistical Process Control (SPC)
PDF
Business statistics-i-part1-aarhus-bss
PPTX
Descriptive statistics
PPTX
Statistics
PPTX
Statistics Based On Ncert X Class
PPT
Chapter 1: Statistics
PPT
Aed1222 lesson 6 2nd part
Business statistics-i-part2-aarhus-bss
Unit III - Statistical Process Control (SPC)
Business statistics-i-part1-aarhus-bss
Descriptive statistics
Statistics
Statistics Based On Ncert X Class
Chapter 1: Statistics
Aed1222 lesson 6 2nd part

What's hot (19)

PPTX
STATISTICAL PROCEDURES (Discriptive Statistics).pptx
PPTX
Descriptive Statistics, Numerical Description
DOCX
Bba 2001
PPTX
Basics of Educational Statistics (Descriptive statistics)
PPTX
Statistics Class 10 CBSE
PPT
General Statistics boa
PDF
Panel slides
PPT
Descriptive Statistics and Data Visualization
PPT
Statistika Dasar (1 - 3) pendahuluan
PPT
STATISTICS
PDF
Exploratory data analysis project
PDF
Descriptive Statistics
PPTX
Descriptive Statistics
PPT
Percentiles and Deciles
PDF
Basic Concepts of Statistics - Lecture Notes
PPTX
Das20502 chapter 1 descriptive statistics
PPT
Day 3 descriptive statistics
PPT
Areas In Statistics
STATISTICAL PROCEDURES (Discriptive Statistics).pptx
Descriptive Statistics, Numerical Description
Bba 2001
Basics of Educational Statistics (Descriptive statistics)
Statistics Class 10 CBSE
General Statistics boa
Panel slides
Descriptive Statistics and Data Visualization
Statistika Dasar (1 - 3) pendahuluan
STATISTICS
Exploratory data analysis project
Descriptive Statistics
Descriptive Statistics
Percentiles and Deciles
Basic Concepts of Statistics - Lecture Notes
Das20502 chapter 1 descriptive statistics
Day 3 descriptive statistics
Areas In Statistics
Ad

Similar to Basic Statistics to start Analytics (20)

PPT
Manpreet kay bhatia Business Statistics.ppt
PPTX
Stat-Lesson.pptx
PPTX
1. Descriptive statistics.pptx engineering
PPTX
Statistics with R
PPT
Introduction to statistics
PPT
Introduction To Statistics.ppt
PPTX
CO1_Session_6 Statistical Angalysis.pptx
PPTX
Lecture 1 - Overview.pptx
PPT
FDS_Descripdnkdkrnenjetive_analytics.ppt
PPT
Descriptive statistics
PPTX
Introduction to Statistics and Arithmetic Mean
PPTX
advance data Science-Introduction to Statistics
PDF
Data Science_Chapter -2_Statical Data Analysis.pdf
PPTX
Descrptive statistics
PPTX
Data Analysis and Presentation PPTX - Dr P.Thirunagalinga Pandiyan
PPTX
Health statics chapter three.pptx for students
PPTX
Type of data @ Web Mining Discussion
PPT
Descriptive statistics
PPTX
Data Analysis.pptx
PDF
Biostatistics CH Lecture Pack
Manpreet kay bhatia Business Statistics.ppt
Stat-Lesson.pptx
1. Descriptive statistics.pptx engineering
Statistics with R
Introduction to statistics
Introduction To Statistics.ppt
CO1_Session_6 Statistical Angalysis.pptx
Lecture 1 - Overview.pptx
FDS_Descripdnkdkrnenjetive_analytics.ppt
Descriptive statistics
Introduction to Statistics and Arithmetic Mean
advance data Science-Introduction to Statistics
Data Science_Chapter -2_Statical Data Analysis.pdf
Descrptive statistics
Data Analysis and Presentation PPTX - Dr P.Thirunagalinga Pandiyan
Health statics chapter three.pptx for students
Type of data @ Web Mining Discussion
Descriptive statistics
Data Analysis.pptx
Biostatistics CH Lecture Pack
Ad

Recently uploaded (20)

PPTX
A powerpoint presentation on the Revised K-10 Science Shaping Paper
PDF
medical_surgical_nursing_10th_edition_ignatavicius_TEST_BANK_pdf.pdf
PDF
David L Page_DCI Research Study Journey_how Methodology can inform one's prac...
PPTX
Computer Architecture Input Output Memory.pptx
PPTX
TNA_Presentation-1-Final(SAVE)) (1).pptx
PDF
CISA (Certified Information Systems Auditor) Domain-Wise Summary.pdf
DOC
Soft-furnishing-By-Architect-A.F.M.Mohiuddin-Akhand.doc
PDF
احياء السادس العلمي - الفصل الثالث (التكاثر) منهج متميزين/كلية بغداد/موهوبين
PDF
MBA _Common_ 2nd year Syllabus _2021-22_.pdf
PDF
1_English_Language_Set_2.pdf probationary
PPTX
CHAPTER IV. MAN AND BIOSPHERE AND ITS TOTALITY.pptx
PPTX
Unit 4 Computer Architecture Multicore Processor.pptx
PDF
HVAC Specification 2024 according to central public works department
PDF
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
PDF
1.3 FINAL REVISED K-10 PE and Health CG 2023 Grades 4-10 (1).pdf
PPTX
Share_Module_2_Power_conflict_and_negotiation.pptx
PDF
IGGE1 Understanding the Self1234567891011
PDF
Indian roads congress 037 - 2012 Flexible pavement
PPTX
B.Sc. DS Unit 2 Software Engineering.pptx
PDF
Trump Administration's workforce development strategy
A powerpoint presentation on the Revised K-10 Science Shaping Paper
medical_surgical_nursing_10th_edition_ignatavicius_TEST_BANK_pdf.pdf
David L Page_DCI Research Study Journey_how Methodology can inform one's prac...
Computer Architecture Input Output Memory.pptx
TNA_Presentation-1-Final(SAVE)) (1).pptx
CISA (Certified Information Systems Auditor) Domain-Wise Summary.pdf
Soft-furnishing-By-Architect-A.F.M.Mohiuddin-Akhand.doc
احياء السادس العلمي - الفصل الثالث (التكاثر) منهج متميزين/كلية بغداد/موهوبين
MBA _Common_ 2nd year Syllabus _2021-22_.pdf
1_English_Language_Set_2.pdf probationary
CHAPTER IV. MAN AND BIOSPHERE AND ITS TOTALITY.pptx
Unit 4 Computer Architecture Multicore Processor.pptx
HVAC Specification 2024 according to central public works department
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
1.3 FINAL REVISED K-10 PE and Health CG 2023 Grades 4-10 (1).pdf
Share_Module_2_Power_conflict_and_negotiation.pptx
IGGE1 Understanding the Self1234567891011
Indian roads congress 037 - 2012 Flexible pavement
B.Sc. DS Unit 2 Software Engineering.pptx
Trump Administration's workforce development strategy

Basic Statistics to start Analytics

  • 1. Analytic Square 1 Fundamentals ofFundamentals of StatisticsStatistics Analytic SquareAnalytic Square Making the DifferenceMaking the Difference
  • 2. Analytic Square 2 Outline  Introduction  Frequency Distribution  Measures of Central Tendency  Measures of Dispersion
  • 3. Analytic Square 3 Outline-Continued  Other Measures  Concept of a Population and Sample  The Normal Curve  Tests for Normality
  • 4. Analytic Square 4 Learning Objectives When you have completed this chapter you should be able to:  Know the difference between a variable and an attribute.  Perform mathematical calculations to the correct number of significant figures.  Construct histograms for simple and complex data.
  • 5. Analytic Square 5 Learning Objectives-cont’d. When you have completed this chapter you should be able to:  Calculate and effectively use the different measures of central tendency, dispersion, and interrelationship.  Understand the concept of a universe and a sample.  Understand the concept of a normal curve and the relationship to the mean and standard deviation.
  • 6. Analytic Square 6 Learning Objectives-cont’d. When you have completed this chapter you should be able to:  Calculate the percent of items below a value, above a value, or between two values for data that are normally distributed.  Calculate the process center given the percent of items below a value  Perform the different tests of normality  Construct a scatter diagram and perform the necessary related calculations.
  • 7. Analytic Square 7 Definition of Statistics: 1. A collection of quantitative data pertaining to a subject or group. Examples are blood pressure statistics etc. 2. The science that deals with the collection, tabulation, analysis, interpretation, and presentation of quantitative data IntroductionIntroduction
  • 8. Analytic Square 8 Two phases of statistics:  Descriptive Statistics: Describes the characteristics of a product or process using information collected on it.  Inferential Statistics (Inductive): Draws conclusions on unknown process parameters based on information contained in a sample. Uses probability IntroductionIntroduction
  • 9. Analytic Square 9 Types of Data: Attribute: Discrete data. Data values can only be integers. Counted data or attribute data. Examples include:  How many of the products are defective?  How often are the machines repaired?  How many people are absent each day? Collection of DataCollection of Data
  • 10. Analytic Square 10 Types of Data: Attribute: Discrete data. Data values can only be integers. Counted data or attribute data. Examples include:  How many days did it rain last month?  What kind of performance was achieved?  Number of defects, defectives Collection of Data – Cont’d.Collection of Data – Cont’d.
  • 11. Analytic Square 11 Types of Data: Variable: Continuous data. Data values can be any real number. Measured data. Examples include:  How long is each item?  How long did it take to complete the task?  What is the weight of the product?  Length, volume, time Collection of DataCollection of Data
  • 12. Analytic Square 12 Collection of DataCollection of Data  Significant Figures  Rounding
  • 13. Analytic Square 13  Significant Figures = Measured numbers  When you measure something there is always room for a little bit of error  How tall are you 5 ft 9 inches or 5 ft 9.1 inches?  Counted numbers and defined numbers ( 12 ins. = 1 ft, there are 6 people in my family) Significant FiguresSignificant Figures
  • 14. Analytic Square 14  Significant figures are used to indicate the amount of variation which is allowed in a number.  It is believed to be closer to the actual value than any other digit.  Significant figures:  3.69 – 3 significant digits.  36.900 – 5 significant digits. Significant FiguresSignificant Figures
  • 15. Analytic Square 15  Use Scientific Notation  3x10^2 (1 significant digit)  3.0x10^2 (2 significant digits) Significant Figures – Cont’d.Significant Figures – Cont’d.
  • 16. Analytic Square 16  Rules for Multiplying and Dividing  Number of sig. = the same as the number with the least number of significant digits.  6.59 x 2.3 = 15  32.65/24 = 1.4 (where 24 is not a counting number)  32.64/24=1.360(24 is a counting number i.e. 24.00) Significant FiguresSignificant Figures
  • 17. Analytic Square 17 Rules for Adding and Subtracting  Result can have no more sig. fig. after the decimal point than the number with the fewest sig. fig. after the decimal point.  38.26 – 6 = 32 (6 is not a counting number)  38.2 -6 = 32.2 (6 is a counting number)  38.26 – 6.1 = 32.2 (rounded from 32.16)  If the last digit >=5 then round up, else round down Significant FiguresSignificant Figures
  • 18. Analytic Square 18 Precision The precision of a measurement is determined by how reproducible that measurement value is. For example if a sample is weighed by a student to be 42.58 g, and then measured by another student five different times with the resulting data: 42.09 g, 42.15 g, 42.1 g, 42.16 g, 42.12 g Then the original measurement is not very precise since it cannot be reproduced. Precision and AccuracyPrecision and Accuracy
  • 19. Analytic Square 19 Accuracy  The accuracy of a measurement is determined by how close a measured value is to its “true” value.  For example, if a sample is known to weigh 3.182 g, then weighed five different times by a student with the resulting data: 3.200 g, 3.180 g, 3.152 g, 3.168 g, 3.189 g  The most accurate measurement would be 3.180 g, because it is closest to the true “weight” of the sample. Precision and AccuracyPrecision and Accuracy
  • 20. Analytic Square 20 Precision and AccuracyPrecision and Accuracy Figure 4-1 Difference between accuracy and precision
  • 21. Analytic Square 21  Frequency Distribution  Measures of Central Tendency  Measures of Dispersion DescribingDescribing DataData
  • 22. Analytic Square 22  Ungrouped Data  Grouped Data Frequency DistributionFrequency Distribution
  • 23. Analytic Square 23 2-72-7 There are three types of frequency distributions  Categorical frequency distributions  Ungrouped frequency distributions  Grouped frequency distributions Frequency DistributionFrequency Distribution
  • 24. Analytic Square 24 2-72-7 Categorical frequency distributions  Can be used for data that can be placed in specific categories, such as nominal- or ordinal-level data.  Examples - political affiliation, religious affiliation, blood type etc. CategoricalCategorical
  • 25. Analytic Square 25 2-82-8 Example :Blood Type Frequency Distribution Class Frequency Percent A 5 20 B 7 28 O 9 36 AB 4 16 CategoricalCategorical
  • 26. Analytic Square 26 2-92-9 Ungrouped frequency distributions  Ungrouped frequency distributions - can be used for data that can be enumerated and when the range of values in the data set is not large.  Examples - number of miles your instructors have to travel from home to campus, number of girls in a 4-child family etc. UngroupedUngrouped
  • 27. Analytic Square 27 2-102-10 Example :Number of Miles Traveled Class Frequency 5 24 10 16 15 10 UngroupedUngrouped
  • 28. Analytic Square 28 2-112-11  Grouped frequency distributions  Can be used when the range of values in the data set is very large. The data must be grouped into classes that are more than one unit in width.  Examples - the life of boat batteries in hours. GroupedGrouped
  • 29. Analytic Square 29 2-122-12 Example: Lifetimes of Boat Batteries Class limits Class Boundaries Cumulative 24 - 30 23.5 - 37.5 4 4 38 - 51 37.5 - 51.5 14 18 52 - 65 51.5 - 65.5 7 25 frequency Frequency GroupedGrouped
  • 30. Analytic Square 30 Number non conforming Frequency Relative Frequency Cumulative Frequency Relative Frequency 0 15 0.29 15 0.29 1 20 0.38 35 0.67 2 8 0.15 43 0.83 3 5 0.10 48 0.92 4 3 0.06 51 0.98 5 1 0.02 52 1.00 Table 4-3 Different Frequency Distributions of Data Given in Table 4-1 Frequency DistributionsFrequency Distributions
  • 31. Analytic Square 31 Frequency Histogram 0 5 10 15 20 25 0 1 2 3 4 5 Number Nonconforming Frequency Frequency HistogramFrequency Histogram
  • 32. Analytic Square 32 Relative Frequency Histogram 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0 1 2 3 4 5 Number Nonconforming RelativeFrequency Relative Frequency HistogramRelative Frequency Histogram
  • 33. Analytic Square 33 Cumulative Frequency Histogram 0 10 20 30 40 50 60 0 1 2 3 4 5 Number Nonconforming CumulativeFrequency Cumulative FrequencyCumulative Frequency HistogramHistogram
  • 34. Analytic Square 34 The histogram is the most important graphical tool for exploring the shape of data distributions. Check: http://quarknet.fnal.gov/toolkits/ati/histograms.html for the construction ,analysis and understanding of histograms The HistogramThe Histogram
  • 35. Analytic Square 35 The Fast Way Step 1: Find range of distribution, largest - smallest values Step 2: Choose number of classes, 5 to 20 Step 3: Determine width of classes, one decimal place more than the data, class width = range/number of classes Step 4: Determine class boundaries Step 5: Draw frequency histogram #classes n= Constructing a HistogramConstructing a Histogram
  • 36. Analytic Square 36 Number of groups or cells  If no. of observations < 100 – 5 to 9 cells  Between 100-500 – 8 to 17 cells  Greater than 500 – 15 to 20 cells Constructing a HistogramConstructing a Histogram
  • 37. Analytic Square 37 For a more accurate way of drawing a histogram see the section on grouped data in your textbook Constructing a HistogramConstructing a Histogram
  • 38. Analytic Square 38  Bar Graph  Polygon of Data  Cumulative Frequency Distribution or Ogive Other Types ofOther Types of Frequency Distribution GraphsFrequency Distribution Graphs
  • 39. Analytic Square 39 Bar Graph and Polygon of DataBar Graph and Polygon of Data
  • 40. Analytic Square 40 Cumulative FrequencyCumulative Frequency
  • 41. Analytic Square 41Figure 4-6 Characteristics of frequency distributions Characteristics of FrequencyCharacteristics of Frequency Distribution GraphsDistribution Graphs
  • 42. Analytic Square 42 Analysis of HistogramsAnalysis of Histograms Figure 4-7 Differences due to location, spread, and shape
  • 43. Analytic Square 43 Analysis of HistogramsAnalysis of Histograms Figure 4-8 Histogram of Wash Concentration
  • 44. Analytic Square 44 The three measures in common use are the:  Average  Median  Mode Measures of Central TendencyMeasures of Central Tendency
  • 45. Analytic Square 45 There are three different techniques available for calculating the average three measures in common use are the:  Ungrouped data  Grouped data  Weighted average AverageAverage
  • 46. Analytic Square 46 1 n i i X X n= = ∑ Average-Ungrouped DataAverage-Ungrouped Data
  • 47. Analytic Square 47 1 1 1 2 2 1 2 ... . ... h i i i h h h f X X n f X f X f X f f f = = + + = + + ∑ h = number of cellsh = number of cells fi=frequencyfi=frequency Xi=midpointXi=midpoint Average-Grouped DataAverage-Grouped Data
  • 48. Analytic Square 48 1 1 n i ii w n i i w X X w = = = ∑ ∑ Used when a number of averages are combined with different frequencies Average-Weighted AverageAverage-Weighted Average
  • 49. Analytic Square 49 2 m d m m n cf M L i f   −  = +      Lm=lower boundary of the cell with the median N=total number of observations Cfm=cumulative frequency of all cells below m Fm=frequency of median cell i=cell interval Median-Grouped DataMedian-Grouped Data
  • 50. Analytic Square 50 Boundaries Midpoint Frequency Computation 23.6-26.5 25.0 4 100 26.6-29.5 28.0 36 1008 29.6-32.5 31.0 51 1581 32.6-35.5 34.0 63 2142 35.6-38.5 37.0 58 2146 38.6-41.5 40.0 52 2080 41.6-44.5 43.0 34 1462 44.6-47.5 46.0 16 736 47.6-50.5 49.0 6 294 Total 320 11549 Table 4-7 Frequency Distribution of the Life of 320 tires in 1000 km Example ProblemExample Problem
  • 51. Analytic Square 51 2 m d m m n cf M L i f   −  = +      320 154 235.6 3 35.9 58 Md   −  = + =      Median-Grouped DataMedian-Grouped Data Using data from Table 4-7
  • 52. Analytic Square 52 ModeMode The Mode is the value that occurs with the greatest frequency. It is possible to have no modes in a series or numbers or to have more than one mode.
  • 53. Analytic Square 53Figure 4-9 Relationship among average, median and mode Relationship Among theRelationship Among the Measures of Central TendencyMeasures of Central Tendency
  • 54. Analytic Square 54  Range  Standard Deviation  Variance Measures of DispersionMeasures of Dispersion
  • 55. Analytic Square 55 The range is the simplest and easiest to calculate of the measures of dispersion. Range = R = Xh - Xl  Largest value - Smallest value in data set MeasuresMeasures of Dispersion-Rangeof Dispersion-Range
  • 56. Analytic Square 56 Sample Standard Deviation: 2 1 ( ) 1 n i Xi X S n = − = − ∑ 2 2 1 1 / 1 n n i i Xi Xi n S n = =   − ÷  = − ∑ ∑ Measures of Dispersion-Measures of Dispersion- Standard DeviationStandard Deviation
  • 57. Analytic Square 57 Ungrouped Technique 2 2 1 1 ( ) ( 1) n n i i n Xi Xi S n n = = − = − ∑ ∑ Standard DeviationStandard Deviation
  • 58. Analytic Square 58 2 2 1 1 ( ) ( ) ( 1) h h i i i ii i n f X f X s n n = = − = − ∑ ∑ Standard DeviationStandard Deviation Grouped Technique
  • 59. Analytic Square 59 Relationship Between theRelationship Between the Measures of DispersionMeasures of Dispersion  As n increases, accuracy of R decreases  Use R when there is small amount of data or data is too scattered  If n> 10 use standard deviation  A smaller standard deviation means better quality
  • 60. Analytic Square 60 Relationship Between theRelationship Between the Measures of DispersionMeasures of Dispersion Figure 4-10 Comparison of two distributions with equal average and range
  • 61. Analytic Square 61 Other MeasuresOther Measures There are three other measures that are frequently used to analyze a collection of data:  Skewness  Kurtosis  Coefficient of Variation
  • 62. Analytic Square 62 Skewness is the lack of symmetry of the data. For grouped data: 3 1 3 3 ( ) / h i ii f X X n a s = − = ∑ SkewnessSkewness
  • 63. Analytic Square 63 SkewnessSkewness Figure 4-11 Left (negative) and right (positive) skewness distributions
  • 64. Analytic Square 64 Kurtosis provides information regrading the shape of the population distribution (the peakedness or heaviness of the tails of a distribution). For grouped data: 4 1 4 4 ( ) / h i ii f X X n a s = − = ∑ KurtosisKurtosis
  • 65. Analytic Square 65 KurtosisKurtosis Figure 4-11 Leptokurtic and Platykurtic distributions
  • 66. Analytic Square 66 Correlation variation (CV) is a measure of how much variation exists in relation to the mean. Coefficient of VariationCoefficient of Variation (100%)s CV X =
  • 67. Analytic Square 67  Population  Set of all items that possess a characteristic of interest  Sample  Subset of a population Population and SamplePopulation and Sample
  • 68. Analytic Square 68 Parameter is a characteristic of a population, i.o.w. it describes a population  Example: average weight of the population, e.g. 50,000 cans made in a month. Statistic is a characteristic of a sample, used to make inferences on the population parameters that are typically unknown, called an estimator  Example: average weight of a sample of 500 cans from that month’s output, an estimate of the average weight of the 50,000 cans. Parameter and StatisticParameter and Statistic
  • 69. Analytic Square 69 Characteristics of the normal curve:  It is symmetrical -- Half the cases are to one side of the center; the other half is on the other side.  The distribution is single peaked, not bimodal or multi-modal  Also known as the Gaussian distribution The Normal CurveThe Normal Curve
  • 70. Analytic Square 70 Characteristics: Most of the cases will fall in the center portion of the curve and as values of the variable become more extreme they become less frequent, with "outliers" at the "tail" of the distribution few in number. It is one of many frequency distributions. The Normal CurveThe Normal Curve
  • 71. Analytic Square 71 The standard normal distribution is a normal distribution with a mean of 0 and a standard deviation of 1. Normal distributions can be transformed to standard normal distributions by the formula: iX Z µ σ − = Standard Normal DistributionStandard Normal Distribution
  • 72. Analytic Square 72 Relationship between the MeanRelationship between the Mean and Standard Deviationand Standard Deviation
  • 73. Analytic Square 73 Mean and Standard DeviationMean and Standard Deviation Same mean but different standard deviation
  • 74. Analytic Square 74 Mean and Standard DeviationMean and Standard Deviation Same mean but different standard deviation
  • 75. Analytic Square 75 IF THE DISTRIBUTION IS NORMAL Then the mean is the best measure of central tendency Most scores “bunched up” in middle Extreme scores are less frequent, therefore less probable Normal DistributionNormal Distribution
  • 76. Analytic Square 76 Percent of items included between certain values of the std. deviation Normal DistributionNormal Distribution
  • 77. Analytic Square 77  Histogram  Skewness  Kurtosis Tests for NormalityTests for Normality
  • 78. Analytic Square 78 Histogram: Shape Symmetrical The larger the sampler size, the better the judgment of normality. A minimum sample size of 50 is recommended Tests for NormalityTests for Normality
  • 79. Analytic Square 79 Skewness (a3) and Kurtosis (a4)”  Skewed to the left or to the right (a3=0 for a normal distribution)  The data are peaked as the normal distribution (a4=3 for a normal distribution)  The larger the sample size, the better the judgment of normality (sample size of 100 is recommended) Tests for NormalityTests for Normality
  • 80. Analytic Square 80 Probability Plots  Order the data from the smallest to the largest  Rank the observations (starting from 1 for the lowest observation)  Calculate the plotting position 100( 0.5)i PP n − = Where i = rank PP=plotting position n=sample size Tests for NormalityTests for Normality
  • 81. Analytic Square 81 Procedure:  Order the data  Rank the observations  Calculate the plotting position Probability PlotsProbability Plots
  • 82. Analytic Square 82 Procedure cont’d:  Label the data scale  Plot the points  Attempt to fit by eye a “best line”  Determine normality Probability PlotsProbability Plots
  • 83. Analytic Square 83 Procedure cont’d:  Order the data  Rank the observations  Calculate the plotting position  Label the data scale  Plot the points  Attempt to fit by eye a “best line”  Determine normality Probability PlotsProbability Plots
  • 84. Analytic Square 84 Chi-Square Test 2 Chi-squared Observed value in a cell Expected value for a cell i i O E χ = = = Where 2 2 1 ( )i k i ii O E E χ = − = ∑ Chi-Square Goodness of FitChi-Square Goodness of Fit TestTest
  • 85. Analytic Square 85 The simplest way to determine if a cause and-The simplest way to determine if a cause and- effect relationship exists between two variableseffect relationship exists between two variables Scatter DiagramScatter Diagram Figure 4-19 Scatter Diagram
  • 86. Analytic Square 86  Supplies the data to confirm a hypothesis thatSupplies the data to confirm a hypothesis that two variables are relatedtwo variables are related  Provides both a visual and statistical meansProvides both a visual and statistical means to test the strength of a relationshipto test the strength of a relationship  Provides a good follow-up to cause and effectProvides a good follow-up to cause and effect diagramsdiagrams Scatter DiagramScatter Diagram
  • 87. Analytic Square 87 Straight Line FitStraight Line Fit 2 2 [( )( ) / [( ) / ] / ( / ) xy x y n m x x n a y n m x n y a mx − = − = − = + ∑ ∑ ∑ ∑ ∑ ∑ ∑ Where m=slope of the line and a is the intercept on the y axis