SlideShare a Scribd company logo
Henry R. Kang (1/2010)
General Chemistry
Lecture 5
Statistical Data
Analysis
Henry R. Kang (7/2008)
Outlines
• Fundamental Statistics
• Accuracy and Precision
• Data Rejection
Henry R. Kang (1/2010)
Accuracy & Precision
• Accuracy
 Accuracy is a measure of the closeness of a
measured quantity to the true value.
• Precision
 How close two or more measurements of the
quantity agree with one another.
 Precision is a measure of the agreement of
replicate measurements.
Henry R. Kang (7/2008)
Fundamental
Statistics
Henry R. Kang (7/2008)
Errors
• All Measurements Contain Errors.
• Types of Errors
 Systematic errors
 One-sided errors (either positive or negative)
• Usually from a single source
• Resulting data are consistently high or low
 Results may be precise but inaccurate
• Examples: Balance is incorrectly zeroed. Use incorrect constant for
calculations.
 Random errors
 Randomly occurred
 Positive and negative deviations occur with equal frequency and size.
• A bell shape curve (Gaussian or normal distribution)
 The source of the error is usually not known
Henry R. Kang (7/2008)
Gaussian Distribution
• Gaussian distribution gives the distribution of data points with respect to the
true value. It gives a bell-shaped curve as shown in the figure.
 The closer to the true value, the higher the probability.
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
-3 -2 -1 0 1 2 3
Standard Deviation
Probability
Henry R. Kang (7/2008)
Measuring Accuracy
• Percent Error
 If the true value is known
• Part Per Thousand (PPT)
• Part Per Million (PPM)
• Unfortunately, the true value is often not known.
% error =
| true value – experimental value |
| True value |
× 100
PPT =
| true value – experimental value |
| true value – experimental value |
| True value |
| True value |
× 1000
× 106
PPM =
Henry R. Kang (7/2008)
Measuring Precision
• Mean (or Average)
• Deviation and Absolute Deviation
• Absolute Average Deviation
• Relative Deviation
• Relative Average Deviation (RAD)
• Standard Deviation
• Relative Standard Deviation
Henry R. Kang (7/2008)
Mean (Average)
• For multiple measurements of a given quantity,
we have numerical values x1, x2, x3, - - - -, xn, where
n is the number of measurements.
• Sum is defined as
Sum = x1 + x2 + x3 + - - - + xn = ∑ xi
• Mean xavg is defined as
∑ xiSum
n n=xavg =
Henry R. Kang (7/2008)
Deviation & Absolute Deviations
• Deviation is the difference (or variation) of a single measurement,
xi, away from the mean value, xavg.
 d1 = x1 – xavg
 d2 = x2 – xavg
 d3 = x3 – xavg
 -- - -- -- - -- --
 -- - -- -- - -- --
 dn = xn – xavg
• Absolute deviation is always positive.
 d1 = | x1 – xavg|
 d2 = | x2 – xavg|
 d3 = |x3 – xavg|
 -- - -- -- - -- --
 -- - -- -- - -- --
Henry R. Kang (7/2008)
Absolute Average Deviation
• Absolute average deviation, davg, is the arithmetic
mean of individual absolute deviations, di.
d1 = | x1 – xavg|
d2 = | x2 – xavg|
d3 = | x3 – xavg|
--------- ---
--------- ---
dn = | xn – xavg| ∑ di
n=davg
Henry R. Kang (7/2008)
Relative Deviation
• Relative deviation, Di, is the ratio of
individual absolute deviations, di, to the
mean value, xavg.
D1 = d1 / xavg = | x1 – xavg| / xavg
D2 = d2 / xavg = | x2 – xavg| / xavg
D3 = d3 / xavg = | x3 – xavg| / xavg
------------
Di = di / xavg = | xi – xavg| / xavg
------------
Henry R. Kang (7/2008)
Relative Average Deviation
• Relative average deviation (RAD) is the
absolute average deviation relative to
the mean xavg
A precision of 3 ppt or less is considered
very good.
RAD (ppt) = × 1000
davg
xavg
Henry R. Kang (7/2008)
Standard Deviation
• Standard deviation (σ) is useful in estimating data points
distribution in the form of the Gaussian distribution (a
bell-shaped curve).
 (xavg ± σ) incorporates 68.3% of the data points.
 (xavg ± 3σ) incorporates 99.7% of the data points.
 The smaller the σ, the less spread of data points.
 d1 = x1 – xavg
d2 = x2 – xavg
d3 = x3 – xavg
------------
dn = xn – xavg
∑ di
2
n – 1
=σ
√ =
√
d1
2
+ d2
2
+ d3
2
+ - - - - + dn
2
n – 1
Henry R. Kang (7/2008)
Relative Standard Deviation
• Relative standard deviation (σr) is the standard
deviation relative to the mean value.
 d1 = x1 – xavg
d2 = x2 – xavg
d3 = x3 – xavg
--------- ---
dn = xn – xavg
where n is the number of measurements
∑ (di /xavg)2
n – 1
=σr
√ =
√ D1
2
+D2
2
+D3
2
+ - - - - +Dn
2
n – 1
or σr (ppt) = (σ / xavg ) × 1000
Henry R. Kang (7/2008)
Gaussian Distribution
• Gaussian distribution gives the
distribution of data points with
respect to the true value. It gives a
bell-shaped curve as shown in the
figure.
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
-3 -2 -1 0 1 2 3
Standard Deviation
Probability
• The Gaussian equation is
P(x) = [(2π)1/2
σ]–1
exp[-(x – X)2
/(2σ2
)]
where σ is the standard deviation and X is the true value.
 The closer to the true value, the higher the probability.
 The area under the curve (or the integration of the Gaussian function)
 (xture ± σ) incorporates 68.3% of the data points.
 (xture ± 3σ) incorporates 99.7% of the data points.
 (xture ± 3.8901σ) incorporates 99.99% of the data points.
 (xture ± 4.4172σ) incorporates 99.999% of the data points.
 (xture ± 6σ) incorporates nearly 100% of the data points.
Henry R. Kang (7/2008)
Standard Deviation & Data Distribution
• The smaller the σ, the less spread of data points.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
-4 -3 -2 -1 0 1 2 3 4
Standard Deviation
Probability
σ = 0.5
σ = 1.0
σ = 2.0
Henry R. Kang (7/2008)
Approximation of Standard Deviation
• The computational cost for standard deviation is pretty
high; therefore, there exists a good approximation to
compute standard deviation with much less
computational cost.
• š = Ř/√N
 Ř is the range of data points from the lowest value to the
highest value
Ř = xmax – xmin
 N is the number of data points.
• For a small number of measurements the approximation
is accurate enough to replace the formal standard
deviation.
Henry R. Kang (7/2008)
Accuracy
and
Precision
Henry R. Kang (1/2010)
Accuracy & Precision of Measurements
• Accuracy is a measure of the closeness of a measured quantity to
the true value.
• Precision is a measure of the agreement of replicate
measurements.
• Measurements can be precise but not accurate or accurate but not
precise or neither. The best result is, of course, accurate and
precise.
Accurate &
precise
Precise but
not accurate
not accurate
& not precise
accurate but
not precise
Henry R. Kang (1/2010)
Example 1 of Accuracy and Precision
• Measured %S values in H2SO4 are 28.72%, 28.40%, and 28.57%,
where the true value is 32.69%. Determine the accuracy and
precision.
• Answer:
 Mean = (28.72% + 28.40% + 28.57%) / 3 = 28.60%
 Estimated precision by using the approximation: š = Ř / √N
 š = (28.72 – 28.40)% / 31/2
= 0.32% / 1.732 = 0.18 %
 Relative standard deviation: sr = š / xM
 sr = 0.18% / 28.60% = 0.0063
 Accuracy = |X − xM| = | 32.69% − 28.60% | = 4.09%
 Relative accuracy = Accuracy / True value
= 4.09% / 32.69% = 0.125
• These result indicate that the data are precise but inaccurate.
Henry R. Kang (1/2010)
Example 2 of Accuracy and Precision
• Measured %S values in H2SO4 are 28.89%, 32.56%, and 36.64%,
where the true value is 32.69%. Determine the accuracy and
precision.
• Answer:
 Mean = (28.89% + 32.56% + 36.64%) / 3 = 32.70%
 Estimated precision by using the approximation: š = Ř / √N
 š = (36.64 – 28.89)% / 31/2
= 7.75% / 1.732 = 4.47 %
 Relative standard deviation: sr = š / xM
 sr = 4.47% / 32.70% = 0.137
 Accuracy = |X − xM| = | 32.69% − 32.70% | = 0.01%
 Relative accuracy = Accuracy / True value
= 0.01% / 32.69% = 0.0003
• These result indicate that the data are imprecise but accurate.
Henry R. Kang (1/2010)
Example 3 of Accuracy and Precision
• Measured %S values in H2SO4 are 25.62%, 33.56%, and 27.93%,
where the true value is 32.69%. Determine the accuracy and
precision.
• Answer:
 Mean = (25.62% + 33.56% + 27.93%) / 3 = 29.04%
 Estimated precision by using the approximation: š = Ř / √N
 š = (33.56 – 25.62)% / 31/2
= 7.94% / 1.732 = 4.58 %
 Relative standard deviation: sr = š / xM
 sr = 4.58% / 29.04% = 0.158
 Accuracy = |X − xM| = | 32.69% − 29.04% | = 3.65%
 Relative accuracy = Accuracy / True value
= 3.65% / 32.69% = 0.112
• These result indicate that the data are imprecise and inaccurate.
Henry R. Kang (7/2008)
Data Rejection
Henry R. Kang (7/2008)
Data Rejection
• Replicate measurements of a given quantity are usually
scattered.
 Some values are closer than others.
• Which values to keep (or which values to discard)
 If a single result differs greatly from the others that is caused
by a particular error of the experimenter, then this result
should be discarded.
 If a result is significantly “off”, but there is no error in the
experiment, then the result, in general, should be kept.
• If in doubt, use the rejection coefficient Q test.
• Do not discard any result just to get “good precision”.
Henry R. Kang (7/2008)
Q Test
• Q test is used to test the extreme values (the highest and lowest
values)
• Procedure
 Calculate the range
 Range = xmax – xmin
 Calculate the difference between the extreme value with its nearest
neighbor
 dhi = xmax – xnbor,hi; dlo = | xmin – xnbor,lo |
 Calculate the ratio (Q value) between the difference and the range
 Qhi = dhi / Range ;Qlo = dlo / Range
• Compare the resulting Q value with the rejection table at 90%
confidence level (or other selected confidence level)
 If the calculated Q value is greater than the Q value given in the table, then
reject the value.
Henry R. Kang (7/2008)
Rejection Q Tables
Number
of Data
Q90 Q96 Q99
3 0.94 0.98 0.99
4 0.76 0.85 0.93
5 0.64 0.73 0.82
6 0.56 0.64 0.74
7 0.51 0.59 0.68
8 0.47 0.54 0.63
9 0.44 0.51 0.60
10 0.41 0.48 0.57
Henry R. Kang (7/2008)
Q Test - Example
• Data: 35.00, 35.05, 35.10, 35.80
• Calculate the range
 Range = xmax – xmin= 35.80 – 35.00 = 0.80
• Calculate the difference between the extreme value with its
nearest neighbor.
 dhi = xmax – xnbor,hi = 35.80 – 35.10 = 0.70
 dlo = xmin – xnbor,lo = | 35.00 – 35.05 | = 0.05
• Calculate Q values between the difference and the range.
 Qhi = dhi / Range = 0.70 / 0.80 = 0.88
 Qlo = dlo / Range = 0.05 / 0.80 = 0.063
• Compare the resulting Q value with the rejection table at 90%
confidence level.
 For 4 samples, the Q value in the table is 0.76
 Qhi > 0.76; therefore, the highest value 35.80 can be dropped
 Once the value is dropped, it is no longer in the data set and should not
be used for the calculations of mean and various deviations.
#Data Q90
3 0.94
4 0.76
5 0.64
6 0.56
7 0.51
8 0.47
9 0.44
10 0.41

More Related Content

PDF
UV-visible spectroscopy - 2021
PPT
Infrared spectroscopy
DOCX
Alkanes
PDF
Ch10 chemical bonding ii
PPTX
Flame Atomic Absorption Spectroscopy
PPT
Mass spectrometry
PPT
CHM260 - Spectroscopy Method
PPTX
Character table in symmmetry operation a
UV-visible spectroscopy - 2021
Infrared spectroscopy
Alkanes
Ch10 chemical bonding ii
Flame Atomic Absorption Spectroscopy
Mass spectrometry
CHM260 - Spectroscopy Method
Character table in symmmetry operation a

Viewers also liked (20)

PPT
GC-S006-Graphing
PDF
Determination of the accuracy of linear and volumetric measurements on CBCT i...
PPSX
Accuracy & Precision
PPTX
Accuracy and Precision
PDF
I010315762
PPTX
Power point estrada
PDF
dissertation_final_sarpakunnas
PDF
I0814852
PDF
TireAngel Telematics 2014-12
PPTX
startup_inside_FINAL
PDF
B017250715
PDF
Ouranos hemeljesuschristus
PPT
Conoscere l'editoria 2
PDF
Jeshua february2016
PDF
H017235155
PDF
Derivation and Application of Six-Point Linear Multistep Numerical Method for...
PDF
Using Kentico EMS to optimize the B2B sales process
PDF
Madcom analyzes the need for broadband in eastern pa
PDF
D018132226
PDF
LordJeshuainheritancemay2016
GC-S006-Graphing
Determination of the accuracy of linear and volumetric measurements on CBCT i...
Accuracy & Precision
Accuracy and Precision
I010315762
Power point estrada
dissertation_final_sarpakunnas
I0814852
TireAngel Telematics 2014-12
startup_inside_FINAL
B017250715
Ouranos hemeljesuschristus
Conoscere l'editoria 2
Jeshua february2016
H017235155
Derivation and Application of Six-Point Linear Multistep Numerical Method for...
Using Kentico EMS to optimize the B2B sales process
Madcom analyzes the need for broadband in eastern pa
D018132226
LordJeshuainheritancemay2016
Ad

Similar to GC-S005-DataAnalysis (20)

PPT
Analytical chemistry lecture 3
PPT
lecture-2.ppt
PDF
Chapter 6
PDF
Basic Statistics Concepts
PDF
Basic statistics concepts
PPT
Data-Handling part 2.ppt
PPTX
Errors in Chemistry ANALYTICAL CHEMISTRY (Errors in Chemical Analysis).pptx
PPT
statistics-for-analytical-chemistry (1).ppt
PPT
Introduction to analytical chemistry.ppt
PPTX
Error in chemical analysis
PPTX
Measuring precisionchm 240 excell lab
PPTX
Measuring precisionchm 240 excell lab
PPTX
Data analysis
PPTX
Errors in chemical analysis
PPTX
DSE-2, ANALYTICAL METHODS.pptx
PPTX
Errors in Chemical Analysis and Sampling
PPT
Measurement
PPTX
1625941932480.pptx
PDF
03B Statistics of Repeated Measurements.pdf
PPT
Presentation4.ppt
Analytical chemistry lecture 3
lecture-2.ppt
Chapter 6
Basic Statistics Concepts
Basic statistics concepts
Data-Handling part 2.ppt
Errors in Chemistry ANALYTICAL CHEMISTRY (Errors in Chemical Analysis).pptx
statistics-for-analytical-chemistry (1).ppt
Introduction to analytical chemistry.ppt
Error in chemical analysis
Measuring precisionchm 240 excell lab
Measuring precisionchm 240 excell lab
Data analysis
Errors in chemical analysis
DSE-2, ANALYTICAL METHODS.pptx
Errors in Chemical Analysis and Sampling
Measurement
1625941932480.pptx
03B Statistics of Repeated Measurements.pdf
Presentation4.ppt
Ad

More from henry kang (11)

PPT
GC-S010-Nomenclature
PPT
GC-S009-Substances
PPT
GC-S008-Mass&Mole
PPT
GC-S007-Atom
PPT
GC-S004-ScientificNotation
PPT
GC-S003-Measurement
PPT
GC-S002-Matter
DOCX
RC3-deScreen_s
DOCX
RC2-filterDesign_s
PPT
GenChem000-WhatIsChemistry
PPT
GenChem001-ScientificMethod
GC-S010-Nomenclature
GC-S009-Substances
GC-S008-Mass&Mole
GC-S007-Atom
GC-S004-ScientificNotation
GC-S003-Measurement
GC-S002-Matter
RC3-deScreen_s
RC2-filterDesign_s
GenChem000-WhatIsChemistry
GenChem001-ScientificMethod

GC-S005-DataAnalysis

  • 1. Henry R. Kang (1/2010) General Chemistry Lecture 5 Statistical Data Analysis
  • 2. Henry R. Kang (7/2008) Outlines • Fundamental Statistics • Accuracy and Precision • Data Rejection
  • 3. Henry R. Kang (1/2010) Accuracy & Precision • Accuracy  Accuracy is a measure of the closeness of a measured quantity to the true value. • Precision  How close two or more measurements of the quantity agree with one another.  Precision is a measure of the agreement of replicate measurements.
  • 4. Henry R. Kang (7/2008) Fundamental Statistics
  • 5. Henry R. Kang (7/2008) Errors • All Measurements Contain Errors. • Types of Errors  Systematic errors  One-sided errors (either positive or negative) • Usually from a single source • Resulting data are consistently high or low  Results may be precise but inaccurate • Examples: Balance is incorrectly zeroed. Use incorrect constant for calculations.  Random errors  Randomly occurred  Positive and negative deviations occur with equal frequency and size. • A bell shape curve (Gaussian or normal distribution)  The source of the error is usually not known
  • 6. Henry R. Kang (7/2008) Gaussian Distribution • Gaussian distribution gives the distribution of data points with respect to the true value. It gives a bell-shaped curve as shown in the figure.  The closer to the true value, the higher the probability. 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 -3 -2 -1 0 1 2 3 Standard Deviation Probability
  • 7. Henry R. Kang (7/2008) Measuring Accuracy • Percent Error  If the true value is known • Part Per Thousand (PPT) • Part Per Million (PPM) • Unfortunately, the true value is often not known. % error = | true value – experimental value | | True value | × 100 PPT = | true value – experimental value | | true value – experimental value | | True value | | True value | × 1000 × 106 PPM =
  • 8. Henry R. Kang (7/2008) Measuring Precision • Mean (or Average) • Deviation and Absolute Deviation • Absolute Average Deviation • Relative Deviation • Relative Average Deviation (RAD) • Standard Deviation • Relative Standard Deviation
  • 9. Henry R. Kang (7/2008) Mean (Average) • For multiple measurements of a given quantity, we have numerical values x1, x2, x3, - - - -, xn, where n is the number of measurements. • Sum is defined as Sum = x1 + x2 + x3 + - - - + xn = ∑ xi • Mean xavg is defined as ∑ xiSum n n=xavg =
  • 10. Henry R. Kang (7/2008) Deviation & Absolute Deviations • Deviation is the difference (or variation) of a single measurement, xi, away from the mean value, xavg.  d1 = x1 – xavg  d2 = x2 – xavg  d3 = x3 – xavg  -- - -- -- - -- --  -- - -- -- - -- --  dn = xn – xavg • Absolute deviation is always positive.  d1 = | x1 – xavg|  d2 = | x2 – xavg|  d3 = |x3 – xavg|  -- - -- -- - -- --  -- - -- -- - -- --
  • 11. Henry R. Kang (7/2008) Absolute Average Deviation • Absolute average deviation, davg, is the arithmetic mean of individual absolute deviations, di. d1 = | x1 – xavg| d2 = | x2 – xavg| d3 = | x3 – xavg| --------- --- --------- --- dn = | xn – xavg| ∑ di n=davg
  • 12. Henry R. Kang (7/2008) Relative Deviation • Relative deviation, Di, is the ratio of individual absolute deviations, di, to the mean value, xavg. D1 = d1 / xavg = | x1 – xavg| / xavg D2 = d2 / xavg = | x2 – xavg| / xavg D3 = d3 / xavg = | x3 – xavg| / xavg ------------ Di = di / xavg = | xi – xavg| / xavg ------------
  • 13. Henry R. Kang (7/2008) Relative Average Deviation • Relative average deviation (RAD) is the absolute average deviation relative to the mean xavg A precision of 3 ppt or less is considered very good. RAD (ppt) = × 1000 davg xavg
  • 14. Henry R. Kang (7/2008) Standard Deviation • Standard deviation (σ) is useful in estimating data points distribution in the form of the Gaussian distribution (a bell-shaped curve).  (xavg ± σ) incorporates 68.3% of the data points.  (xavg ± 3σ) incorporates 99.7% of the data points.  The smaller the σ, the less spread of data points.  d1 = x1 – xavg d2 = x2 – xavg d3 = x3 – xavg ------------ dn = xn – xavg ∑ di 2 n – 1 =σ √ = √ d1 2 + d2 2 + d3 2 + - - - - + dn 2 n – 1
  • 15. Henry R. Kang (7/2008) Relative Standard Deviation • Relative standard deviation (σr) is the standard deviation relative to the mean value.  d1 = x1 – xavg d2 = x2 – xavg d3 = x3 – xavg --------- --- dn = xn – xavg where n is the number of measurements ∑ (di /xavg)2 n – 1 =σr √ = √ D1 2 +D2 2 +D3 2 + - - - - +Dn 2 n – 1 or σr (ppt) = (σ / xavg ) × 1000
  • 16. Henry R. Kang (7/2008) Gaussian Distribution • Gaussian distribution gives the distribution of data points with respect to the true value. It gives a bell-shaped curve as shown in the figure. 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 -3 -2 -1 0 1 2 3 Standard Deviation Probability • The Gaussian equation is P(x) = [(2π)1/2 σ]–1 exp[-(x – X)2 /(2σ2 )] where σ is the standard deviation and X is the true value.  The closer to the true value, the higher the probability.  The area under the curve (or the integration of the Gaussian function)  (xture ± σ) incorporates 68.3% of the data points.  (xture ± 3σ) incorporates 99.7% of the data points.  (xture ± 3.8901σ) incorporates 99.99% of the data points.  (xture ± 4.4172σ) incorporates 99.999% of the data points.  (xture ± 6σ) incorporates nearly 100% of the data points.
  • 17. Henry R. Kang (7/2008) Standard Deviation & Data Distribution • The smaller the σ, the less spread of data points. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 -4 -3 -2 -1 0 1 2 3 4 Standard Deviation Probability σ = 0.5 σ = 1.0 σ = 2.0
  • 18. Henry R. Kang (7/2008) Approximation of Standard Deviation • The computational cost for standard deviation is pretty high; therefore, there exists a good approximation to compute standard deviation with much less computational cost. • š = Ř/√N  Ř is the range of data points from the lowest value to the highest value Ř = xmax – xmin  N is the number of data points. • For a small number of measurements the approximation is accurate enough to replace the formal standard deviation.
  • 19. Henry R. Kang (7/2008) Accuracy and Precision
  • 20. Henry R. Kang (1/2010) Accuracy & Precision of Measurements • Accuracy is a measure of the closeness of a measured quantity to the true value. • Precision is a measure of the agreement of replicate measurements. • Measurements can be precise but not accurate or accurate but not precise or neither. The best result is, of course, accurate and precise. Accurate & precise Precise but not accurate not accurate & not precise accurate but not precise
  • 21. Henry R. Kang (1/2010) Example 1 of Accuracy and Precision • Measured %S values in H2SO4 are 28.72%, 28.40%, and 28.57%, where the true value is 32.69%. Determine the accuracy and precision. • Answer:  Mean = (28.72% + 28.40% + 28.57%) / 3 = 28.60%  Estimated precision by using the approximation: š = Ř / √N  š = (28.72 – 28.40)% / 31/2 = 0.32% / 1.732 = 0.18 %  Relative standard deviation: sr = š / xM  sr = 0.18% / 28.60% = 0.0063  Accuracy = |X − xM| = | 32.69% − 28.60% | = 4.09%  Relative accuracy = Accuracy / True value = 4.09% / 32.69% = 0.125 • These result indicate that the data are precise but inaccurate.
  • 22. Henry R. Kang (1/2010) Example 2 of Accuracy and Precision • Measured %S values in H2SO4 are 28.89%, 32.56%, and 36.64%, where the true value is 32.69%. Determine the accuracy and precision. • Answer:  Mean = (28.89% + 32.56% + 36.64%) / 3 = 32.70%  Estimated precision by using the approximation: š = Ř / √N  š = (36.64 – 28.89)% / 31/2 = 7.75% / 1.732 = 4.47 %  Relative standard deviation: sr = š / xM  sr = 4.47% / 32.70% = 0.137  Accuracy = |X − xM| = | 32.69% − 32.70% | = 0.01%  Relative accuracy = Accuracy / True value = 0.01% / 32.69% = 0.0003 • These result indicate that the data are imprecise but accurate.
  • 23. Henry R. Kang (1/2010) Example 3 of Accuracy and Precision • Measured %S values in H2SO4 are 25.62%, 33.56%, and 27.93%, where the true value is 32.69%. Determine the accuracy and precision. • Answer:  Mean = (25.62% + 33.56% + 27.93%) / 3 = 29.04%  Estimated precision by using the approximation: š = Ř / √N  š = (33.56 – 25.62)% / 31/2 = 7.94% / 1.732 = 4.58 %  Relative standard deviation: sr = š / xM  sr = 4.58% / 29.04% = 0.158  Accuracy = |X − xM| = | 32.69% − 29.04% | = 3.65%  Relative accuracy = Accuracy / True value = 3.65% / 32.69% = 0.112 • These result indicate that the data are imprecise and inaccurate.
  • 24. Henry R. Kang (7/2008) Data Rejection
  • 25. Henry R. Kang (7/2008) Data Rejection • Replicate measurements of a given quantity are usually scattered.  Some values are closer than others. • Which values to keep (or which values to discard)  If a single result differs greatly from the others that is caused by a particular error of the experimenter, then this result should be discarded.  If a result is significantly “off”, but there is no error in the experiment, then the result, in general, should be kept. • If in doubt, use the rejection coefficient Q test. • Do not discard any result just to get “good precision”.
  • 26. Henry R. Kang (7/2008) Q Test • Q test is used to test the extreme values (the highest and lowest values) • Procedure  Calculate the range  Range = xmax – xmin  Calculate the difference between the extreme value with its nearest neighbor  dhi = xmax – xnbor,hi; dlo = | xmin – xnbor,lo |  Calculate the ratio (Q value) between the difference and the range  Qhi = dhi / Range ;Qlo = dlo / Range • Compare the resulting Q value with the rejection table at 90% confidence level (or other selected confidence level)  If the calculated Q value is greater than the Q value given in the table, then reject the value.
  • 27. Henry R. Kang (7/2008) Rejection Q Tables Number of Data Q90 Q96 Q99 3 0.94 0.98 0.99 4 0.76 0.85 0.93 5 0.64 0.73 0.82 6 0.56 0.64 0.74 7 0.51 0.59 0.68 8 0.47 0.54 0.63 9 0.44 0.51 0.60 10 0.41 0.48 0.57
  • 28. Henry R. Kang (7/2008) Q Test - Example • Data: 35.00, 35.05, 35.10, 35.80 • Calculate the range  Range = xmax – xmin= 35.80 – 35.00 = 0.80 • Calculate the difference between the extreme value with its nearest neighbor.  dhi = xmax – xnbor,hi = 35.80 – 35.10 = 0.70  dlo = xmin – xnbor,lo = | 35.00 – 35.05 | = 0.05 • Calculate Q values between the difference and the range.  Qhi = dhi / Range = 0.70 / 0.80 = 0.88  Qlo = dlo / Range = 0.05 / 0.80 = 0.063 • Compare the resulting Q value with the rejection table at 90% confidence level.  For 4 samples, the Q value in the table is 0.76  Qhi > 0.76; therefore, the highest value 35.80 can be dropped  Once the value is dropped, it is no longer in the data set and should not be used for the calculations of mean and various deviations. #Data Q90 3 0.94 4 0.76 5 0.64 6 0.56 7 0.51 8 0.47 9 0.44 10 0.41