1
JAMES L. PAGLINAWAN, Ph.D.
Professor
CENTRAL MINDANAO UNIVERSITY
2
 No test measures
perfectly, and many
tests fail to measure
as well as we would
like them to.
 Tests make “mistakes”.
They are always
associated with some
degree of error.
Error – What is it?
3
 Think about the last test you
took.
 Did you obtain exactly the
score you thought or knew
you deserved?
When you couldn’t sleep
the night before the test
When the essay test you
were taking was so
poorly constructed it was
hard to tell what was
being tested.
Example of a type of error that lower
your obtained score
4
When you are sick but
took the test anyway
When the test had a 45-
minute time limit but
you were allowed only
38 minutes,
When you took a test
that had multiple
defensible answers
Example of a type of
error (of situation) that
raised your obtained
score
 The time you just happened to see
the answers on your neighbor’s paper,
 The time you got lucky guessing,
 The time you had 52 minutes for a 45-minute test
 The time the test was so full of unintentional clues
that you were able to answer several questions
based on the information given in other questions.
Unfortunately,wedon’thaveananswer.
Thetruescoreandtheerrorscoreareboth
theoreticalorhypotheticalvalues.
6
Then how does one go about discovering one’s
true score?
They are important because they allow us
to illustrate some important points about
test score reliability and test score
accuracy.
8 8
Remember:
Obtained Score = true score+ error score
9
Student Obtained Score True Score Error Score
Donna 91 88 +3
Jack 72 79 -7
Phyllis 68 70 -2
Gary 85 80 +5
Marsha 90 86 +4
Milton 75 78 -3
Hypothetical Values
Is the standard deviation of error scores
of a test.
We will use the error scores from table
17.1 (3, -7, -2, 5,4, -3)
m
11
Step 1: Determine the mean.
M = ΣX = 0 = 0
N 6
Student Obtained
Score
True
Score
Error
Score
Donna 91 88 +3
Jack 72 79 -7
Phyllis 68 70 -2
Gary 85 80 +5
Marsha 90 86 +4
Milton 75 78 -3
12
Step 2 Subtract the mean from each error score to arrive at the
deviation scores. Square each deviation score and sum the squared
deviations.
Student Obtained
Score
True
Score
Error
Score
Donna 91 88 +3
Jack 72 79 -7
Phyllis 68 70 -2
Gary 85 80 +5
Marsha 90 86 +4
Milton 75 78 -3
X – M = x x
+3 – 0 = 3 9
-7– 0 = -7 49
-2 – 0 = -2 4
+5 – 0 = 5 25
+4 – 0 = 4 16
-3 – 0 = -3 9
112∑X =
2
2
13
Step 3: Plug the x sum into the formula and solve for the standard
deviation.
Error Score SD =
2
Fortunately,arathersimplestatisticalformulacanbe
usedtoestimatethisstandarddeviation(Sm)without
actuallyknowingtheerrorscores:
14
Where r is the reliability of the test and SD is the
test’s standard deviation.
15
error scores
1. are normally distributed
2. have a mean of zero
have a standard deviation called the standard
error of measurement (Sm).
USING THE STANDARD
ERROR OF MEASUREMENT
“
16
USING THE STANDARD ERROR OF
MEASUREMENT
Student Obtained
Score
True
Score
Error
Score
Donna 91 88 +3
Jack 72 79 -7
Phyllis 68 70 -2
Gary 85 80 +5
Marsha 90 86 +4
Milton 75 78 -3
Table 17.1
Figure 17.1 The error score distribution
17
This figure tells us that the distribution of error
scores is a normal distribution
Error score of the ninth-grade
math test
Figure 17.2 The error score distribution for the test depicted in Table 17.1
18
Fig. 17.3 The error score distribution for the
test depicted in Table 17.1
With approximate normal curve percentages.
19
Let’s use the following number line to represent an individual’s
obtained score, which we will simply call the X:
20
Fig. 17.4 The error distribution around an
obtained score of 90 for a test with Sm= 4.32
Student Obtained
Score
True
Score
Error Score
Donna 91 88 +3
Jack 72 79 -7
Phyllis 68 70 -2
Gary 85 80 +5
Marsha 90 86 +4
Milton 75 78 -3
21
Student Obtained
Score
True
Score
Error Score
Donna 91 88 +3
Jack 72 79 -7
Phyllis 68 70 -2
Gary 85 80 +5
Marsha 90 86 +4
Milton 75 78 -3
Fig. 17.5 The error distribution around an
obtained score of 75 for a test with Sm = 4.32
22
Standard Deviation or Standard error of
measurement?
Standard Deviation
(SD)
Standard Error of Measurement
(Sm)
 Is the variability of raw
scores.
 It tells us how spread out the
scores are in a distribution of
raw scores.
 Is based on a group of scores
that actually exist.
 Is the variability of error
scores.
 Is based on a group of
scores that is
hypothetical.
THANK YOU!

More Related Content

PPTX
Accuracy and errors
PPTX
Measures of Spread Revision
PPTX
Calculating the mean, median and mode
PPT
Mean Median Mode Range
PPTX
Measures of central tendecy
PDF
Sample computer
PPT
Mean, median, and mode
PPTX
Measures Of Central Tendency
Accuracy and errors
Measures of Spread Revision
Calculating the mean, median and mode
Mean Median Mode Range
Measures of central tendecy
Sample computer
Mean, median, and mode
Measures Of Central Tendency

What's hot (20)

PPTX
Lesson 38 adding and subtracting dissimilar fractions
PPTX
Measures of Central Tendency
PPTX
November 30, 2015
PPTX
Math 221 week 6 lecture april 2015
PPTX
Statistics
PPTX
Multiplying decimals
PPTX
Handling Data 2
PPTX
11:00 Tuesday 4/24 ABE Math
PPT
Solving 2 step equations review & practice
PPTX
Lec 3 variable, central tendency, and dispersion
PPT
Multiplying Decimals
DOCX
Measures of Central Tendency
PPTX
Maths Short Tricks : How to multiply & find square of any two digit number?
PPT
N) Multiply Integers Day 1
PPTX
Section 4.6
PPT
Decimal Numbers Part 3
PPT
Mean Median Mode
ODP
An Algorithm for solving the game of Mastermind
PPTX
May 8, 2014
PPT
Whole Number Round Up!
Lesson 38 adding and subtracting dissimilar fractions
Measures of Central Tendency
November 30, 2015
Math 221 week 6 lecture april 2015
Statistics
Multiplying decimals
Handling Data 2
11:00 Tuesday 4/24 ABE Math
Solving 2 step equations review & practice
Lec 3 variable, central tendency, and dispersion
Multiplying Decimals
Measures of Central Tendency
Maths Short Tricks : How to multiply & find square of any two digit number?
N) Multiply Integers Day 1
Section 4.6
Decimal Numbers Part 3
Mean Median Mode
An Algorithm for solving the game of Mastermind
May 8, 2014
Whole Number Round Up!
Ad

Similar to Accuracy & Error (20)

DOCX
Chapter 17 error and accuracy
PPT
Derived Scores
PPT
Interpreting Test Scores
PPTX
Practical Language Testing by Fulcher (2010)
PDF
Analysis-Interpretation-and-Use-of-Test-Data.pdf
PPT
Presentation1group b
PPT
Day 4 normal curve and standard scores
PPTX
Data analysis
PPT
Estimating standard error of measurement
PPTX
Week8finalexamlivelecture2011
PPTX
Week8 finalexamlivelecture 2010december
PPTX
Data meeting
PDF
Chapter2 slides-part 2-harish complete
PPTX
Standard error of measurement
PPTX
Standard error of measurement
PPT
statistics
PPTX
Week8 finalexamlivelecture 2010june
PDF
Normal distribution
PDF
Lecture_Wk08.pdf
PPTX
Spe 501 class 11
Chapter 17 error and accuracy
Derived Scores
Interpreting Test Scores
Practical Language Testing by Fulcher (2010)
Analysis-Interpretation-and-Use-of-Test-Data.pdf
Presentation1group b
Day 4 normal curve and standard scores
Data analysis
Estimating standard error of measurement
Week8finalexamlivelecture2011
Week8 finalexamlivelecture 2010december
Data meeting
Chapter2 slides-part 2-harish complete
Standard error of measurement
Standard error of measurement
statistics
Week8 finalexamlivelecture 2010june
Normal distribution
Lecture_Wk08.pdf
Spe 501 class 11
Ad

Recently uploaded (20)

PDF
LDMMIA Reiki Yoga Finals Review Spring Summer
PPTX
History, Philosophy and sociology of education (1).pptx
PDF
medical_surgical_nursing_10th_edition_ignatavicius_TEST_BANK_pdf.pdf
PDF
advance database management system book.pdf
PDF
Uderstanding digital marketing and marketing stratergie for engaging the digi...
PDF
MBA _Common_ 2nd year Syllabus _2021-22_.pdf
PDF
FOISHS ANNUAL IMPLEMENTATION PLAN 2025.pdf
PDF
HVAC Specification 2024 according to central public works department
PDF
Hazard Identification & Risk Assessment .pdf
DOCX
Cambridge-Practice-Tests-for-IELTS-12.docx
PPTX
CHAPTER IV. MAN AND BIOSPHERE AND ITS TOTALITY.pptx
PDF
BP 704 T. NOVEL DRUG DELIVERY SYSTEMS (UNIT 1)
PPTX
ELIAS-SEZIURE AND EPilepsy semmioan session.pptx
PDF
FORM 1 BIOLOGY MIND MAPS and their schemes
PDF
What if we spent less time fighting change, and more time building what’s rig...
PDF
International_Financial_Reporting_Standa.pdf
DOC
Soft-furnishing-By-Architect-A.F.M.Mohiuddin-Akhand.doc
PDF
Vision Prelims GS PYQ Analysis 2011-2022 www.upscpdf.com.pdf
PPTX
202450812 BayCHI UCSC-SV 20250812 v17.pptx
PDF
ChatGPT for Dummies - Pam Baker Ccesa007.pdf
LDMMIA Reiki Yoga Finals Review Spring Summer
History, Philosophy and sociology of education (1).pptx
medical_surgical_nursing_10th_edition_ignatavicius_TEST_BANK_pdf.pdf
advance database management system book.pdf
Uderstanding digital marketing and marketing stratergie for engaging the digi...
MBA _Common_ 2nd year Syllabus _2021-22_.pdf
FOISHS ANNUAL IMPLEMENTATION PLAN 2025.pdf
HVAC Specification 2024 according to central public works department
Hazard Identification & Risk Assessment .pdf
Cambridge-Practice-Tests-for-IELTS-12.docx
CHAPTER IV. MAN AND BIOSPHERE AND ITS TOTALITY.pptx
BP 704 T. NOVEL DRUG DELIVERY SYSTEMS (UNIT 1)
ELIAS-SEZIURE AND EPilepsy semmioan session.pptx
FORM 1 BIOLOGY MIND MAPS and their schemes
What if we spent less time fighting change, and more time building what’s rig...
International_Financial_Reporting_Standa.pdf
Soft-furnishing-By-Architect-A.F.M.Mohiuddin-Akhand.doc
Vision Prelims GS PYQ Analysis 2011-2022 www.upscpdf.com.pdf
202450812 BayCHI UCSC-SV 20250812 v17.pptx
ChatGPT for Dummies - Pam Baker Ccesa007.pdf

Accuracy & Error

  • 1. 1 JAMES L. PAGLINAWAN, Ph.D. Professor CENTRAL MINDANAO UNIVERSITY
  • 2. 2
  • 3.  No test measures perfectly, and many tests fail to measure as well as we would like them to.  Tests make “mistakes”. They are always associated with some degree of error. Error – What is it? 3  Think about the last test you took.  Did you obtain exactly the score you thought or knew you deserved?
  • 4. When you couldn’t sleep the night before the test When the essay test you were taking was so poorly constructed it was hard to tell what was being tested. Example of a type of error that lower your obtained score 4 When you are sick but took the test anyway When the test had a 45- minute time limit but you were allowed only 38 minutes, When you took a test that had multiple defensible answers
  • 5. Example of a type of error (of situation) that raised your obtained score  The time you just happened to see the answers on your neighbor’s paper,  The time you got lucky guessing,  The time you had 52 minutes for a 45-minute test  The time the test was so full of unintentional clues that you were able to answer several questions based on the information given in other questions.
  • 7. They are important because they allow us to illustrate some important points about test score reliability and test score accuracy.
  • 8. 8 8 Remember: Obtained Score = true score+ error score
  • 9. 9 Student Obtained Score True Score Error Score Donna 91 88 +3 Jack 72 79 -7 Phyllis 68 70 -2 Gary 85 80 +5 Marsha 90 86 +4 Milton 75 78 -3 Hypothetical Values
  • 10. Is the standard deviation of error scores of a test. We will use the error scores from table 17.1 (3, -7, -2, 5,4, -3) m
  • 11. 11 Step 1: Determine the mean. M = ΣX = 0 = 0 N 6 Student Obtained Score True Score Error Score Donna 91 88 +3 Jack 72 79 -7 Phyllis 68 70 -2 Gary 85 80 +5 Marsha 90 86 +4 Milton 75 78 -3
  • 12. 12 Step 2 Subtract the mean from each error score to arrive at the deviation scores. Square each deviation score and sum the squared deviations. Student Obtained Score True Score Error Score Donna 91 88 +3 Jack 72 79 -7 Phyllis 68 70 -2 Gary 85 80 +5 Marsha 90 86 +4 Milton 75 78 -3 X – M = x x +3 – 0 = 3 9 -7– 0 = -7 49 -2 – 0 = -2 4 +5 – 0 = 5 25 +4 – 0 = 4 16 -3 – 0 = -3 9 112∑X = 2 2
  • 13. 13 Step 3: Plug the x sum into the formula and solve for the standard deviation. Error Score SD = 2
  • 15. 15 error scores 1. are normally distributed 2. have a mean of zero have a standard deviation called the standard error of measurement (Sm). USING THE STANDARD ERROR OF MEASUREMENT
  • 16. “ 16 USING THE STANDARD ERROR OF MEASUREMENT Student Obtained Score True Score Error Score Donna 91 88 +3 Jack 72 79 -7 Phyllis 68 70 -2 Gary 85 80 +5 Marsha 90 86 +4 Milton 75 78 -3 Table 17.1 Figure 17.1 The error score distribution
  • 17. 17 This figure tells us that the distribution of error scores is a normal distribution Error score of the ninth-grade math test Figure 17.2 The error score distribution for the test depicted in Table 17.1
  • 18. 18 Fig. 17.3 The error score distribution for the test depicted in Table 17.1 With approximate normal curve percentages.
  • 19. 19 Let’s use the following number line to represent an individual’s obtained score, which we will simply call the X:
  • 20. 20 Fig. 17.4 The error distribution around an obtained score of 90 for a test with Sm= 4.32 Student Obtained Score True Score Error Score Donna 91 88 +3 Jack 72 79 -7 Phyllis 68 70 -2 Gary 85 80 +5 Marsha 90 86 +4 Milton 75 78 -3
  • 21. 21 Student Obtained Score True Score Error Score Donna 91 88 +3 Jack 72 79 -7 Phyllis 68 70 -2 Gary 85 80 +5 Marsha 90 86 +4 Milton 75 78 -3 Fig. 17.5 The error distribution around an obtained score of 75 for a test with Sm = 4.32
  • 22. 22 Standard Deviation or Standard error of measurement? Standard Deviation (SD) Standard Error of Measurement (Sm)  Is the variability of raw scores.  It tells us how spread out the scores are in a distribution of raw scores.  Is based on a group of scores that actually exist.  Is the variability of error scores.  Is based on a group of scores that is hypothetical.

Editor's Notes

  • #4: Think about the last test you took. Did you obtain exactly the score you thought or knew you deserved? Was your score higher than you expected? Was it lower than you expected? What about your obtained scores on all the other tests you have taken? Did they truly reflect your skill, knowledge, or ability, or did they sometimes underestimate your knowledge, ability, or skill? Or did they overestimate? If your obtained test scores did not always reflect your true ability, they were associated with some error. Your obtained scores may have been lower or higher than they should have been. In short, an obtained score has a true component (actual level of ability, knowledge) and an error component (which may act to lower or raise the obtained score).
  • #14: The standard deviation of the error score distribution, also known as the standard error of measurement, is 4. 43. If we could know what the error scores are for each test we administer, we could compute Sm in this manner. But, of course, we never know these error scores. If you are following so far, your neat question should be, “But how in the world do you determine the standard deviation of the error scores if you never know the error scores?”
  • #16: Error scores are assumed to be random. As such, they cancel each other out. That is obtained scores are inflated by random error to the same extent as they are deflated by error. Another way of saying this is that the mean of the error scores for a test is zero. The distribution of the error scores is also important, since it approximates a normal distribution closely enough for us to use the normal distribution to represent it.
  • #18: (Fig. 17.3 Should refresh your memory) We listed along the baseline the standard deviation of the error score distribution. This is more commonly called the standard error of measurement (Sm) of the test. Thus we can see that 68% of the error scores for the test will be no more than 4.32 points higher or 4.32 points lower than the true scores. That is, if there were 100 obtained scores on this test, 68 of these scores would not be “off” their true scores by more than 4.32 points. The Sm then, tells us about the distribution of obtained score around true scores. By knowing an individual’s true socre we can predict what his or her obtained score is likely to be.
  • #19: (Fig. 17.3 Should refresh your memory) We listed along the baseline the standard deviation of the error score distribution. This is more commonly called the standard error of measurement (Sm) of the test. Thus we can see that 68% of the error scores for the test will be no more than 4.32 points higher or 4.32 points lower than the true scores. That is, if there were 100 obtained scores on this test, 68 of these scores would not be “off” their true scores by more than 4.32 points. The Sm then, tells us about the distribution of obtained score around true scores. By knowing an individual’s true socre we can predict what his or her obtained score is likely to be.
  • #20: The careful reader may be thinking, “That’s not very useful information. We can never know what a person’s true score is, only their obtained score.” This is correct. As a test users, we work only with obtained scores. However, we can follow our logic in reverse. If 68% of obtained scores fall within 1 Sm of their true scores, then 68% of true scores must fall within 1Sm of their obtained scores. Strictly speaking, this reverse logic is somewhat inaccurate, it would be true 99% of the itme (Gullikson, 1987). Therefore the Sm is often used to determine how test error is likely to have affected individual obtained scores. That is, X plus or minus 4.32 (+4.32) defines the range or band
  • #21: The careful reader may be thinking, “That’s not very useful information. We can never know what a person’s true score is, only their obtained score.” This is correct. As a test users, we work only with obtained scores. However, we can follow our logic in reverse. If 68% of obtained scores fall within 1 Sm of their true scores, then 68% of true scores must fall within 1Sm of their obtained scores. Strictly speaking, this reverse logic is somewhat inaccurate, it would be true 99% of the itme (Gullikson, 1987). Therefore the Sm is often used to determine how test error is likely to have affected individual obtained scores. That is, X plus or minus 4.32 (+4.32) defines the range or band
  • #22: The careful reader may be thinking, “That’s not very useful information. We can never know what a person’s true score is, only their obtained score.” This is correct. As a test users, we work only with obtained scores. However, we can follow our logic in reverse. If 68% of obtained scores fall within 1 Sm of their true scores, then 68% of true scores must fall within 1Sm of their obtained scores. Strictly speaking, this reverse logic is somewhat inaccurate, it would be true 99% of the itme (Gullikson, 1987). Therefore the Sm is often used to determine how test error is likely to have affected individual obtained scores. That is, X plus or minus 4.32 (+4.32) defines the range or band
  • #23: The careful reader may be thinking, “That’s not very useful information. We can never know what a person’s true score is, only their obtained score.” This is correct. As a test users, we work only with obtained scores. However, we can follow our logic in reverse. If 68% of obtained scores fall within 1 Sm of their true scores, then 68% of true scores must fall within 1Sm of their obtained scores. Strictly speaking, this reverse logic is somewhat inaccurate, it would be true 99% of the itme (Gullikson, 1987). Therefore the Sm is often used to determine how test error is likely to have affected individual obtained scores. That is, X plus or minus 4.32 (+4.32) defines the range or band