Chapter 16 ppt eval & testing 4e formatted 01.10 kg edits

© 2013 Springer Publishing Company, LLC.
Chapter 16
Interpreting Test Scores
&Oermann Gaberson
Evaluation and Testing in Nursing Education
4th edition

Interpreting Test Scores
♦ A test produces a score
– Number with no intrinsic meaning
– Must be compared with something that has
meaning
♦ Interpretations can be norm- or criterion-
referenced
2

Test Score Distributions
♦ Scoring a test produces a collection of raw
scores, recorded by student name or number
– Difficult to interpret characteristics of the scores
♦ Arrange in rank order, highest to lowest
– Reveals range of scores
– Still difficult to judge how a typical student
performed on the test or other characteristics of
the obtained scores
3

Test Score Distributions
♦ Frequency distribution
– Remove student names or numbers
– List each score once
– Tally number of times each score occurs
– Identify how well the group of students performed on the
exam more easily
– Can represent graphically as a histogram or frequency
polygon
• Display scores that occurred most frequently, score distribution
shape, range
4

Characteristics of
Score Distributions
♦ Symmetry
♦ Skewness
♦ Modality
♦ Kurtosis
5

Symmetry
♦ Symmetric distribution or curve
– Equal halves, mirror images of each other
♦ Nonsymmetric or asymmetric distribution
or curve
– Scores cluster at one end, tail toward other end
– Most nursing test score distributions
6

Skewness
♦ Skew—direction in which the tail extends
– Positive skew—tail toward the right (in the
direction of positive numbers on a scale)
• Positively skewed distribution—cluster of scores at
low end
– Negative skew—tail toward the left (in the
direction of negative numbers)
• Cluster of scores at the high end
• Most nursing test score distributions
7

Modality
♦ Number of peaks (cluster of scores) in the
distribution
♦ Mode
– Most frequently occurring score in the distribution
♦ Unimodal—one peak
♦ Bimodal—two peaks
♦ Multimodal—many peaks
8

Kurtosis
♦ Relative flatness or peakedness of the curve
♦ Platykurtic—relatively flat, gently curved
♦ Mesokurtic—moderately curved
♦ Leptokurtic—sharply peaked
9

“Curving” Grades
♦ Not appropriate if scores lack characteristics
of a normal curve
– Bell-shaped: symmetric, unimodal, mesokurtic

“Curving” Grades
♦ Most score distributions from teacher-made
tests not normally distributed
♦ Shape of distribution affected by:
– Test characteristics
• Difficult test → positively skewed curve
– Ability of students
• Nursing content knowledge not normally distributed
– Students admitted to nursing program not representative
of general population
11

Measures of Central Tendency
♦ Ways of indicating the score that is most
characteristic or typical of the distribution
♦ “Middle” of a distribution, scores tend to
cluster around it
♦ Three measures
– Mode
– Median
– Mean
12

Mode
♦ Most frequently occurring score in a distribution
♦ Must be an actual obtained score
♦ Identified from frequency distribution or graphic
display without mathematical calculation
♦ Rough indication of central tendency
♦ Least stable measure of central tendency
– Can fluctuate considerably among samples drawn from
the same population
13

Median
♦ Point that divides a score distribution into equal halves
♦ 50th percentile—50% of scores are above and 50% are below
♦ Does not have to be an actual obtained score
– Even number of scores—median is halfway between the two
middle scores
– Odd number of scores—median is the middle score
♦ Index of location—not influenced by the value of each score
– Good for skewed distribution
14

Mean
♦ Mathematical average of all scores
– Computed by summing individual scores and dividing by
the total number of scores
– Does not have to be an actual obtained score
♦ Value of the mean is affected by every score in the
distribution
– Influenced by extremely high or low scores
– Not the most accurate measure of central tendency in
highly skewed distributions
15

Selecting a Measure of
Central Tendency
♦ Relationship between shape of a distribution
and locations of measures of central tendency
– Normal distribution
• Mean, median, and mode have the same value
– Positively skewed distribution
• Mean is highest, mode is lowest
– Negatively skewed distribution
• Mode is highest, mean is lowest
16

Measures of Variability
♦ Used to determine how similar or different
the test scores are
♦ Score distributions may have similar measures
of central tendency and different degrees of
variability
♦ Most common measures
– Range
– Standard deviation
17

Range
♦ Simplest measure of variability
♦ Difference between the highest and lowest scores in
the distribution
– Sometimes expressed as highest and lowest scores, rather
than a difference score (e.g., 42 to 60)
♦ Can be highly unstable—based on only two values
♦ Tends to increase with number of scores
– Wider range of test scores from large group of students
because of likelihood of an extreme score
18

Standard Deviation (SD)
♦ Most common and useful measure of variability
♦ Takes every score in the distribution into
consideration
♦ Based on differences between each score and the
mean
♦ Represents average amount by which scores differ
from the mean
– Smaller if scores cluster tightly around the mean
– Larger if scores widely scattered over large range
19

Interpreting an Individual Score
♦ Scores on teacher-made tests
– Norm-referenced interpretations
• Use mean and SD to interpret individual scores
– Criterion-referenced interpretations
• Used in most nursing education settings
• Scores are compared to a preset standard
• Example: percentage-correct score
– Comparison of a student’s score with the maximum
possible score
20

Percentage-Correct Scores
♦ Derived (not raw) score
♦ Often used as a basis for assigning grades
♦ Determined more by test item difficulty than by
quality of performance
– If test is more difficult than expected, teacher may
want to adjust the raw scores before calculating the
percentage correct
♦ Not to be confused with percentile score
– Norm-referenced interpretation
21

♦ Scores on standardized tests
– Usually used to make norm-referenced
interpretations
– More relevant to general rather than specific
instructional goals
• Should not be used to determine course grades
– Usually reported in derived scores
• Percentile ranks
• Standard scores
• Norm-group scores
(cont’d)
22

♦ Scores on standardized tests (cont’d)
– Important to specify an appropriate norm group
for comparison
– User’s manual includes norm tables with
descriptions of each norm group
– Teacher should select the norm group that most
closely matches the group of students
• Examples: type of nursing program, public or private
23

Chapter 16 ppt eval & testing 4e formatted 01.10 kg edits

More Related Content

What's hot (20)

Similar to Chapter 16 ppt eval & testing 4e formatted 01.10 kg edits (20)

More from stanbridge (20)