Measurement and instrumentaion

TEST SCORES
C O L L E C T E D A N D P R E S E N T E D B Y :
E M A N A W A D E L- S A W Y
Ch. 4

What is the meaning of Instrumentation?
It is the process of selecting or
developing measuring devices and
methods appropriate to a given
evaluation problem. (p. 101)

WHAT’S THE DEAL WITH TESTING?
As a society, we like numbers. If something can be
quantified, it is viewed as valid or more scientific.
Machine scoring of a test is fast, efficient, and cheap.
Hand scoring of a test is slow, time consuming, and
very expensive.

INTERPRETING TEST SCORES:
To interpret a test score ,two things must be known:
1. The nature of the score itself (what kind of scoring or scaling system
was used in the calculations?
2. The basis of comparison underlying the score (what reference
population or norm group does it present?)

Types of
test scores
1. Raw
score
2. Percentile
(centile ranks) and
percentiles
(centiles)
3. Stanine
scores
(standard nine)
4.
Standard
scores
Z-scores
T-scores
Other
standard
scores
5. Grade
level
scores

1. RAW SCORES
the actual score made on a test.
Simply the total number of points an individual gets on a test before it is
converted to any formal or standardized scoring system .
Limitations: A raw score by itself is uninterpretable since there is no way
of knowing how it compares with anything else.

2. PERCENTILE (CENTILE RANKS) AND PERCENTILES
(CENTILES)
Raw scores begin to have meaning when they are ranked
from high to low.
A convenient solution is to convert the scores into
percentage values.
Two statistics used for this purpose:
A. The Percentile (Centile Rank): a number between 0 & 100
indicating percent of cases in a norm group falling at or below
that score.
B. the percentile (centile): a point on a scale of scores at or below
which a given percent of the cases falls.

 Strengths:
 They are easily understood by lay people.
 They allow exact interpretation.
 They are appropriate for markedly skewed data than scores based
on the normal probability curve.
 Weaknesses
 Confusion with a “percentage-right score”
 Inequality of units.
 It is misleading to report results in percentage terms when the
sample size is under 100.
 They only permit statements about rank (greater than, equal to ,
less than).
 The intervals between units are not equal (between 60th and 70th ＃
between 80th and 90th).

3. STANINES
CONTRACTION OF “STANDARD NINE”
 Stanines divide the normal distribution into 9 units each of which
cover the same length along the base of the normal curve (except
the units which cover the two tails). Stanines have a M = 5 and SD =
2 and range 1 (lowest) – 9 (highest).
 They combine the understandability of percentages with the features
of the normal curve of probability.
Stanine scores are useful in comparing a student's performance
across different content areas. For example, a 6 in Mathematics
and an 8 in Reading generally indicate a meaningful difference
in a student's learning for the two respective content areas.
Advantages: Stanine score are coming into increasing use
because of their simplicity and utility.
9

4. STANDARD SCORES
The standard scores indicate a student’s relative
position in a group. It expresses test
performance in terms of standard deviation
units from the mean.
They are derived from the properties of the
normal probability curve and preserving the
absolute differences between scores.
Disadvantages:
1. They are inappropriate if data are markedly
skewed.
2. They are difficult to explain to lay audience.

5. GRADE LEVEL SCORES:
They are based on the relationship between
scores on a test and the average performance
of children at each of a series of grade levels.
However, developmental characteristics of certain
age levels may be due to maturity rather than
instruction.
They are most relevant in elementary schools
where subject matter tends to be more
continuous.
Beyond the sixth grade they lose meaning.

STANDARDIZED TESTS:
They report score based on a norm group representing a defined
population .
Until this comparison group is clearly known , a satisfactory interpretation
of the score is not possible.
1. Norm-referenced tests.
2. Criterion-referenced tests.

MEASUREMENT AND EVALUATION:
CRITERION- VERSUS NORM-REFERENCED
TESTINGMany educators and members of the public fail to
grasp the distinctions between criterion-
referenced and norm-referenced testing. It is
common to hear the two types of testing referred
to as if they serve the same purposes, or shared
the same characteristics. Much confusion can be
eliminated if the basic differences are
understood.
The following is adapted from: Popham, J. W.
(1975). Educational evaluation. Englewood Cliffs,
New Jersey: Prentice-Hall, Inc.

STANDARDIZED TESTS:
Criterion-Referenced Test
Criterion-referenced tests, also called mastery tests,
compare a person's performance to a set of objectives.
Anyone who meets the criterion can get a high score.
Everyone knows what the benchmarks / objectives are and
can attain mastery to meet them.
It is possible for ALL the test takers to achieve 100%
mastery.

TESTING
Dimension Criterion-Referenced
Tests
Norm-Referenced
Tests
Purpose To determine whether each
student has achieved specific
skills or concepts.
To find out how much
students know before
instruction begins and after
it has finished.
To rank each student with
respect to the
achievement of others in
broad areas of knowledge.
To discriminate between high
and low achievers.

TESTING
Tests
Norm-Referenced
Tests
Content Measures specific
skills which make up a
designated curriculum.
These skills are
identified by teachers
and curriculum
experts.
Each skill is expressed
as an instructional
objective.
Measures broad skill areas
sampled from a variety of
textbooks, syllabi, and the
judgments of curriculum
experts.

TESTING
Tests
Norm-Referenced
Tests
Item
Characteristics
Each skill is tested by at
least four items in order to
obtain an adequate sample
of student performance and
to minimize the effect of
guessing.
The items which test any
given skill are parallel in
difficulty.
Each skill is usually tested by
less than four items.
Items vary in difficulty.
Items are selected that
discriminate between high
and low achievers.

TESTING
Tests
Norm-Referenced
Tests
Score
Interpretation
Each individual is
compared with a preset
standard for acceptable
achievement. The
performance of other
examinees is irrelevant.
A student's score is usually
expressed as a percentage.
Student achievement is
reported for individual
skills.
Each individual is compared
with other examinees and
assigned a score--usually
expressed as a percentile, a
grade equivalent score, or a
stanine.
Student achievement is
reported for broad skill
areas, although some norm-
referenced tests do report
student achievement for
individual skills.

Measurement and instrumentaion

More Related Content

What's hot (20)

Similar to Measurement and instrumentaion (20)

More from ahmedabbas1121 (20)

Recently uploaded (20)

Measurement and instrumentaion