Computer Generated Items, Within-Template Variation, and the Impact on the Parameters of Response Models

Computer Generated Items,
Within-Template Variation, and the
Impact on the Parameters of Response
Models
Quinn N Lathrop
University of Notre Dame
January 8, 2015
1

Item Response Theory
Ability/Difficulty
Prob
Correct
-3 -2 -1 0 1 2 3
0.0
0.2
0.4
0.6
0.8
1.0
2

Item Response Theory
i = 1, 2, ..., I for Items
p = 1, 2, ..., N for Persons
Ypi = 0 or 1
ηpi = αi × (θp − βi)
P(Ypi = 1) =
exp(ηpi)
1 + exp(ηpi)
θp is Person Ability
βi is Item Difficulty
αi is Item Discrimination
3

New Technology Leads to New Psychometrics
Summer Internship with Pearson’s Center for Digital Data,
Analytics & Adaptive Learning
Computer/tablet-based course: lots of data
Technology allows for algorithmically generated items (called
Templates)
4

What are templates?
Templates generate items/tasks during computerized assessment.
6

What are templates?
I Creates secure and inexpensive item bank
I Creates miniature randomized experiments
I Students can repeat and practice templates
6

What are templates?
Templates contain:
6

What are templates?
Templates contain:
I a question form
6

What are templates?
Templates contain:
I a question form
I distributions for all variables in the question form
6

Examples of Templates
Question Form:
“What is X + Y ?”
7

Question Form:
Distributions:
fX (x) = 1/5, x ∈ {1, 2, 3, 4, 5}
fY (y) = 1/6, y ∈ {3, 4, 5, 6, 7, 8}
7

Question Form:
Distributions:
fX (x) = 1/5, x ∈ {1, 2, 3, 4, 5}
fY (y) = 1/6, y ∈ {3, 4, 5, 6, 7, 8}
Question Form:
“What is the average of
x1, x2, x3, x4, x5?”
7

Question Form:
Distributions:
fX (x) = 1/5, x ∈ {1, 2, 3, 4, 5}
fY (y) = 1/6, y ∈ {3, 4, 5, 6, 7, 8}
Question Form:
“What is the average of
x1, x2, x3, x4, x5?”
Distributions:
x1−5 ∼ Binom(40, .5)
7

Our Motivating Template
Question Form:
“What is the probability of rolling a X on a Y -sided die?”
8

Question Form:
Distributions:
fY (y) = 1/5, y ∈ {6, 8, 10, 12, 20}
fX (x) = 1/y, x ∈ {1, 2, ..., y}
8

Question Form:
Distributions:
fY (y) = 1/5, y ∈ {6, 8, 10, 12, 20}
fX (x) = 1/y, x ∈ {1, 2, ..., y}
Correct strategy:
1
y
8

Question Form:
Distributions:
fY (y) = 1/5, y ∈ {6, 8, 10, 12, 20}
fX (x) = 1/y, x ∈ {1, 2, ..., y}
Correct strategy:
1
y
An incorrect strategy:
x
y
8

Question Form:
Distributions:
fY (y) = 1/5, y ∈ {6, 8, 10, 12, 20}
fX (x) = 1/y, x ∈ {1, 2, ..., y}
Correct strategy:
1
y
An incorrect strategy:
x
y
For a subset (x = 1), students
can use the wrong strategy and
still get the correct answer!
8

Model within-template differences with multi-level IRT?
10

• Albers (1995)
• Glas and van der Linden (2003)
• Johnson and Sinharay (2005)
10

• Albers (1995)
Model within-template differences with covariates?
10

• Albers (1995)
• Fischer (1973)
• de Boeck and Wilson (2004)
10

• Albers (1995)
• Fischer (1973)
Both?
10

• Albers (1995)
• Fischer (1973)
Both?
Neither?
10

Publications/Presentations
Lathrop, Q. N. & Cheng, Y. (Under Review). Computer Generated
Items, Within-Template Variation, and the Impact on the Parameters of
Response Models.
Lathrop, Q. N. (2014). The Impact of Within-Template Systematic
Variation. Presented at NCME and regional conferences.
Lathrop, Q. N. & Behrens, J. (2014). Psychometric, Computational, and
Interactional Issues in Designing Integrated Assessment and Learning Systems.
Presented at the AERA.
Lathrop, Q. N. & Cheng, Y. (2013). Modeling Tests Using Templates
and Effect of Ignoring Template Structure on Educational Outcomes.
Presented at NCME
11

Notation Changes
Each person responds to templates
• p = 1, 2, ..., N for persons and t = 1, 2, ..., T for templates
12

Notation Changes
Each template has some number of items
• ti = 1, 2, ..., tI
12

Notation Changes
• ti = 1, 2, ..., tI
The items within a template may be grouped by a
design matrix
• A dummy variable Xti
= 1 if ti is in the “subset” and 0
otherwise
12

Notation Changes
• ti = 1, 2, ..., tI
The items within a template may be grouped by a
design matrix
• A dummy variable Xti
= 1 if ti is in the “subset” and 0
otherwise
When person p is assigned template t, item ti is
randomly drawn from available items
• The response is Ypti
∼ Bernoulli(
exp(ηpti
)
1+exp(ηpti
))
12

Four Models
2P-T
ηpt = αt × (θp − µt)
• the “neither” option, just a
template level IRT model
13

Four Models
2P-T
2P-TX
ηpti
= αt × (θp − µt + λtXti
)
• adds a covariate, λt to explain
differences contained in Xti
13

Four Models
2P-T
2P-TX
ηpti
)
2P-R
ηpti
= αt × (θp − βti
)
βti
∼ N(µt, σt)
• multi-level model
13

Four Models
2P-T
2P-TX
ηpti
)
2P-R
ηpti
)
βti
∼ N(µt, σt)
• multi-level model
2P-RX
ηpti
+ λtXti
)
βti
∼ N(µt, σt)
• the “both” option
13

14

15

2P-T
Prob
of
Correct
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
0.70
0.75
0.80
0.85
0.90
0.95
1.00
15

2P-T
Prob
of
Correct
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
0.70
0.75
0.80
0.85
0.90
0.95
1.00
2P-TX
Prob
of
Correct
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
0.70
0.75
0.80
0.85
0.90
0.95
1.00
15

2P-T
Prob
of
Correct
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
0.70
0.75
0.80
0.85
0.90
0.95
1.00
2P-TX
Prob
of
Correct
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
0.70
0.75
0.80
0.85
0.90
0.95
1.00
2P-R
Prob
of
Correct
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
0.70
0.75
0.80
0.85
0.90
0.95
1.00
15

2P-T
Prob
of
Correct
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
0.70
0.75
0.80
0.85
0.90
0.95
1.00
2P-TX
Prob
of
Correct
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
0.70
0.75
0.80
0.85
0.90
0.95
1.00
2P-R
Prob
of
Correct
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
0.70
0.75
0.80
0.85
0.90
0.95
1.00
2P-RX
Prob
of
Correct
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
0.70
0.75
0.80
0.85
0.90
0.95
1.00
15

Parameter Estimation and Model Evaluation
2P-T 2P-TX 2P-R 2P-RX
αt 1.382 1.383 1.439 1.412
µt -0.624 -0.484 -0.758 -0.557
λt - 2.381 - 2.279
σt - - 0.493 0.332
DIC 423.1 405.8 391.6 391.8
cv-AUC 0.653 0.705 0.664 0.698
16

Model selection may be out of our hands
Person ID Template ID Item ID Response X
1 1 12 0 0
1 2 37 1 0
.
.
.
.
.
.
.
.
.
.
.
.
1 T 2 0 1
2 1 7 1 1
.
.
.
.
.
.
.
.
.
.
.
.
N T It 1 1
17

Simulation Study
Simulate data (so we know the true values of the parameters)
18

Simulation Study
Fit all four models (2P-T, 2P-TX, 2P-R, and 2P-RX) with
MCMC
18

Simulation Study
MCMC
Compare their results
18

Simulation Study
MCMC
Compare their results
What happens if we fit the simpler 2P-T? (ignore template
variability and ignore systematic variation)
18

Quick summary of MCMC inference
19

Bayesian analysis combines the data, our model for the data
(likelihood), and any prior information.
19

The results of MCMC are samples from the distribution of all
parameters of interest
19

parameters
iterations theta[40]
[1,] -1.590404
[2,] -1.625150
[3,] -1.676880
[4,] -1.986976
[5,] -1.808551
[6,] -1.562125
[7,] -1.837187
[8,] -1.518175
19

parameters
[1,] -1.590404
[2,] -1.625150
[3,] -1.676880
[4,] -1.986976
[5,] -1.808551
[6,] -1.562125
[7,] -1.837187
[8,] -1.518175
-Point estimate is the average across
iterations.
19

parameters
[1,] -1.590404
[2,] -1.625150
[3,] -1.676880
[4,] -1.986976
[5,] -1.808551
[6,] -1.562125
[7,] -1.837187
[8,] -1.518175
iterations.
-Standard error is the standard
deviation across iterations.
19

parameters
[1,] -1.590404
[2,] -1.625150
[3,] -1.676880
[4,] -1.986976
[5,] -1.808551
[6,] -1.562125
[7,] -1.837187
[8,] -1.518175
iterations.
-Standard error is the standard
deviation across iterations.
-Hypothesis testing can be done with
posterior intervals (like confidence
intervals).
19

Simulation Study
1000 persons answering 40 templates
20

Simulation Study
Each template has 12 or 100 items
20

Simulation Study
σt ∼ 0, |N(0, .3)|, or |N(0, .6)|
20

Simulation Study
σt ∼ 0, |N(0, .3)|, or |N(0, .6)|
For X, a random 25% of items within a template belong to the
“subset”
20

Simulation Study
σt ∼ 0, |N(0, .3)|, or |N(0, .6)|
For X, a random 25% of items within a template belong to the
“subset”
λt is zero (Type I error) or nonzero
20

How does systematic variation affect α̂t?
21

How does systematic variation affect σ̂t?
22

How does systematic variation affect µ̂t?
23

What about λ̂t?
The covariate performs well in terms of bias.
24

What about λ̂t?
But the 2P-TX has very high Type I error.
24

What about λ̂t?
And the 2P-RX properly controls Type I error.
24

What about λ̂t?
There is a clear benefit to the other parameters in the model.
24

What about λ̂t?
There is a clear benefit to the other parameters in the model.
Specification of Xti
is the limiting factor.
24

Review
ηpti
+ λtXti
)
βti
∼ N(µt, σt)
25

How does systematic variation affect θ̂p?
26

Is there any situation that biases θ̂p?
Received ∑λX
Bias
of
θ
^
-2 -1 0 1 2
-0.02
-0.01
0.00
0.01
0.02
0.03
2P-T
2P-TX
2P-R
2P-RX
27

Implications - what is our inferential focus?
28

While templates are increasingly used, there is relatively little
methodological work
28

methodological work
I If within-template variation exists, the 2P-TX, 2P-R, 2P-RX
models can account for them
28

methodological work
I Useful for item analysis, item selection, and other item-based
inferences
28

methodological work
inferences
I But doesn’t meaningfully effect the inferences based of θ
28

methodological work
inferences
The Simple 2P-T Model
28

methodological work
inferences
I while it cannot uncover the within-template effects, can still
measure θ very well
28

methodological work
inferences
I and the 2P-T’s discrimination parameter can be used to
screen for high within-template variation
28

methodological work
inferences
I and the 2P-T’s discrimination parameter can be used to
screen for high within-template variation
Already used in large assessment and learning systems
28

Data Collection is Key
Many systems have thousands of templates each with
potentially thousands of items.
29

I Is the item index being recorded (needed for R
models)?
29

models)?
I How do we organize the items by meaningful
dimensions in X (needed for X models)?
29

models)?
If not, the 2P-T is generally the only option.
29

models)?
If not, the 2P-T is generally the only option.
If we don’t collect the data, we can’t even begin to ask.
29

Thank you
Quinn N Lathrop
qlathrop@nd.edu
irtND.wikispaces.com
30

Computer Generated Items, Within-Template Variation, and the Impact on the Parameters of Response Models

More Related Content

Similar to Computer Generated Items, Within-Template Variation, and the Impact on the Parameters of Response Models (20)

Recently uploaded (20)

Computer Generated Items, Within-Template Variation, and the Impact on the Parameters of Response Models