Multivariate And Mixture Distribution Rasch Models Extensions And Applications Matthias Von Davier

Multivariate And Mixture Distribution Rasch
Models Extensions And Applications Matthias Von
Davier download
https://ebookbell.com/product/multivariate-and-mixture-
distribution-rasch-models-extensions-and-applications-matthias-
von-davier-52341674
Explore and download more ebooks at ebookbell.com

Here are some recommended products that we believe you will be
interested in. You can click the link to download.
Multivariate And Probabilistic Analyses Of Sensory Science Problems
Institute Of Food Technologists Series 1st Edition Jeanfranois
Meullenet
https://ebookbell.com/product/multivariate-and-probabilistic-analyses-
of-sensory-science-problems-institute-of-food-technologists-
series-1st-edition-jeanfranois-meullenet-2202286
Symmetric Multivariate And Related Distributions Kai Wang Fang
https://ebookbell.com/product/symmetric-multivariate-and-related-
distributions-kai-wang-fang-6992168
Recent Developments In Multivariate And Random Matrix Analysis
Festschrift In Honour Of Dietrich Von Rosen 1st Ed Thomas Holgersson
https://ebookbell.com/product/recent-developments-in-multivariate-and-
random-matrix-analysis-festschrift-in-honour-of-dietrich-von-
rosen-1st-ed-thomas-holgersson-22500950
Analysis Of Multivariate And Highdimensional Data Koch I
https://ebookbell.com/product/analysis-of-multivariate-and-
highdimensional-data-koch-i-4585354

Analysis Of Multivariate And Highdimensional Data 2014th Edition Inge
Koch
https://ebookbell.com/product/analysis-of-multivariate-and-
highdimensional-data-2014th-edition-inge-koch-60073108
Multivariate Characteristic And Correlation Functions Zoltn Sasvri
https://ebookbell.com/product/multivariate-characteristic-and-
correlation-functions-zoltn-sasvri-50378588
Multivariate Algorithms And Informationbased Complexity Fred J
Hickernell Editor Peter Kritzer Editor
https://ebookbell.com/product/multivariate-algorithms-and-
informationbased-complexity-fred-j-hickernell-editor-peter-kritzer-
editor-51130430
Multivariate Calculus And Geometry 3d Sen Dineen
https://ebookbell.com/product/multivariate-calculus-and-
geometry-3d-sen-dineen-4929500
Multivariate Methods And Forecasting With Ibm Spss Statistics
Abdulkader Aljandali Auth
https://ebookbell.com/product/multivariate-methods-and-forecasting-
with-ibm-spss-statistics-abdulkader-aljandali-auth-6616640

Statistics for Social and Behavioral Sciences
Advisors:
S.E. Fienberg W.J. van der Linden

Statistics for Social and Behavioral Sciences
Brennan: Generalizability Theory.
DeBoeck/Wilson: Explanatory Item Response Models: A Generalized Linear and
Nonlinear Approach
Devlin/Fienberg/Resnick/Roeder (Eds.): Intelligence, Genes, and Success: Scientists
Respond to The Bell Curve.
Finkelstein/Levin: Statistics for Lawyers, Second Edition.
Gastwirth (Ed.): Statistical Science in the Courtroom.
Handcock/Morris: Relative Distribution Methods in the Social Sciences.
Johnson/Albert: Ordinal Data Modeling.
Kolen/Brennan: Test Equating, Scaling, and Linking: Methods and Practices, Second
Edition.
Longford: Missing Data and Small-Area Estimation: Modern Analytical Equipment for
the Survey Statistician.
Morton/Rolph: Public Policy and Statistics: Case Studies from RAND.
van der Linden: Linear Models for Optimal Test Design.
von Davier (A.A.)/Holland/Thayer: The Kernel Method of Test Equating.
von Davier (M.)/Carstensen: Multivariate and Mixture Distribution Rasch Models.
Zeisel/Kaye: Prove It with Figures: Empirical Methods in Law and Litigation.

Matthias von Davier
Claus H. Carstensen
Multivariate and
Mixture Distribution
Rasch Models
Extensions and Applications

Library of Congress Control Number: 2006926461
ISBN-10: 0-387-32916-1
ISBN-13: 978-0-387-32916-1
Printed on acid-free paper.
© 2007 Springer Science +Business Media, LLC
All rights reserved. This work may not be translated or copied in whole or in part without the
written permission of the publisher (Springer Science+Business Media, LLC, 233 Spring Street,
New York, NY 10013, USA), except for brief excerpts in connection with reviews or scholarly
analysis. Use in connection with any form of information storage and retrieval, electronic
adaptation, computer software, or by similar or dissimilar methodology now known or hereafter
developed is forbidden.
The use in this publication of trade names, trademarks, service marks, and similar terms, even if
they are not identified as such, is not to be taken as an expression of opinion as to whether or
not they are subject to proprietary rights.
9 8 7 6 5 4 3 2 1
springer.com
Matthias von Davier
Rosedale Road
Educational Testing Service
Princeton, NJ 08541
USA
mvondavier@ets.org
Claus H. Carstensen (Editors)
Leibniz-Institut für die Pädagogik
der Naturwissenschaften (IPN)
University of Kiel, Institute for Science
Education
Kiel 24098
Germany
carstensen@ipn.uni-kiel.de
Advisors:
Stephen E. Fienberg
Department of Statistics
Carnegie Mellon University
Pittsburgh, PA 15213-3890
USA
Wim J. van der Linden
Department of Measurement and Data
Analysis
Faculty of Behavioral Sciences
University of Twente
7500 AE Enschede
The Netherlands

Preface
This volume gathers together a set of extensions of the Rasch model, one of
the most prominent models for measurement in educational research and so-
cial science developed by Danish mathematician Georg Rasch. The idea for
this volume emerged during a meeting of the Psychometric Society in Mon-
terey, CA. At that meeting, friends and colleagues discussed news about the
impending retirement of Dr. Jürgen Rost, an important innovator and men-
tor in this field. To recognize Jürgen’s contributions, we decided to produce
a collection of research on extending the Rasch model as well as embedding
the Rasch model in more complex statistical models, an area that is receiving
broad interest in many fields of social sciences at the current time.
This collection contains 22 chapters by recognized international experts in
the field. The contributions cover topics ranging from general model extensions
to application in fields as diverse as cognition, personality, organizational and
sports psychology, and health sciences and education.
The Rasch model is designed for categorical data, often collected as exam-
inees’ responses to multiple tasks such as cognitive items from psychological
tests or from educational assessments. The Rasch model’s elegant mathemati-
cal form is suitable for extensions that allow for greater flexibility in handling
complex samples of examinees and collections of tasks from different domains.
In these extensions, the Rasch model is enhanced by additional structural el-
ements that account either for differences between diverse populations or for
differences among observed variables.
Research on extending well-known statistical tools such as regression, mix-
ture distribution, and hierarchical linear models has led to the adoption of
Rasch model features to handle categorical observed variables. We maintain
both perspectives in the volume and show how these merged models—Rasch
models with a more complex item or population structure—are derived either
from the Rasch model or from a structural model, how they are estimated,
and where they are applied.
This volume is centered on extensions of the Rasch model to multiple di-
mensions and complex samples of examinees and/or item responses. Therefore,

VI
applications of the unidimensional Rasch model for simple random samples
are not specifically mentioned. Such cases can be found in volumes geared
toward applying the Rasch model. More importantly, simple data collection
designs can be treated as special cases of the extensions presented here, so that
data suitable for the ordinary Rasch model can be analyzed with virtually all
the extensions presented in this volume.
Thanks goes to our respective families, who helped us a lot with their
encouragement and support: thank you Alina, Barbara, Thomas, and Luis!
We are also deeply grateful to our academic teacher, Jürgen Rost, who in-
troduced us to the fascinating field of extended Rasch models. We would
also like to thank our professional affiliations and colleagues for making this
project possible by providing resources and support. We thank Daniel Eignor
for the excellent help on clarifying and better organizing a lot of our writing,
and thanks go to Kim Fryer for the superb editorial support and to Henning
Voigtländer for helping to convert and typeset many contributed chapters.
Most of all, the diversity and coverage of topics presented in this volume
would not have been possible without the excellent contributors in their roles
as authors and reviewers for this volume.
Matthias von Davier
Princeton, NJ, USA
Claus H. Carstensen
Kiel, Germany
March 2006

List of Contributors
Raymond J. Adams
Assessment Research Centre (ARC)
Faculty of Education
University of Melbourne
Victoria 3010 Australia
r.adams@unimelb.edu.au
Keith A. Boughton
CTB/McGraw-Hill
20 Ryan Ranch Road
Monterey, CA 93940, USA
keith boughton@ctb.com
Dirk Büsch
Universität Bremen
Fachbereich 9 - Kulturwissenschaften
Studiengang Sport
dbuesch@uni-bremen.de
Claus H. Carstensen
Leibniz Institute for
Science Education
Olshausenstrasse 62
24098 Kiel, Germany
carstensen@ipn.uni-kiel.de
Yuk Fai Cheong
Division of Educational Studies
Emory University
1784 North Decatur Road, Suite 240
Atlanta, GA 30322
ycheong@emory.edu
Karl Bang Christensen
Arbejdsmiljøinstituttet
Lersø Parkallé 105
2100 Copenhagen, Denmark
kbc@ami.dk
Matthias von Davier
Rosedale Road
Princeton, NJ 08541, USA
mvondavier@ets.org
Karen Draney
Graduate School of Education
University of California at Berkeley
4415 Tolman Hall
Berkeley, CA 94720
kdraney@berkeley.edu
Clemens Draxler
Science Education
Olshausenstrasse 62
24098 Kiel, Germany
draxler@ipn.uni-kiel.de

VIII List of Contributors
Michael Eid
Institut für Psychologie
Freie Universität Berlin
Habelschwerdter Allee 45
D-14195 Berlin
Germany
michael.eid@pse.unige.ch
Susan E. Embretson
School of Psychology
Georgia Institute of Psychology
654 Cherry Street
Atlanta, Georgia USA 30332
susan.embretson@psych.gatech.edu
Anton K. Formann
Fakultät für Psychologie
Universität Wien
Liebiggasse 5
A-1010 Wien, AUSTRIA
anton.formann@univie.ac.at
Cees A. W. Glas
Faculty of Educational Science and
Technology
PO Box 217
7500 AE Enschede, The Netherlands
glas@edte.utwente.nl
Judith Glück
Universität Wien
Liebiggasse 5
judith.glueck@univie.ac.at
Shelby J. Haberman
Rosedale Road
shaberman@ets.org
Chun-Wei Huang
WestEd
730 Harrison Street
San Francisco, CA 94107
chuang@wested.org
Akihito Kamata
Department of Educational
Psychology & Learning Systems
Florida State University
Tallahassee, FL 32306, U.S.A.
kamata@coe.fsu.edu
Henk Kelderman
Vrije Universiteit
Department of Psychology and
Pedagogics
Van der Boechorststraat 1
1081 BT Amsterdam
The Netherlands
h.kelderman@psy.vu.nl
Svend Kreiner
Biostatistisk afdeling Institut for
Folkesundhedsvidenskab
Øster Farimagsgade 5 opg. B
Postboks 2099
1014 Copenhagen, Denmark
S.Kreiner@biostat.ku.dk
Klaus D. Kubinger
Universität Wien
Liebiggasse 5
klaus.kubinger@univie.ac.at
Thorsten Meiser
Institut für Psychologie
Universität Jena
Humboldtstr. 11
D-07743 Jena, Germany
thorsten.meiser@uni-jena.de

List of Contributors IX
Robert J. Mislevy
University of Maryland at
College Park
Department of Measurement,
Statistics and Evaluation
Benjamin 1230-C
College Park, MD 20742
rmislevy@umd.edu
Carl P.M. Rijkes
Bogortuin 145
1019PE Amsterdam
The Netherlands
h.kelderman@psy.vu.nl
Jürgen Rost
Science Education
Olshausenstrasse 62
24098 Kiel, Germany
an@j-rost.de
Bernd Strauss
Westfälische Wilhelms-Universität
Münster
Institut für Sportwissenschaft
Horstmarer Landweg 62b
D-48149 Münster
bstrauss@uni-muenster.de
Christiane Spiel
Universität Wien
Liebiggasse 5
christiane.spiel@univie.ac.at
Gershon Tenenbaum
Department of Educational
Psychology & Learning Systems
Florida State University
Tallahassee, FL 32306, U.S.A.
tenenbau@mail.coe.fsu.edu
Mark Wilson
Graduate School of Education
University of California at Berkeley
4415 Tolman Hall
Berkeley, CA 94720
MarkW@berkeley.edu
Margaret L. Wu
Assessment Research Centre (ARC)
Faculty of Education
University of Melbourne
Victoria 3010 Australia
mlwu@unimelb.edu.au
Kentaro Yamamoto
Rosedale Road
kyamamoto@ets.org
Michael Zickar
Bowling Green State University
Department of Psychology
Bowling Green, OH 43403, USA
mzickar@bgnet.bgsu.edu

Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . V
List of Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VII
1 Introduction: Extending the Rasch Model
Matthias von Davier, Jürgen Rost, Claus H. Carstensen . . . . . . . . . . . . . . 1
Part I Multivariate and Mixture Rasch Models
2 Measurement Models as Narrative Structures
Robert Mislevy, Chun-Wei Huang . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3 Testing Generalized Rasch Models
Cees A.W. Glas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4 The Mixed-Coeﬃcients Multinomial Logit Model: A
Generalized Form of the Rasch Model
Raymond J. Adams, Margaret L. Wu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5 Loglinear Multivariate and Mixture Rasch Models
Henk Kelderman . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
6 Mixture-Distribution and HYBRID Rasch Models
Matthias von Davier, Kentaro Yamamoto . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
Part II Generalized Models—Speciﬁc Research Questions
7 Application of the Saltus Model to Stagelike Data: Some
Applications and Current Developments
Karen Draney, Mark Wilson . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

XII Contents
8 Determination of Diagnostic Cut-Points Using
Stochastically Ordered Mixed Rasch Models
Svend Kreiner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
9 A HYBRID Model for Test Speededness
Keith A. Boughton, Kentaro Yamamoto. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
10 Multidimensional Three-Mode Rasch Models
Claus H. Carstensen, Jürgen Rost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
11 (Almost) Equivalence Between Conditional and Mixture
Maximum Likelihood Estimates for Some Models of the
Rasch Type
Anton K. Formann . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
12 Rasch Models for Longitudinal Data
Thorsten Meiser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
13 The Interaction Model
Shelby J. Haberman . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
14 Multilevel Rasch Models
Akihito Kamata, Yuk Fai Cheong . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
Part III Applications of Multivariate and Mixed Rasch Models
15 Mixed Rasch Models for Measurement in Cognitive
Psychology
Susan E. Embretson . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
16 Detecting Response Styles and Faking in Personality and
Organizational Assessments by Mixed Rasch Models
Michael Eid, Michael J. Zickar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
17 Application of Multivariate Rasch Models in International
Large-Scale Educational Assessments
Raymond J. Adams, Margaret L. Wu, Claus H. Carstensen . . . . . . . . . . . 271
18 Studying Development via Item Response Models: A
Wide Range of Potential Uses
Judith Glück, Christiane Spiel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
19 A Comparison of the Rasch Model and Constrained Item
Response Theory Models for Pertinent Psychological Test
Data
Klaus D. Kubinger, Clemens Draxler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293

Contents XIII
20 Latent-Response Rasch Models for Strategy Shifts in
Problem-Solving Processes
Carl P. M. Rijkes, Henk Kelderman . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
21 Validity and Objectivity in Health-Related Scales:
Analysis by Graphical Loglinear Rasch Models
Svend Kreiner, Karl Bang Christensen. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329
22 Applications of Generalized Rasch Models in the Sport,
Exercise, and the Motor Domains
Gershon Tenenbaum, Bernd Strauss, Dirk Büsch . . . . . . . . . . . . . . . . . . . . . 347
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E1
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391
Erratum . . . . .

1
Introduction: Extending the Rasch Model
Matthias von Davier1
, Jürgen Rost2
, and Claus H. Carstensen2
1
2
Leibniz Institute for Science Education, Kiel
1.1 Introduction
The present volume is a collection of chapters on research and development
work on extensions of the Rasch model (RM; Rasch, 1960) that have focused
on relaxing some fundamental constraints of the original RM, while preserv-
ing many of the unique features of the model. More specifically, the volume
presents extensions of the RM in which certain homogeneity assumptions on
the item level and the population level have been relaxed. With these two
types of assumption intact, the original RM decomposes the probability of
item responses in two independent components: an item-specific difficulty pa-
rameter that is constant across all examinees in the population, and one abil-
ity parameter for each examinee that is the same across all items in a given
assessment.
These homogeneity assumptions, however, are the ones not met in many
practical applications of the RM, since either some or all of the items may
function differently in different subpopulations, or the responses of subjects
to these items may depend on more than one latent trait. This turns out to
be an issue, for example, if item types are mixed, if the content of items varies
somewhat, and/or if items are assessed in complex populations of examinees
that come from different backgrounds such as different educational systems.
The volume addresses these issues in two ways, first by presenting chapters
on recent extensions to the RM and second by providing chapters on applica-
tions of these extensions in educational or psychological contexts. The model
extensions presented here have been actively developed and studied by vari-
ous researchers, who have contributed to pioneering theoretical developments
on extending the RM to multiple populations and multidimensional abilities.
These researchers are often long-term advocates of applying these models to
substantial research questions in the social sciences. Many researchers with
backgrounds in other well-established statistical fields likewise took the RM
as a basis for extending “their” models, frequently with a specific substan-
tive question in mind. Several chapters in this volume are contributed by the

2 Matthias von Davier, Jürgen Rost, and Claus H. Carstensen
original developers of such model extensions, who took a mathematical model
and made it more flexible to suit applied research questions.
This direction of development—from a theory-driven substantive research
question or a hypothesis to a model extension that reflects this theory—is
guiding the structure of most contributions in this volume. The different
chapters describe this process by referring to exemplary theories or research
questions under investigation, then outline the required features of the model
extension used to investigate these questions, and finally describe the path
taken to extend or choose a model and to plan and carry out the analysis. To
reflect this interplay between substantive theory and model development, the
first part of this volume includes papers presenting work on extending MRMs
and multiple group RMs—relaxing the person homogeneity assumptions—as
well as multivariate RMs that relax the item homogeneity assumption to fit
typical questions arising in applied research. The second part of this volume
consists of chapters that present the models developed in the first part in a
variety of applications in empirical educational research and a number of areas
of psychological research.
1.1.1 The Rasch Model
This section introduces a basic set of assumptions and a general framework
for latent variable models for item response data. The conventions introduced
here can be found in most subsequent chapters, except where the extensions
developed in subsequent chapters are more easily derived using a different
notation.
Assume there are n examinees, E1, . . . , En, drawn randomly from a popu-
lation, who respond to a set of I test items. Let xvi ∈ {0, 1, . . . , mi} denote the
integer-coded response of examinee v to item i, that is, the actual behavioral
response is mapped to an element of a set of successive integers starting from
0.
If the responses to item i take on only the two values 0 and 1, we speak
of dichotomous data and refer to the dichotomous RM; if the responses can
take on more than two integer values, say x·i ∈ {0, 1, 2, 3, 4}, the RM has
to be specified for polytomous ordinal data to model responses of this type
appropriately. In this volume, both the dichotomous RM and the RM for poly-
tomous data will be used frequently, and it will often not be explicitly specified
whether item responses are assumed to be dichotomous or polytomous. We
ensure that this will not lead to ambiguities by using a specific method of
introducing the RM in a mathematical form that can be used for both di-
chotomous and polytomous data while meeting certain common foundational
assumptions of the RM.
Given the above definitions, denote the observed item responses of an
examinee v by xv = (xv1, . . . , xvI ), that is, a vector with integer components in
the finite space Ωx =
I
i=1 {0, . . . , mi} of possible response patterns for these
I test items. The RM is derived by assuming certain unobserved quantities in

1 Extending the Rasch Model 3
addition to the observed quantities xv1, . . . , xvI for each examinee v and each
item i, and by specifying certain assumptions about the relation of these,
yet to be specified, unobserved quantities to the probability of observing a
response pattern x ∈ Ωx.
The dichotomous RM assumes that there is a real-valued parameter θv
for each examinee, referred to as person parameter, and real-valued βi for
each item, subsequently referred to as item difficulty. For the probability of a
response xvi, the RM assumes
Pvi(X = xvi) = P(xvi|θv, βi) =
exp(xvi(θv − βi))
1 + exp(θv − βi)
(1.1)
for all examinees v = 1, . . . , N and all items i = 1, . . . , I.
This equation can easily be extended to polytomous responses by writing
the model as
P(xvi|θv, βi·) =
exp (xviθv − βixvi )
1 +
mi
x=1 exp (xθv − βix)
(1.2)
with real-valued βix for i = 1, . . . , I and x = 1, . . . , mi and θv real-valued as
above. The model as defined in Equation 1.2 is suitable for observed variables
xvi ∈ {0, . . . , mi} with an integer mi 0.
The definition of the RM ensures that the probability of responding with
category x rather than with x − 1 is strictly increasing with increasing person
parameter θ. For the item parameters, strictly decreasing monotonicity holds,
with increasing difficulty threshold γix, a response in the upper (x) of two
adjacent categories (x, x − 1) decreases in probability. These monotonicity
properties (MO) are among the defining characteristics of the RM.
For the second defining characteristic, it is convenient to write
α(θv, βi·) = − ln

1 +
mi

x=1
exp(xθv − βix)

and to write the RM as
P(X = xvi|θv, βi·) = exp (xviθv − βixiv
+ α(θv, βi·)) . (1.3)
In addition to the monotonicity in item and person parameters, the RM
assumes local independence (LI), i.e., it is assumed that, for an examinee
with person parameter θ, the responses x = (x1, . . . , xI) are independently
distributed given θ. That is,
P(x|θ) =
I

i=1
P(X = xi|θ, βi)
for all θ. This, with the above definitions, yields after some elementary trans-
formations

P(x|θ) = exp(tvθ) exp(α
α
α(θ,β
β
β))exp
I

i=1
βixiv

(1.4)
with tv =
I
i=1 xvi and α
α
α(θv,β
β
β) =
I
i=1 α(θv, βi·).
Note that in Equation 1.4, the probability of a response pattern x in the
RM has been written as a product of three terms. Note that one of the terms,
exp(α(θ, β)), does not depend on the observed data, and another one is the
same for all response patterns that share the same total score t. This property
will be used in the next section, which talks about conditional inferences in
the RM.
To estimate parameters, maximum likelihood methods can be applied. Ini-
tial approaches to the estimation problem have been based on maximizing a
likelihood function for the observed data matrix (xvi)i=1...I,v=1...N jointly for
the θv and the βix parameters. To avoid undesirable properties of the joint
estimation (Neyman Scott, 1948), later approaches applied modified likeli-
hood equations that eliminated the person parameter θ and thus allow one to
maximize for the item parameters only. By eliminating the “nuisance” param-
eters θv, which are increasing in number with sample size N, the consistency
of item parameter estimates can be ensured. This is done either by assuming a
distribution for the person parameter θ and integrating over this distribution
(marginal maximum likelihood—MML) or by conditioning on some available
observed quantity, a sufficient statistic (Bickel Doksum, 1977) that allows
one to eliminate the nuisance parameters.
MML estimation is prevalent in more general IRT models since these often
do not have simple sufficient statistics. However, the specific form of the RM
as given in Equations 1.1 and 1.2 ensures that the total score tv is a sufficient
statistic for the person parameter θv, and similarly for the item-category to-
tals. This property of the RM, the sufficiency of total (ST) scores for the
item and person parameters, is the third defining characteristic of RMs. The
impact of this sufficiency is elaborated on in the following subsection on the
conditional (on total score) form used in the conditional maximum likelihood
estimation (CML) of the RM.
1.1.2 Conditional Inferences in the Rasch Model
The sufficiency of the total score (ST) ensures that the RM can be written in a
conditional form, based on the observed distribution of the sufficient statistic.
The conditional form of the RM no longer contains the person parameter
and can be used to draw conditional inferences about model data fit and
to estimate item parameters without assumptions about the distribution of
person parameters in the population by plugging in the observed counts of
the total score.
The derivation of the RM in conditional form is based on Equation 1.4.
For a given θ, the probability of observing a total score t is

P(t|θ) =

x|t
P(x|θ),
which is the sum over all conditional probabilities of response patterns x with
the same total score t. As it is easily seen in Equation 1.4, all probabilities in
the above sum share the terms exp(tθ) and exp(α
α
α(θ,β
β
β)), since these do not
depend on the specific response pattern x, but only on θ and t (and β, which
is of lesser concern at this point).
Conditional inference in the RM uses the specific form of P(x|θ) from
Equation 1.4, which separates terms that depend on the observed data x
from terms that depend only on the total score t or do not at all depend on
the observed data. Then, after some algebra, we may write
P(x|t, θ) =
P(x|θ)
P(t|θ)
=
exp(−
I
i=1 βixi
)

x|t exp(−
I
i=1 βix
i
)
.
The right-hand side of the above expression is independent of θ and con-
tains only the response vector x and the item parameters β
β
β. Integrating over
the person parameter distribution using P(x|t) =

θ
P(x|t, θ)p(θ)dθ yields
P(x|t) =
exp(−
I
i=1 βixi )

x|t exp(−
I
i=1 βix
i
)
, (1.5)
which is the probability of a response vector x in the conditional form of the
RM. This is not to be confused with the integration over the ability distribu-
tion commonly used for more general IRT models in conjunction with MML
estimation methods (Bock Aitkin, 1981). In contrast to MML estimation,
the integration mentioned above to arrive at the expression in Equation 1.5
does not actually take place during estimation; it is utilized as an algebraic
equivalence to get rid of the θ on the left side of the expression.
In this conditional form of the RM, we have an expression for the prob-
ability of a response pattern x, given total score t that is independent of
θ. This eliminates the need either to estimate the ability θ for each exami-
nee or to assume a specific form of ability distribution when estimating item
parameters.
The conditional form of the RM is quite useful when item parameters have
to be estimated from observed data. The independence of specific assumptions
about the ability distribution is ensured in the conditional estimation of pa-
rameters. This sets the RM apart from other models for item response data,
since most other models such as the two- and three-parameter item response
theory (IRT) models need additional assumptions about the distribution of
person parameters for estimating item parameters.
Conditional inferences play an important role in the RM (Fischer Mole-
naar, 1995) and in many of the extensions of the RM presented in this volume.
These extensions preserve the defining characteristics of the RM in a way that
enables one to use the RM (or its extensions) in conditional form.

1.1.3 Some Notation for Extended Rasch Models
This section introduces notation that allows one to specify the RM in the
presence of multiple populations and for multiple scales simultaneously. Using
this approach, many extensions presented in this volume can be viewed as
models that assume that the RM holds, with the qualifying condition that
it holds with a different set of parameters in different populations or with a
different ability (person) parameter for each of a set of distinguishable subsets
(scales) of test items.
Assume that there is a many-to-one classification g that maps the person
index v to v → g(v) = c ∈ {1, . . . , C}, so that each examinee v is member of
exactly one of C populations (classes, groups). In the ordinary RM, C = 1,
and therefore, the population index c is not needed. Also, assume that there
is a real-valued θvk for all v and multiple scales k = 1, . . . , K, and let θ
θ
θv =
(θv1, . . . , θvK) be the k-dimensional person parameter.
Let xv = (xv1, . . . , xvI ) be the vector of observed responses for examinee
v ∈ {1, . . . , N}. As above, the categorical responses xvi may be dichotomous
or polytomous ordinal responses, i.e., assume xvi ∈ {0, . . . , mi}. Note that we
keep most of the notation intact; v denotes the examinee index, and N is the
total number of observations. Since there is more than one set of items, the
index k denotes the scale, and the items i = 1, . . . , I are mapped onto the k
scales.
One additional constructive element has to be included. Each item may
belong to exactly one component of ability, say the kth component of θ
θ
θ, or
it may be considered an item that taps into one or more of the K-person
parameter components. In the case that the items belong to more than one
ability component k, we speak about within-item multidimensionality. Oth-
erwise, if each item belongs to exactly one ability component, we talk about
between-item multidimensionality (compare also Chapter 4 in this volume).
Within-item multidimensionality refers to the assumption that responses
to each item may require multiple ability components (more than one skill or
ability component is required for each item) while between-item multidimen-
sionality refers to the assumption that each item can be solved using only one
skill, but different subsets of items may require different skills.
For the case of within-item multidimensionality, each item i is character-
ized by a vector qi = (qi1, . . . , qiK) that represents the load of each scale on
the ith item. The collection of these vectors into a matrix Q represents the
design of the assessment instrument. The matrix Q determines which items
load on which scales. In the RMs presented here, this design matrix consists
of zeros and ones, predetermined by the researcher. More specifically, the Q-
matrix entries are a hypothesized structure of relationships between required
skills and items, and the matrix entries (loadings) are fixed, not estimated.
Therefore, we may write for the case of within-item multidimensionality
Pi(x|θ
θ
θv, c = g(v)) =
exp

x(qT
i θ
θ
θv) − βixc
1 +
mi
y=1 exp

y(qT
i θ
θ
θv) − βiyc

with qT
i θ
θ
θ =

k qikθk.
For the case of between-item multidimensionality (each item “loads” on
one scale only), we can define the probability of a response x to item i in scale
k by an examinee v with c = g(v) as
Pi(x|θ
θ
θv, c = g(v)) =
exp(xθvk − βixc)
1 +
mi
y=1 exp(yθvk − βiyc)
with real-valued βixc for x = 1, . . . , mi, and βi0c = 0. The two definitions
above are compatible, since the between-item multidimensionality is a special
case of the within-item multidimensionality. If each item loads on only one
scale, the cross product qT
i θ
θ
θ reduces to the one term θ·k for which qik = 1,
since all other qik are equal to 0.
Obviously, if c and k were not present, the above equation would resemble
the ordinary RM from the previous section. Many of the extensions treated
in this volume can be expressed in ways that add a population index (like c),
or a scale index (like k) to the ordinary RM.
In the equations, the probability of the outcome depends on v only through
θ
θ
θv and through c = g(v), so that we may write
Pi(x|θ
θ
θ, c) =
exp

x(qT
i θ
θ
θ) − βixc
1 +
mi
y=1 exp

y(qT
i θ
θ
θ) − βiyc
(1.6)
by omitting the v in the equation. This holds, since all examinees v, v
with
identical θ
θ
θv = θ
θ
θv and c = g(v) = g(v
) have the same response probabilities
in the model above.
For a response vector x = (x1, . . . , xI), the probability of this variable is
defined by Equation 1.6 above and the usual assumption of local independence,
that is,
P(x|θ
θ
θ, c) =
I

i=1
Pi(xi|θ
θ
θ, c)
with the same definitions as before, i.e, θ
θ
θ = (θ1, . . . , θK) and c ∈ {1, . . . , C},
and Pi(xi|θ
θ
θ, c) as defined above.
For between-item multidimensionality, the conditional form of the RM is
easily derived in this framework as well, but it will be obviously dependent on
the scale k and the population c. In that case, the conditional RM becomes
P(xk|tk, c) =
exp(−

i|k(i)=k βixic)

x
k|tk
exp(−

i|k(i)=k βkix
kic)
(1.7)
with xk denoting the projection of the response vector that contains only items
of scale k. The total scale score tk is the corresponding sum over only those
items belonging to the kth scale. The conditional RM for scale k in population
c allows one to estimate item parameters for this scale in this population, using
conditional maximum likelihood estimation methods (Fischer Molenaar,
1995; von Davier Rost, 1995).

1.1.4 Are These Extensions Still Rasch Models?
Critics of extensions such as the ones presented in this volume may argue
that these models are no longer RMs, since some basic assumptions of the
original model are modified. Even within the group of researchers who use
the original RM, there are arguments as to what is the right way to do so. In
this volume, the majority of extensions of the RM are based on the assumption
that the original RM holds in exhaustive and mutually exclusive subsets of the
item universe and the examinee population. This means that each examinee
belongs to one subpopulation where the RM holds, possibly with a unique set
of item parameters. The same is true for most extensions presented here for
each item; that is, it is assumed that each of the items belongs to one subset
(subscale) for which the original RM holds, but there may be more than
one subscale. A Rasch purist could still analyze these subscales separately,
or analyze subpopulations separately in this case. Such an approach would
retain all the assumptions of the RM by using a more constrained definition
of the target population and/or the item universe. However, if a joint analysis
is desired, an extended model that accommodates differences between items
and subpopulations is required.
The first rule of statistical modeling is that no model ever “really”’ fits the
data. This is true and can be shown empirically by rigorously testing models
in sufficiently large samples. Still, there is hope in the sense that some models
provide useful summaries of data, so that these summaries are predictive
for some outside variable that was assessed concurrently or even some future
outcome. Model extensions are aimed at improving these capabilities; they are
aimed at improving predictions by including a more complex description of
the observed variables (that is, the item responses), the examinees involved, or
both. This more complex description relates to an increased number of model
parameters that often make either items’ response functions or population
distributions more flexible.
Which of these extensions are legitimate? And for whom? This may often
depend on which group (or subpopulation) the researcher who judges these
extensions belongs in (von Davier, 2006). There are, of course, common sta-
tistical issues that pose problems for any model extension, such as a lack of
identifiability, which all professional groups would agree disqualifies a model
from further consideration. Apart from these, the selection of which exten-
sions are permissible, and which catapult the specific model outside of the
group of “extended” RMs stays somewhat subjective.
As mentioned above, most extensions in this volume maintain basic fea-
tures of the RM such as the conditional sufficiency of raw scores (either in
subpopulations, or as subscores based on subsets of items), the conditional
independence assumption, and the monotonicity assumption. Conditional in-
dependence is given up in only one of the chapters, mainly to account for dif-
ferences in point-biserial correlations among items, which would otherwise be
modeled by allowing a discrimination parameter. Monotone increasing charac-

teristic functions, in both item easiness (negative difficulty) and in the person
parameter, are the basis for all the models presented in this volume.
Maybe more interesting than the question whether the extensions pre-
sented here may still be called (extended) RMs is whether these models add
value to the statistical analysis of item response data. In many cases, adding
parameters to a model and increasing model-data fit is easy to do, but the
added value of doing so has to be well established in order to justify the
added complexity for the given purpose of the analysis. Molenaar (1997) has
expressed this in very understandable terms that may be paraphrased as “IRT
models are great, even if they never fit the data. But does it matter?” The RM
(and its extensions) set the stage for answering Molenaar’s question. However,
the question whether it matters has to be qualified as, “Does it matter for the
specific purpose one has in mind?”
Applications aimed at variance decomposition using background variables
ask a different question, and therefore may require consideration of a different
type of model extension, than applications aimed at deriving a rank order
of students applying to a higher-education facility. The former purpose is ex-
planatory and tests hypotheses about relationships between variables, whereas
the latter classifies students as admitted versus not admitted. One application
is concerned with the best possible representation of variance components,
whereas the other is concerned with the best possible point estimate for each
student in order to provide the most accurate classification, given data and
model. The chapters in this volume derive extensions of the RM with specific
purposes in mind. The reader is kindly asked to view the chapters with that
in mind, in order to see the scope of applicability of the specific extensions
and to explore the different fields in which the simple and elegant form of the
RM has proven useful as the foundational basis for a more complex statistical
model.
1.2 Overview and Structure of This Volume
Most if not all extensions presented in this volume were created after encoun-
tering the need to model data that are more complex than the RM in its
“pure” form can handle. Some extensions address specific questions and were
driven by some specific research context, whereas other extensions address
more general considerations as to which model assumptions may limit the
applicability of the RM to more complex assessment data.
The chapters within this volume introduce specific extensions or applica-
tions and cross reference to other appropriate chapters. References to work
published outside this volume are also provided to encourage further reading
and to provide a broader view of this area of research as consisting of inter-
connected fields. In this view, it is less important whether a statistical tool
such as hierarchical linear models uses the RM for categorical dependent vari-
ables or whether the RM adopts a more complex population structure that

reminds one of a hierarchical linear model. We hope that it becomes evident
that no matter what prompted a particular development, the merger of the
RM with other statistical methods creates interesting, useful, and rigorously
testable models with applications in a variety of fields. This approach should
provide some guidance for readers and help them to build a cognitive map of
the different extensions of the RM.
This format is applied to the more general chapters as well as to the more
applied chapters, which either contain an overview of relevant applications
or illustrate certain extensions using exemplary studies from various areas of
research.
1.2.1 General Rasch Model Extensions
The first part of this volume covers the ideas guiding these model extensions
and tries to create a framework that helps the reader understand the specific
tools these model extensions provide for researchers. These more conceptual
chapters are an attempt to showcase more generally some ways to think about
deriving model extensions from demands that cannot be fulfilled by a model
that assumes a very strict structure. This part also contains a chapter that
provides some insight into how the expected payoff of extending the RM can
be tested.
The first chapter in this part (Chapter 2) is the most conceptual in the
sense that it lays out what kind of inferences require models that include
strong homogeneity assumptions. Chapter 3 outlines how evidence for the
need for more complex models can be collected and evaluated statistically.
This chapter introduces procedures for testing whether the added complexity
of extended RMs actually helps to describe and understand the data better.
This, in our understanding, is a fundamental requirement of analysis with
complex statistical models, since the added complexity requires more resources
for reporting as well as additional effort for researchers who want to make
sense of the results or who want to use the outcomes in subsequent analysis.
Chapter 4 presents an overview of flexible families of multivariate RMs. These
multivariate RMs are based on the assumption that there is a hypothesis about
the dimensional structure of each observed variable, i.e., each item is related to
one or more of the multiple abilities through a design matrix defined a priori.
This design matrix is often referred to as a Q-matrix in models for student
profiles (Tatsuoka, 1983) and resembles the structural basis for a confirmatory
analysis of a multivariate model. Chapter 5 introduces a very useful way to
specify, estimate, and study extensions of RMs. This chapter shows how RMs
and their extensions can be framed in terms of loglinear models and how these
models can be estimated using software for loglinear models. The final chapter
(Chapter 6) in the first part of this volume describes the family of discrete
mixture distribution RMs (mixed Rasch models, [MRMs]; Rost, 1990; von
Davier Rost, 1995) and HYBRID RMs. This chapter provides an outline of
the basis for these models as derived from IRT and the RM and as integrated

with latent class analysis (LCA). This unique way of modeling offers tools
to, among other things, handle differential item functioning (DIF) as well as
to test for multidimensionality in the context of discrete mixture distribution
models.
1.2.2 Model Extensions for Specific Purposes
The second part of the volume covers models that were created in response to
a specific problem or research question. Overlap with the first part is inten-
tional, since some of the extensions treated here, even if originally developed
for a specific research question, grew into a broader class of models with ap-
plications in a variety of fields.
The first chapter in this part (Chapter 7) describes a model that allows
one to study developmental processes using repeated measures. This chapter
introduces the saltus model, an extension of the RM that allows one to study
changes in difficulty of tasks over different developmental stages. Chapter 8 in
this part introduces stochastically ordered MRMs for identifying diagnostic
cutscores. Chapter 9 is dedicated to an extension of the HYBRID model that
allows one to study speededness phenomena in detail. This chapter modifies
mixture distribution RMs introduced in the first part of the volume by im-
posing complex equality constraints on them to model the switch between
systematic and random response at a certain point in the response process.
Chapter 10 is a specialization of the multidimensional approach also already
introduced in the first part of the volume. This chapter covers different types
of potential applications of these multidimensional RMs. The fifth chapter in
Part II, Chapter 11, relates the RM and the MRM to discrete latent trait mod-
els, namely to located latent-class models, and compares parameter estimates
from these different latent-variable models.
The following chapter (Chapter 12) introduces MRMs for longitudinal
data. Interestingly, several contributions in this volume use loglinear models,
initially described in Chapter 5, as the common language to describe devel-
opments based on multivariate or mixture-distribution Rasch models. These
loglinear models with unobserved grouping variables are a useful tool that
lends itself nicely to treating this kind of missing-data problem. Chapter 13
extends the RM to allow for differences in discriminations across the range of
items by introducing an interaction rather than a slope parameter. In contrast
to the two-parameter logistic model, the interaction model used in Chapter
13 retains some of the conditional inference features of the RM. The final
chapter in Part II (Chapter 14) is an extension of the RM to complex sam-
ples from hierarchically organized populations that do not lend themselves
easily to drawing simple random samples. This situation is often encountered
in large-scale educational assessments and other survey assessments. Here we
might also assume the development from the other side of the statistical tool-
box, namely that the model basis was a hierarchical linear model that was
extended by a Rasch-type measurement model.

1.2.3 Applications of Extended Rasch Models
The third part of this volume is dedicated to chapters that provide insight
into exemplary applications of extended RMs in various fields of research.
There is a strong link between these chapters and the previous parts, since
the applied work shows how statistical tools that are based on the RM can
help to pose and answer specific questions on data from complex assessments
and or populations.
The first chapter in this part (Chapter 15) presents a variety of applications
of extended RMs such as mixture distribution RMs in the area of cognitive
psychology. Chapter 16 applies mixture RMs to the task of detecting faking
and response distortions with the aim of identifying candidates who try to
present themselves in a specific way. Chapter 17 talks about applications of
multidimensional RMs in an international educational survey assessment.
Chapter 18 talks about applications of RMs and extensions of RMs to
studying developmental issues. This chapter presents an overview of areas of
application and the limitations of these approaches. Chapter 19 compares an
item response model that uses a parsimonious way to account for guessing by
estimating a constrained three-parameter logistic model with the application
of mixture-distribution RMs to identify and correct for guessing behavior.
Chapter 20 covers extended RMs developed for modeling strategy shifts.
This chapter extends previous work on strategy differences and helps one to
understand how such complex models can be conveniently specified in the
framework of loglinear models. Chapter 21 integrates principles of graphical
models and mixture distribution RMs and presents an application to health
science data. The last chapter in this volume (Chapter 22) presents some
applications of RMs and extensions of RMs to data from sports science and
applied psychology in the motor domains.

Part I
Multivariate and Mixture Rasch Models

2
Measurement Models as Narrative Structures
Robert Mislevy1
and Chun-Wei Huang2
1
University of Maryland
2
WestEd
2.1 Introduction
An active area in psychometric research has been developing models that
address strategies by which examinees respond to tasks. One purpose of this
chapter is to describe some of these models and the relationships among them.
The other is to lay out a framework for discussing topics of this sort, fore-
grounding the interplay between the technical affordances of the probability-
based psychometric models and the substantive arguments they are meant to
support.
The framework consists of a narrative structure overlaid by a model that
supports probability-based reasoning. The narrative component is a special-
ization of Toulmin’s (1958) general argument structure to assessment argu-
ments (Mislevy, 2003). The component that supports probability-based rea-
soning is the measurement model (Mislevy, 1994; Mislevy Gitomer, 1996).
It is through the narrative structure connecting them to real-world phenomena
that the formal variables and conditional distributions in measurement mod-
els acquire situated meanings. The key feature of the measurement models we
discuss is that the narrative space has been extended beyond the measurement
theme that characterizes trait/differential psychology, to include relationships
among the ways people process information and features of tasks that interact
with their problem-solving—themes that have emerged from the information-
processing research in cognitive psychology in the tradition of Newell and
Simon’s (1972) classic Human Problem Solving.
Section 2.2 reviews Toulmin’s structure for arguments and its application
to educational and psychological testing. Section 2.3 describes the extension
to probability-based models and illustrates with the basic Rasch model (RM)
for dichotomous items (Rasch, 1960). Section 2.4 describes themes that are the
basis of extensions from basic measurement models such as the RM, including
mixtures, differential item functioning (DIF), multiple groups of examinees,
and covariates for tasks based on theories of problem-solving. Section 2.5 de-
scribes a number of these models and relates the structures of the probability

16 Robert Mislevy and Chun-Wei Huang
models to the structures of the narratives they embody. Section 2.6 concludes
with a comment about contrasting aims of psychometric modeling.
2.2 Assessment Arguments and Measurement Models
Toulmin (1958) proposed a schema for how we use substantive theories and
accumulated experience to reason from particular data to particular claims.
Figure 2.1 outlines the structure of a simple argument. The claim (C) is a
proposition we wish to support with data (D). The arrow represents inference,
which is justified by a warrant (W), a generalization that justifies the inference
from the particular data to the particular claim. Theory and experience—both
personal and formal, such as empirical studies and prior research findings—
provide backing (B) for the warrant. In any particular case we reason back
through the warrant, so we may need to qualify our conclusions because there
may be alternative explanations (A) for the data, which may in turn be sup-
ported or uncut by rebuttal data (R).
C
D
W
B
A
R
since
so
on
account
of
unless
supports
Fig. 2.1. Toulmin’s structure for arguments. Reasoning flows from data (D) to
claim (C) by justification of a warrant (W), which in turn is supported by backing
(B). The inference may need to be qualified by alternative explanations (A), which
may have rebuttal evidence (R) to support them.
The foundation of an educational or psychological assessment argument is
a concept of the nature of proficiency (e.g., knowledge, ability, propensity to
act in certain ways in certain situations). It determines the nature of every
element in the argument structure and the rationale that orchestrates them
as a coherent argument. As Messick (1994) asks, what kinds of things might
one wish say about persons? What kinds of things does one need to see an

2 Measurement Models as Narrative Structures 17
examinee say or do in what kinds of situations? How are they related? The
answers to these questions become claims, data, and warrants respectively in
assessment arguments.
In particular, there are myriad aspects of persons, situations, and per-
sons’ actions within situations to which we might attend, and ways in which
we might characterize them. The conception of proficiency shapes which of
these will constitute data. An assessment argument generally includes three
classes of data: aspects of the circumstances in which the person is acting,
over which an assessment designer generally has principal influence; aspects
of the person’s behavior in the situations, over which the person has principal
influence; and additional knowledge about the person’s history or relationship
to the observational situation that may be further required. These latter fac-
tors are essential to assessment in practice, even though they are often tacit,
embedded in forms and practices.
To illustrate, consider a kind of task often used to assess spatial rotation
(Cooper Shepard, 1973). A subject is first shown a target figure, in this case
a nonisosceles right triangle with a certain angle, then a second version of the
target, rotated by a specified number of degrees from the target (Figure 2.2).
The subject must indicate whether the stimulus is identical to the target or
a mirror image of it. Lower response latencies are usually taken as evidence
of higher proficiency for tasks like these, but we will use correctness: More-
proficient subjects are posited to be more likely to make correct responses than
less-proficient subjects. The Toulmin diagram for an assessment argument
based on Sue’s correct response to an item of this type is shown in Figure
2.3. Data about the item are each triangle’s acute angle, the rotation from
the target, and whether the stimulus is the same or different. Data about the
subject, such as gender and ethnicity, may not be available to the analyst but
are not relevant in the basic measurement models illustrated here. Reasoning
back through the warrant, the claim is that Sue has a high level of proficiency
in spatial rotation, based on the observation of this item response. We will
address alternative explanations in the next section.
Of course a single item provides meager information, so spatial rotation
tests generally consists of many items of the same kind. As such, the same
warrant applies in each case, as shown in Figure 2.4. Reasoning back through
the same substantive warrant is called for, of course, and it is clear that more
information is available. How is the evidence contained in the data across
tasks to be synthesized, and how might we use this richer body of evidence to
phrase more refined claims? This is where probability-based models come in.
2.3 The Role of Probability Models
2.3.1 From Arguments to Probability Models
Recognizing assessment as an evidentiary argument, one would like a mech-
anism to reason back up through the warrant, from data about what ex-

Target
Stimulus
Fig. 2.2. A spatial rotation item
C: Sue has a high value
of spatial visualization
ability.
W: Students who are high on
Spatial Visualization tend to do
well on mental rotation tasks.
B: (1) Empirical studies show
high correlations between
mental rotation tasks and
other spatial visualization
tasks. (2) These tasks are
constructed so that they
can be solved using a
mental rotation strategy.
A: Sue answered
correctly using a strategy
that did not require
spatial visualization
since
so
on
account
of
unless
D1: Sue
answered the
triangle item
correctly.
D2: Logical
structure and
contents of this
triangle item.
and
Fig. 2.3. Toulmin diagram for one assessment task. Note that the warrant requires
a conjunction of data about the nature of Sue’s performance and the nature of the
performance situation.
aminees say, do, or make, to claims about their knowledge and proficiencies
more broadly conceived. Probability-based reasoning supports coherent re-
verse reasoning, specifically through Bayes’s theorem. We construct a proba-
bility model that approximates key features of the situation in terms of vari-
ables and their interrelationships. Figure 2.5 shows the structure of an IRT
model for the similar-tasks example described earlier. Details appear in the
following section; we first address features illustrated here that are common
to, and characteristic of, psychometric models more generally.

C: Sue has a high value
of spatial visualization.
W: Students who are high on
spatial visualization tend to do
well on mental rotation items.
B: ...
A: ...
since
so
on
account
of
unless
supports
R: ...
D11
: Sue's
answer to
Item 1
D21 structure
and contents
of Item 1
and
D1n
: Sue's
answer to
Item n
...
D2n structure
and contents
of Item n
...
Fig. 2.4. Toulmin diagram for several tasks of the same kind. The same general
warrant is employed, as adapted to the particulars of each piece of data as they ﬁt
into the same scheme.
γ
X1
X2
X3
p( )
p(X1
| )
p(X2
| )
p(X3
| )
γ
γ
γ
γ
Fig. 2.5. Graph for an item response theory (IRT) model
There is an important diﬀerence between the variables in a probability
model and the corresponding entities, claims, and data in a Toulmin diagram.

A claim in a Toulmin diagram is a particular proposition that one seeks to
support; a datum is a particular proposition about an aspect of an observation.
A variable addresses not only the particular claim or observation, but also
other claims or observations that could be entertained. As a datum in an
argument, one might say that the response to Item j is correct. As a value of
the item response variable Xj, we would say that the value of Xj is “correct”
or 1 as opposed to “incorrect” or 0. If you know what the value of the variable
is, you also know what it is not.
Whereas a claim in a Toulmin diagram is a particular proposition, a pro-
ficiency variable γ in a psychometric model characterizes ranges or potential
values for selected aspects of proficiency. The possible values of the unobserv-
able, possibly vector-valued, γ correspond to different states, levels, or config-
urations of proficiency. In Figure 2.1, the generic γ takes the particular form
of a real-valued scalar θ that characterizes an examinee’s propensity to make
correct rather than incorrect responses. As formal entities, these variables can
correspond to aspects of proficiency cast in trait, behavioral, information-
processing, developmental, sociocultural, or any psychological perspective;
that same perspective will drive the nature of observations and the relation-
ships between them (Mislevy, 2003)—that is, the view of proficiency in the
space of narratives a given probability model is constructed to support. A
probability distribution over γ indicates knowledge at a given point in time
about what the value of γ might be. The prior probability distribution p(γ)
expresses what is known about a person’s value of γ before responses or values
of covariates are known.
A possibly vector-valued observable variable X characterizes selected as-
pects of a person’s response. X’s are modeled as depending in probability
on the person variables through conditional probabilities p(x|γ). In this for-
mulation the direction of reasoning flows, like the substantive warrant, in a
deductive direction, that is, expectations for what observables might be if
person variables were known.
The support for a substantive claim is expressed in terms of a probability
distribution that represents a degree of belief about corresponding values of γ.
The situated meaning of such a claim arises from the nature of the observations
that it is posited to affect and the substantive grounding of the model. Once
such a model is fit and parameters have been estimated from initial data
(pretest data, or “calibration” data), Bayes’s theorem can be used to update
belief about person variables in light of task performances:
p (γ |x) =
p (x |γ ) p (γ)
p (x)
.
The probability model becomes an additional component of a warrant that
permits a quantitative expression of support for claims, and affords the calcu-
lus of probability to synthesize multiple, possibly conflicting, possibly overlap-
ping, pieces of evidence. These advantages do not come for free. Additional
backing is required for the probability-based aspects of the warrant, in the

form of the pretest data. Additional alternative explanations for good or poor
performance are introduced in connection with model misspecification and
data errors.
Shafer (1976) defines a “frame of discernment” as all of the possible subsets
of combinations of values that the variables in an inferential problem at a
given point in time might take. The term “frame” emphasizes how a frame of
discernment circumscribes the universe in which inference will take place. The
term “discernment” emphasizes how a frame of discernment reflects purposive
choices about what is important to recognize in the inferential situation, how
to categorize observations, and from what perspective and at what level of
detail variables should be defined.
Powerful methods are available for reasoning in probability models, for
example, coherent updating of belief about any subset of variables, given new
information about any other subset, clear expression of degree of support for
claims expressed in terms of values of γ’s or X’s, and the capability to express
relationships of considerable subtlety and complexity (Schum, 1994), as might
arise in simulation-based assessments tapping many aspects of knowledge and
producing complex performances with sequential dependencies. These advan-
tages obtain only for inferences that can be expressed in terms of a model’s
frame of discernment, however. The structure of the relationships embodied in
a model may be quite flexible, but they effectively lay out the narrative space
of stories that can be told, in terms of all the possible values that the variables
might take. Questions concerning features of situations or patterns of interac-
tions outside this universe of discourse cannot be asked, let alone answered.
And to the extent that unmodeled patterns do exist in the real-world setting,
they can distort inferences made through the model. We return to this issue
in Section 2.5 in connection with multiple problem-solving strategies.
Model criticism tools do help deal with these problems. Tests of overall
model fit are available. Even more useful are tests for particular suspected
departures, such as adequacy across subsets of the data partitioned by features
not in the model. Some patterns of observables—for example, a given person’s
pattern of responses—may be so improbable under the model as to cast doubt
on using the model for that individual, even if the model fits well in general.
2.3.2 Example: The Rasch Model for Dichotomous Items
The Rasch IRT model for dichotomous items (RM: Rasch, 1960) posits that
a probability of response to Item j given θ takes the following form:
P (Xj = 1|θ, bj) = Ψ (θ − bj) , (2.1)
where Ψ (·) ≡ exp (·)/[1 + exp (·)] is the cumulative logistic probability dis-
tribution, θ is a one-dimensional measure of proficiency, bj is a difficulty
parameter for Item j, and xj is 1 if right and 0 if wrong. Under the usual
IRT assumption of conditional independence, the probability of a vector of
responses to n items is

P(x1, . . . , xn|θ, β1, . . . , βn) =
n

i=1
P (xj|θ, βj) . (2.2)
The RM corresponds to a narrative space in which persons may differ as to
their probability of answering items correctly, specified by θ; items may differ
as to their probabilities of being answered correctly, specified by βj; and the
probability of the outcome when a person with proficiency θ attempts Item
j is given by (2.1). Exactly the same difference in log odds (i.e., ln(p/(1 −
p)) obtains when we compute differences between two given persons for any
item across the collection for which the model is presumed to hold. The only
difference among persons that can be expressed in the model is as to their
overall propensity; all persons with the same evidence about the their θ’s (in
the case of a test, all persons with the same total score) are indistinguishable
through the lens of the RM.
These main-effects patterns for comparing persons and similarly for com-
paring items render the RM a probabilistic version of a fundamental measure-
ment model (Campbell, 1920), specifically, a conjoint measurement model as
described by Luce Tukey (1964) (see Perline et al., 1979, Fischer, 1968,
Keats, 1971; Michell, 1997, 1999, and Roskam Jansen, 1984). Although
high-θ persons sometimes answer low-β items incorrectly and vice versa, pat-
terns in which some items are systematically easier for some people than
others lie outside the narrative space of the RM. The narrative theme of fun-
damental measurement accords well with the trait or differential-psychological
perspective.
Model criticism tools such as item fit and person fit indices (e.g., Meijer
Sijtsma, 2001) allow the analyst to detect situations in which items do not
appear to be equally difficult given overall proficiency for different groups of
persons, or a person’s response patterns are so unlike those of most people with
similar overall proficiencies that the same substantive interpretation is not
supported. In this way the item-level probabilistic framework grounds much
stronger inference than the still-widespread practice of treating all examinees
with the same total score as equivalent, without regard for systematic patterns
within the data that would argue otherwise.
The basic RM does not encompass covariates qj for items or wi for persons.
It is not an explanatory model, in the sense of De Boeck Wilson (2004):
Considerations of correlates of person proficiency and item difficulty, and thus
substantive explanations of its character and probabilistic tests of conjectures
to this effect, lie outside the model. Tests of whether β’s are invariant across
distinguishable groups of students and examinations of the relationships be-
tween item difficulties and item features are starting points for some of the
extensions we discuss in Section 2.5. Such issues are there incorporated into
measurement models, and the larger universe of inference that can be ad-
dressed in them supports a correspondingly larger narrative space.

2.4 Modeling Solution Processes
The “cognitive revolution” of the 1960s and 1970s, exemplified by Human In-
formation Processing (Newell Simon, 1972), called attention to the nature
of knowledge, and how people acquire and use it. How do people represent
the information in a situation? What operations and strategies do they use
to solve problems? What aspects of problems make them difficult, or call for
various knowledge or processes? Strong parallels to computation and artificial
intelligence appear in the use of rules, production systems, task decomposi-
tions, and means–ends analyses. The key insight is modeling problem-solving
in these terms in light of the capabilities and the limitations of human thought
and memory that are revealed by psychological experiments.
Among the tools developed to study cognitive processes is cognitive task
analysis (CTA). CTA is a disciplined process of investigating the knowledge
structures and strategies that individuals at targeted levels of ability use to
solve specific types of tasks, through observable evidence of those structures
and strategies. A CTA seeks to expose (a) essential features of task situations
for eliciting certain behaviors; (b) internal representations of task situations;
(c) the relationship between problem-solving behavior and internal represen-
tation; (d) processes used to solve problems; and (e) task characteristics that
impact problem-solving processes and task difficulty (Newell Simon, 1972).
In the 1970s, researchers such as Carroll (1976) and Sternberg (1977) stud-
ied test items in these terms as psychological tasks. Others, including Whitely
(1976) and Tatsuoka (Klein et al., 1981), designed aptitude- and achievement-
test items around features motivated by theories of knowledge and perfor-
mance in a given domain. For example, the cognitive model for processing
documents (Mosenthal Kirsch, 1991) indicates that the difficulty of a task
will be driven by (a) features of the document in question, such as the num-
ber of organizing categories, (b) features of the directive, such the number
of features that must be matched, and (c) the correspondence between the
two, as determined by the degree to which the document has been designed
to facilitate the inference that must be drawn. In the third edition of the
influential volume Educational Measurement (Linn, 1989), Snow Lohman
(1989) assert that
Summary test scores, and factors based on them, have often been
thought of as “signs” indicating the presence of underlying, latent
traits. . . . An alternative interpretation of test scores as samples of
cognitive processes and contents, and of correlations as indicating the
similarity or overlap of this sampling, is equally justifiable and could
be theoretically more useful. The evidence from cognitive psychology
suggests that test performances are comprised of complex assemblies
of component information-processing actions that are adapted to task
requirements during performance (p. 317).

Even when considering performances on familiar tasks, the cognitive per-
spective entails a new narrative space, in order to cast claims about persons,
to characterize relevant features of tasks, and to express conjectures about re-
lationships between task features and person performances. A new narrative
space in turn entails a new frame of discernment, to bring into the probabilis-
tic model those features and relationships that are central to the discourse of
cognitive explanation but were irrelevant for the strict purpose of measuring
traits.
We may distinguish five cases for modeling strategy use (extending a list
given by Junker, 1999):
Case 0: No explicit modeling of strategies (basic IRT models)
Case 1: Common strategy presumed across persons.
Case 2: Strategy may differ between persons
Case 3: Strategy may differ between tasks, within persons
Case 4: Strategy may change within task, within persons
The RM discussed in Section 3.2 is an example of Case 0. Models below
include Case 1 (the linear logistic test model, or LLTM), Case 2 (mixtures of
RMs, mixtures of LLTMs), and Case 3 (the Andersen/RM).
2.5 A Space of Models
This section describes a number of cognitively motivated extensions of IRT.
Enough research has been done along these lines that a comprehensive review
is beyond the scope of this chapter presentation. We confine attention to tasks
with single right/wrong responses, for example, and to extensions of the RM.
We will illustrate three notable extensions of the narrative space: Story lines
that reflect aspects of how persons solve tasks, how features of tasks influence
their difficulty under a given approach, and that an observer may or may
not know about the approach a person is taking for a given task. Interest
lies in how these themes are incorporated into parameters and structures of
conditional probability distributions. The models described below are ordered
approximately in terms of increasing complexity. A strict linear order does not
exist, but cases in which one model can be viewed as an extension of models
discussed previously will be noted.
2.5.1 Differential Item Functioning (DIF)
As noted in Section 2.3.2, it is a common practice in educational and psy-
chological testing to sum over item responses and treat all examinees with
the same total score as interchangeable with respect to whatever the test
is purported to “measure.” It is a matter of some importance that similar
scores for students of different demographic groups based on, say, gender and

race/ethnicity reflect similar performances on the items that comprise a test.
Also as noted in Section 3.2, probability-based IRT models, such as the RM,
make it possible to test whether the patterns in a given data set support
this interpretation (Thissen et al., 1993). Differential item functioning (DIF)
means that typical performance on certain items varies substantially across
groups among students with similar overall proficiency. That is, the difficulties
of items vary across known groups of students.
A model that incorporates DIF with respect to manifest groups of students
thus incorporates an observed student covariate w into the probability model:
P (Xj = 1|θ, w, b·j) = Ψ (θ − bwj) , (2.3)
where the item difficulty now depends on group membership, as indicated
by the group index w on group-specific item parameters bwj. It may be the
case that only certain items exhibit DIF across groups. Nevertheless, sub-
stantive interpretations of examinees’ performances, and by extension their
proficiencies, are incomplete without taking their group membership into ac-
count. Equation 2.3 affords no substantive explanation for these differences.
It is compelling to examine the items that differentiate the groups. Is it that
background knowledge differs among different groups of people? Are different
people using different strategies to solve items? Conjectures about patterns
suggested by substantive knowledge about items can be incorporated using
the approach discussed in Section 2.5.3.
It may be found that the RM fits well within the classes determined by
partitioning persons and responses on the basis of w. In these circumstances
one again obtains measurement models in the sense of probabilistic versions
of conjoint measurement.
2.5.2 Mixtures of Rasch Models
The not-uncommon finding of DIF among manifest groups raises the possi-
bility that this phenomenon may be occurring even when the analyst does
not happen to know persons’ values on the appropriate grouping variable.
Mixture distribution of RMs (e.g., Kelderman Macready, 1990; Rost, 1990)
incorporate an unobserved student covariate φ into the probability model:
P (Xj = 1|θ, φ, bj) = Ψ (θ − bφj) , (2.4)
where the interpretation is the same as in the DIF model (2.3) except that
now which group a given student belongs to is not known with certainty.
Equation 2.4 can be described in terms of latent trait models within a latent
class model. Given a student’s pattern of observed responses and estimates of
the group-specific item parameters, one uses Bayes’s theorem to compute the
posterior probability that the student belongs to each of the possible latent
groups. A mixture IRT model obtains when patterns of relative difficulty for
certain items appear to differ in a consistent manner in subsets of a data set.

As with DIF models, it is compelling to examine the items that differentiate
the groups discovered in an application of a mixture model.
Although the probabilistic version of conjoint measurement would hold
within groups, and this narrative theme could be used in discussing results,
the mixture model of (2.4) is not itself a measurement model in this strict
sense. Whether such models ought to be called measurement models is an
open question.
Glück et al. (2002) provide an interesting example of a mixture RM to
study the effects of strategy training for spatial rotation tasks. Pretest and
posttest subsets of data are distinguished, and within time points propor-
tions of students using a true rotational strategy and a less-effective pattern-
matching strategy. The efficacy of each strategy as applied to three kinds
of tasks could be predicted, so that when unrestricted-mixture Rasch models
were fit it was possible to identify resulting classes with strategies. They found
that almost all of the students who used the pattern strategy at the pretest
had switched to a spatial strategy at the posttest, after receiving training to
that effect.
2.5.3 The LLTM
In the linear logistic test model (LLTM; Scheiblechner, 1972; Fischer, 1983),
cognitively based features of items and persons’ probabilities of response are
related through a so-called Q-matrix (Tatsuoka, 1983): qjk indicates the degree
to which feature k applies to item j. In simple cases, qjk is 1 if feature k is
present in item j and 0 if not. The LLTM extends the Rasch model by positing
a linear structure for the βj’s:
βj =

k
qjkηk = q
jη, (2.5)
where ηk is a contribution to item difficulty entailed by feature k. Features
can refer to a requirement for applying a particular skill, using a particular
piece of information, carrying out a procedure, or some surface feature of an
item—exactly the kinds of elements that Newell Simon (1972) sought to
uncover in cognitive task analysis as correlates of task difficulty.
The LLTM supports probability-based reasoning for a narrative space that
addresses conjectures about the reasons that items are difficult and the na-
ture of proficiency. In particular, any given value of θ can now be interpreted
in terms of expected performance in situations described by their theoreti-
cally relevant features. The LLTM is a measurement model in the sense of
probabilistic conjoint measurement, so it supports the narrative theme of fun-
damental measurement in the comparison of persons.
As an example, Fischer (1983) used the LLTM to model the difficulty
of multistep calculus items, as a function of how many times each of seven
differentiation formulas had to be applied. He used statistical tests to deter-
mine how well the smaller set of features accounted for empirical patterns of

difficulty, and whether repeated applications of a rule contribute additional
increments to difficulty (they didn’t). A relaxed version of (2.4), the random-
weights LLTM (RW-LLTM; Rijmen De Boeck, 2002), allows items with
the same features to differ in difficulty, presumably due to nonmodeled item
features. Sheehan Mislevy (1990) fit the RW-LLTM to a data set using fea-
tures based on the Mosenthal Kirsch (1991) cognitive analysis of document
literacy tasks described in Section 2.4.
One can argue that models such as the LLTM marked a realization of
the call for the synthesis of the “two disciplines” of psychology (Cronbach,
1957), the experimental and the correlational, for they bring substantive the-
ory, task design, and measurement modeling into a unified framework (Em-
bretson, 1985a, 1998). When items are generated in accordance with theory
and patterns among responses can be predicted and tested against that the-
ory, every item provides a new test of the theory (Bejar, 2002). Note that the
reach of the basic LLTM extends only to final responses, not the identities or
the sequences of processes that persons may carry out during the course of
solution.
Behavior at this more detailed level is central to cognitive task analysis,
and inferences at this level are required in many intelligent tutoring systems to
provide feedback or select instruction (e.g., Martin van Lehn, 1995). This so-
called model tracing lies below the level that can be addressed in the narrative
space supported by the LLTM, but in favorable cases the patterns LLTM can
address will appear as emergent phenomena in overall performance. Steps in
the direction of model tracing are seen, for example, in the Embretson (1985b)
model for multistep problems: Each step produces a result to be modeled in
terms of RM or LLTM-like structures, and the final product is a stochastic
outcome of step-level outcomes.
2.5.4 Multiple-Group LLTM
By combining elements of the DIF model (2.3) and the LLTM (2.5), one ob-
tains a model that supports narratives about how items with different features
are differentially difficult to members of different manifest groups:
P (Xj = 1|θ, w, bj) = Ψ (θ − bwj) ,
where
βwj =

k
qwjkηwk = q
jηw. (2.6)
Equation 2.6 shows that the both codings of item covariates, qwjk, and
contributions to item difficulty, ηwk, can differ across groups. That is, both
which features of items are relevant and how they are relevant can differ, pre-
sumably in accordance with a substantive theory that underlies the intended
narrative space. Spada McGaw (1985), for example, define groups in terms
of educational treatments and pre- and posttest occasions, and item features
in terms of curricular elements.

2.5.5 Multivariate Structured Models
Providing theoretically derived multidimensional characterizations of persons’
knowledge and skills is called cognitive diagnosis (Nichols et al., 1995). Three
features of cognitive diagnostic models are of interest here: First, the model
space, hence the supported narrative space, supports qualified claims about
persons, i.e., claims that have the word “but” in them; for example, “Steven is
familiar with the strategy of space-splitting but he is not sufficiently familiar
with the canopy system to apply it there.” Rather than claims about overall
proficiency in a domain, a multidimensional model accounts for persons’ per-
formance in terms of profiles of knowledge and skill. Second, the tasks used
to provide observations can be complex in that each depends on one or more
of these dimensions of proficiency, and different tasks may impose different
profiles of demand on them. Third, as with task design under the LLTM,
substantive theory guides the construction of both the tasks themselves and
the probability model for analyzing the ensuing performances. The narrative
depends on a conception of how persons with different levels or configura-
tions of proficiency are likely to act in different ways in settings with different
cognitively relevant features.
Most cognitively based multidimensional IRT models posit compensatory
or conjunctive combinations of proficiencies to determine response probabil-
ities. The reader interested in conjunctive models is referred to Junker
Sijtsma (2001). In compensatory models, proficiencies combine so that a lack
in one proficiency can be made up with an excess in another proficiency, that
is, a
jθ = aj1θ 1+· · · + ajDθD, where θ is a D-dimensional vector. The ajd’s
indicate the extent to which proficiency d is required to succeed on item j.
The A-matrix indicating examinee proficiency requirements is analogous to a
Q-matrix specification of task features. A is estimated in some models (Ack-
erman, 1994), but in applications more strongly grounded in cognitive theory
the a’s are treated as known, their values depending on the knowledge and
skill requirements that have been designed into each item. As an example, the
Adams, Wilson, Wang (1997) multidimensional random coefficients multi-
nomial logit model (MRCMLM, Chapter 4 in this volume) is a multivariate
generalization of the LLTM. Under the MRCMLM, the probability of a correct
response is modeled as
Pr (Xj = 1 |θ, η, aj, qj ) = Ψ

a
jθ + q
jη . (2.7)
De Boeck and his colleagues (e.g., Hoskins De Boeck, 1995; Janssen De
Boeck, 1997; Rijmen De Boeck, 2002) have carried out an active program of
research using such models to investigate hypotheses about the psychological
processes underlying item performance. In the extension of the MRCMLM
to polytomous responses, each response category has its own a and q vectors
to indicate which aspects of proficiency are evidenced in that response and
which features of the category context contribute to its occurrence. Different
aspects of proficiency may be involved in different responses to a given item,

and different item features can be associated with different combinations of
proficiencies.
Structured multivariate models such as the MRCMLM provide the means
to solve a thorny problem in task-based language assessment (Mislevy, Stein-
berg, Almond, 2002). Real-world language use draws simultaneously on sev-
eral aspects of competence: from phonemic and morphological, through lexical
and syntactic, to pragmatic, substantive, and cultural (Bachman, 1990). Re-
search has shed light on factors that increase challenge in each aspect, such as
the complexity of sentence structures and the familiarity of the content. The
challenge a task presents to a given person depends on both of what Robinson
(1984) calls “complexity factors” and “difficulty factors.” The former is asso-
ciated with features such as syntactic complexity and time pressure, though
information-processing arguments increase the load for most people. The lat-
ter he associates with features by which a task is rendered differentially hard
or easy for particular persons, such as familiarity with content and prior ex-
perience with the genre.
Tasks differ from one to another with regard to the mix of demands they
offer and the degree to which these demands interact with persons. How might
one make sense of such complex data? The frame of discernment of multivari-
ate structured models can support claims of the desired structure: Within
the probability model, one can (a) characterize task demands with respect
to complexity factors via a Q-matrix, (b) define a multivariate θ in terms of
aspects of proficiency along which persons may differ with respect to planned
variations in tasks tapping Robinson’s “difficulty factors,” and (c) indicate
through an A-matrix which dimensions of θ are involved to what degrees for
each observable variable. A model so constructed embodies generalizations
about how persons with different profiles of language proficiencies are likely
to act in situations with specified features—again, the direction of reasoning
that accords with a warrant. From patterns of performance across tasks with
different profiles of demand, the analyst applies Bayes’ theorem to infer a
person’s profile of proficiency. We have argued elsewhere that the key to ap-
plying such models is defining from the beginning a joint narrative space for
desired inferences, a design space for tasks to support claims so framed, and
a structured probability model for responses that accords with the narrative
and design spaces (Mislevy, Steinberg, Breyer, et al., 2002).
2.5.6 Structured Mixture Models
Cognitive task analyses reveal that different subjects apply different strate-
gies to the same problems (Simon, 1975), including familiar item types from
educational and psychological testing (Kyllonen, Lohman, Snow, 1984).
Further, comparisons of experts’ and novices’ problem-solving suggest that
the sophistication with which one chooses and monitors strategy use develops
as expertise grows. Strategy use is therefore a potential target for inference

in assessment. This section considers Case 2 models for solution strategies, or
structured mixture models.
The narrative themes embodied in mixed strategy models (e.g., Wilson,
1989; Mislevy Verhelst, 1990) are these: Different persons may use different
strategies but are presumed to use the same strategy for all items. It is not
known which strategy a person is using. Features of tasks that render them
difficult are posited for each strategy.
Structured mixture models incorporate multiple Q-matrices to differen-
tiate the strategies that may be used to solve the test items. Consider the
case of M strategies; each person applies one of them to all items, and item
difficulty under strategy m depends on features of the task that are relevant
under this strategy in accordance with an LLTM structure. Specifically, the
difficulty of item j under strategy m is bjm =

k qjmkηmk. Define for each
person the vector parameter φi = (φi1, . . . , φiM ), where φim = 1 if person
i uses strategy m and 0 if not, and denote the proficiency of person i under
strategy m as θim. The probability of a correct response under such a model
takes the form
Pr (Xij = 1 |θi, φi, qj, η ) =

m
Ψ

θim −

k
qjmkηmk
φi
. (2.8)
As a first example, the Wilson (1989) saltus model (Chapter 7 in this vol-
ume) addresses developing proficiency that occurs in Piagetian stages. Bal-
ance beam tasks are a familiar example (Siegler, 1981). Movement to a new
stage involves the acquisition of a new rule, so that certain classes of tasks
becomes relatively easier. The saltus model posits that an RM holds across all
items for persons at a given stage, but that these models may differ from one
stage to another by shift parameters that depend on stage membership and
its effect on items in each item class. Figure 2.6 illustrates a saltus model for
three stages, and Table 2.1 shows one way of parameterizing the Q-matrices
and η’s. (The illustrated model is a special case of the saltus model in which
each group of tasks becomes easier by a given amount once a student reaches
a particular stage, and the shifts for that task group are zero before that stage
and constrained to equality across groups thereafter. In an unrestricted saltus
model, the shift parameters for a group of tasks may vary from stage to stage.)
As a second example, consider the finding that subjects may solve putative
mental rotation items such as the one shown as Figure 2.2 either by the an-
ticipated rotation strategy or by a feature-based analytic strategy (Hochberg
Gellman, 1977). Mislevy et al. (1991) modeled response times in this situa-
tion in terms of a mixture of the two strategies. The analytic strategy checks
which direction, clockwise or counterclockwise from the right angle, one finds
the sharper angle in the stimulus and target triangles. Difficulty is posited to
increase linearly with degree of rotation under the rotational strategy (Shep-
ard Meltzer, 1971), but to depend mainly on the acuteness of the angle

b11 b13 b12 b14 b17 b15 b16
Ability/difficulty
scale for Class 1
θ
b21 b23 b22
b24 b27
b25 b26
Ability/difficulty
scale for Class 2
θ
b31 b33 b32
b34 b37
b35 b36
Ability/difficulty
scale for Class 3
θ
η8
η9
Fig. 2.6. Three saltus RMs
under the analytic strategy. The task in Figure 2.2 would thus be relatively
difficult under the rotational strategy but easy under the analytic strategy.
As a third and final example, cognitive analysis by Tatsuoka and her col-
leagues (Klein et al., 1981) found that the 530 middle-school students she
studied characteristically solved mixed-number subtraction problems using
one of two strategies:
Method A: Convert mixed numbers to improper fractions, subtract, then
reduce if necessary.
Method B: Separate mixed numbers into whole-number and fractional
parts, subtract as two subproblems, borrowing one from the minuend whole
number if necessary, and then reduce if necessary.
Tatsuoka (1983) further detailed the subprocesses required for solution
under each method, and identified the attributes of items that called for the
use of subprocesses under each strategy. An item like 72
3 − 51
3 is hard under
Method A but easy under Method B; an item like 21
3 − 12
3 is the opposite.
A response vector with most of the first kind of item right and the second
kind wrong shifts belief toward Method B. The opposite pattern shifts belief
toward the use of Method A. Note that these response patterns constitute
noise, in the form of conflicting evidence, in an overall proficiency model,
but constitute evidence about strategy usage under the mixture model. The
narrative space of how students might be solving problems differently, and

Table 2.1. Saltus Q-matrices and η’s for an example with seven items and three
stages
Q-matrix for Class 1 Item η1 η2 η3 η4 η5 η6 η7 η8 η9
1 1 0 0 0 0 0 0 0 0
2 0 1 0 0 0 0 0 0 0
3 0 0 1 0 0 0 0 0 0
4 0 0 0 1 0 0 0 0 0
5 0 0 0 0 1 0 0 0 0
6 0 0 0 0 0 1 0 0 0
7 0 0 0 0 0 0 1 0 0
1 1 0 0 0 0 0 0 0 0
2 0 1 0 0 0 0 0 0 0
3 0 0 1 0 0 0 0 0 0
4 0 0 0 1 0 0 0 1 0
5 0 0 0 0 1 0 0 1 0
6 0 0 0 0 0 1 0 0 0
7 0 0 0 0 0 0 1 0 0
1 1 0 0 0 0 0 0 0 0
2 0 1 0 0 0 0 0 0 0
3 0 0 1 0 0 0 0 0 0
4 0 0 0 1 0 0 0 1 0
5 0 0 0 0 1 0 0 1 0
6 0 0 0 0 0 1 0 0 1
7 0 0 0 0 0 0 1 0 1
η1: Difficulty parameter for Item 1 in Class 1
η8: Shift for Items 4 and 5 for examinees in Classes 2 and 3
η9: Shift for Items 6 and 7 for examinees in Class 3
how an observer might see patterns that suggest which, ground a conjecture
that cannot be framed within the overall proficiency model.
2.5.7 A Model for Within-Person Mixtures of Strategy Use
The final model we discuss concerns a Case 3 instance of modeling strategy
use. The narrative themes are these: A known fixed set of strategies exists to
solve tasks in a given domain. A person may be using any of the strategies to
solve a given task, although persons differ in their propensities to use different

strategies, and tasks differ, by virtue of their features, to provoke the use of
different strategies. It is observed which strategy a person uses to solve each
task. The inference of interest is, for each person, their propensities to use
each of the strategies.
Examples of domains of tasks in which this narrative space applies can be
found in science, where strategies correspond to conceptions and misconcep-
tions in the domain that have been revealed by cognitive research (McCloskey,
1983). Researchers have developed assessments in which tasks present situa-
tions, and multiple-choice options for predictions or explanations correspond
to particular misconceptions. The Hestenes et al. (1992) force concept in-
ventory (FCI) is an example. Figure 2.5.7 gives two examples of the kind of
items found on the FCI, both based on Newton’s third law: “For every action,
there is an equal and opposite reaction.” The first tends to evoke the Newto-
nian response because it is a paradigmatic third-law situation. The second is
equivalent to an expert, but tends to evoke the response based on a common
misconception, namely that the truck exerts more force than the fly because
it has a greater mass.
C.-W. Huang (2003) used an RM studied by Andersen (1973a) to analyze
responses to the FCI, for which responses could all be classified into three
approaches to force and motion problems: Newtonian, impetus theory, and
nonscientific response. The response of Examinee i to Item j is coded as 1,
2, or 3, for the approach used. Each examinee is characterized by three pa-
rameters θik indicating propensities to use each approach, and each item is
characterized by three parameters indicating propensities to evoke each ap-
proach. Strategy choice is modeled as
Pr (Xij = k |θi, βj ) =
exp (θik − βjk)
3

m=1
exp (θim − βjm)
. (2.9)
We may note that this model assumes that strategy use can be ascertained
as an observable for each task, and that the categorization of strategies is
exhaustive. Note also that the model addresses strategy approach only, not
proficiency in using a given strategy. Were proficiency within strategy also a
target of inference, then data concerning strategy application, such as cor-
rectness, would additionally be required. It would be modeled jointly with
strategy choice, through a model such as the RM where the proficiency for
a given strategy applied only to tasks in which the student was observed to
have used that strategy. Further, if task features related to difficulty under the
various strategies were available, then the within-strategy proficiency models
could feature LLTM-like structures for task difficulties, or MRCMLM-like
structures if the models were multivariate.

A. The truck exerts the same amount of force on
the car as the car exerts on the truck.
B. The car exerts more force on the truck than the
truck exerts on the car.
C. The truck exerts more force on the car than the
car exertson the truck.
D. There's no force because they both stop.
What are the forces at the instant of impact?
20 mph 20 mph
A. The truck exerts the same amount of force on
the fly as the fly exerts on the truck.
B. The fly exerts more force on the truck than the
truck exerts on the fly
.
C. The truck exerts more force on the fly than the
fly exerts on the truck.
D. There's no force because they both stop.
What are the forces at the instant of impact?
20 mph 1 mph
Fig. 2.7. Two items testing misconceptions about Newton’s third law
2.6 Closing Comment
Before the advent of item response theory, total scores on putatively similar
tasks were taken a fortiori to be operationally deﬁned measures—of what,
and in what sense of measurement to be determined partially through the
thinking that led to the construction of the items and partly through correla-
tions with other scores. Both issues lie outside the scope of the classical test
theoretic model generally used to model uncertainty associated with persons’
scores. Michell (2000) argues that this practice constitutes an abrogation of
responsibility on the part of those who wish to contend that test scores are

measures of quantitative psychological traits, where “measurement” is meant
in the classical sense of the term (Campbell, 1920).
Developments in psychometric models and in cognitive research have
moved the debate forward in ways both anticipated and unanticipated. A
case can be made that the family of RMs does embody the axioms of fun-
damental measure in a falsifiable probabilistic model. Hence the claim of a
quantitative measured trait can be put to the test in any given data set.
The question is becoming not so much whether scores reflect fundamentally
measured attributes as whether the measurement narrative is sufficiently well
approximated to ground applied work. As research provides insight into the
nature of human capabilities, extensions of psychometric models bring hy-
pothesized data patterns into the probabilistic models where they too can be
put to the test. The methodological tools to address the question whether
psychological attributes are quantitative, and to explore their nature with ex-
periments and statistical tests. But the same research reveals that much that
is important in the acquisition and use of knowledge is better expressed in
terms other than common measured attributes. Now that tools have at last
been developed to address the fundamental questions of trait psychology, a
future may lie in using those tools for inferences that lie beyond its narrative
space.

3
Testing Generalized Rasch Models
Cees A.W. Glas
3.1 Introduction
Item response theory (IRT) models provide a useful and well-founded frame-
work for measurement in the social sciences. The family of IRT models is still
expanding (see, for instance, De Boeck Wilson, 2004; Skrondal Rabe-
Hesketh, 2004), so characterization of the family of IRT models is not easy.
But to provide some demarcation, IRT models can be defined as stochastic
models for multiway data, usually two-way data consisting of responses of
persons to items. An essential feature in this definition of IRT models is pa-
rameter separation, that is, the influences of the various factors, say items
and persons, on the responses are modeled by distinct sets of parameters. (It
must be mentioned here that some authors define IRT more broadly to in-
clude models that are not necessarily based on parameter separation, such as
the distance model by Lazarsfeld (1950b), and the BTL model by Bradley
Terry (1952), but these models are beyond the scope of this chapter.)
The Rasch model (RM, Rasch, 1960) is just one of many IRT models.
However, the RM has a special place in the family of IRT models because
it represents an approach to measurement in the social sciences that sets it
apart from the rest of IRT. The two approaches can be labeled the measure-
ment approach (for the RM) and the model-fitting approach (for the rest of
IRT). The idea of the model-fitting approach is that the test is a given entity
constructed by experts in some educational, psychological, or sociological do-
main, and the role of the psychometrician is to find a statistical model that
is acceptable for making inferences about the students’ proficiencies and to
attach some measure of reliability to these inferences. The approach is well
documented by Lord (1980). The measurement approach starts with a the-
oretical construct and a set of measurement desiderata, and a measurement
instrument is constructed that fits the measurement model. The measurement
desiderata usually lead to the RM (see, for instance, Fischer, 1995a).
The RM is a quite strict model, so evaluation of model fit has a long
tradition (Andersen, 1973c; Martin-Löf, 1970; Molenaar, 1983; Kelderman,

38 Cees A.W. Glas
1984, 1989; Glas, 1988; Glas Verhelst, 1989, 1995; Klauer, 1989; Ponocny,
2000, 2001). The aim of the present chapter is to present a general framework
for testing the class of models that is the topic of this volume: the class of
generalized RMs. Therefore, we will start with a definition of this class. A
generalized RM is a model in which the likelihood function given the response
patterns xv (v = 1, . . . , N) can be written as
L(θ, β, λ, φ) =

v
p(xv|θv, β)g(θv|λ, yv)h(β|φ, z)
=

v
exp(xt
vAθv − xt
vBβ)c(θv, β)g(θv|λ, yv)h(β|φ, z), (3.1)
where θv = (θv1, . . . , θvQ) is a vector of the person’s ability parameters, β
is a vector of item parameters, and A and B are matrices of fixed integer
scoring weights. Further, c(θv, β) is a function of the parameters θv, β and
independent of xv, and g(θv|λ, yv) and h(β|φ, z) are the (possibly degener-
ate) densities of the person and item parameters, with parameters λ and φ.
These densities might depend on covariates yv and z. In this chapter, the
factor p(xv|θv, β) will be called the Rasch measurement model and the fac-
tor g(θv|λ, yv)h(β|φ, z) the structural model. It is assumed that persons are
independent, so the complete likelihood is the product over persons. Local
independence between the person’s responses is not assumed to include some
interesting models by Jannarone (1986) that lack this assumption. Further,
no assumption about the format of the responses has been made yet, so the
RM for speed tests (Rasch, 1960) and other models for continuous responses
(Mellenbergh, 1994) are also included in this definition.
A general approach to testing this model will be worked out in three esti-
mation settings: the conditional maximum likelihood (CML) framework, the
marginal maximum likelihood (MML) framework, and a Bayesian framework
that will be labeled the MCMC framework (the Markov chain Monte Carlo
framework) for reasons that will become apparent below. The reason for con-
sidering three estimation frameworks is that all three have their specific ranges
of application. In the CML framework, the structural model does not play a
role in the analyses, so this approach is especially suited to evaluate the ap-
propriateness of the measurement model. In MML, the structural model does
play a role, so this approach is suited for testing hypotheses concerning the
measurement and structural model simultaneously. Below, it will be explained
that the feasibility of the MML approach is limited by the dimensionality of
the structural model. If the dimensionality becomes too high, Bayesian meth-
ods based on MCMC become important.
Most hypotheses concerning the structural model can be easily tested us-
ing likelihood ratio tests and their Bayesian analogues. These tests are om-
nipresent in this volume, and they will not be treated in this chapter. In this
chapter, the focus is on the fit of the measurement model, with and with-
out the presence of a structural model. Fit of the measurement model can

Random documents with unrelated
content Scribd suggests to you:

D.ª Isab.
D. Álv.
Mendoza.
D.ª Isab.
D. Álv.
Mendoza.
D. Álv.
Mendoza.
D. Álv.
Mendoza.
D. Álv.
D.ª Isab.
Mendoza.
D. Álv.
Mendoza.
Á
En grande peligro estoy.
¡Valedme, cielos, valedme!
(Escóndense las dos.)
ESCENA XI.
DON ÁLVARO.—DON JUAN DE MENDOZA; DOÑA
ISABEL, escondida.
Señor Don Juan de Mendoza,
Hablar con vos me conviene
A solas.
Pues solo estoy.
(Ap. al paño.) ¡Qué descolorido viene!
(Ap.) Pues cerraré aquesa puerta.
Cerradla. (Ap. ¡Buen lance es este!)
Ya pues que cerrada está,
Escuchadme atentamente.
En una conversacion
Supe ahora cómo vienen
A buscaros...
Es verdad.
A esta prision...
Y no os mienten.
Quien con el alma y la vida
En aquesta accion me ofende.
(Ap. al paño.) ¿Qué más se ha de declarar?
(Ap.) ¡Cielos! ya no hay quien espere.
Y así, he querido llegar
(Antes que los otros lleguen,
Queriendo efectuar con esto
Amistades indecentes)
En defensa de mi honor.
Eso mi ingenio no entiende.

D. Álv.
D.ª Isab.
D. Álv.
Mendoza.
D. Álv.
Pues yo me declararé.
(Ap. al paño.) Otra vez mi pecho aliente;
Que no soy yo la que busca.
El Corregidor pretende,
Con Don Fernando de Válor,
De Don Juan Malec pariente,
Hacer estas amistades,
Y á mí solo me compete
Estorbarlas. La razon,
Aunque muchas darse pueden,
Yo dárosla á vos no quiero;
Y en fin, sea lo que fuere,
Yo vengo á saber de vos,
Por capricho solamente,
Si es valiente con un jóven
Quien con un viejo es valiente,
Y en efecto, vengo solo
A darme con vos la muerte.
Merced me hubiérades hecho
En decirme brevemente
Lo que pretendeis, porque
Juzgué, confuso mil veces,
Que era otra la ocasion
De más cuidado, porque ese
No es cuidado para mí.
Y puesto que no se debe
Rehusar reñir con cualquiera
Que reñir conmigo quiere;
Antes que esas amistades
Que decís que tratan, lleguen,
Y que os importa estorbarlas
Por la ocasion que quisiereis,
Sacad la espada.
A eso vengo;
Que me importa daros muerte
Más presto que vos pensais.

Mendoza.
D.ª Isab.
Mendoza.
D. Álv.
D.ª Isab.
D. Álv.
D.ª Isab.
D. Álv.
Mendoza.
D. Álv.
Pues campo bien solo es este. (Riñen.)
(Ap. al paño.) De una confusion en otra,
Más desdichas me suceden.
¿Quién á su amante y su hermano
Vió reñir, sin que pudiese
Estorbarlo?
(Ap.) ¡Qué valor!
(Ap.) ¡Qué destreza!
(Ap. al paño.) ¿Qué he de hacerme?
Que veo jugar á dos,
Y deseo entrambas suertes,
Porque van ambos por mí,
Si me ganan ó me pierden...
(Tropezando en una silla, cae Don Álvaro: sale Doña Isabel
tapada y detiene á Don Juan.)
Tropezando en esta silla,
He caido.
¡Don Juan, tente!
(Ap. Pero ¿qué hago? El afecto
Me arrebató desta suerte.) (Retírase.)
Mal hicisteis en callarme
Que estaba aquí dentro gente.
Si á daros la vida estaba,
No os quejeis; que más parece
Que estar conmigo, reñir
Con dos, si á ampararos viene.
Aunque hizo mal, porque yo
De caballero las leyes
Sé tambien; que habiendo visto
Que el caer es accidente,
Os dejara levantar.
Ya tengo que agradecerle
Dos cosas á aquesa dama:
Que á darme la vida llegue,
Y llegue ántes que de vos

Mendoza.
D.ª Isab.
D. Álv.
Mendoza.
D. Álv.
Mendoza.
D.ª Isab.
D. Álv.
Mendoza.
D.ª Isab.
D. Alon.
D.ª Isab.
La reciba, porque quede,
Sin aquesta obligacion,
Capaz mi enojo valiente
Para volver á reñir.
¿Quién, Don Álvaro, os detiene? (Riñen.)
(Ap. al paño.) ¡Oh quién pudiera dar voces!
(Llaman dentro á la puerta.)
A la puerta llama gente.
¿Qué haremos?
Que muera el uno
Y abra luégo el que viviere.
Decís bien.
(Saliendo.) Primero yo
Abriré, porque ellos entren.
No abrais.
No abrais.
(Abre Doña Isabel.)
ESCENA XII.
DON FERNANDO DE VÁLOR, DON ALONSO;
despues, INÉS.—DOÑA ISABEL, tapada; DON
ÁLVARO, DON JUAN DE MENDOZA.
Caballeros.
Los dos que mirais presentes
Se quieren matar.
Teneos,
Porque hallándôs desta suerte
Riñendo á ellos y aquí á vos,
Se dice bien claramente
Que sois la causa.
(Ap.) ¡Ay de mí!

D. Álv.
Mendoza.
D. Alon.
D.ª Isab.
Válor.
Que me he entregado á perderme,
Por donde entendí librarme.
Porque en ningun tiempo llegue
A peligrar una dama
A quien mi vida le debe
El sér, diré la verdad
Y la causa que me mueve
A este duelo. No es de amor,
Sino que como pariente
De Don Juan Malec, así
Pretendí satisfacerle.
Y es verdad, porque esa dama
Acaso ha venido á verme.
Pues que con las amistades
Que ya concertadas tienen,
Todo cesa, mejor es
Que todo acabado quede
Sin sangre, pues vence más
Aquel que sin sangre vence.—(Sale Inés.)
Idos, señoras, con Dios.
(Ap.) Solo esto bien me sucede.
(Vanse las dos.)
ESCENA XIII.
DON ALONSO, DON ÁLVARO, DON JUAN DE
MENDOZA, DON FERNANDO DE VÁLOR.
Señor Don Juan de Mendoza,
A vuestros deudos parece
Y á los nuestros, que este caso
Dentro de puertas se quede
(Como dicen en Castilla),
Y que con deudo se suelde,

Mendoza.
Válor.
Mendoza.
Válor.
Mendoza.
D. Álv.
D. Alon.
Pues dando la mano vos
A Doña Clara, la fénix
De Granada, como parte
Entónces...
La lengua cese,
Señor Don Fernando Válor,
Que hay muchos inconvenientes.
Si es el fénix Doña Clara,
Estarse en Arabia puede;
Que en montañas de Castilla
No hemos menester al fénix,
Y los hombres como yo
No es bien que deudos concierten
Por soldar ajenas honras,
Ni sé que fuera decente
Mezclar Mendozas con sangre
De Malec, pues no convienen
Ni hacen buena consonancia
Los Mendozas y Maleques.
Don Juan de Malec es hombre...
Como vos.
Sí, pues desciende
De los reyes de Granada;
Que todos sus ascendientes
Y los mios reyes fueron.
Pues los mios, sin ser reyes,
fueron más que reyes moros,
Porque fueron montañeses.
Cuanto el señor Don Fernando
En esta parte dijere,
Defenderé yo en campaña.
Aquí de ministro cese
El cargo; que caballero
Sabré ser cuando conviene;
Que soy Zúñiga en Castilla
Antes que Justicia fuese.

Criado.
D. Alon.
Mendoza.
D. Alon.
Mendoza.
D. Alon.
Mendoza.
D. Alon.
Válor.
D. Álv.
Válor.
D. Álv.
Válor.
D. Álv.
Y así, arrimando esta vara,
Adónde y como quisiereis,
Al lado de Don Juan, yo
Haré...
ESCENA XIV.
Un CRIADO.—Dichos.
En casa se entra gente.
Pues todos disimulad;
Que al cargo mi valor vuelve.
Vos, Don Juan, aquí os quedad
Preso.
A todo os obedece
Mi valor.
Los dos os id.
Y si desto os pareciere
Satisfaceros...
A mí
Y á Don Juan, donde eligiereis...
Nos hallaréis con la espada...
Y la capa solamente.
(Vase Don Alonso, y Don Juan de Mendoza va
acompañándole.)
¡Esto consiente mi honor!
¡Esto mi valor consiente!
Porque me volví cristiano,
¿Este baldon me sucede?
Porque su ley recibí,
¿Ya no hay quien de mí se acuerde?
¡Vive Dios, que es cobardía
Que mi venganza no intente!
¡Vive el cielo, que es infamia

Válor.
D. Álv.
Válor.
D. Álv.
Válor.
D. Álv.
Válor.
D. Álv.
Válor.
D. Álv.
Válor.
D. Álv.
Válor.
D. Álv.
Que yo de vengarme deje!
¡El cielo me dé ocasion...
¡Ocasion me dé la suerte...
Que si me la dan los cielos...
Si el hado me la concede...
Yo haré que veais muy presto...
Llorar á España mil veces...
El valor...
El ardimiento
Deste brazo altivo y fuerte...
De los Válores altivos!
De los Tuzanís valientes!
¿Habeis escuchado?
Sí.
Pues de hablar la lengua cese
Y empiecen á hablar las manos.
Pues ¿quién dice que no empiecen?

D. Juan.
JORNADA SEGUNDA.
Sierra de la Alpujarra.—Cercanías de Galera.
ESCENA PRIMERA.
Tocan cajas y trompetas, y salen soldados, DON
JUAN DE MENDOZA y EL SEÑOR DON JUAN DE
AUSTRIA.
Rebelada montaña,
Cuya inculta aspereza, cuya extraña
Altura, cuya fábrica eminente,
Con el peso, la máquina y la frente
Fatiga todo el suelo,
Estrecha el aire y embaraza el cielo:
Infame ladronera,
Que de abortados rayos de tu esfera
Das, preñados de escándalos tus senos,
Aquí la voz y en Africa los truenos.
Hoy es, hoy es el dia
Fatal de tu pasada alevosía,
Porque vienen conmigo
Juntos hoy mi venganza y tu castigo;
Si bien corridos vienen
De ver el poco aplauso que previenen
Los cielos á mi fama;
Que esto matar y no vencer se llama,
Porque no son blasones
A mi honor merecidos

Mendoza.
Postrar una canalla de ladrones
Ni sujetar un bando de bandidos:
Y así, encargue á los tiempos mi memoria
Que la llamo castigo y no vitoria.
Saber deseo el orígen deste ardiente
Fiero motin.
Pues oye atentamente.
Esta, austral águila heroica,
Es el Alpujarra, esta
Es la rústica muralla,
Es la bárbara defensa
De los moriscos, que hoy,
Mal amparados en ella,
Africanos montañeses,
Restaurar á España intentan.
Es por su altura difícil,
Fragosa por su aspereza,
Por su sitio inexpugnable
É invencible por sus fuerzas.
Catorce leguas en torno
Tiene, y en catorce leguas
Más de cincuenta que añade
La distancia de las quiebras,
Porque entre puntas y puntas
Hay valles que la hermosean,
Campos que la fertilizan,
Jardines que la deleitan.
Toda ella está poblada
De villajes y de aldeas;
Tal, que cuando el sol se pone,
A las vislumbres que deja,
Parecen riscos nacidos
Cóncavos entre las breñas,
Que rodaron de la cumbre,
Aunque á la falda no llegan.
De todas las tres mejores

Son Berja, Gavia y Galera,
Plazas de armas de los tres
Que hoy á los demas gobiernan.
Es capaz de treinta mil
Moriscos que están en ella,
Sin las mujeres y niños,
Y tienen donde apacientan
Gran cantidad de ganados;
Si bien los más se sustentan,
Más que de carnes, de frutas
Ya silvestres ó ya secas,
O de plantas que cultivan;
Porque no sólo á la tierra,
Pero á los peñascos hacen
Tributarios de la yerba;
Que en la agricultura tienen
Del estudio, tal destreza,
Que á preñeces de su azada
Hacen fecundas las piedras.
La causa del rebelion,
Por si tuve parte en ella,
Te suplico que en silencio
La permitas á mi lengua.
Aunque mejor es decir
Que fuí la causa primera,
Que no decir que lo fueron
Las pragmáticas severas
Que tanto los apretaron,
Que decir esto me es fuerza:
Si uno ha de tener la culpa,
Más vale que yo la tenga.
En fin, sea aquel desaire
La ocasion, señor, ó sea
Que á Válor al otro dia
Que sucedió mi pendencia,
Llegó el alguacil mayor

Dél, y le quitó á la puerta
Del ayuntamiento una
Daga que traia encubierta;
O sea que ya oprimidos
De ver cuánto los aprietan
Órdenes que cada dia
Aquí de la corte llegan,
Los desesperó de suerte,
Que amotinarse conciertan:
Para cuyo efecto fueron,
Sin que ninguno lo entienda,
Retirando á la Alpujarra
Bastimento, armas y hacienda
Tres años tuvo en silencio
Esta traicion encubierta
Tanto número de gentes:
Cosa que admira y eleva,
Que en más de treinta mil hombres
Convocados para hacerla,
No hubiera uno que jamás
Revelara ni dijera
Secreto de tantos dias.
¡Cuánto ignora, cuánto yerra
El que dice que un secreto
Peligra en tres que le sepan!
Que en treinta mil no peligra,
Como á todos les convenga.
El primer trueno que dió
Este rayo que en la esfera
Desos peñascos forjaban
La traicion y la soberbia,
Fueron hurtos, fueron muertes,
Robos de muchas iglesias,
Insultos y sacrilegios
Y traiciones, de manera
Que Granada, dando al cielo

Bañada en sangre las quejas,
Fué miserable teatro
De desdichas y tragedias.
Preciso acudió al remedio
La justicia; pero apénas
Se vió atropellada, cuando
Toda se puso en defensa:
Trocó la vara en acero,
Trocó el respeto en la fuerza,
Y acabó en civil batalla
Lo que empezó en resistencia.
Al Corregidor mataron:
La ciudad, al daño atenta,
Tocó al arma, convocando
La milicia de la tierra.
No bastó; que siempre estuvo
(Tanto novedades precia)
De su parte la fortuna:
De suerte, que todo era
Desdichas para nosotros.
¡Qué pesadas y qué necias
Son, pues en cuanto porfían,
Nunca ha quedado por ellas!
Creció el cuidado en nosotros,
Creció en ellos la soberbia
Y creció en todos el daño,
Porque se sabe que esperan
Socorro de África, y ya
Se ve si el socorro llega,
Que el defenderle la entrada
Es divertirnos la fuerza:
Además, que si una vez
Pujantes se consideran,
Harán los demas moriscos
Del acaso consecuencia;
Pues los de la Extremadura

Los de Castilla y Valencia,
Para declararse aguardan
Cualquier victoria que tengan.
Y para que veais que son
Gente, aunque osada y resuelta,
De políticos estudios,
Oid cómo se gobiernan;
Que esto lo habemos sabido
De algunas espías presas.
Lo primero que trataron
Fué elegir una cabeza;
Y aunque sobre esta eleccion
Hubo algunas competencias
Entre Don Fernando Válor
Y otro hombre de igual nobleza,
Don Álvaro Tuzaní;
Don Juan Malec los concierta
Con que Don Fernando reine,
Casándose con la bella
Doña Isabel Tuzaní,
Su hermana. (Ap. ¡Oh cuánto me pesa
De traer á la memoria
El Tuzaní, á quien respetan,
Ya que á él no le hicieron rey,
Haciendo á su hermana reina!)
Coronado pues el Válor,
La primer cosa que ordena,
Fué, por oponerse en todo
A las pragmáticas nuestras,
O por tener por las suyas
A su gente más contenta,
Que ninguno se llamara
Nombre cristiano, ni hiciera
Ceremonia de cristiano:
Y porque su ejemplo fuera
El primero, se firmó

D. Juan.
Mendoza.
El nombre de Abenhumeya,
Apellido de los reyes
De Córdoba, á quien hereda.
Que ninguno hablar pudiese,
Sino en arábiga lengua;
Vestir sino traje moro,
Ni guardar sino la secta
De Mahoma: despues desto,
Fué repartiendo las fuerzas.
Galera, que es esa villa
Que estás mirando primera,
Cuyas murallas y fosos
Labró la naturaleza,
Tan singularmente docta,
Que no es posible que pueda
Ganarse sin mucha sangre,
La dió á Malec en tenencia;
A Malec, padre de Clara,
Que ya se llama Maleca.
Al Tuzaní le dió á Gavia
La Alta, y él se quedó en Berja,
Corazon que vivifica
Ese gigante de piedra.
Esa es la disposicion
Que desde aquí se penetra;
Y esa, señor, la Alpujarra,
Cuya bárbara eminencia,
Para postrarse á tus piés,
Parece que se despeña.
Don Juan, vuestras prevenciones
Son de Mendoza y son vuestras,
Que es ser dos veces leales.—
(Tocan dentro.)
Pero ¿qué cajas son estas?
La gente que va llegando,

D. Juan.
Mendoza.
D. Juan.
Mendoza.
D. Juan.
Mendoza.
D. Juan.
Mendoza.
D. Juan.
Mendoza.
D. Juan.
Mendoza.
D. Juan.
Pasando, señor, la muestra.
¿Qué tropa es esa?
Esta es
De Granada, y cuanto riega
El Genil.
¿Y quién la trae?
Tráela el marqués de Mondéjar,
Que es el conde de Tendilla,
De su Alhambra y de su tierra
Perpetuo alcaide.
Su nombre
El moro en África tiembla.—(Tocan.)
¿Cuál es esta?
La de Murcia:
¿Y quién es quien la gobierna?
El gran marqués de los Vélez.
Su fama y sus hechos sean
Corónicas de su nombre. (Tocan.)
Estos son los de Baeza,
Y viene por cabo suyo
Un soldado, á quien debiera
Hacer estatuas la fama,
Como su memoria eterna,
Sancho de Ávila, señor.
Por mucho que se encarezca,
Será poco, si no dice
La voz que alabarle intenta,
Que es discípulo del duque
De Alba, enseñado en su escuela
A vencer, no á ser vencido. (Tocan.)
Aqueste que ahora llega,
El tercio viejo de Flándes
Es, que ha bajado á esta empresa
Desde el Mosa hasta el Genil,
Trocando perlas á perlas.
¿Quién viene con él?

Mendoza.
D. Juan.
Mendoza.
D. Juan.
D. Lope.
D. Juan.
D. Lope.
D. Juan.
P. Lope.
Un monstruo
Del valor y la nobleza,
Don Lope de Figueroa.
Notables cosas me cuentan
De su gran resolucion
Y de su poca paciencia.
Impedido de la gota,
Impacientemente lleva
El no poder acudir
Al servicio de la guerra.
Yo deseo conocerle.
ESCENA II.
DON LOPE DE FIGUEROA.—Dichos.
Voto á Dios, que no me lleva
En aqueso de ventaja
Un átomo vuestra Alteza,
Porque hasta verme á sus piés,
Sólo he sufrido á mis piernas.
¿Cómo llegais?
Como quien,
Señor, á serviros llega
De Flándes á Andalucía;
Y no es mala diligencia,
Pues vos á Flándes no vais,
Que Flándes á vos se venga.
Cúmplame el cielo esa dicha.
¿Traeis buena gente?
Y tan buena,
Que si fuera el Alpujarra
El infierno, y estuviera
Mahoma por alcaide suyo,

Un sold.
Garcés.
D. Juan.
Garcés.
D. Lope.
Garcés.
Alcuzc.
D. Juan.
Entraran, señor, en ella...
Si no es los que tienen gota,
Que no trepan por las peñas,
Porque vienen...
ESCENA III.
Un soldado, GARCÉS, ALCUZCUZ.—Dichos.
(Dentro.) Detenéos.
(Dentro.) Tengo de llegar: afuera.
(Sale Garcés con Alcuzcuz á cuestas.)
¿Qué es esto?
De posta estaba
A la falda desa sierra,
Sentí ruido entre unas ramas,
Páreme hasta ver quién era,
Y ví este galgo que estaba
Acechando detras dellas,
Que sin duda era su espía.
Maniatéle con la cuerda
Del mosquete, y porque ladre
Qué hay allá, le traigo á cuestas.
¡Buen soldado, vive Dios!
¿Esto hay acá?
¡Pues! ¿qué piensa
Vueseñoría que todo
Está en Flándes?
(Ap.) ¡Malo es esta!
Alcuzcuz, á esparto olelde
El nuez del gaznato vuestra.
Ya os conozco: no me cogen
Estas hazañas de nuevas.

Garcés.
D. Juan.
Alcuzc.
D. Juan.
Alcuzc.
D. Juan.
Alcuzc.
D. Juan.
Mendoza.
D. Juan.
Garcés.
¡Oh cómo premian sin costa
Príncipes que honrando premian!
Venid acá.
¿A mé decilde?
Sí.
Ser gran favor tan cerca.
Bien estalde aquí.
¿Quién sois?
(Ap. Aquí importar el cautela.)
Alcuzcuz, un morisquilio,
A quien lievaron por fuerza
Al Ampujarro; que mé
Ser crestiano en me conciencia,
Saber la trina crestiana,
El Credo, la Salve Reina,
El pan nostro, y el catorce
Mandamientos de la Iglesia.
Por decir que ser crestiano,
Darme otros el muerte intentan;
Yo correr, é hoyendo, dalde
En manos de quien me prenda.
Si me dar el vida, yo
Decilde cuanto allá piensan,
Y lievaros donde entreis
Sin alguna resistencia.
(Ap. á Mendoza.) Como presumo que miente,
Tambien puede ser que sea
Verdad.
¿Quién duda que hay muchos
Que ser cristianos profesan?
Yo sé una dama que está
Retirada allá por fuerza.
Pues ni todo lo creamos
Ni dudemos.—Garcés, tenga
Ese morisco por preso...
Yo, yo tendré con él cuenta.

D. Juan.
Mendoza.
Garcés.
Alcuzc.
Garcés.
Alcuzc.
Que en lo que luégo dijere,
Veremos si acierta ó yerra.
Y ahora vamos, Don Lope,
Dando á los cuarteles vuelta,
Y á consultar por qué sitio
Se ha de empezar.
Vuestra Alteza
Lo mire bien, porque aunque
Parece poca la empresa,
Importa mucho; que hay cosas,
Mayormente como estas,
Que no dan honor ganadas,
Y perdidas dan afrenta:
Y así, se debe poner
Mayor atencion en ellas,
No tanto para ganarlas,
Cuanto para no perderlas.
(Vanse Don Juan de Austria, Don Juan de Mendoza, Don
Lope y soldados.)
ESCENA IV.
GARCÉS, ALCUZCUZ.
Vos ¿cómo os llamais?
Arroz;
Que si entre moriscos era
Alcuzcuz, entre crestianos
Seré arroz, porque se entienda
Que menestra mora pasa
A ser crestiana menestra.
Alcuzcuz, ya sois mi esclavo:
Decid verdad.
Norabuena.

Garcés.
Alcuzc.
Garcés.
Alcuzc.
Garcés.
Alcuzc.
Garcés.
Alcuzc.
Garcés.
Alcuzc.
Vos dijisteis al señor
Don Juan de Austria...
¿Que aquél era?
Que le llevariais por donde
Entrada tiene esa sierra.
Sí, mi amo.
Aunque es verdad
Que él á sujetaros venga
Con el marqués de los Vélez,
Con el marqués de Mondéjar,
Sancho de Avila y Don Lope
De Figueroa, quisiera,
Yo que la entrada á estos montes
Solo á mí se me debiera:
Llévame allá, porque quiero
Mirarla y reconocerla.
(Ap. Engañifa á este crestiano
He de hacerle, é dar la vuelta
Al Alpujarra.) Venilde
Conmigo.
Detente, espera;
Que en este cuerpo de guardia
Dejé mi comida puesta
Cuando salí á hacer la posta,
Y quiero volver por ella;
Que en una alforja podré
(Porque el tiempo no se pierda)
Llevarla, para ir comiendo
Por el camino.
Así sea.
Vamos pues.
(Ap.) Santo Mahoma,
Pues tú selde mi profeta,
Lievarme, é á Meca iré,
Aunque ande de ceca en meca.

Válor.
D.ª Isab.
Jardin en Berja.
ESCENA V.
Moriscos y músicos; y detras, DON FERNANDO
VÁLOR y DOÑA ISABEL TUZANÍ.
A la falda lisonjera
Dese risco coronado,
Donde sin duda ha llamado
A córtes la primavera,
Porque entre tantos colores
De su república hermosa
Quede jurada la rosa
Por la reina de las flores,
Puedes, bella esposa mia,
Sentarte. Cantad, á ver
Si la música vencer
Sabe la melancolía.
Abenhumeya valiente,
A cuya altivez bizarra,
No el roble del Alpujarra
Dé corona solamente,
Sino el sagrado laurel,
Arbol ingrato del sol,
Cuando llore el español
Su cautiverio cruel:
No es desprecio de la dicha
Deste amor, desta grandeza,
Mi repetida tristeza,
Sino pension ó desdicha
De la suerte; porque es tal
De la fortuna el desden,
Que apénas nos hace un bien,

Válor.
Cuando le desquita un mal.
No nace de causa alguna
Esta pena, (Ap. ¡A Dios pluguiera!)
Sino sólo desta fiera
Condicion de la fortuna.
Y si ella es tan envidiosa,
¿Cómo puedo yo este miedo
Perder al mal, si no puedo
Dejar de ser tan dichosa?
Si la causa de mirarte
Triste tu dicha ha de ser,
Pésame de no poder,
Mi Lidora, consolarte;
Que habrá tu melancolía
De ser cada dia mayor,
Pues que tu imperio y mi amor
Son mayores cada dia.
Cantad, cantad, su belleza
Celebrad, pues bien halladas,
Siempre traen paces juradas
La música y la tristeza.
(Música.)
No es menester que digais
Cúyas sois, mis alegrías;
Que bien se ve que sois mias
En lo poco que durais.
ESCENA VI.
MALEC, que llega á hablar á DON FERNANDO,
hincada la rodilla; y á los lados, DON ÁLVARO y
DOÑA CLARA, que salen en traje de moros y se
quedan á las puertas; BEATRIZ.—Dichos.

D.ª Clar.
D. Álv.
D.ª Clar.
D. Álv.
D.ª Clar.
D. Álv.
D.ª Clar.
D. Álv.
Ellos.
Malec.
Válor.
Malec.
Válor.
D.ª Clar.
Á
(Ap.) «No es menester que digais
Cúyas sois, mis alegrías...»
(Ap.) «Que bien se ve que sois mias
En lo poco que durais.»
(Siempre suenan los instrumentos, aunque se represente.)
(Ap.) ¡Cuánto siendo haber oido
Ahora aquesta cancion!
(Ap.) ¡Qué notable confusion
La voz en mí ha introducido!
(Ap.) Pues cuando mi casamiento
A tratar mi padre viene...
(Ap.) Pues cuando dichas previene
Amor, á mi amor atento...
(Ap.) Glorias mias, escuchais...
(Ap.) Escuchais mis fantasías...
(Música.)
(Ap.) Que bien se ve que sois mias
En lo poco que durais.
Señor, pues entre el estruendo
De Marte el amor se ve
Tan hallado, bien podré
Decirte cómo pretendo
Dar á Maleca marido.
Quién fué tan feliz, me dí.
Tu cuñado Tuzaní.
Muy cuerda eleccion ha sido,
Pues uno y otro fïel
A preceptos de su estrella,
Él no viviera sin ella,
Y ella muriera sin él.
¿Adónde están?
(Llegan Don Álvaro y Doña Clara.)
A tus piés
Alegre llego.

D. Álv.
Válor.
D. Álv.
D.ª Clar.
Y yo ufano,
Para que nos des tu mano.
Mil brazos tomad, y pues
En nuestro docto alcoran,
Ley que ya todos guardamos,
Más ceremonias no usamos
Que las prendas que se dan
Dos, déle á Maleca divina
Sus arras el Tuzaní.
Todo es poco para tí,
A cuya luz peregrina
Se rinde el mayor farol;
Y así temo, porque arguyo
Que es darle al sol lo que es suyo,
Darle diamantes al sol.
Aqueste un Cupido es,
De sus flechas guarnecido;
Que áun de diamantes Cupido,
Viene á postrarse á tus piés.
Esta una sarta de perlas,
De quien duda quien ignora
Que las llorara el aurora,
Si tú habias de cogerlas.
Esta es un águila bella,
Del color de mi esperanza;
Que sólo un águila alcanza
Ver el sol que mira ella.
Un clavo para el tocado
Es este hermoso rubí,
Que ya no me sirve á mí,
Pues mi fortuna ha parado.
Estas memorias... Mas no
Las tomes; que en tales glorias,
Quiero que tengas memorias
Tú, sin traértelas yo.
Las arras, Tuzaní, aceto,

D.ª Isab.
Malec.
D. Álv.
D.ª Clar.
Los dos.
Todos.
Malec.
D. Álv.
Válor.
D. Álv.
Y á tu amor agradecida,
Traerlas toda mi vida
En tu nombre te prometo.
Y yo os doy el parabien
De aqueste lazo inmortal,
(Ap. Que ha de ser para mi mal.)
Ea, pues, las manos den
Albricias al alma.
Puesto
A tus piés estoy.
Los brazos
Conformen eternos lazos.
Yo soy feliz...
(Al darse las manos, tocan cajas dentro.)
Mas ¿qué es esto?
Cajas españolas son
Las que atruenan estos riscos,
Que no tambores moriscos.
¿Quién vió mayor confusion?
Cese la boda, hasta ver
Qué novedad causa ha sido...
¿Ya, señor, no lo has sabido?
¿Qué más novedad que ser
Dichoso yo? Pues el sol
Mira apénas mi ventura,
Cuando eclipsan su luz pura
Las armas del español.
(Vuelven á tocar.)
ESCENA VII.
ALCUZCUZ, con unas alforjas al hombro.—Dichos.

Alcuzc.
D. Álv.
Alcuzc.
Válor.
Alcuzc.
¡Gracias á Mahoma y Alá,
Que á tus piés haber llegado!
Alcuzcuz, ¿dónde has estado?
Ya todos estar acá.
¿Qué te ha sucedido?
Yo
Hoy de posta estar, é aposta
Liego aquí, aunque por la posta,
Quien por detras me cogió,
Lievóme con otros dos
A un Don Juan, que ahora es venido;
E crestianilio fingido,
Decirle que crêr en Dios.
No me dió muerte; cativo
Ser del soldado crestiano,
Que no se labará en vano:
A éste apénas le apercibo
Que senda saber por donde
Poder la Alpojarra entrar,
Cuando la querer mirar.
De camaradas se esconde,
E aquesta forja me dando
Donde venir su comida,
Por una parte escondida,
Entrar los dos camenando.
Apénas solo le ver,
Cuando, sin que seguir pueda,
Füí por monte, é se queda
Sin cativo é sin comer;
Porque aunque me seguir quiso,
Una trompa que salir
De moros, le hacer huir:
E yo venir con aviso
De que ya muy cerca dejo
Don Juan de Andustria en campaña,
A quien decir que acompaña

Válor.
D.ª Isab.
D.ª Clar.
Válor.
El gran marqués de Mondejo
Con el marqués de Luzbel,
El que fremáticos doma,
Don Lope Figura-roma,
Y Sancho Débil con él:
Todos hoy á la Alpojarra
Venir contra tí.
No digas
Más, porque á cólera obligas
Mi altivez siempre bizarra.
Ya desde esa excelsa cumbre
Donde tropezando el sol,
O teme ajar su arrebol
O teme apagar su lumbre,
Ni bien ni mal se divisan
Entre várias confusiones
Los armados escuadrones
Que nuestros términos pisan.
Grande gente ha conducido
Granada á aquesta faccion.
Pocos muchos mundos son,
Si á vencerme á mí han venido,
Aunque fuera el que sujeta
Ese hermoso laberinto,
Como hijo de Cárlos Quinto,
Hijo del quinto planeta;
Porque aunque estos horizontes
Cubran de marciales señas,
Serán su pira estas peñas,
Serán su tumba estos montes.
Y pues se viene acercando
Ya la ocasion, advertidos,
No ya desapercibidos
Nos hallen, sino esperando
Todo su poder; y así,
Su puesto ocupe cualquiera.

Welcome to our website – the perfect destination for book lovers and
knowledge seekers. We believe that every book holds a new world,
offering opportunities for learning, discovery, and personal growth.
That’s why we are dedicated to bringing you a diverse collection of
books, ranging from classic literature and specialized publications to
self-development guides and children's books.
More than just a book-buying platform, we strive to be a bridge
connecting you with timeless cultural and intellectual values. With an
elegant, user-friendly interface and a smart search system, you can
quickly find the books that best suit your interests. Additionally,
our special promotions and home delivery services help you save time
and fully enjoy the joy of reading.
Join us on a journey of knowledge exploration, passion nurturing, and
personal growth every day!
ebookbell.com

Multivariate And Mixture Distribution Rasch Models Extensions And Applications Matthias Von Davier

More Related Content

Similar to Multivariate And Mixture Distribution Rasch Models Extensions And Applications Matthias Von Davier (20)

Recently uploaded (20)

Multivariate And Mixture Distribution Rasch Models Extensions And Applications Matthias Von Davier