Chapter 12
THEMATIC APPERCEPTION TEST
Like the Rorschach Inkblot Method (RIM) discussed in the
preceding chapter, the Thematic
Apperception Test (TAT) is a performance-based measure of
personality. This means that
TAT data consist of how people respond to a task they are given
to do, not what they may
say about themselves. In further contrast to self-report
measures, the TAT resembles the
RIM in providing an indirect rather than a direct assessment of
personality characteristics,
which makes it particularly helpful in identifying characteristics
that people do not fully
recognize in themselves or are reluctant to disclose.
The TAT is a storytelling technique in which examinees are
shown pictures of people or
scenes and asked to make up a story about them. The TAT
differs from the RIM in three
key respects. First, being real pictures rather than blots of ink,
the TAT stimuli are more
structured and less ambiguous than the Rorschach cards.
Second, the TAT instructions are
more open-ended and less structured than those used in
administering the RIM. Rorschach
examinees are questioned specifically about where they saw
their percepts and what made
them look as they did. On the TAT, as elaborated in the present
chapter, people are asked
only in general terms to expand on the stories they tell (e.g.,
"What is the person think-
ing?"). Third, the TAT requires people to exercise their
imagination, whereas the RIM is a
measure of perception and association. Rorschach examinees
who ask whether to use their
imagination should be told, "No, this is not a test of
imagination; just say what the blots
look like and what you see in them." By contrast, TAT takers
who say "I'm not sure what
the people in the picture are doing," or "I don't know what the
outcome will be," can be
told, "This is a test of imagination; make something up."
This distinction between the RIM and the TAT accounts for the
TAT having been called
an apperceptive test. As elaborated in Chapter 11, the RIM was
originally designed as a test
of perception that focused on what people see in the test
stimuli, where they see it, and why
it looks as it does. The TAT was intended to focus instead on
how people interpret what
they see and the meaning they attach to their interpretations,
and the term "apperception"
was chosen to designate this process. The development of the
TAT is discussed further
following the description of the test.
NATURE OF THE THEMATIC APPERCEPTION TEST
The Thematic Apperception Test (TAT) consists of 31
achromatic cards measuring 9¼ x
11 inches. Fourteen of the cards show a picture of a single
person, 11 cards depict two or
more people engaged in some kind of relationship, three are
group pictures of three or four
people, two portray nature scenes, and one is totally blank. The
cards are numbered from
425
426 Performance-Based Measures
1 to 20, and nine of the cards are additionally designated by
letters intended to indicate
their appropriateness for boys (B) and girls (G) aged 4 to 14,
males (M) and females (F)
aged 15 or older, or some combination of these characteristics
(as in 3BM, 6GF, 12BG, and
l 3MF). Twenty cards are designated for each age and gender
group.
People are asked to tell a story about each of the cards they are
shown. They are told
that their stories should have a beginning, a middle, and an end
and should include what
is happening in the picture, what led up to this situation, what
the people in the picture
are thinking and feeling, and what the outcome of the situation
will be. When people have
finished telling their story about a picture, they are asked to add
story elements they have
omitted to mention (e.g., "How did this situation come about?"
"What is on this person's
mind?" "How is she feeling right now?" "What is likely to
happen next?"). In common
with a Rorschach administration, these TAT procedures
generate structural, thematic, and
behavioral data that provide a basis for drawing inferences
about an individual's personality
characteristics.
Structural Data
All TAT stories have a structural component that is defined by
certain objective features
of the test protocol. The length of the stories people tell can
provide information about
whether they are approaching this task-and perhaps other
situations in their lives as
well-in a relatively open and revealing fashion (long stories) or
in a relatively guarded
manner that conceals more than it reveals (short stories). Story
length can also provide clues
to a person's energy level, perhaps thereby identifying
depressive lethargy in one person
(short stories) and hypomanic expansiveness in another person
(long stories), and clues to
whether the individual is by nature a person of few or many
words. Shifts in the length of
stories from one card to the next, or in the reaction time before
the storytelling begins, may
identify positive or negative reactions to the typical themes
suggested by the cards, which
are described later in the chapter.
The amount of detail in TAT stories provides another
informative structural element of
a test protocol. Aside from their length in words, TAT stories
can vary in detail from a
precisely specified account of who is doing what to whom and
why (which might reflect
obsessive-compulsive personality characteristics), to a vague
and superficial description of
people and events that suggests a shallow style of dealing with
affective and interpersonal
experience. A related structural variable consists of the number
and type of stimulus
details that are noted in the stories. Most of the TAT pictures
contain (a) some prominent
elements that are almost always included in the stories people
tell; (b) some minor figures
or objects that are also included from time to time; and (c)
many peripheral details that are
rarely noted or mentioned. Card 3BM, for example, depicts a
person sitting on the floor
(almost always mentioned), a small object on the floor by the
person's feet (frequently but
not always mentioned), and a piece of furniture on which the
person is leaning (seldom
mentioned). Divergence from these common expectations can
have implications for how
people generally pay attention to their surroundings,
particularly with respect to whether
they tend to be inattentive to what is obvious and important, or
whether instead they are
likely to become preoccupied with what is obscure or of little
relevance.
Also of potential interpretive significance is the extent to which
TAT stories revolve
around original themes or common themes. A preponderance of
original themes may reflect
Thematic Apperception Test 427
creativity and openness on the part of the examinee, whereas
consistently common themes
often indicate conventionality or guardedness. As additional
structural variables, the
coherence and rationality of stories can provide clues to
whether people are thinking
clearly and logically, and the quality of the vocabulary usage
and grammatical construc-
tion in people's stories usually says something about their
intellectual level and verbal
facility.
Thematic Data
All TAT stories have a thematic as well as a structural
component. Like the thematic
imagery that often emerges in Rorschach responses, the content
of TAT stories provides
clues to a person's underlying needs, attitudes, conflicts, and
concerns. Because they depict
real scenes, the TAT cards provide more numerous and more
direct opportunities than the
Rorschach inkblots for examinees to attribute characteristics to
human figures in various
circumstances. Typical TAT stories are consequently rich with
information about the de-
picted characters' aspirations, intentions, and expectations that
will likely reveal aspects of
how people feel about themselves, about other people, and
about their future prospects.
These kinds of information typically derive from four
interpretively significant aspects
of the imagery in TAT stories:
1. How the people in a story are identified and described (e.g.,
"young woman," "pres-
ident of a bank," "good gymnast") and whether examinees
appear to be identifying
with these people or seeing them as representing certain other
people in their lives
(e.g., parent, spouse).
2. How the people in a story are interacting; for example,
whether they are helping or
hurting each other in some way.
3. The emotional tone of the story, as indicated by the specific
affect attributed to the
depicted characters (e.g., happy, sad, angry, sorry, enthused,
indifferent).
4. The plot of the story, with particular respect to outcomes
involving success or failure,
gratification or disappointment, love gained or lost, and the
like.
Behavioral Data
As when they are responding to the RIM and other performance-
based measures of per-
sonality, the way people behave and relate to the examiner
during a TAT administration
provides clues to how they typically approach task-oriented and
interpersonal situations.
Whether they appear self-assured or tentative, friendly or surly,
assertive or deferential, and
detached or engaged can characterize individuals while they are
telling their TAT stories,
and these test behaviors are likely to reflect general traits of a
similar kind.
Unlike the situation in Rorschach assessment, the structural,
thematic, and behavioral
sources of data in TAT assessment are not potentially
equivalent in their interpretive sig-
nificance. As discussed in Chapter 11, either the structural, the
thematic, or the behavioral
features of Rorschach responses may tum out to be the most
revealing and reliable source
of information about an individual's personality functioning,
and it cannot be determined
in advance which one it will be. On the TAT, by contrast, the
thematic imagery in the
428 Performance-Based Measures
stories almost always provides more extensive and more useful
data than the structural and
behavioral features of a test protocol.
Moreover, because the TAT pictures portray real-life situations,
and because test takers
are encouraged to embellish their responses, TAT stories are
likely to generate a greater
number of specific hypotheses than the thematic imagery in
Rorschach responses concern-
ing an individual's underlying needs, attitudes, conflicts, and
concerns. TAT stories tend
to help identify particular persons and situations with whom
various motives, intentions,
and expectations are associated. With respect both to the inner
life of people and the nature
of their social relationships, then, the TAT frequently provides
more information than the
RIM.
HISTORY
As befits a storytelling technique, the TAT emerged as the
outcome of an interesting and
in some respects unlikely story. Like the history of Rorschach
assessment, the TAT story
dates back to the first part of the twentieth century, but it is an
American rather than a
European story. The tale begins with Morton Prince, a Boston
born and Harvard-educated
neurologist who lectured at Tufts Medical College and
distinguished himself as a specialist
in abnormal psychology. Along with accomplishments as a
practitioner, teacher, and author
of the original work on multiple personality disorder (Prince,
1906), Prince founded the
Journal of Abnormal Psychology in 1906 and served for many
years as its editor. By the
mid- l 920s, he had come to believe that a university setting
would be more conducive to
advances in psychopathology research than the traditional locus
of such research in medical
schools, where patient care responsibilities often take
precedence over scholarly pursuits.
In 1926, Prince offered an endowment to Harvard University to
support an academic center
for research in psychopathology. The university accepted his
offer and established for this
purpose the Harvard Psychological Clinic, with Prince as its
first director.
On assuming the directorship of the Harvard Psychological
Clinic, Prince looked to hire a
research associate who would plan and implement the programs
of the new facility. Acting
on the recommendation of an acquaintance, but apparently
without benefit of a search
committee or consultation with the Harvard psychology faculty,
he hired an ostensibly
unqualified person for the job-a surgically trained physician and
PhD biochemist named
Henry Murray. Two years later, Prince retired, and in 1928
Murray succeeded him as
Director of the Clinic, a position for which, according to his
biographer, Murray "was the
first person to admit that he was unqualified ... though he had
done a good bit of reading"
(Robinson, 1992, p. 142).
Henry Murray was to become one of the best-known and highly
respected personality
theorists in the history of psychology. He remains recognized
today for his pioneering
emphasis on individual differences rather than group
tendencies, which as noted in Chapter
1 (see pp. 12-13) became identified in technical terms as an
idiographic approach to the
study of persons (as distinguished from a nomothetic approach
emphasizing characteristics
that differentiate groups of people). The main thrust of what
Murray called "personology"
was attention to each person's unique integration of
psychological characteristics, rather
than to the general nature of these characteristics. For Murray,
then, the study of person-
ality consisted of exploring individual experience and the kinds
of lives that people lead,
Thematic Apperception Test 429
rather than exploring the origins, development, and
manifestations of specific personality
characteristics like dependency, assertiveness, sociability, and
rigidity ( see Barenbaum &
Winter, 2003; Hall, Lindzey, & Campbell, 1998, chap. 5). Most
of all, however, Murray is
known for having originated the Thematic Apperception Test.
When Murray ascended to the directorship of the Harvard
Psychological Clinic in 1928,
there was little basis for anticipating his subsequent
contributions to psychology. Born and
reared in New York City, he had studied history as an
undergraduate at Harvard, received his
medical degree from Columbia in 1919, done a 2-year surgical
internship, and then devoted
himself to laboratory research that resulted in 21 published
articles and a 1927 doctorate in
biochemistry (Anderson, 1988, 1999; Stein & Gieser, 1999). As
counterpoint to Murray's
limited preparation for taking on his Harvard Clinic
responsibilities, two personal events
in the mid-1920s had attracted him to making this career
change. One of these events
was reading Melville's Moby Dick and becoming fascinated
with the complexity of the
characters in the story, particularly the underlying motivations
that influenced them to act
as they did.
The second event was meeting and beginning a lifelong
friendship with Christiana Mor-
gan, an artist who was enamored of the psychoanalytic
conceptions of Carl Jung. Morgan
encouraged Murray to visit Jung in Switzerland, which he did in
1925. He later stated
that, in 2 days of conversation with Jung, "enough affective
stuff erupted to invalid a pure
scientist" (Murray, 1940, p. 153). These events and his
subsequent extensive reading in the
psychological and psychoanalytic literature, combined with his
background in patient care
and laboratory research, made him far better prepared to head
up the Harvard Psycholog-
ical Clinic than his formal credentials would have suggested. He
later furthered his own
education by entering a training program in psychoanalysis,
which he completed in 1935.
During his tenure as Director from 1928 to 1943, Murray staffed
the Harvard Psycho-
logical Clinic with a highly talented group of young scholars
and clinicians, many of whom
went on to distinguished careers of their own. Under his
direction, the clinic gained world-
wide esteem for its theoretical and research contributions to the
literature in personality and
psychopathology. As his first major project, Murray
orchestrated an intensive psychologi-
cal study of 50 male Harvard students, each of whom was
assessed individually with over
20 different procedures. Included among these procedures was a
picture-story measure in
which Murray had become interested in the early 1930s. A
conviction had formed in his
mind that stories told by people can reveal many aspects of
what they think and how they
feel, and that carefully chosen pictures provide a useful
stimulus for eliciting stories that
are rich in personal meaning. In collaboration with Morgan, he
experimented with different
pictures and eventually selected 20 that seemed particularly
likely to suggest a critical
situation or at least one person with whom an examinee would
identify. These 20 pictures
constituted the original version of the TAT, first described in
print by C. D. Morgan and
Murray (1935) as "a method for investigating fantasies."
The results of Murray's Harvard study were published in a
classic book, Explorations
in Personality, which is best known for presenting his
idiographic approach to studying
people and his model of personality functioning (Murray, 1938).
In Murray's model, each
individual's personality is an interactive function of "needs,"
which are the particular
motivational forces emerging from within a person, and
"presses," which are environmental
forces and situations that affect how a person expresses these
needs. Less well-known or
recalled is that the 1938 book was subtitled A Clinical and
Experimental Study of Fifty
430 Performance-Based Measures
Men of College Age. After elaborating his personality theory in
terms of 29 different needs
and 20 different presses in the first half of the book, Murray
devoted the second half to
presenting the methods and results of the 50-man Harvard study.
The discussion of this
research included some historically significant case studies that
illustrated for the first time
how the TAT could be used in concert with other assessment
methods to gain insight into
the internal pressures and external forces that shape each
individual's personality.
The original TAT used in the Harvard Clinic Study was
followed by three later versions
of the test, as C. D. Morgan and Murray continued to examine
the stimulus potential of
different kinds of paintings, photographs, and original
drawings. The nature and origins of
the pictures used in four versions of the test are reviewed by W.
G. Morgan (1995, 2002,
2003). The final 31-card version of the test was published in
1943 (Murray, 1943/1971) and
remains the version in use today. Ever the curious scientist,
Murray might have continued
trying out new cards, according to Anderson ( 1999), had he not
left Harvard for Washington,
D.C., in 1943 to contribute to the World War II effort. Murray
was asked to organize an
assessment program in the Office of Strategic Services (OSS),
the forerunner of the CIA,
for selecting men and women who could function effectively as
spies and saboteurs behind
enemy lines. A fascinating account of how Murray and his
colleagues went about this task
and the effectiveness of the selection procedures they devised
was published after the war
by the OSS staff (Office of Strategic Services, 1948), and
Handler (2001) has more recently
prepared a summary of this account.
As Rorschach had done with his inkblots, Murray developed a
scheme for coding
stories told to the TAT pictures. Also in common with
Rorschach's efforts, but for different
reasons, Murray's coding scheme opened the door for
modification in the hands of others.
Rorschach's system was still sketchy at the time of his death
and left considerable room for
additions and revisions by subsequent systematizers (see
Chapter 11). By contrast, Murray
(1943/1971) presented in his manual a detailed procedure for
rating each of 28 needs
and 24 presses on a 5-point scale for their intensity, duration,
frequency, and importance
whenever they occur in a story. This complex scoring scheme
proved too cumbersome
to gain much acceptance among researchers and practitioners
who took up the TAT after
its 1943 publication made it widely available. Consequently, as
elaborated by Murstein
(1963), many other systems for interpreting the TAT emerged
over the next 15 to 20 years;
some them followed Murray in emphasizing content themes, and
others attended as well to
structural and thematic features of stories.
Several of these new systems were proposed by psychologists
who had worked with
Murray at the Harvard Psychological Clinic, notably Leopold
Bellak (1947), William
Henry (1956), Edwin Shneidman (1951), Morris Stein (1948),
and Silvan Tomkins (1947).
Shneidman (1965) later wrote that the TAT had quickly become
"everybody's favorite
adopted baby to change and raise as he wished" (p. 507). Of
these and other TAT systems
that were devised in the 1940s and 1950s, only variations of an
"inspection technique"
proposed by Bellak became widely used. Currently in its sixth
edition, Bellak's text recom-
mends an approach to TAT interpretation in which an
individual's stories are examined for
repetitive themes and recurring elements that appear to fall
together in meaningful ways
(Bellak & Abrams, 1997). This inspection technique is
described further in the coding and
interpretation sections of the present chapter.
Aside from proposing different systems for interpreting TAT
stories, assessment psy-
chologists have at times suggested four reasons for modifying
the TAT picture set that
Thematic Apperception Test 431
Murray published in 1943. The first of these reasons concerns
whether the standard TAT
pictures are suitable for use with young children or the elderly.
Young children may identify
more easily with animals than with people, some said, and the
situations portrayed in the
standard picture set do not adequately capture the life
experiences of older persons. In
light of these possibilities, Bellak developed two alternative
sets of pictures: the Children's
Apperception Test (CAT), intended for use with children aged 3
to 10 and portraying ani-
mal rather than human characters, and the Senior Apperception
Test (SAT), which depicts
primarily elderly people in circumstances they are likely to
encounter (Bellak, 1954, 1975;
Bellak & Abrams, 1997). Little has been written about the
utility of the SAT, however,
and the development of the CAT appears to have been
unnecessary. Research reviewed by
Teglasi (2001, chap. 8) has indicated that children tell equally
or even more meaningful
stories to human cards than they do to animal cards.
A second reason for questioning the appropriateness of the
standard TAT set is that all
the figures in them are Caucasian. Efforts to enhance
multicultural sensitivity in picture-
story assessment, particularly in the evaluation of children and
adolescents, led to the
development of the Tell-Me-A-Story test (TEMAS; Costantino,
Malgady, & Rogler, 1988).
Th.e TEMAS is a TAT-type measure for use with young people
aged 5 to 18 in which the
stimulus cards portray conflict situations involving African
American and Latino characters.
Research with the TEMAS pictures has confirmed that they are
likely to elicit fuller
and more revealing stories from minority individuals than the
all-Caucasian TAT pictures
(Costantino & Malgady, 1999; Costantino, Malgady, Rogler, &
Tusi, 1988), and there are
also indications that the TEMAS has cross-culture applicability
in Europe as well as within
the United States (see Dana, 2006).
As a third concern, there has been little standardization of
which of the 20 TAT cards are
administered and in what order to a person of a particular age
and gender, which has made
it difficult to assess the reliability and validity of the
instrument. Considerations in card
selection and the psychometric foundations of the TAT are
discussed later in the chapter.
However, dissatisfaction with widespread variation in these
aspects of TAT administration
influenced the development of two new TAT-type measures.
One of these newer measures, the Roberts Apperception Test for
Children (RATC), was
designed for use with young people aged 5 to 16 and portrays
children and adolescents
engaged in everyday interactions (McArthur & Roberts, 1990).
There are 27 RATC cards,
11 of which are alternate versions for males or females, and
each youngster taking the test is
administered a standard set of 16 cards in a set sequence, using
male or female versions as
appropriate. A revision of the RATC, called the Roberts-2
(Roberts, 2006) extends the age
range for the test to 18 and includes three parallel sets of cards
for use with White, Black,
and Hispanic children and adolescents. The second alternative
standard set of cards, which
also includes multiethnic pictures, is the Apperceptive
Personality Test (APT; Holmstrom,
Silber, & Karp, 1990; Karp, Holstrom, & Silber, 1989). The
APT consists of just eight
stimulus pictures, each of which is always administered and in a
fixed sequence.
Fourth and finally, some users of the TAT have found fault with
the generally dark,
gloomy, achromatic nature of the pictures and with the old-
fashioned appearance of the
people and scenes portrayed in them. It may be that these
features of the cards make it
difficult for people to identify with the figures in them or to tell
lively stories about them.
The TEMAS, by contrast, features brightly colored pictures and
contemporary situations.
Colored photographs have also been used to develop an
alternative picture set for use with
432 Performance-Based Measures
adults, called the Picture Projective Test (PPT), and some
research has suggested that the
relatively bright PPT cards may generate more active and more
emotionally toned stories
than the relatively dark TAT cards (Ritzler, Sharkey, & Chudy,
1980; Sharkey & Ritzler,
1985).
As alternative picture sets for use with young and elderly
individuals, the CAT, SAT,
TEMAS, and RATC have enjoyed some popularity in applied
practice. Each of these
measures also remains visible as the focus of occasional
research studies published in
the literature. However, none of them appears to have detracted
very much from clinical
applications and research studies of the original 1943 version of
the TAT. With respect to
alternative picture sets for adults, neither the APT, the PPT, or
any other proposed revision
in the TAT picture set has attracted much attention from
practitioners or researchers, despite
their apparent virtues with respect to standardization and
stimulus enhancement.
ADMINISTRATION
As spelled out in his 1943 Manual, Murray intended that
persons taking the TAT would
be asked to tell stories to all 20 of the pictures appropriate to
their age (child/adult) and
gender (male/female). The 20 pictures were to be shown in two
50-minute sessions, with a
1-day interval between sessions, and people would be instructed
to devote about 5 minutes
to each story. In actual practice over the years, TAT examiners
have typically administered
8 to 12 selected cards in a single session. Most commonly, cards
are selected on the basis
of whether they are expected to elicit stories that are rich in
meaning and relevant to
specific concerns of the person being assessed. With respect to
eliciting interpretively rich
stories, the most productive cards are usually those that portray
a person in thought or
depict emotional states or interpersonal relationships. The
selection of cards specifically
relevant in the individual case involves matching the content
themes commonly pulled by
the various cards with what is known or suspected about a
person's central issues, such
as aggressive or depressive concerns, problematic family
relationships, or heterosexual or
homosexual anxieties.
In selecting which cards to use, then, examiners need to
consider the content themes
typically associated with each of them. A description of the
TAT cards and the story
lines they usually pull follows in the interpretation section of
the chapter. With respect to
common practice in card selection, Teglasi (2001, p. 38) has
reported a consensus among
TAT clinicians that the most useful TAT cards are 1, 2, 3BM,
6BM, 7GF, 8BM, 9GF, 10,
and l 3MF. According to Teglasi's report, each of these 9 car ds
appears to work equally well
across ages and genders, despite their male, female, boy, or girl
designation. Bellak (1999)
recommends using a standard 10-card sequence consisting of
these 9 cards plus Card 4,
with the possible addition of other cards that pull for particular
themes. In the individual
case, then, the selected set should comprise all or most of these
9- or 10-card sets, with
replacement or additional cards chosen on the basis of specific
issues that are evaluated.
Two research findings relevant to TAT card selection should
also be noted. In an analysis
by Keiser and Prather (1990) of 26 TAT studies, the 10 cards
used most frequently were 1,
2, 3BM, 4, 6BM, 7BM, 8BM, 10, 13MF, and 16. In the other
study, Avila-Espada (2000)
used several variables, including the number of themes in the
stories each card elicited, to
calculate a stimulus value for each of them. On this basis, he
chose two 12-card sets that
Thematic Apperception Test 433
he considered equivalent in stimulus value to the full 20-card
TAT set: one set for males
(1, 2, 3BM, 4, 6BM, 7BM, 8BM, 10, 13MF, 14, 15, and 18BM)
and one set for females
(1, 2, 3GF, 4, 6GF, 7GF, 8GF, 9GF, 10, 13MF, 17GF, and
18GF).
Turning now to the actual administration of the test, many of
the general considerations
discussed in Chapter 11 with respect to administering the RIM
apply to the TAT as well.
Test takers should have had an opportunity to discuss with the
examiner (a) the purpose of
their being tested (e.g., "The reason for this examination is to
help in planning what kind of
treatment would be best for you"); (b) the types of information
the test will provide (e.g.,
"This is a measure of personality functioning that will give us a
clearer understanding of
what you 're like as an individual, the kinds of concerns you
have, and what might be helpful
to you at this point"); and (c) how the results will be used (e.g.,
"When the test results are
ready, I will be reviewing them with you in a feedback session
and then sending a written
report to your therapist").
In preparation for giving the formal TAT instructions, the cards
that have been selected
should be piled face down on the table or desk, with Card I on
the top and the rest of the
selected cards beneath it in the order in which they are to be
presented. To minimize inadver-
tent influence of the examiner's facial expressions or bodily
movements, it is advisable for
the examiner to sit beside or at an angle from the person taking
the test, rather than directly
in front of the person. Once the test begins, whatever the
examinee says should be recorded
verbatim. Examiners can word-process the protocol with a
computer instead of writing it
longhand, should they prefer to do so, and a person's stories can
also be tape-recorded and
transcribed later on. There is no evidence to indicate that the
examiner's writing out the
record, using a computer, or tape-recording the protocol makes
any difference in the stories
that are obtained.
Examiners should begin the TAT administration by informing
people of the nature of
their task. The following instructions, based on Murray's
(1943/1971) original procedures
and modifications suggested by Bellak and Abrams (1997), will
serve this purpose well
with adolescents and adults of at least average intelligence:
I am going to show you some pictures, one at a time, and your
task will be to make up as
dramatic a story as you can for each. Tell what has led up to the
event shown in the picture,
describe what is happening at the moment, what the characters
are feeling and thinking, and
then give the outcome. Speak your thoughts as they come to
your mind. Do you understand?
When the TAT is being administered to adolescents and adults
of limited intelligence,
to children, or to seriously disturbed persons, the following
simplified version of the
instructions is recommended:
This is a storytelling test. I have some pictures here that I am
going to show you, and for each
picture I want you to make up a story. Tell what has happened
before and what is happening
now. Say what the people are feeling and thinking and how it
will come out. You can make up
any kind of story you please. Do you understand?
Following whichever of set of instructions is given, the
examiner should say, "Here is
the first picture" and then hand Card 1 to the examinee. Each of
the subsequent cards can be
presented by saying, "Here is the next one" or merely handing it
to the person without further
434 Performance-Based Measures
comment. The story told to each picture should be recorded
silently, without interruption,
until the person has finished with it. Immediately following the
completion of each story,
the examiner should inquire about any of the requested story
elements that are missing.
Depending on the content of the story, this inquiry could
include questions about what is
happening, what led up to this situation, what the people are
thinking and feeling, or what
the outcome will be.
If a story as first told is missing most of these elements, a
gentle reminder of the test
instructions and a request to tell the story again may be
preferable to asking each of the
individual questions concerning what has been omitted. If only
some of the requested
story elements are missing and individual inquiries about them
are answered with "Don't
know" or "Can't say," examinees as previously indicated should
be encouraged to "Use
your imagination and make something up." Should this
encouragement fail to generate
any further elaboration of the story element being inquired, the
examiner should desist
without pressing the person further. Putting excessive pressure
on test takers rarely generates
sufficient additional information to justify the distress it may
cause them, and doing so can
also generate negative attitudes that limit cooperation with the
testing procedures that
follow. To the contrary, because adequately informative TAT
protocols are so dependent
on individuals being willing to fantasize and share the products
of their fantasy, it can be
helpful to encourage them with occasional praise (e.g., "That's
an interesting story"). As
Murray (1943/1971, p. 4) said about a little praise from time to
time, "There is no better
stimulant to the imagination."
The examiner's inquiry questions should be limited to requests
for information about
missing story elements and should not include any other kinds
of discussion or questions.
For example, direct questions about the character's motives
(e.g., "Why are they doing
this?") should be avoided. Motivations that emerge in response
to such leading questions
lack the interpretive significance of motivations that people
report spontaneously, and
leading questions that go beyond the basic instructions may
encourage examinees to report
motivations and other kinds of information on subsequent cards
when they would not
otherwise have done so. Similarly, people should not be asked
to talk about any person or
object in a picture that they omitted from their story. This kind
of question can influence the
thoroughness with which individuals attend to subsequent
pictures and thereby dilute the
potential information value of total or selective attention to
certain parts of certain pictures.
Certain kinds of responses may at times call for the examiner to
interrupt an examinee
during the spontaneous phase of a TAT administration. Should
the person be telling a
rambling, extremely detailed story that contains all the requisite
story elements but seems
endless, the examiner should break in with something on the
order of, "That's fine; I think
I have the gist of that story; let's go on to the next picture." If a
rambling and detailed
story covers all the requisite elements except an outcome, the
interruption can be modified
to, "That's fine; just tell me how the story ends, and we'll go on
to the next one." Long
stories rarely provide more information than a briefer version
that covers all the required
story elements, and endless stories are seldom worth the time
and energy they consume in
a testing session.
A second kind of response that calls for interruption is a drawn-
out description of what
a person sees in the picture with little or no attention to
developing a story line with a
plot. In this circumstance, the appropriate intervention is to
remind the individual of the
instructions: "That's fine so far, but let me remind you that what
we need for this test is
Thematic Apperception Test 435
for you to tell a story about each picture, with a beginning and
an end, and to say what the
people are thinking and feeling." A third problematic
circumstance arises when people say
that they can think of two or three different possibilities in a
picture and set out to relate
more than one story. Once more, to minimize any dilution of the
interpretive significance
of the data, examinees should not be allowed to tell alternative
stories. If they indicate that
such is their intent, they should be interrupted with words to
this effect: "For each of these
pictures I want you to tell just one story; if you have more than
one idea about a picture,
choose the one that you think is the best story for it."
Finally, the nature of the test makes it suitable for group as well
as individual assessments.
In group administration, the selected cards are shown on a
screen, the instructions are given
in written form as well as orally by the person conducting the
administration, and people are
asked to write out their stories for each picture. Although group
administration sacrifices
the opportunity for examiners to inquire about missing story
elements, this shortcoming
can be circumvented in large part by mentioning the story
requirements in the instructions.
Based on recommendations by Atkinson (1958, Appendix III),
who was a leading figure
in developing procedures for large sample research with the
TAT, the following written
instructions can be used for group administration:
You are going to see a series of pictures, and your task is to tell
a story that is suggested to you
by each picture. Try to imagine what is going on in each
picture. Then tell what the situation
is, what led up to the situation, what the people are thinking and
feeling, and what they will do.
In other words, write a complete story, with a plot and
characters. You will have four minutes
to write your story about each picture, and you will be told
when it is time to finish your story
and get ready for the next picture. There are no right or wrong
stories or kinds of stories, so
you may feel free to write whatever story is suggested to you
when you look at a picture.
Together with these written instructions, group test takers can
be given a sheet of paper
for each picture they will be shown, with the following four sets
of questions printed on
each sheet and followed by space for writing in an answer:
1. What is happening? Who are the persons?
2. What led up to this situation? What has happened in the past?
3. What is being thought and felt? What do the persons want?
4. What will happen? What will be done?
CODING
As noted, the cumbersome detail of the TAT coding scheme
originally proposed by Murray
(1943/1971) discouraged its widespread adoption in either
clinical practice or research.
The only comprehensive procedure for coding TAT stories that
has enjoyed even mild
popularity is an "Analysis Sheet" developed by Bellak (Bellak
& Abrams, 1997, chap. 4)
for use with his inspection method. Bellak's Analysis Sheet
calls for examiners to describe
briefly several features of each story, including its main theme,
the needs and intentions of
its characters, the kinds of affects that are being experienced,
and the nature of any conflicts
Thematic Apperception Test 467
Turning to their thinking, people whose cognitive integrity is
intact typically produce
coherent TAT stories that are easy to follow and exemplify
logical reasoning. Disjointed
stories that do not flow smoothly, and confusing stories that
lack a sensible sequence,
give reason for concern that a person's thought processes may
be similarly scattered and
incoherent. Narratives characterized by strained and
circumstantial reasoning also raise
questions about the clarity of an individual's thought processes.
Illogical reasoning consists
of drawing definite conclusions on the basis of minimal or
irrelevant evidence and express-
ing these conclusions with absolute certainty when alternative
inferences would be equally
or more likely. The following examples illustrate what people
who are thinking illogically
might say in telling their TAT stories.
To Card 9BM (men lying on the ground): "These men are
probably a barbershop quartet,
because there are four of them, and the little guy would be the
tenor, because he's the smallest"
[being four in number is a highly circumstantial and far from
compelling basis for inferring
that the men are a vocal group, and there is no necessary or
exclusive relationship between
small stature and tenor voice].
To Card 12M (young man lying on couch with older man
leaning over him): "The boy has a tie
on, which means that he's a college student" [this is possible,
but far from being a necessary
meaning; perhaps the young man is wearing a tie because his
mother made him wear it, or
because he is going to get his picture taken today].
To Card 13MF (man standing in front of a woman lying in bed):
"I think she must be dead,
because she's lying down" [seeing the woman in this picture as
dead is not unusual, but
inferring certain death from lying down overlooks the
possibility that she might be sleeping or
resting].
APPLICATIONS
In common with the other assessment measures presented in this
Handbook, the TAT
derives its applications from the information it provides about
an individual's personality
characteristics. The TAT was described in the introduction to
this chapter as a performance-
based measure that, like the RIM, generates structural, thematic,
and behavioral sources of
data. As also noted, however, these data sources are not of
potentially equivalent significance
in TAT interpretation as they are in Rorschach interpretation.
Instead, the TAT, with few
exceptions, is most useful by virtue of what can be learned from
the thematic imagery about
a person's inner life.
Because the TAT functions best as a measure of underlying
needs, attitudes, conflicts,
and concerns, its primary application is in clinical work, mostly
in planning psychotherapy
and monitoring treatment progress. TAT findings may at times
provide some secondary
assistance in differential diagnosis, as illustrated in some of the
examples presented in
discussing story interpretation. Nevertheless, TAT stories are
more helpful in understanding
the possible sources and implications of adjustment difficulties
than in distinguishing
among categories of psychological disorder. For this reason,
forensic and organizational
applications of TAT assessment have also been limited,
although attention is paid in the
discussion that follows to the general acceptance of the TAT in
the professional community
468 Performance-Based Measures
and its potential utility in personnel selection. Other aspects of
TAT assessment that enhance
its utility are its suitability for group administration, its value in
cross-cultural research,
and its resistance to impression management.
Treatment Planning and Monitoring
The interpretive implications of TAT stories often prove helpful
in planning, conducting, and
evaluating the impact of psychological treatment. Especially in
evaluating people who are
seeking mental health care but are unable to recognize or
disinclined to reveal very much
about themselves, TAT findings typically go well beyond
interview data in illuminating
issues that should be addressed in psychotherapy. Inferences
based on TAT stories are
particularly likely to assist in answering the following four
central questions in treatment
planning:
1. What types of conflicts need to be resolved and what
concerns need to be eased for
the person to feel better and function more effectively?
2. What sorts of underlying attitudes does the person have
toward key figures in his or her
life, toward certain kinds of people in general, and toward
interpersonal relatedness?
3. What situations or events are likely to be distressing or
gratifying to the person, and
how does this person tend to cope with distress and respond to
gratification?
4. Which of these umesolved conflicts, underlying attitudes, or
distressing experiences
appears to be a root cause of the emotional or adjustment
problems that brought the
person into treatment?
By providing such information, TAT findings can help guide
therapists plan their treat-
ment strategies, anticipate obstacles to progress, and identify
adroit interventions. Having
such knowledge in advance about elements of a person's inner
life gives therapists a head
start in conducting psychotherapy. This advantage can be
especially valuable in short-term
or emergency therapy, when the time spent obtaining an in-
depth personality assessment
is more than compensated by the time saved with early
identification of the issues and
concerns that need attention.
Three research studies with the SCORS and DMM scales have
demonstrated the po-
tential utility of TAT stories for anticipating the course of
psychotherapy and monitor-
ing treatment progress. In one of these studies, S. J. Ackerman,
Hilsemoth, Clemence,
Weatherill, and Fowler (2000) found significant relationships
between the pretherapy
SCORS levels for affective quality of representations and
emotional investment in rela-
tionships and the continuation in treatment of 63 patients with a
personality disorder, as
measured by the number of sessions they attended.
Also working with the SCORS, Fowler et al. (2004) followed 77
seriously disturbed
patients receiving intensive psychotherapy in a residential
setting who were administered
the TAT prior to beginning treatment and a second time
approximately 16 months later.
Behavioral ratings indicated substantial improvement in the
condition of these patients,
and four of the SCORS scales showed corresponding significant
changes for the better
(Complexity of Representations, Understanding Social
Causality, Self-esteem, and Identity
and Coherence of the Self).
Thematic Apperception Test 469
Cramer and Blatt (1990) were similarly successful in
demonstrating the utility of the
DMM in monitoring treatment change. In the Cramer and Blatt
study, 90 seriously disturbed
adults in residential treatment were tested on admission and
retested after an average of 15
months of therapy. Reduction of psychiatric symptoms in these
patients was accompanied
by significant decline in total use of defenses, as measured with
the DMM.
Diagnostic Evaluations
Contemporary practice in differential diagnosis distinguishes
among categories or dimen-
sions of disorder primarily on the basis of a person's manifest
symptomatology or behavior,
rather than the person's underlying attitudes and concerns (see
American Psychiatric As-
sociation, 2000). For this reason, what the TAT does best-
generate hypotheses about a
person's inner life-rarely plays a prominent role in clinical
diagnostic evaluations. Nev-
ertheless, certain thematic, structural, and behavioral features of
a TAT protocol may be
consistent with and reinforce diagnostic impressions based on
other sources of informa-
tion. Examples of this diagnostic relevance include suspicion-
laden story plots that suggest
paranoia, disjointed narratives that indicate disordered thinking,
and a slow rate of speech
that points to depressive lethargy. 2
In addition, research with the SCORS and DMM scales has
demonstrated that objectified
TAT findings can identify personality differences among
persons with different types of
problem. Patients with borderline personality disorder differ
significantly on some SCORS
variables from patients with major depressive disorder (Westen
et al., 1990), and SCORS
variables have been found to distinguish among patients with
borderline, narcissistic, and
antisocial personality disorders (S. J. Ackerman et al., 1999).
Young people who have been
physically or sexually abused display quite different
interpersonal attitudes and expectations
on the SCORS scales from the attitudes and expectations of
children and adolescents who
have not experienced abuse (Freedenfeld, Ornduff, & Kelsey,
1995; Kelly, 1999; Ornduff,
Freedenfeld, Kelsey, & Critelli, 1994; Ornduff & Kelsey, 1996).
Sandstrom and Cramer (2003) found that elementary
schoolchildren whose DMM scores
indicate use of identification are better adjusted
psychologically, as measured by parent and
self-report questionnaires, than children who rely on denial. In
particular, the children in
this study who showed identification reported less social
anxiety and depression than those
who showed denial, were less often described by their parents
as having behavior prob-
lems, and were more likely to perceive themselves as socially
and academically competent.
Adolescents with conduct disorder show less mature defenses on
the DMM than adoles-
cents with adjustment disorder, with the conduct disorder group
being more likely to use
denial than the adjustment disorder group, and less likely to use
identification (Cramer &
Kelly, 2004). Frequency of resorting to violence for resolution
of interpersonal conflicts, as
2Note should be taken of the recent publication of the
Psychodynamic Diagnostic Manual (PDM Task Force, 2006),
which is intended to supplement the Diagnostic and Statistical
Manual (DSM; American Psychiatric Association,
2000) as a guideline for differential diagnosis. The diagnostic
framework formulated in the PDM encourages
attention to each person's profile of mental functioning, which
includes "patterns of relating, comprehending, and
expressing feelings, coping with stress and anxiety, observing
one's own emotions and behaviors, and forming
moral judgments" (p. 2). Should such considerations come to
play a more formal part in differential diagnosis
than has traditionally been the case, TAT findings may become
increasingly relevant in determining diagnostic
classifications.
470 Performance-Based Measures
self-reported by a sample of college student men, has shown a
significant negative corre-
lation with DMM use of identification and a significant positive
correlation with use of
projection (Porcerelli, Cogan, Kamoo, & Letman, 2004).
These and similar TAT findings can help clinicians understand
psychological distur-
bances and appreciate the needs and concerns of people with
adjustment problems. How-
ever, these findings do not warrant using the SCORS, the DMM,
or any other TAT scale as
a sole or primary basis for diagnosing personality disorders or
identifying victims of abuse.
Differential diagnosis should always be an integrative process
drawing on information from
diverse sources, and for reasons already mentioned, the
information gleaned from TAT sto-
ries usually plays a minor role in this process. Moreover,
neither the TAT nor any other
performance-based measure of personality provides sufficient
basis for inferring whether a
person has been abused or had any other particular type of past
experience. The following
caution in this regard should always be kept in mind:
"Psychological assessment data are
considerably more dependable for describing what people are
like than for predicting how
they are likely to behave or postdicting what they are likely to
have done or experienced"
(Weiner, 2003, p. 335).
Forensic and Organizational Applications
Like the imagery in Rorschach responses, stories told to TAT
pictures are better suited
for generating hypotheses to be pursued than for establishing
the reasonable certainties
expected in the courtroom. On occasion, thematic
preoccupations may carry some weight
in documenting a state of mind relevant to a legal question, as
in a personal injury case in
which the TAT stories of a plaintiff seeking damages because of
a claimed posttraumatic
stress disorder reflect pervasive fears of being harmed or
damaged. By and large, however,
the psycholegal issues contended in the courtroom seldom hinge
on suppositions about a
litigant's or defendant's inner life. In terms of the criteria for
admissibility into evidence
discussed in the previous chapter, then, TAT testimony has
limited likelihood of being
helpful to judges and juries. As discussed in the final section of
this chapter, moreover,
TAT interpretation does not rest on a solid scientific basis,
except for conclusions based on
objectified scales for measuring specific personality
characteristics.
Nevertheless, forensic psychologists report using the TAT in
their practice, and TAT
assessment easily meets the general acceptance criterion for
admissibility into evidence.
Among forensic psychologists responding to surveys, over one-
third report using the TAT or
CAT in evaluations of children involved in custody disputes,
and 24% to 29% in evaluating
adults in these cases, with smaller numbers using the TAT in
evaluations of personal injury
(9%), criminal responsibility (8%), and competency to stand
trial (5%; M. J. Ackerman &
Ackerman, 1997; Boccaccini & Brodsky, 1999; Borum &
Grisso, 1995; Quinnell & Bow,
2001). In a more recent survey of forensic psychologists by
Archer, Buffington-Vollum,
Stredny, and Handel (2006), 29% reported using the TAT for
various purposes in their case
evaluations. In clinical settings, the TAT has consistently been
among the four or five most
frequently used tests, and it has been the third most frequently
used personality assessment
method, following the MMPI and RIM with adults and the RIM
and sentence completion
tests with adolescents (Archer & Newsom, 2000; Camara,
Nathan, & Puente, 2000; Hogan,
2005; Moretti & Rossini, 2004).
Thematic Apperception Test 471
A majority (62%) of internship training directors report a
preference for their incoming
trainees to have had prior TAT coursework or at least a good
working knowledge of the
instrument (Clemence & Handler, 2001 ). Over the years, the
TAT has been surpassed only
by the MMPI and the RIM in the volume of published
personality assessment research it has
generated (Butcher & Rouse, 1996). As judged from its
widespread use, its endorsement
as a method that clinicians should learn, and the extensive body
of literature devoted to it,
TAT assessment appears clearly to have achieved general
acceptance in the professional
community.
With respect to potential applications of the TAT in personnel
selection, two meta-
analytic studies have identified substantial relationships
between McClelland's n-Ach scale
and achievement-related outcomes. In one of these meta-
analyses, Spangler (1992) found a
statistically significant average affect size for n-Ach in
predicting such outcomes as income
earned, occupational success, sales success, job performance,
and participation in and lead-
ership of community organizations. This TAT measure of
achievement motivation showed
higher correlations with outcome criteria in these studies than
self-report questionnaire
measures of motivation to achieve.
In the other meta-analysis, Collins, Hang es, and Locke (2004)
examined 41 studies
of need for achievement among persons described as
entrepreneurs. Entrepreneurship in
these studies consisted of being a manager responsible for
making decisions in the business
world or a founder of a business with responsibility for
undertaking a new venture. The
n-Ach scale in these studies was significantly correlated with
choosing an entrepreneurial
career and performing well in it, and Collins et al. concluded,
"Achievement motivation
may be particularly potent at differentiating between successful
and unsuccessful groups
of entrepreneurs" (p. 111). Hence there is reason to expect that
TAT assessment may be
helpful in identifying individuals who are likely to be adept at
recognizing and exploiting
entrepreneurial opportunities in the marketplace.
Group Administration, Cross-Cultural Relevance, and
Resistance
to Impression Management
As mentioned, three other aspects of the TAT are likely to
enhance its applications for
various purposes. First, the suitability of the TAT for group
administration facilitates large-
scale data collection for research purposes and creates
possibilities for using the instrument
as a screening device in applied settings.
Second, since early in its history, the TAT has been used as a
clinical and research
instrument in many different countries and has proved
particularly valuable in studying
cultural change and cross-cultural differences in personality
characteristics. Contributions
by Dana (1999) and Ephraim (2000) provide overviews of these
international applications
of the TAT, and the particular sensitivity of TAT stories to
cultural influences is elaborated
by Ritzier (2004) and by Hofer and Chasiotis (2004).
Third, as a performance-based measure, the TAT is somewhat
resistant to impression
management. People who choose to conceal their inner life by
telling brief and unelaborated
stories can easily defeat the purpose of the examination. In so
doing, however, they make
it obvious that they are delivering a guarded protocol that
reveals very little about them,
other than the fact of their concealment. For examinees who are
being reasonably open
472 Performance-Based Measures
and cooperative, the ambiguity of the task and their limited
awareness of what their stories
might signify make it difficult for them to convey any
intentionally misleading impression
of their attitudes and concerns.
Nevertheless, telling stories is a more reality-based enterprise
than saying what inkblots
might be, and for this reason, the TAT is probably not as
resistant as the RIM to impression
management. Moreover, research reported in the 1960s and
1970s showed that college
students could modify the TAT stories they told after being
instructed to respond in certain
ways ( e.g., as an aggressive and hostile person). Schretlen (
1997) has concluded from these
early studies that they "clearly demonstrate the fakability of the
TAT" (p. 281).
To take issue with Schretlen's conclusion, however, the ability
of volunteer research
participants to shape their TAT stories according to certain
instructions may have little
bearing on whether people being examined for clinical purposes
can successfully manage
the impression they give on this measure. Moreover, it is
reasonable to hypothesize that
experienced examiners, working with the benefit of case history
information and data from
other tests as well, would have little difficulty identifying in
TAT stories the inconsistencies
and exaggerations that assist in detecting malingering.
However, the sensitivity of clinicians
to attempted impression management in real-world TAT
assessment has not yet been put to
adequate empirical test.
PSYCHOMETRIC FOUNDATIONS
The nature of the TAT and the ways in which it has most
commonly been used have
made it difficult to determine its psychometric properties. Aside
from a widely used and
fairly standard set of instructions based on Murray's original
guidelines for administration,
research and practice with the TAT has been largely
unsystematic. Certain sets of cards have
been recommended by various authorities on the test, but there
has been little consistency
with respect to which cards are used and in what sequence they
are shown (Keiser & Prather,
1990). Moreover, the primarily qualitative approach that
typifies TAT interpretation in
clinical practice does not yield the quantitative data that
facilitate estimating the reliability
of an assessment instrument, determining its validity for various
purposes, and developing
numerical reference norms.
This lack of systematization and the resultant shortfall in
traditional psychometric veri-
fication have fueled a long history of controversy between
critics who have questioned the
propriety of using the TAT in clinical practice and proponents
who have endorsed the value
of the instrument and refuted criticisms of its use.
Commentaries by Conklin and Westen
(2001), Cramer (1999), Garb (1998), Hibbard (2003), Karon
(2000), and Lilienfeld, Wood,
and Garb (2000) provide contemporary summaries of these
opposing views. Without re-
hashing this debate, and with the psychometric shortcomings of
traditional TAT assessment
having already been noted, the following discussion calls
attention to four considerations
bearing on how and why this instrument can be used effectively
for certain purposes.
First, criticisms of the validity of the TAT have frequently been
based on low correlations
between impressions gleaned from TAT stories and either
clinical diagnosis or self-report
data. However, correlations with clinical diagnoses and self-
report measures are conceptu-
ally irrelevant to the validity of TAT for its intended purposes,
and criticisms based on such
correlations accordingly lack solid basis. The TAT was
designed to explore the personal
Thematic Apperception Test 473
experience and underlying motives of people, not to facilitate a
differential diagnosis based
primarily on manifest symptomatology (which is the basis of
psychiatric classification in
the Diagnostic and Statistical Manual [DSM-IV-TR]; American
Psychiatric Association,
2000). Should some TAT scales show an association with
particular psychological disor-
ders, as they in the SCORS and DMM research, the test may
help identify personality
characteristics associated with these disorders. Failure to
accomplish differential diagnosis,
although important to recognize as a limitation of TAT
applications, does not invalidate use
of the instrument for its primary intended purposes.
As for correlations with self-report measures, there is little to
gain from attempting
to validate performance-based personality tests against self-
report questionnaires, or vice
. versa for that matter. These are two types of test that are
constructed differently, ask for
different kinds of responses, provide different amounts of
structure, and tap different levels
of self-awareness, as discussed in concluding Chapter 1. Hence
they may at times yield
different results when measuring similar constructs, and in such
instances they are more
likely to complement than to contradict each other (see pp. 24-
26; see also Weiner, 2005).
Meyer et al. (2001) drew the following conclusions in this
regard from a detailed review of
evidence and issues in psychological testing:
Distinct assessment methods provide unique information ....
Any single assessment method
provides a partial or incomplete representation of the
characteristics it intends to mea-
sure.... Cross-method correlations cannot reveal ... how good a
test is in any specific
sense.... Psychologists should anticipate disagreements when
similarly named scales are com-
pared across diverse assessment methods. (p. 145)
Because both self-report and performance-based personality
tests are inferential mea-
sures, furthermore, substantial correlations between them
usually have only modest impli-
cations for their criterion validity. Two tests that correlate
perfectly with each other can be
equally invalid, with no significant relationship to any
meaningful criterion. Compelling
evidence of criterion validity emerges when personality test
scores correlate not with each
other, but with external (nontest) variables consisting of what
people are like and how they
are observed to behave.
Second, the traditionally qualitative TAT methods have been
supplemented with quantita-
tive scales that are readily accessible to psychometric
verification. The previously mentioned
research with the SCORS, DMM, and n-Ach scoring
demonstrates that TAT assessment
can be objectified to yield valid and reliable scales for
measuring dimensions of personal-
ity functioning. Additional research has demonstrated the
internal consistency of SCORS
and its validity in identifying developmental differences in the
interpersonal capacities of
children (e.g., Hibbard, Mitchell, & Porcerelli, 2001; Niec &
Russ, 2002).
The DMM has been validated as a measure of maturity level in
children and adolescents,
of developmental level of maturity in college students, and of
long-term personality change
and stability in adults (Cramer, 2003; Hibbard & Porcerelli,
1998; Porcerelli, Thomas,
Hibbard, & Cogan, 1998). Support for the validity of these
scales is acknowledged by
critics as well as proponents of the TAT, although in the former
case with the qualification
that these "promising TAT scoring systems ... are not yet
appropriate for routine clinical
use" (Lilienfeld et al., 2000, p. 46). Even if this qualification is
warranted, the point has
been made that TAT assessment has the potential to generate
valid and reliable findings.
474 Performance-Based Measures
Research with other picture-story measures, notably the RATC
and the TEMAS, has
provided additional evidence of the potential psychometric
soundness of assessing person-
ality with this method. As reviewed by Weiner and Kuehnle
(1998), quantitative scores
generated by both measures have valid and meaningful
correlates and have shown adequate
levels of interscorer agreement and either internal consistency
or retest stability.
Third, not having systematically gathered quantitative
normative data to guide TAT
interpretations does not mean that the instrument lacks
reference points. As reviewed in
the section of this chapter on card pull, cumulative clinical
experience has established
expectations concerning the types of stories commonly elicited
by each of the TAT cards.
Hence examiners are not in the position of inventing a new test
each time they use the TAT.
Instead, similarities and differences between a person's stories
and common expectations
can and should play a prominent role in the interpretive process,
as they did in many of the
examples presented in this chapter.
The fourth consideration pertains to the primary purpose of
TAT assessment, which
is to explore an individual's personal experience and generate
hypotheses concerning the
individual's underlying needs, attitudes, conflicts, and concerns.
The value of the TAT
resides in generating hypotheses that expand understanding of a
person's inner life. If
a TAT story suggests three alternative self-perceptions or
sources of anxiety, and only
one of these alternatives finds confirmation when other data
sources are examined, then
the test has done its job in useful fashion. It is not invalidated
because two-thirds of the
suggested alternatives in this instance proved incorrect. This is
the nature of working
with a primarily qualitative assessment instrument, which shows
its worth, not through
quantitative psychometric verification, but by clinicians finding
it helpful in understanding
and treating people who seek their services. Psychologists who
may be concerned that this
qualitative perspective detracts from the scientific status of
assessment psychology should
keep in mind that generating hypotheses is just as much a part
of science as confirming
hypotheses.
REFERENCES
Ackerman, M. J., & Ackerman, M. C. (1997). Custody
evaluations in practice: A survey of experienced
professionals (revisited). Professional Psychology, 28, 137-145.
Ackerman, S. J., Clemence, A. J., Weatherill, R., & Hilsenroth,
M. J. (1999). Use of the TAT in the
assessment of DSM-IV Custer B personality disorders. Journal
of Personality Assessment, 73,
422-448.
Ackerman, S. J., Hilsenroth, M. J., Clemence, A. J., Weatherill,
R., & Fowler, J. C. (2000). The
effect of social cognition and object representation on
psychotherapy continuation. Bulletin of
the Menninger Clinic, 64, 386-408.
American Psychiatric Association. (2000). Diagnostic and
statistical manual of mental disorders
(4th ed., text rev.). Washington, DC: Author.
Anderson, J. W. (1988). Henry Murray's early career: A
psychobiographical exploration. Journal of
Personality, 56, 139-171.
Anderson, J. W. (1999). Henry A. Murray and the creation of
the Thematic Apperception Test. In
L. Gieser & M. I. Stein (Eds.), Evocative images: The Thematic
Apperception Test and the art
ofprojection (pp. 23-38). Washington, DC: American
Psychological Association.
Thematic Apperception Test 475
Archer, R. P., Buffington-Vollum, J. K., Stredny, R. V., &
Handel, R. W. (2006). A survey of
psychological test use patterns among forensic psychologists.
Journal ofPersonality Assessment,
87, 84-94.
Archer, R. P., & Newsom, C. R. (2000). Psychological test
usage with adolescent clients: Survey
update. Assessment, 7, 227-235.
Atkinson, J. W. (Ed.). (1958). Motives in fantasy, action, and
society. Princeton, NJ: Van Nostrand.
Avila-Espada, A. (2000). Objective scoring for the TAT. In R.
H. Dana (Ed.), Handbook of cross-
cultural and multicultural personality assessment (pp. 465-480).
Mahwah, NJ: Erlbaum.
Barenbaum, N. R., & Winter, D. G. (2003). Personality. In I. B.
Weiner (Editor-in-Chief) & D. K.
Freedheim (Vol. Ed.), Handbook of psychology: Vol. 1. History
of psychology (pp. 177-302).
Hoboken, NJ: Wiley.
Bellak, L. (1947). A guide to the interpretation of the Thematic
Apperception Test. New York:
Psychological Corporation.
Bellak, L. (1954). The Thematic Apperception Test and the
Children's Apperception Test in clinical
use. New York: Grune & Stratton.
Bellak, L. (1975). The TAT, CAT, and SAT in clinical use ( 3rd
ed.). New York: Grune & Stratton.
Bellak, L. (1999). My perceptions of the Thematic Apperception
Test in psychodiagnosis and psy-
chotherapy. In L. Gieser & M. I. Stein (Eds.), Evocative
images; The Thematic Apperception Test
and the art ofprojection (pp. 133-141). Washington, DC:
American Psychological Association.
Bellak, L., & Abrams, D. M. (1997). The TAT, CAT, and SAT
in clinical use ( 6th ed.). Boston: Allyn
&Bacon.
Blankenship, V., Vega, C. M., Ramos, E., Romero, K., Warren,
K., Keenan, K., et al. (2006). Using the
multifaceted Rasch model to improve the TAT/PSE measure of
need for achievement. Journal
ofPersonality Assessment, 86, 100-114.
Boccaccini, M. T., & Brodsky, S. L. (1999). Diagnostic test
usage by forensic psychologists in
emotional injury cases. Professional Psychology, 30, 253-259.
Borum, R., & Grisso, T. (1995). Psychological test use in
criminal forensic evaluations. Professional
Psychology, 26, 465-473.
Busch, F. (1995). The ego at the center of clinical technique.
Northvale, NJ: Aronson.
Butcher, J. N., & Rouse, S. V. (1996). Personality: Individual
differences and clinical assessment.
Annual Review ofPsychology, 47, 87-111.
Camara, W., Nathan, J., & Puente, A. (2000). Psychological test
usage: Implications in professional
use. Professional Psychology, 31, 141-154.
Clemence, A. J., & Handler, L. (2001). Psychological
assessment on internship: A survey of training
directors and their expectations for students. Journal
ofPersonality Assessment, 76, 18-47.
Collins, C. J., Ranges, P. J., & Locke, E. A. (2004). The
relationship of achievement motivation to
entrepreneurial behavior: A meta-analysis. Human Performance,
17, 95-117.
Conklin, A., & Westen, D. (2001). Thematic apperception test.
In W. I. Dorfman & M. Hersen (Eds.),
Understanding psychological assessment (pp. 107-133).
Dordrecht, The Netherlands: Kluwer
Academic.
Costantino, G., & Malgady, R. G. (1999). The Tell-Me-A-Story
Test: A multicultural offspring
of the Thematic Apperception Test. In L. Gieser & M. I. Stein
(Eds.), Evocative images: The
Thematic Apperception Test and the art ofprojection (pp. 177-
190). Washington, DC: American
Psychological Association.
Costantino, G., Malgady, R. G., & Rogler, L. H. (1998).
Technical manual: TEMAS Thematic
Apperception Test. Los Angeles: Western Psychological
Services.
Costantino, G., Malgady, R. G., Rogler, L. H., & Tosi, E. C.
(1998). Discriminant analysis of clinical
outpatients and public school children by TEMAS: A thematic
apperception test for Hispanics
and Blacks. Journal ofPersonality Assessment, 52, 670-678.
476 Performance-Based Measures
Cramer, P. (1991). The development of defense mechanisms:
Theory, research and assessment. New
York: Springer-Verlag.
Cramer, P. (1996). Storytelling, narrative, and the Thematic
Apperception Test. New York: Guilford
Press.
Cramer, P. (1999). Future directions for the Thematic
Apperception Test. Journal of Personality
Assessment, 72, 74-92.
Cramer, P. (2003). Personality change in later adulthood is
predicted by defense mechanism use in
early adulthood. Journal ofResearch in Personality, 37, 76-104.
Cramer, P. (2006). Protecting the self: Defense mechanisms in
action. New York: Guilford
Press.
Cramer, P., & Blatt, S. J. (1990). Use of the TAT to measure
change in defense mechanisms following
intensive psychotherapy. Journal ofPersonality Assessment, 54,
236-251.
Cramer, P., & Kelly, F. D. (2004). Adolescent conduct disorder
and adjustment reaction. Journal of
Nervous and Mental Diseases, 192, 139-145.
Dana, R. H. (1999). Cross-cultural-multicultural use of the
Thematic Apperception Test. In L.
Gieger & M. I. Stein (Eds.), Evocative images: The Thematic
Apperception Test and the art of
projection (pp. 177-190). Washington, DC: American
Psychological Association.
Dana, R. H. (2006). TEMAS among the Europeans: Different,
complementary, and provocative.
South African Rorschach Journal, 3, 17-28.
Ephraim, D. (2000). A psychocultural approach to TAT scoring
and interpretation. In R.H. Dana (Ed.),
Handbook of cross-cultural and multicultural personality
assessment (pp. 427-446). Mahwah,
NJ: Erlbaum.
Eron,L. D. (1950). A normative study of the Thematic
Apperception Test.Psychological Monographs,
64( Whole No. 315).
Eron, L. D. (1953). Responses of women to the Thematic
Apperception Test. Journal of Consulting
Psychology, 17, 269-282.
Fowler, J. C., Ackerman, S. J., Speanburg, S., Bailey, A.,
Blagys, M., & Conklin, A. C. (2004).
Personality and symptom change in treatment refractory
inpatients: Evaluation of the phase
model of change using Rorschach TAT and DSM-IV Axis V.
Journal ofPersonality Assessment,
83, 306-322.
Freedenfeld, R. N., Orndoff, S. R., & Kelsey, R. M. (1995).
Object relations and physical abuse: A
TAT analysis. Journal ofPersonality Assessment, 64, 552-568.
Freud, S. (1957). "Wild" psychoanalysis. In J. Strachey (Ed. &
Trans.), The standard edition of
the works of Sigmund Freud (Vol. 11, pp. 221-227). London:
Hogarth Press. (Original work
published 1910)
Garb, H. N. (1998). Recommendations for training in the use of
the Thematic Apperception Test
(TAT). Professional Psychology, 29, 621-622.
Hall, C. S., Lindzey, G., & Campbell, J. B. (1998). Theori es of
personality (4th ed.). New York:
Wiley.
Handler, L. (2001). Assessment of men: Personality assessment
goes to war by the Office of Strategic
Services Assessment staff. Journal ofPersonality Assessment,
76, 558-578.
Henry, W. E. (1956). The analysis offantasy: The thematic
apperception technique in the study of
personality. New York: Wiley.
Hibbard, S. (2003). A critique of Lilienfeld et al. 's (2000) "The
scientific status of projective tech-
niques." Journal ofPersonality Assessment, 80, 260-271.
Hibbard, S., Mitchell, D., & Porcerelli, J. (2001). Internal
consistency of the Object Relations and
Social Cognition scales for the Thematic Apperception Test.
Journal ofPersonality Assessment,
77, 408-419.
Hibbard, S., & Porcerelli, J. (1998). Further validation for the
Cramer Defense Mechanisms manual.
Journal ofPersonality Assessment, 70, 460-483.
Thematic Apperception Test 477
Hofer, J., & Chasiotis, A. (2004). Methodological
considerations of applying a TAT-type picture-story
test in cross-cultural research. Journal of Cross-Cultural
Psychology, 35, 224-241.
Hogan, T. P. (2005). 50 widely used psychological tests. In G.
P. Koocher, J.C. Norcross, & S. S. Hill
III (Eds.), Psychologists' desk reference ( 2nd ed., pp. 101-104).
New York: Oxford University
Press.
Holmstrom, R. W., Silber, D. E., & Karp, S. A. (1990).
Development of the Apperceptive Personality
Test. Journal ofPersonality Assessment, 54, 252-264.
Huprich, S. K., & Greenberg, R. P. (2003). Advances in the
assessment of object relations in the
1990s. Clinical Psychology Review, 23, 665-698.
Jenkins, S. R. (in press). Handbook ofclinical scoring systems
for Thematic Apperception techniques.
Mahwah, NJ: Erlbaum.
Karon, B. P. (2000). The clinical interpretation of the Thematic
Apperception Test, Rorschach,
and other clinical data: A reexamination of statistical versus
clinical prediction. Professional
Psychology, 31, 230-233.
Karp, S. A., Holstrom, R. W., & Silber, D. E. (1989). Manual
for the Apperceptive Personality Test
(APT). Orland Park, IL: International Diagnostic Services.
Keiser, R. E., & Prather, E. N. (1990). What is the TAT? A
review of ten years of research. Journal
of Personality Assessment, 55, 800-803.
Kelly, F. D. (1999). The psychological assessment ofabused and
traumatized children. Mahwah, NJ:
Erlbaum.
Kelly, F. D. (2007). The clinical application of the Social
Cognition and Object Relations scale with
children and adolescents. In S. R. Smith & L. Handler (Eds.),
The clinical assessment ofchildren
and adolescents (pp. 169-182). Mahwah, NJ: Erlbaum.
Lanagan-Fox, J., & Grant, S. (2006). The Thematic
Apperception Test: Toward a standard measure
of the big three motives. Journal ofPersonality Assessment, 87,
277-291.
Lilienfeld, S. 0., Wood, J. M., & Garb, H. N. (2000). The
scientific status of projective techniques.
Psychological Science in the Public Interest, 1, 27-66.
McArthur, D. S., & Roberts, G. E. (1990). Roberts
Apperception Test for Children manual. Los
Angeles: Western Psychological Services.
McClelland, D. C. (1999). How the test lives on: Extensions of
the Thematic Apperception Test
approach. In L. Gieser & M. I. Stein (Eds.), Evocative images:
The Thematic Apperception Test
and the art ofprojection (pp. 163-175). Washington, DC:
American Psychological Association.
McClelland, D. C., Atkinson, J. W., Clark, R. A., & Lowell, E.
L. (1953). The achievement motive.
New York: Appleton-Century-Crofts.
McClelland, D. C., Clark, R. A., Roby, T. B., & Atkinson, J. W.
(1958). The effect of the need for
achievement on thematic apperception. In J. W. Atkinson (Ed.),
Motives in fantasy, action, and
society (pp. 64-82). Princeton, NJ: Van Nostrand.
Meyer, J. G. (2004). The reliability and validity of the
Rorschach and Thematic Apperception
Test (TAT) compared to other psychological and medical
procedures: An analysis of system-
atically gathered evidence. In M. Hersen (Editor-in-Chief), M.
Hilsenroth, & D. Segal (Vol.
Eds.), Comprehensive handbook of psychological assessment:
Vol. 2. Personality assessment
(pp. 315-342). Hoboken, NJ: Wiley.
Meyer, J. G., Finn, S. E., Eyde, L. D., Kay, G. G., Moreland, K.
L., Dies, R. R., et al. (2001).
Psychological testing and psychological assessment: A review
of evidence and issues. American
Psychologist, 56, 128-165.
Moretti, R. J., & Rossini, E. D. (2004). The Thematic
Apperception Test (TAT). In M. Hersen (Editor-
in-Chief), M. J. Hilsenroth, & D. L. Segal (Vol. Eds.),
Comprehensive handbook ofpsychological
assessment: Vol. 2. Personality assessment (pp. 356-371).
Hoboken, NJ: Wiley.
Morgan, C. D., & Murray, H. A. (1935). A method for
investigating fantasies: The Thematic Apper-
ception Test. Archives ofNeurology and Psychiatry, 34, 289-
306.
478 Performance-Based Measures
Morgan, W. G. (1995). Origin and history of Thematic
Apperception Test images. Journal ofPerson-
ality Assessment, 65, 237-254.
Morgan, W. G. (2002). Origin and history of the earliest
Thematic Apperception Test pictures. Journal
ofPersonality Assessment, 79, 422-445.
Morgan, W. G. (2003). Origin and history of the "Series B" and
"Series C" TAT pictures. Journal of
Personality Assessment, 81, 133-148.
Murray, H. A. ( 1938). Explorations in personality: A clinical
and experimental study offifty men of
college age. New York: Oxford University Press.
Murray, H. A. (1940). What should psychologists do about
psychoanalysis? Journal of Abnormal
and Social Psychology, 35, 150--175.
Murray, H. A. (1971). Thematic Apperception Test: Manual.
Cambridge, MA: Harvard University
Press. (Original work published 1943)
Murstein, B. I. (1963). Theory and research in projective
techniques (Emphasizing the TAT). New
York: Wiley.
Niec, L. N., & Russ, S. W. (2002). Children's internal
representations, empathy, and fantasy play: A
validity study of the SCORS-Q. Psychological Assessment, 14,
331-338.
Office of Strategic Services Assessment Staff. (1948).
Assessment of men. New York: Rinehart.
Ornduff, S. R., Freedendeld, R. N., Kelsey, R. M., & Critelli , J.
W. ( 1994 ). Object relations of sexually
abused female subjects: A TAT analysis. Journal of Personality
Assessment, 63, 223-238.
Ornduff, S. R., & Kelsey, R. M. (1996). Object relations of
sexually and physically abused female
children: A TAT analysis. Journal ofPersonality Assessment,
66, 91-105.
Pang, J. S., & Schultheiss, 0. C. (2005). Assessing implicit
motives in U.S. college students effects
of picture type and position, gender, and ethnicity, and cross -
cultural comparisons. Journal of
Personality Assessment, 85, 280--294.
PDM Task Force. (2006). Psychodynamic diagnostic manual.
Silver Spring, MD: Alliance of Psy-
choanalytic Organizations.
Peters, E. J., Hilsenroth, M. J., Eudell-Simmons, E. M., Blagys,
M. D., & Handler, L. (2006).
Reliability and validity of the Social Cognition and Object
Relations scale in clinical use.
Psychotherapy Research, 16, 617--616.
Porcerelli, J. H., Cogan, R., Kamoo, R., & Leitman, W. (2004).
Defense mechanisms and self-reported
violence toward partners and strangers. Journal ofPersonality
Assessment, 82, 317-320.
Porcerelli, J. H., & Hibbard, S. (2004). Projective assessment of
defense mechanisms. In M.
Hersen (Editor-in-Chief), M. J. Hilsenroth, & D. L. Segal (Vol.
Eds.), Comprehensive hand-
book ofpsychological assessment: Vol. 2. Personality
assessment (pp. 466-475). Hoboken, NJ:
Wiley.
Porcerelli, J. H., Thomas, S., Hibbard, S., & Cogan, R. (1998).
Defense mechanism development in
children, adolescents, and late adolescents. Journal of
Personality Assessment, 71, 411-420.
Prince, M. (1906). The dissociation of a personality: A
biographical study in abnormal psychology.
New York: Longmans.
Quinnell, F. A., & Bow, J. N. (2001). Psychological tests used
in child custody evaluations. Behavioral
Sciences and the Law, 19, 491-501.
Ritzier, B. A. (2004). Cultural applications of the Rorschach,
Apperception Tests, and figure drawings.
In M. Hersen (Editor-in-Chief), M. J. Hilsenroth, & D. L. Segal
(Vol. Eds.), Comprehensive
handbook ofpsychological assessment: Vol. 2. Personality
assessment (pp. 573-585). Hoboken,
NJ: Wiley.
Ritzier, B. A., Sharkey, K. J., & Chudy, J. F. (1980). A
comprehensive projective alternative to the
TAT. Journal ofPersonality Assessment, 44, 358-362.
Roberts, G. E. (2006). Roberts-2 manual. Los Angeles: Western
Psychological Services.
Robinson, F. G. (1992). Love's story told: A life of Henry A.
Murray. Cambridge, MA: Harvard
University Press.
Thematic Apperception Test 479
Sandstrom, M. J., & Cramer, P. (2003). Defense mechanisms
and psychological adjustment in child-
hood. Journal ofNervous and Mental Diseases, 191, 487-495.
Schretlen, D. J. (1997). Dissimulation on the Rorschach and
other projective measures. In R. Rogers
(Ed.), Clinical assessment of malingering and deception (2nd
ed., pp. 208-222). New York:
Guilford Press.
Shakespeare, W. (194 7). The tragedy ofHamlet, Prince
ofDenmark. New Haven, CT: Yale University
Press. (Original work published 1604)
Sharkey, K. J., & Ritzler, B. A. (1985). Comparing diagnostic
validity of the TAT and a new Picture
Projective Test. Journal ofPersonality Assessment, 49, 406-412.
Shneidman, E. S. (1951). Thematic test analysis. New York:
Grune & Stratton.
Shneidman, E. S. (1965). Projective techniques. In B. B.
Wolman (Ed.), Handbook of clinical psy-
chology (pp. 498-521). New York: McGraw-Hill.
Smith, C. P. (Ed.). (1992). Motivation and personality:
Handbook of thematic content analysis. New
York: Cambridge University Press.
Spangler, W. D. (1992). Validity of questionnaire and TAT
measures of need for achievement: Two
meta-analyses. Psychological Bulletin, 112, 140---154.
Stein, M. I. (1948). The Thematic Apperception Test. Reading,
MA: Addison-Wesley.
Stein, M. I., & Gieser, L. (1999). The zeitgeists and events
surrounding the birth of the Thematic
Apperception Test. In L. Gieser & M. I. Stein (Eds.), Evocative
images: The Thematic Apper-
ception Test and the art of projection (pp. 15-22). Washington,
DC: American Psychological
Association.
Stricker, G., & Gooen-Piels, J. (2004). Projective assessment of
object relations. In M. Hersen
(Editor-in-Chief), M. J. Hilsenroth, & D. L. Segal (Vol. Eds.),
Comprehensive handbook
of psychological assessment: Vol. 2. Personality assessment
(pp. 449-465). Hoboken, NJ:
Wiley.
Teglasi, H. (2001). Essentials ofTAT and other storytelling
techniques assessment. New York: Wiley.
Tomkins, S. S. (1947). The Thematic Apperception Test: The
theory and technique of interpretation.
New York: Grune & Stratton.
Vaillant, G. E. (1977). Adaptation to life. Boston: Little,
Brown.
Vaillant, G. E. (1994). Ego mechanisms of defense and
personality psychopathology. Journal of
Abnormal Psychology, 105, 44--50.
Vane, J. R. (1981). The Thematic Apperception Test: A review.
Clinical Psychology Review, 1,
319--336.
Weiner, I. B. (2003). Prediction and postdiction in clinical
decision making. Clinical Psychology:
Science and Practice, JO, 335-338.
Weiner, I. B. (2005). Integrative personality assessment with
self-report and performance-based
measures. In S. Strack (Ed.), Handbook of personology and
psychopathology (pp. 317-331).
Hoboken, NJ: Wiley.
Weiner, I. B., & Kuehnle, K. (1998). Projective assessment of
children and adolescents. In A.
S. Bellack & M. Hersen (Eds.), Comprehensive clinical
psychology: Vol. 4. Assessment
(pp. 432-458). New York: Pergamon Press.
Westen, D. (1991). Social cognition and object relations.
Psychological Bulletin, 109, 429-455.
Westen, D. (1995). Social Cognition and Object Relations
Scale: Q-Sort for Projective Stories
(SCORS-Q). Unpublished manuscript, Harvard Medical School,
Cambridge, MA.
Westen, D., Lohr, N. E., Silk, K., Gold, L., & Kerber, K.
(1990). Object relations and social cog-
nition in borderlines, major depressives, and normals: A
Thematic Apperception Test analysis.
Psychological Assessment, 2, 355-364.
Westen, D., Lohr, N. E., Silk, K., Kerber, K., & Goodrich, S.
(1989). Object relations and social
cognition TAT scoring manual (4th ed.). Unpublished
manuscript, University of Michigan, Ann
Arbor.
480 Performance-Based Measures
Winter, D. G. (1998). Toward a science of personality
psychology: David McClelland's development
of empirically derived TAT measures. History ofPsychology, 1,
130--153.
Winter, D. G. (1999). Linking personality and "scientific"
psychology: The development of empiri-
cally derived Thematic Apperception Test measures. In L.
Gieser & M. I. Stein (Eds.), Evocative
images: The Thematic Apperception Test and the art
ofprojection (pp. 106-124). Washington,
DC: American Psychological Association.
Zubin, J., Eron, L. D., & Schumer, F. (1965). An experimental
approach to projective techniques.
New York: Wiley.
a-425-435a-467-480
Chapter 11
RORSCHACH INKBLOT METHOD
The preceding five chapters have presented the most commonly
used self-report inventories
for assessing personality functioning. As noted in Chapter 1,
inventories of this kind differ
in several respects from performance-based personality
measures. Self-report inventories
provide direct assessments of personality characteristics in
which people are asked to
describe themselves by indicating whether certain statements
apply to them. Performance-
based measures are an indirect approach in which personality
characteristics are inferred
from the way people respond to various standardized tasks.
Self-report and performance-
based methods both bring advantages and limitations to the
assessment process, as discussed
in Chapters 1 and 2, and there are many reasons personality
assessments should ordinarily
be conducted with a multifaceted test battery that includes both
kinds of measures (see
pp. 13-15 and 22-26).
This and the following three chapters address the most widely
used performance-based
measures of personality functioning: the Rorschach Inkblot
Method (RIM), the Thematic
Apperception Test (TAT), figure drawing methods, and sentence
completion methods. These
and other performance-based personality measures have
traditionally been referred to as
projective tests and are still commonly labeled this way. As
pointed out in concluding
Chapter 1, however, "projective" is not an apt categorization of
these measures, and con-
temporary assessment psychologists prefer more accurate
descriptive labels for them such
as performance-based.
NATURE OF THE RORSCHACH INKBLOT METHOD
The Rorschach Inkblot Method (RIM) consists of 10 inkblots
printed individually on 6 %"
by 9 %" cards. Five of these blots are printed in shades of gray
and black (Cards I and
IV-VII); two of the blots are in shades of red, gray, and black
(Cards II and III); and the
remaining three blots are in shades of various pastel colors
(Cards VIII-X). In what is
called the Response Phase of a Rorschach examination, people
are shown the cards one
at a time and asked to say what they see in them. In the
subsequent Inquiry Phase of the
examination, persons being examined are asked to indicate
where in the blots they saw each
of the percepts they reported and what made those percepts look
the way they did.
These procedures yield three sources of data. First, the manner
in which people structure
their responses identifies how they are likely to structure other
situations in their lives.
People who base most of their responses on the overall
appearance of the inkblots and pay
little attention to separate parts of them are likely to be
individuals who tend to form global
impressions of situations and ignore or overlook details of these
situations. Conversely,
345
346 Performance-Based Measures
people who base most of their responses on parts of the blots
and seldom make use of
an entire blot are often people who become preoccupied with
the details of situations
and fail to grasp their overall significance-as in "not being able
to see the forest for the
trees." As another example of response structure, people who
report seeing objects that are
shaped similarly to the part of the blot where they are seeing
them are likely in general
to perceive people and events accurately, and hence to show
adequate reality testing. By
contrast, people who give numerous perceptually inaccurate
responses that do not resemble
the shapes of the blots are prone in general to form distorted
impressions of what they see,
and hence to show impaired reality testing.
As a second source of data, Rorschach responses frequently
contain content themes
that provide clues to a person's underlying needs, attitudes, and
concerns. People who
consistently describe human figures they see in the inkblots as
being angry, carrying
weapons, or fighting with each other may harbor concerns that
other people are potentially
dangerous to them, or they may view interpersonal relationships
as typified by competition
and strife. Conversely, a thematic emphasis on people described
as friendly, as carrying
a peace offering, or as helping each other in a shared endeavor
probably reveals a sense
of safety in interpersonal relationships and an expectation that
people will interact in
collaborative ways. In similar fashion, recurrent descriptions of
people, animals, or objects
seen in the blots as being damaged or dysfunctional (e.g., "a
decrepit old person"; "a
wounded bug"; "a piece of machinery that's rusting away") may
reflect personal concerns
about being injured or defective in some way, or about being
vulnerable to becoming injured
or defective.
The third source of data in a Rorschach examination consists of
the manner in which
individuals conduct themselves and relate to the examiner,
which provides behavioral indi-
cations of how they are likely to deal with task-oriented and
interpersonal situations. Some
of the behavioral data that emerge during a Rorschach
examination resemble observa-
tions that clinicians can make whenever they are conducting
interview or test assessments.
Whether people being assessed seem deferential or antagonistic
toward the examiner may
say something about their attitudes toward authority. Whether
they appear relaxed or ner-
vous may say something about how self-confident and self-
assured they are and about how
they generally respond to being evaluated.
The RIM also provides some test-specific behavioral data in the
form of how people
handle the cards and how they frame their responses. Do they
carefully hand each card back
to the examiner when they are finished responding to it, or do
they carelessly toss the card
on the desk? Do they give definite responses and take
responsibility for them (as in "This
one looks to me like a bat"), or do they disavow responsibility
and avoid commitment (as
in "It really doesn't look like anything to me, but if I have to
say something, I'd say it might
look something like a bat")?
To summarize this instrument, then, the RIM involves each of
the following three tasks:
1. A perceptual task yielding structural information that helps to
identify personality
states and traits
2. An associational task generating content themes that contain
clues to a person's
underlying needs and attitudes
3. A behavioral task that provides a representative sample of an
individual's orientation
to problem-solving and interpersonal situations
Rorschach Inkblot Method 347
In parallel to these three test characteristics, Rorschach
assessment measures personality
functioning because the way people go about seeing things in
the inkblots reflects how they
look at their world and how they customarily make decisions
and deal with events. What
they see in the inkblots provides a window into their inner life
and the contents of their
mind, and how they conduct themselves during the examination
provides information about
how they usually respond to people and to external demands.
By integrating these structural, thematic, and behavioral
features of the data, Rorschach
clinicians can generate comprehensive personality descriptions
of the people they examine.
These descriptions typically address adaptive strengths and
weaknesses in how people
manage stress, how they attend to and perceive their
surroundings, how they form concepts
and ideas, how they experience and express feelings, how they
view themselves, and how
they relate to other people. Later sections of this chapter
elaborate the codification, scoring,
and interpretation of Rorschach responses and delineate how
Rorschach-based descriptions
of personality characteristics facilitate numerous applications of
the instrument. As a further
introduction to these topics and to the psychometric features of
the RIM, the next two
sections of the chapter review the history of Rorschach
assessment and standard procedures
for administering the instrument.
HISTORY
Of the personality assessment instruments discussed in this text,
the Rorschach Inkblot
Method has the longest and most interesting history because it
was shaped by diverse
personal experiences and life events. The inkblot method first
took systematic form in the
mind of Hermann Rorschach, a Swiss psychiatrist who lived
only 37 years, from 1885
to 1922. As a youth, Rorschach had been exposed to inkblots in
the form of a popular
parlor game in tum-of-the-century Europe called
Klecksographie. Klecks is the German
word for "blot," and the Klecksographie game translates loosely
into English as "Blotto."
The game was played by dropping ink in the middle of a piece
of paper, folding the paper
in half to make a more or less symmetrical blot, and then
competing to see who among
the players could generate the most numerous or interesting
descriptions of the blots or
suggest associations to what they resembled. According to
available reports, Rorschach's
enthusiasm for this game, which appealed to adolescents as well
as adults, and his creativity
in playing it led to his being nicknamed "Klex" by his high
school classmates (Exner, 2003,
chap. 1).
From 1917 to 1919, while serving as Associate Director of the
Krombach Mental
Hospital in Herisau, Switzerland, Rorschach pursued a notion he
had formed earlier in
his career that patients with different types of mental disorders
would respond to inkblots
differently from each other and from psychologically healthy
people. To test this notion, he
constructed and experimented with a large number of blots, but
these were not the accidental
ink splotches of the parlor games. Rorschach was a skilled
amateur artist who left behind
an impressive portfolio of drawings that can be viewed in the
Rorschach Archives and
Museum in Bern, Switzerland. The blots with which he
experimented were carefully drawn
by him, and over time he selected a small set that seemed
particularly effective in eliciting
responses and reflecting individual differences.
348 Performance-Based Measures
Rorschach then administered his selected set of blots to samples
of 288 mental hos-
pital patients and 117 nonpatients, using a standard instruction,
"What might this be?"
Rorschach published his findings from this research in a 1921
monograph titled Psychodi-
agnostics (Rorschach, 1921/1942). The materials and methods
described by Rorschach
in Psychodiagnostics provided the basic foundation for the
manner in which Rorschach
assessment has been most commonly practiced since that time,
and the standard Rorschach
plates used today are the same 10 inkblots that were published
with Rorschach's original
monograph.
Rorschach's monograph was nevertheless a preliminary work,
and he was just beginning
to explore potential refinements and applications of the inkblot
method when he succumbed
a year after its publication to peritonitis, following a ruptured
appendix. The monograph
itself did not attract much attention initially, and the method
might have succumbed along
with its creator were it not for the efforts of a few close friends
and colleagues of Rorschach
who were devoted to keeping the method alive. Their efforts
were facilitated by the fact that
Switzerland in the 1920s was a Mecca for medical scientists and
researchers, who visited
from many parts of the world to study with famous physicians at
Swiss hospitals and
medical schools. Some of these visiting scholars and
practitioners heard about Rorschach's
method while they were in Switzerland and took copies of the
inkblots home with them.
As a result, articles on the Rorschach were published during the
1920s in such diverse
countries as Russia, Peru, and Japan.
Turning to how the Rorschach came to the United States, an
American psychiatrist named
David Levy went to Zurich in the mid-1920s to study for a year
with Emil Oberholzer, a
prominent psychoanalyst who had been one of Rorschach's good
friends and supporters.
Levy returned to the United States with several copies of the
inkblots, and that is how the
Rorschach came to America. Levy's interests lay elsewhere, and
the Rorschach materials
languished for a time in his desk at the New York Institute of
Guidance. Then, in 1929,
Samuel Beck, a graduate student at Columbia University who
was doing a fellowship at the
Institute, mentioned to Levy that he was looking for a
dissertation topic. Levy told Beck
about the Rorschach materials he had brought back from
Switzerland and suggested that
Beck might do a research project with them. Acting on this
suggestion, Beck earned his
doctorate with a Rorschach standardization study of children.
While collecting his data,
Beck published the first two English language articles on the
method in 1930 (Beck, 1930a,
1930b). He followed these articles 7 years later with
Introduction to the Rorschach Met hod,
which was the first English language monograph on the
Rorschach, and in 1944 with the
first edition of his basic text, Rorschach's Test: I. Basic
Processes (Beck, 1937, 1944).
Throughout a long, productive career, Beck remained an
influential figure in Rorschach
assessment, and his contributions became internationally known
and respected.
In 1934, Beck went to Switzerland for a year's study with
Oberholzer, and his departure
coincided with the arrival from Zurich of another Rorschach
pioneer, Bruno Klopfer.
Klopfer had received a doctorate in educational psychology in
1922 and by 1933 had
advanced to a senior staff position at the Berlin Information
Center for Child Guidance. He
also had become interested in Jungian psychoanalytic theory
and was in the final phases
of completing training as a Jungian analyst. However, the
restrictions being placed on
Jews in Adolf Hitler's Germany at that time led Klopfer to an
advisedly dim view of his
future professional prospects in Berlin, and he decided to move
to Zurich. Without a job
in Zurich, he was helped by Carl Jung to obtain a position as a
technician at the Zurich
Rorschach Inkblot Method 349
Psychotechnic Institute. Klopfer's responsibilities at the
Institute included psychological
testing of applicants for various jobs, and the Rorschach was
among the tests he was
required to use for this purpose. He had no previous interest or
experience in testing, but
he soon became intrigued with the ways in which Rorschach
responses could reveal the
underlying thoughts and feelings of the people he was testing.
Klopfer was dissatisfied with his low status as a technician and
soon began looking for
other opportunities. His search resulted in his being appointed
as a research associate in the
Department of Anthropology at Columbia University, where he
began working in 1934.
Having learned of his arrival on campus, a group of psychology
graduate students asked their
department to arrange for Klopfer to give them some Rorschach
training. Unimpressed with
Klopfer's credentials, the department declined to hire him for
this purpose. The students
were not deterred, however, and they approached Klopfer
privately about offering some
evening seminars for them in his home, which he agreed to do.
Giving these seminars for this and subsequent groups of
students and professionals
produced a network of Klopfer-trained psychologists who were
eager to keep in touch
with each other and continue exchanging ideas about the
Rorschach. In response to this
interest, Klopfer in 1936 founded the Rorschach Research
Exchange, which has been
published regularly since that time and evolved into the
contemporary Journal ofPersonality
Assessment. In 1938, Klopfer founded the Rorschach Institute, a
scientific and professional
organization that continues to function actively today, and more
broadly than Klopfer
envisioned, as the Society for Personality Assessment. Klopfer's
first Rorschach book, The
Rorschach Technique, appeared in 1942, but it was not until
1954 that he published his
definitive basic text, Developments in the Rorschach Technique:
Volume 1. Technique and
Theory (Klopfer, Ainsworth, Klopfer, & Holt, 1954; Klopfer &
Kelley, 1942).
Because one of them needed a dissertation topic and the other
needed a job, then,
these two Rorschach pioneers were drawn into a lifetime
engagement with the inkblot
method. Like Beck, Klopfer gained international acclaim for his
teaching and writing
about Rorschach assessment. Regrettably for the development of
the instrument, Beck and
Klopfer approached their work from very different perspectives.
Having been educated in
an experimentally oriented department of psychology, Beck was
interested in describing
personality characteristics and was firmly committed to
advancing knowledge through
controlled research designs and empirical data collection. He
stuck closely to Rorschach's
original procedures for administration and coding, and he
favored a primarily quantitative
approach to Rorschach interpretation. With respect to the
distinction between nomothetic
and idiographic approaches in personality assessment discussed
in Chapters 1 (pp. 12-13)
and 2 (p. 34 ), Beck was very much in the nomothetic camp.
Klopfer, on the other hand, was a Jungian analyst at heart and
an enthusiast for idiography.
He had a strong interest in symbolic meanings and with
umaveling the phenomenology of
each person's human experience. He employed qualitative
approaches to interpretation that
Beck considered inappropriate, and he added many new
response codes and summary scores
on the basis of imaginative ideas rather than research data,
which Beck found unacceptable.
These differences in perspective led Beck and Klopfer to
formulate and promulgate
distinctive Rorschach systems that involved dissimilar
approaches to administering, scoring,
and interpreting the test. Divergence in method did not stop
with these two pioneers,
however. In the early 1930s, Beck talked about his Rorschach
research with Marguerite
Hertz, the wife of an old friend of his, who was working on her
doctorate in psychology
350 Performance-Based Measures
at Western Reserve University in Cleveland. Hertz became an
ardent enthusiast for the
value of Rorschach assessment, especially in working with
children. She developed some
distinctive variations of her own in Rorschach administration,
scoring, and interpretation,
and, in the course of a long and productive life as a university
professor, she taught her
approach to many generations of graduate students and
workshop participants.
Klopfer's first seminar group included several psychology
graduate students and a friend
of one of these students who had encouraged him to sit in. This
friend was Zygmunt
Piotrowski, who at the time was a postdoctoral fellow at the
Neuropsychiatric Institute in
New York. Piotrowski had received a doctorate in experimental
psychology in Poland
in 1927 and was in the United States for advanced study in
neuropsychology. Aside
from curiosity, he had little interest in Rorschach assessment
when he joined Klopfer's
seminar group. However, he soon began to contemplate the
possibility that persons with
various kinds of neurological disorders might respond to the
inkblots in ways that would
help identify their condition. Piotrowski subsequently pioneered
in conducting Rorschach
research with brain-injured patients, and he developed many
creative ideas about how the
inkblot method should be conceived, coded, and interpreted.
These new ideas coalesced
into a Rorschach system that Piotrowski called Perceptanalysis
(Piotrowski, 1957). Like
Beck, Klopfer, and Hertz, Piotrowski worked productively
throughout a long life during
which his courses, publications, and lectures introduced a loyal
following to his particular
Rorschach system.
This early history of the Rorschach in America came to a close
with the arrival in
the United States of another refugee from Europe, David
Rapaport, a psychoanalytically
oriented doctoral-level psychologist who fled his native
Hungary in 1938. In 1940, Rapaport
joined the staff of the Menninger Foundation in Topeka,
Kansas, where 2 years later he
became head of the psychology department His responsibilities
at the Foundation included
mounting a research project to evaluate the utility of a battery
of psychological tests for
describing people and facilitating differential diagnosis. The
Rorschach was part of this
test battery, and Rapaport's collaborators in the project included
Roy Schafer, who was an
undergraduate psychology student at the time and completed his
doctoral studies several
years later at Clark University, after moving from the
Menninger Foundation to the Austen
Riggs Center in Massachusetts (see Schafer, 2006).
Rapaport's psychoanalytic perspectives and many original ideas
that he and Schafer
formed about how to elicit and interpret Rorschach responses
resulted in their using a
modified inkblot method that differed substantially from any of
the previous methods.
Publication of a 2-volume treatise based on the Menninger
research project and subsequent
influential books by Schafer established the Rapaport/Schafer
system as another alternative
for practitioners and researchers to consider in their work with
the Rorschach (Rapaport,
Gill, & Schafer, 1946/1968; Schafer, 1948, 1954).
By 1950, then, there were five different Rorschach systems in
the United States, each
with its own adherents. Moreover, even though the Beck and
Klopfer systems had become
well-known abroad, the Rorschach landscape also included
distinctive systems developed
in other countries and popular among psychologists in Europe,
South America, and Japan.
This diversity of method made it difficult for Rorschach
practitioners to communicate with
each other and almost impossible for researchers to cumulate
systematic data concerning
the reliability of Rorschach findings and their validity for
particular purposes. This problem
persisted until the early 1970s, when John Exner undertook to
resolve it by standardizing
Rorschach Inkblot Method 351
the Rorschach method in a conceptually reasonable and
psychometrically sound manner.
Having conducted a detailed comparative analysis of the five
American systems (Exner,
1969), Exner instituted a research program to measure the
impact of the different methods
of administration used in the systems and to identify which of
their response codes could be
explained clearly and coded reliably. Drawing on what appeared
to be the best features of
each of the five American systems, Exner combined them into a
Rorschach Comprehensive
System (CS) that he published in 1974 (Exner, 1974).
The Rorschach CS provides specific and detailed instructions
for administration and
coding that are to be followed in exactly the same way in every
instance. Now in its
fourth edition (Exner, 2003), the CS has become by far the most
frequently used Rorschach
system in the United States as well as in many other countries
of the world. Widespread
adoption of the CS standardization has made possible the
development of large sample
normative standards and international collaboration in
examining cross-cultural similarities
and differences in Rorschach responses. The cross-cultural
applicability of Rorschach
assessment has provided a unique large-scale opportunity to
compare and understand
different cultures from all over the world (see Shaffer, Erdberg,
& Meyer, 2007).
Standard Rorschach procedures have also fostered systematic
collection and comparison
of data concerning intercoder agreement, retest reliability, and
criterion, construct, and
incremental validity, both in the United States and abroad,
which are reviewed later in the
chapter. The advent of the CS has additionally allowed
clinicians who use it to exchange
information about Rorschach findings with confidence that
these findings are based on the
same method of obtaining and codifying the data. The next two
sections of the chapter
provide an overview of the CS administration and coding
procedures.
ADMINISTRATION
To preserve standardization for the reasons just mentioned,
Rorschach examiners should
follow as closely as possible the administration and coding
procedures delineated for the
CS by Exner (2003). Prior to beginning the testing, as discussed
in Chapter 2, the examiner
should have discussed with the person being evaluated such
matters as the purposes of the
assessment and how and to whom the results will be
communicated. People are entitled
to information about these matters, and even a brief discussion
of them can be helpful in
establishing rapport, reducing concerns the person may have
about being examined, and
clarifying misconceptions about the testing process. Typically,
the RIM is part of a test
battery that can be introduced in general terms such as the
following: "As for the tests
we 're going to do, I'll be asking you questions about various
matters and giving you some
tasks to do; let's get started, and I'll show you what each of
these tests is like as we do
them."
In preparing to administer the RIM, the examiner should have
the cards face down in a
single pile where they can be seen but not easily reached by the
examinee. The examiner
should also sit alongside the person or at an angle that is at
least slightly behind the examinee
and out of the person's direct line of vision. This arrangement
makes it easy for people to
show the examiner where on the blots they are seeing their
percepts. Avoiding face-to-face
administration also minimizes the possible influence on test
responses of an examiner's
facial expressions or other bodily movements. The Rorschach
administration should begin
352 Performance-Based Measures
with the following type of explanation:
The next test we're going to do is one you may have heard of.
It's often referred to as the
inkblot test, and it's called that because it consists of a series of
cards with blots of ink on
them. The blots aren't anything in particular, but when people
look at them, they see different
things in them. There are 10 of these cards, and I'm going to
show them to you one at a time
and ask you what kinds of things you see in them and what they
look like to you.
No further explanation should routinely be given of Rorschach
procedures or of what can
be learned from Rorschach responses. Should examinees ask,
"How does this test work?"
they can be told the following: "The way people look at things
says something about what
they are like as a person, and this test will give us information
about your personality
that should be helpful in ... [some reference to the purpose of
the examination]." Should
examinees say something on the order of "So this will be a test
of my imagination" or "You
want me to tell you what they remind me of?" the perceptual
elements of the Rorschach
task should be emphasized by indicating otherwise: "No, this is
a test of what you see
in the blots, and I want you to tell me what they look like to
you." If there are no such
questions or comments that examiners must answer first, they
should proceed directly after
their explanation by handing the person Card I and saying,
"What might this be?"
People will usually take Card I when it is handed to them and
should be asked to do so if
necessary. Having people hold the cards promotes their
engagement in the Rorschach task,
and, as mentioned, the manner in which they handle the cards
can be a source of useful
behavioral data. In other respects, the individual's task during
the Response Phase of the
administration should be left as unstructured as possible. In
response to questions ("How
many responses should I give?" "Can I tum the card?" "Do I use
the whole thing or parts
of it as well?"), examiners should provide noncommittal replies
("It's up to you"; "Any
way you wish"). Should the person begin by saying "It's an
inkblot," the examiner should
restate the basic instruction: "Yes, that's right, but what you
need to do is tell me what it
looks like to you, what kinds of things you see in it."
Occasionally, some additional procedures may be necessary to
obtain a record of suffi-
cient but manageable length. A minimum of 14 responses is
required to ensure the validity
of a Rorschach protocol. Records with fewer than 14 responses
are too brief to be entirely
reliable and rarely support valid interpretations. To decrease the
risk of ending up with
a record of insufficient length, persons who give only one
response to Card I should be
prompted by saying, "If you look at it some more, you'll see
other things as well." If
the person still does not produce more than one response, the
single response should be
accepted and the card taken back. However, individuals who
have given just one or two
responses to Card I, and then handed back or put down Cards II,
III, or IV after only a
single response, can be offered the following indirect
encouragement, should they seem
disengaged from their task and on their way to producing a brief
record with fewer than 14
responses: "Wait, don't hurry through these; we're in no hurry,
take your time." Should the
Response Phase for all IO cards yield fewer than 14 responses,
despite such prompting and
encouragement, the examiner should implement the following
instructions:
Now you know how it's done. But there's a problem. You didn't
give enough answers for us
to learn very much from the test. So let's go through them again,
and this time I'd like you to
give me more responses. You can include the same ones you've
already given, if you like, but
give me more answers this time through.
Rorschach Inkblot Method 353
There is also a standard procedure for not taking more
responses than are necessary for
interpretive purposes. If a person has given five responses to
Card I and appears about to
give more, the examiner should take the card back while saying,
"Okay, that's fine, let's
go on to the next one." This procedure can be repeated on each
subsequent card, should
the person continue to give five responses and appear ready to
give more. However, if
on any card the person gives fewer than five responses, the
limiting procedure should be
discontinued and not resumed, even if the person later on gives
more than five responses
to some card. Exner (2003, pp. 52-56) identifies some unusual
circumstances that might
warrant departing from these standard guidelines for increasing
or curtailing response total,
but the procedures presented here suffice with few exceptions to
direct the Response Phase
of the administration.
Of additional importance in conducting both the Response Phase
and the subsequent
Inquiry Phase is verbatim recording of whatever the examiner
and the examinee say.
Accurate coding and thorough interpretation depend on having a
complete account of
exactly how people expressed themselves and precisely what
they were told or asked by the
examiner. Most examiners rely on a system of abbreviations to
simplify the task of recording
a verbatim protocol; for example, using "II a bfly" for "Looks
like a butterfly" or "enc" to
indicate when they have used the encouragement prompt after
getting only one response on
Card I. Some examiners tape-record Rorschach administrations
to ensure preservation of
the verbatim record. Whatever means is used, adequate
Rorschach administration demands
maintaining the integrity of the raw data. To this end, examiners
should write down how
examinees behave during the administration as well as what
they say (e.g., "laughed," "big
sigh," "detached, looking at ceiling") to provide the behavioral
data that emich Rorschach
interpretation.
Following completion of the Response Phase, the examiner
should introduce the Inquiry
Phase of the administration as follows:
Now I want to take a moment to go through these cards with
you again, so that I can see the
things you saw. I'll read back each of the things you said, and
for each one I'd like you to tell
me where you saw it and what made it look like that to you.
The examiner should then hand the cards to the person one at a
time, say for each
response something on the order of "On this one you saw ..." or
"Then you said ... "
or "Next there was ... ," and then complete this statement with a
verbatim reading of the
person's exact words. Nondirective prompts should then be used
as necessary to help people
comply with the inquiry instructions by clarifying what they
have seen, where on the blot
they saw it, and why it looked as it did to them. With respect to
what the person has seen,
appropriate prompts would include such statements and
questions as "I'm not sure what it
is you're seeing there," "Is it the whole person or just part of the
person?" or "You said it
could be a butterfly or a moth-which does it look more like to
you?"
To inquire about where the person has seen a percept, the
examiner might ask, "How
much of the blot is included in it?" or say, "You mentioned a
head and a tail, and I'm not
clear which part of the blot is which." Should the response to
such questions or statements
leave unclear where a percept has been seen, examinees should
be asked to outline with
their finger the area of the blot they were using for it. Inquiry
about what made a percept
look as it did can take the form of such questions as "What
made it look like that to you?"
"What helped you see it that way?" or "What about the blot
suggested that to you?" In
354 Performance-Based Measures
each of these aspects of the Inquiry Phase, examiners should
strive as much as possible
to eliminate ambiguity concerning the what, where, and why of
a response, because such
ambiguities in responses are the main source of uncertainty in
deciding how to code
them.
As these nondirective questions and statements illustrate, a
paramount principle of con-
ducting a Rorschach inquiry is to avoid leading the examinee or
providing clues to what
may be expected or desired. For example, "Are the people doing
anything?" and "Did the
color help you see it that way?" are inappropriate questions,
because they can convey that
movement and color are important for the person to note. Such
messages can influence
individuals to articulate more movement or color determinants
during the course of an
inquiry than they would have otherwise. As a similar precaution
against conveying unin-
tended messages, examiners should avoid the question
"Anything else?" Asking "Anything
else?" can suggest that more is expected from the person, or that
something has been left
out, either of which can lead individuals to say more than they
would have otherwise and
thereby detract from the standardization of the administration.
A second guiding principle in conducting the Inquiry Phase
concerns its basic purpose,
which is to enable accurate coding of the response. With this
principle in mind, examiners
should stop inquiring about a response once they have obtained
enough information to
code it. For example, "Two people standing there" is clearly a
human movement response
that, as indicated in the next section, warrants coding an M. It is
neither necessary nor
appropriate to ask, "What makes it look like two people
standing there?" The additional
question in this instance would not generate any information
necessary to code an M.
Asking such unnecessary questions violates CS standardization
and may have the unwanted
consequence of eliciting response elaborations that, however
interesting, would not have
occurred if standard procedures had been followed.
Should a person report, "Two funny-looking people picking up a
basket," there is no
need to inquire about the human movement, but two other
inquiry questions would be called
for: "What suggests that the people are funny-looking?" and
"What helped you see this part
as a basket?" The first question illustrates the importance of
inquiring about key words in an
individual's responses, particularly nouns, adjectives, verbs, and
adverbs that give responses
a potentially distinctive flavor. Consider the following
examples, with the key words shown
in italics: "Two witches dancing" [Inquiry: What suggests they
are witches?]; "Two old
people dancing" [Inquiry: What makes them look like old
people?]; "Two people arguing or
fighting" [Inquiry: What helps you see them as arguing or
fighting?]; "Two people walking
along slowly" [Inquiry: What gave you the idea that they're
walking slowly?]. The second
question illustrates the importance of inquiring about each part
of a complex response. Thus
"Animals climbing a tree" requires clarifying the where and the
why for both the animals
and the tree, "A jet plane with exhaust coming out the back"
must be inquired sufficiently
to code both the plane and the exhaust, and so on.
CODING AND SCORING
The scoring of a Rorschach protocol is a two-step process. The
first step consists of
assigning each response a set of codes that identify various
features of how the response
has been formulated and expressed. The second step consists of
combining these response
Rorschach Inkblot Method 395
This guideline does not preclude person-specific features of
card pull that may influence
a person's behavior or responses on Card IX. The popular
human figures may in some
instances pull an impression that they are fighting, in which
case Card IX could arouse
some concerns about aggression. Similarly, the resemblance of
the lower middle red detail
of Card IX to female genitals could evoke some sexual concerns
that affect a person's
manner and responses while looking at this card. Neither of
these possible Card IX pulls is
as strong or common as the other card pulls identified in this
section.
CardX
The broken appearance of Card X and its array of loosely
connected but rather sharply
defined and colored details give it a close structural
resemblance to Card VIII. At the same
time, the sheer number of variegated shapes and colors on Card
X imbue it with the same
type of uncertainty and complexity posed by Card IX. Although
Card X is usually seen
as a pleasant stimulus and offers examinees many alternative
possibilities for easily seen
percepts, the challenge of organizing it effectively makes it the
second most difficult card
to manage, after Card IX. Particularly for people who feel
overwhelmed or overburdened
by having to deal with many things at once, responding to Card
X, despite its pleasant
appearance and bright colors, may be a disconcerting experience
that they dislike and are
happy to complete.
Finally of note is the position of Card X as the final card. Just
as the initial response in a
record may be a way for people to sign in and introduce what
they feel is important about
themselves, the last response may serve as an opportuni ty to
sign out by indicating, in effect,
"When all is said and done, this is where things stand for me
and what I want you to know
about me." As a parallel to the example given earlier of a sign-
in response, consider the
contrasting implications of the following responses for the
present status of two depressed
persons. The first one concluded Card X by saying, "And it
looks like everything is falling
apart"; the second one concluded, "And it's brightly colored,
like the sun is coming up."
APPLICATIONS
In common with the self-report inventories presented in
Chapters 6 through I 0, the RIM is an
omnibus personality assessment instrument, in the sense that it
provides information about
a broad range of personality characteristics. As elaborated in
discussing the interpretive
significance of Rorschach findings, these data shed light on the
adequacy of a person's
adaptive capacities in several key respects, on the types of
psychological states and traits
that define what the person is like, and on the underlying needs,
attitudes, conflicts, and
concerns that may be influencing the person's behavior. Such
information about personality
functioning serves practical purposes by helping to identify (a)
the presence and nature of
psychological disorder, (b) whether a person needs and is likely
to benefit from various
kinds of treatment, and (c) the probability of a person's
functioning effectively in certain
kinds of situations.
By serving these purposes, the RIM frequently facilitates
making decisions that are
based in part on personality characteristics. Such personality-
based decisions commonly
396 Performance-Based Measures
characterize the practice of clinical, forensic, and
organizational psychology, the three
contexts in which Rorschach assessment finds its most frequent
applications.
Clinical Practice
Rorschach assessment contributes to clinical practice by
assisting in differential diagno-
sis and treatment planning and outcome evaluation. With
respect to differential diagnosis,
many states and traits identified by Rorschach variables are
associated with particular forms
of psychopathology. Schizophrenia is usually defined to include
disordered thinking and
poor reality testing, and Rorschach evidence of these cognitive
impairments (low XA %
and WDA %, an elevated WSum6) accordingly indicates the
likelihood of a schizophrenia
spectrum disorder. Similarly, because paranoia involves being
hypervigilant and interper-
sonally aversive, a positive HVI suggests the presence of
paranoid features in how people
look at their world. Depressive disorder is suggested by
Rorschach indices of dysphoria
( elevated C', Col-Shd Bids) and negative self-attitudes (
elevated V, low Jr+ 2/R), obsessive-
compulsive personality disorder is suggested by indices of
pedantry and perfectionism
(positive OBS), and so on. To learn more about these and other
applications of Rorschach
findings in differential diagnosis, readers are referred to articles
and books by Hartmann,
Norbech, and Gr11mner!?)d (2006), Huprich (2006), Kleiger
(1999), and Weiner (2003b).
The applications to which the RIM contributes by measuring
personality characteristics
identify its limitations as well. In assessing psychopathology,
Rorschach data are of little use
in determining the particular symptoms a person is manifesting.
Someone with Rorschach
indications of an obsessive-compulsive personality style may be
a compulsive hand washer,
an obsessive prognosticator, or neither. Someone with
depressive preoccupations may be
having crying spells, disturbed sleep, or neither. There is no
isomorphic relationship between
the personality characteristics of disturbed people and their
specific symptoms. Accordingly,
the nature of these symptoms is better determined from
observing or asking directly about
them than by speculating about their presence on the basis of
Rorschach data.
Likewise, Rorschach data do not provide dependable indications
concerning whether a
person has had certain life experiences (e.g., been sexually
abused) or behaved in certain
ways (e.g., abused alcohol or drugs). Only when there is a
substantial known correlation
between specific personality characteristics and the likelihood
of certain experiences or be-
havior having occurred can Rorschach findings provide reliable
postdictions, as mentioned.
The predictive validity of Rorschach findings are similarly
limited by the extent to which
personality factors determine whatever is to be identified or
predicted.
As for treatment planning, Rorschach findings measure
personality characteristics that
have a bearing on numerous decisions that must be made prior
to and during an intervention
process. The degree of disturbance or coping incapacity
reflected in Rorschach responses
assists in determining whether a person requires inpatient care
or is functioning sufficiently
well to be treated as an outpatient. Considered together with the
person's preferences, the
personality style and severity of distress or disorganization
revealed by Rorschach findings
help indicate whether treatment needs will best be met by a
supportive approach oriented
to relieving distress, a cognitive-behavioral approach designed
to modify symptoms or
behavior, or an exploratory approach intended to enhance self-
understanding. Whichever
treatment approach is implemented, the maladaptive personality
traits and the underlying
concerns identified by the Rorschach data can help therapists
determine, in consultation
Rorschach Inkblot Method 397
with their patients, what the goals for the treatment should be
and in what order these
treatment targets should be addressed (see Weiner, 2005b).
Some predictive utility derives from the fact that certain
personality characteristics mea-
sured by Rorschach variables are typically associated with
ability to participate in and
benefit from psychological treatment. These personality
characteristics include being open
to experience (Lambda not elevated), cognitively flexible
(balanced a:p), emotionally re-
sponsive (adequate WSumC and Afr), interpersonally receptive
(presence of T, adequate
SumH), and personally introspective (presence of FD), each of
which facilitates engage-
ment and progress in psychotherapy. By contrast, having an
avoidant or guarded approach
to experience, being set in one's ways, having difficulty
recognizing and expressing one's
feelings, being interpersonally aversive or withdrawn, and
lacking psychological minded-
ness are often obstacles to progress in psychotherapy (Clarkin &
Levy, 2004; Weiner, 1998,
chap. 2).
In a research project relevant to the utility of the RIM in
guiding therapist activity
once treatment is underway, Blatt and Ford (1994) used
Rorschach variables to assist in
categorizing patients as having problems primarily with forming
satisfying interpersonal
relationships (called anac/itic) or primarily with maintaining
their own sense of identity,
autonomy, and self-worth ( called introjective). In the course of
their subsequent psychother-
apy, the anaclitic patients studied by Blatt and Ford were
initially more involved in and
responsive to relational aspects of the treatment than the
introjective patients, who were
more attuned to and influenced by their therapist's
interpretations than by attention to the
treatment relationship.
By helping to identify treatment goals and targets, Rorschach
assessment can also
be helpful in monitoring treatment progress and evaluating
treatment outcome. Suppose
that a RIM is administered prior to beginning therapy and
certain treatment targets can
be identified in Rorschach terms (e.g., reducing subjectively felt
distress, as in changing
D < 0 to D = O; increasing receptivity to emotional arousal, as
in bringing up a low Afr;
promoting more careful problem solving, as in reducing a Zd < -
3.5). Retesting after
some period of time can then provide quantitative indications of
how much progress has
been made toward achieving these goals and how much work
remains to be done on them.
Rorschach evidence concerning the extent to which the goals of
the treatment have been
achieved can guide therapists in deciding if and when
termination is indicated. Similarly,
comparing Rorschach findings at the point of termination or in a
later follow-up evaluation
with those obtained in a pretreatment evaluation will provide a
useful objective measure of
the effects of the treatment, for better or worse.
Both research findings and case reports have demonstrated how
Rorschach assessment
can be applied in treatment outcome evaluation. In studies
reported by Weiner and Exner
(1991) and Exner and Andronikof-Sanglade (1992), patients in
long-term, short-term, and
brief psychotherapy were examined at several points during and
after their treatment.
The data analysis focused on 27 structural variables considered
to have implications for
a person's overall level of adjustment. The results of both
studies showed significant
positive changes in these Rorschach variables over the course of
therapy, consistent with
expectation, and the amount of improvement was associated
with the length of the therapy.
These findings were considered to demonstrate both the
effectiveness of psychotherapy
in promoting positive personality change and the validity of the
RIM in measuring such
change.
398 Performance-Based Measures
In a study with similar implications, Fowler et al. (2004)
monitored the progress of
a group of previously treatment-refractory patients who entered
a residential treatment
center and were engaged in psychodynamically oriented
psychotherapy. After a treatment
duration averaging 16 months, these patients showed significant
improvement in their
average behavior ratings on scales related to social and
occupational functioning, and
these improvements were matched by significant changes for the
better in their average
scores on three Rorschach scales based on response content.
With its thematic imagery
as well as its structural variables, then, Rorschach assessment
has been shown to provide
valid measurement of treatment progress, while helping to
demonstrate the effectiveness of
the treatment. Readers are referred to Weiner (2004a, 2005a) for
additional discussion of
Rorschach monitoring of psychotherapy and a detailed case
study that illustrates positive
Rorschach changes accompanying successful psychotherapy.
Forensic Practice
In the clinical applications just discussed, diagnostic inferences
derive from linkages be-
tween personality characteristics that typify certain disorders
and Rorschach variables that
measure these characteristics. In similar fashion, forensic
applications of Rorschach as-
sessment in criminal, civil, and family law cases derive from a
translation of legal concepts
into psychological terms.
In criminal law, the two questions most commonly addressed to
consulting psychologists
concern whether an accused person is competent to proceed to
trial and whether the person
can or should be held responsible for the alleged criminal
behavior. Being competent in
this context consists of having a rational and factual
understanding of the legal proceedings
one is facing and being able to participate effectively in one's
own defense. These principal
components of competency are commonly translated into
specific questions such as (a)
whether defendants appreciate the nature of the charges and
possible penalties they are
facing, (b) whether they understand the adversarial process and
the roles of the key people
in it, (c) whether they can disclose pertinent facts in their case
to their attorney, and
(d) whether they are capable of behaving appropriately in the
courtroom and testifying
relevantly in their own behalf (Stafford, 2003; Zapf & Roesch,
2006).
With respect to dimensions of personality functioning, these
aspects of competence are
most closely related to being able to think logically and
coherently and to perceive people
and events accurately. Disordered thinking and impaired reality
testing, in combination with
the poor judgment and inappropriate behavior typically
associated with them, can interfere
with a person's ability to demonstrate competence. Accordingly,
the same Rorschach indices
of disordered thinking and impaired reality testing just
mentioned in connection with
differential diagnosis (low XA % , low WDA % , elevated
WSum6), although not sufficient
evidence of incompetence, serve two purposes in this regard.
They alert the examiner to
a distinct likelihood that the defendant will have difficulty
satisfying customary criteria
for competency to stand trial, and if a defendant appears
incompetent with respect to the
applicable criteria, these Rorschach findings help the examiner
explain to the court why the
person is having this difficulty.
Criminal responsibility refers in legal terms to whether an
accused person was legally
sane at the time of committing an alleged offense. In some
jurisdictions, insanity is defined as
a cognitive incapacity that prevented the accused person from
recognizing the criminality
Rorschach Inkblot Method 399
of his or her actions or appreciating the wrongfulness of this
conduct. Insanity in other
jurisdictions is defined either as this type of cognitive
incapacity or as a loss of behavioral
control, such that the person was unable to alter or refrain from
the alleged criminal conduct
at the time (Goldstein, Morse, & Shapiro, 2003; Zapf, Golding,
& Roesch, 2006).
With respect to personality functioning, cognitive incapacity is
measured on the RIM by
the previously mentioned indices of disordered thinking and
poor reality testing. Behavioral
dyscontrol is suggested by Rorschach indices of acute and
chronic stress overload (minus
D-score, minus AdjD-score), which are commonly associated
with limited frustration toler-
ance, intemperate outbursts of affect, and episodes of impulsive
behavior. However, because
legal sanity is defined by the person's state of mind at the time
of an alleged offense, and not
at the time of a present examination, Rorschach findings
suggesting cognitive impairment
or susceptibility to loss of control must be supplemented by
other types of information
(e.g., observations of defendants' behavior by witnesses to their
alleged offense and by the
law enforcement officers who arrested them) to serve
adequately as a basis for drawing
conclusions about criminal responsibility.
In civil law cases involving allegations of personal injury,
personality assessment helps to
determine the extent to which a person has become emotionally
distressed or incapacitated
as a consequence of irresponsible behavior on the part of
another person or some entity. As
prescribed by tort law, this circumstance exists when the
potentially liable person or entity
has, by omission or commission of certain actions, been derelict
in a duty or obligation to
the complainant, thereby causing the aggrieved person to
experience psychological injury
that would otherwise not have occurred (see Greenberg, 2003).
Emotional distress caused by the irresponsible actions of others
is often likely to be
reflected in Rorschach responses, most commonly in indications
of generalized anxiety,
stress disorder, depressive affect and cognitions, and psychotic
loss of touch with reality.
Persons with Posttraumatic Stress Disorder tend to produce one
of two types of
Rorschach protocols. Those whose disorder is manifest
primarily in the reexperiencing
of distressing events and mental and physical hyperarousal tend
to produce a flooded pro-
tocol that is notable for the incursions of anxiety on
comfortable and effective functioning.
The implications of the minus D-score and minus Adj D-score
for stress overload can
be particularly helpful in identifying such incursions, as can a
high frequency of content
codes suggesting concerns about bodily harm (e.g., AG, An, Bl,
MOR, Sx; see Armstrong &
Kaser-Boyd, 2004; Kelly, 1999; Luxenberg & Levin, 2004).
Those anxious or traumatized
persons whose disorder is manifest primarily in efforts to avoid
or withdraw from thoughts,
feelings, or situations that might precipitate psychological
distress tend to produce a con-
stricted Rorschach protocol that is notably guarded or evasive.
Such hallmarks of a guarded
record as a low R, high Lambda, low WSumC, andD = 0 tend to
increase the likelihood that
a person who has been exposed to a potentially traumatizing
experience is experiencing a
stress disorder characterized by defensive avoidance.
However, neither flooded nor constricted Rorschach protocols
are specific to anxiety
and stress disorder, nor do they provide conclusive evidence
that such a disorder is present.
Given historical and other clinical or test data to suggest such a
disorder, they merely
increase its likelihood. Moreover, as in the case of evaluating
sanity, the results of a present
personal injury examination are useful only if they can be
interpreted in the context of
past events. Personal injury cases require examiners to
determine whether any currently
observed distress predated the alleged misconduct by the
defendant and whether this distress
400 Performance-Based Measures
constitutes a decline in functioni ng capacity from some
previously higher level prior to when
the misconduct occurred.
Similar considerations apply in the assessment of depressive or
psychotic features in
plaintiffs seeking personal injury damages. As noted, the DEPI
and its several components
are helpful in identifying the presence of dysphoric affect and
negative cognitions, but they
do not provide a dependable basis for ruling out these features
of depression. A psychotic
impairment of reality testing is indicated by a low XA% and
low WDA%, and psychosis
can usually be ruled out if these variables fall within a normal
range. Lack of evidence
of psychosis would counter a plaintiff's claim to have suffered
psychological injury, but
present indications of psychosis would give little support to
such a claim unless other
reliable data (e.g., previous testing, historical indications of
sound mental health) gave
good reason to believe that this person was not psychotic prior
to the alleged harmful
conduct by the defendant.
Personality assessment also enters into family law cases, in the
context of disputed child
custody and visitation rights. In determining how a child's time
and supervision should be
divided between separated or divorced parents, judges
frequently make their determination
partly on the basis of information about the personality
characteristics of the child and the
parents. Similarly, in deciding whether persons should have
their parental rights terminated,
courts often seek information about their personality strengths
and weaknesses as identified
by a psychological examination. There are no infallible
guidelines concerning which of two
persons would be the better parent for a particular child, nor is
there any perfect measure of
suitability to parent. However, certain personality
characteristics as measured by the RIM
are likely to enhance or detract from parents' abilities to meet
the needs of their children.
These characteristics pertain to the presence or absence of
serious psychological distur-
bance, the adequacy of the person's coping skills, and the
person's degree of interpersonal
accessibility.
Although having a psychological disorder does not necessarily
prevent a person from
being a good parent, being seriously disturbed or
psychologically incapacitated is likely to
interfere with a person's having sufficient judgment, impulse
control, energy, and peace of
mind to function effectively in a parental capacity. As indicated
in presenting interpretive
guidelines for the RIM and as previously mentioned in this
section on applications, several
Rorschach variables help identify such serious disturbance.
These include indices of signif-
icant thinking disorder and substantially impaired reality testing
(elevated PT/), pervasive
dysphoria and negative cognitions (elevated DEP[),
overwhelming anxiety (a large minus
D-score), and marked suicide potential (elevated S-CON).
As for coping skills, good parenting is facilitated by capacities
for good judgment,
careful decision making, a flexible approach to solving
problems, and effective stress
management. Conversely, poor judgment, careless decision
making, inflexible problem
solving, and inability to manage stress without becoming unduly
upset are likely to interfere
with effective parenting. Rorschach findings often cast light on
the adequacy of a person's
skills in each of these respects, as noted in discussing
interpretive guidelines: XA% with
respect to judgment; Zd with respect to decision making; a:p
with respect to problem-
solving approach; and D-score with respect to stress
management. This is by no means a
definitive or exhaustive list of coping skills relevant to quality
of parenting or of Rorschach
variables that might prove helpful in evaluating parental
suitability. The list nevertheless
Rorschach Inkblot Method 401
illustrates important respects in which Rorschach assessment
can be applied in family law
consultation.
Finally, with respect to interpersonal accessibility, the quality
of child care that par-
ents can provide is usually enhanced by their being a person
who is interested in people
and comfortable being around them, a person who is nurturing
and caring in his or her
relationships with others, and a person who is sufficiently
empathic to understand what
other people are like and recognize their needs and concerns.
Conversely, interpersonal
disinterest and discomfort are likely to detract from parental
effectiveness, as is being a
detached, self-absorbed, or insensitive person. In Rorschach
terms, then, the likelihood of
a person's being a good parent is measured in part by the
interpersonal cluster of variables
discussed earlier, which means that good parenting is often,
though not always, associated
with the following seven Rorschach findings:
1. SumH > 3
2. H > Hd + (H) + (Hd)
3. /SOL< .25
4. p <a+ 2
5. T >0
6. COP> 1
7. Accurate M > 2 and M- < 2
In drawing these inferences about interpersonal accessibility,
examiners must always
keep in mind that such Rorschach findings may suggest how
parents are likely to in-
teract with their children, but they are never conclusive. The
test data identify probable
parental strengths or limitations in interpersonal accessibility
that should be considered as
evaluators proceed to observe and obtain reports of how parents
are functioning. Integra-
tion of Rorschach indications of adjustment level and coping
skills with these behavioral
observations and reports should always precede coming to
conclusions about a person's
effectiveness as a parent. Further elaboration of these and other
substantive guidelines in
forensic Rorschach assessment is provided by Erard (2005),
Gacono, Evans, Kaser-Boyd,
and Gacono (in press), Johnston, Walters, & Olesen (2005), and
Weiner (2005a, 2006,
2007, in press).
Whatever the nature of a forensic case, attention must be paid
not only to the substantive
interpretation of Rorschach findings, but also to whether
testimony based on these findings
is admissible into evidence in courtroom proceedings.
Applicable criteria for admissibility
vary, depending on the particular federal or state jurisdiction in
which a case is being
tried, and judges have considerable discretion in determining
what types of testimony are
allowed. As established by published guidelines and case law,
the criteria used in individual
cases involve some combination of the following
considerations: whether the testimony is
relevant to the issues in the case and will help the judge or jury
arrive at their decision
(Federal Rules of Evidence); whether the testimony is based on
generally accepted methods
and procedures in the expert's field (Frye standard); and
whether the testimony is derived
from scientifically sound methods and procedures (Daubert
standard; see Ewing, 2003;
Hess, 2006).
402 Performance-Based Measures
The RIM satisfies criteria for admissibility in all three of these
respects. The usefulness of
Rorschach-based testimony in facilitating legal decisions is
demonstrated by the frequency
with which this testimony is in fact welcomed in the courtroom.
In a survey of almost 8,000
cases in which forensic psychologists offered the court
Rorschach-based testimony, the
appropriateness of the instrument was challenged in only six
instances, and in only one of
these cases was the testimony ruled inadmissible (Weiner,
Exner, & Sciara, 1996). Among
the full set of 247 cases in which Rorschach evidence was
presented to a federal, state, or
military court of appeals during the half-century from 1945 to
1995, the admissibility and
weight of the Rorschach data were questioned in only 10.5% of
the hearings. The relevance
and utility of Rorschach assessment was challenged in only two
of these appellate cases,
and the remaining criticisms of the Rorschach testimony were
directed at the interpretation
of the data, not the method itself (Meloy, Hansen, & Weiner,
1997).
More recently Meloy (in press) has examined the full set of 150
published cases in
which Rorschach findings were cited in federal, state, and
military appellate court pro-
ceedings during the 10-year period from 1996 to 2005. These
150 cases over a 10-year
period indicate an average of 15 Rorschach citations per year in
appellate cases, which
is three times the annual rate of citation found by Meloy et al.
(1997) for the preceding
50 years. Along with this greatly increased use of the RIM in
appellate courts, the percentage
of cases in which these courts recorded criticisms of Rorschach
testimony decreased from
10.5% during 1945 to 1995 to just 2% during 1996 to 2005. In
not one of these 1996 to 2005
appellate cases was the Rorschach method ridiculed or
disparaged by opposing counsel.
The general acceptance of the Rorschach method is reflected in
data concerning how
frequently it is used, taught, and studied. Surveys over the past
40 years have consistently
shown substantial endorsement of Rorschach testing as a
valuable skill to teach, learn, and
practice. Among clinical psychologists, the RIM has been the
fourth most widely used test,
exceeded in frequency of use only by the Wechsler Adult
Intelligence Scale (WAIS), the
Minnesota Multiphasic Personality Inventory (MMPI), and the
Wechsler Intelligence Scale
for Children (WISC), in that order (Hogan, 2005). Surveys also
indicate that over 80% of
clinical psychologists engaged in providing assessment services
use the RIM in their work
and believe that clinical students should be competent in
Rorschach assessment; that over
80% of graduate programs teach the RIM; and that students
usually find this training helpful
in improving their assessment skills and their understanding of
the patients and clients with
whom they work (see Camara, Nathan, & Puente, 2000;
Viglione & Hilsenroth, 2001).
With respect to assessment of young people, 162 child and
adolescent practitioners
surveyed by Cashel (2002) reported that the RIM was their third
most frequently used
personality assessment measure, following sentence completion
and figure drawing meth-
ods. Among 346 psychologists working with adolescents in
clinical and academic settings,
Archer and Newsom (2000) found the RIM to be their most
frequently used personality
test and second among all tests only to the Wechsler scales.
Surveys of training directors in predoctoral internship sites have
also identified
widespread endorsement of the value of Rorschach testing.
Training directors report that
the RIM is one of the three measures most frequently used in
their test batteries (along
with the WAIS/WISC and the MMPI-2/MMPI-A), and they
commonly express the hope or
expectation that their incoming interns will have had a
Rorschach course or at least arrive
with a good working knowledge of the instrument (Clemence &
Handler, 2001; Stedman,
Hatch, & Schoenfeld, 2000).
Rorschach Inkblot Method 403
Survey findings confirm that Rorschach assessment has gained
an established place in
forensic as well as clinical practice. Data collected from
forensic psychologists by Ack-
erman and Ackerman (1997), Boccaccini and Brodsky (1999),
Borum and Grisso (1995),
and Quinnell and Bow (2001) showed 30% using the RIM in
evaluations of competency
to stand trial, 32 % in evaluations of criminal responsibility, 41
% in evaluations of personal
injury, 44% to 48% in evaluations of adults involved in custody
disputes, and 23% in eval-
uations of children in custody cases. Consistent with these
earlier surveys, a more recent
report by Archer, Buffington-Vollum, Stredny, and Handel
(2006) indicated Rorschach us-
age for all purposes combined by 36% of the forensic
psychologists participating in their
survey.
As for study of the instrument, the scientific status of the RIM
has been attested over
many years by a steady and substantial volume of published
research concerning its nature
and utility. Buros (1974) Tests in Print II identified 4,580
Rorschach references through
1971, with an average yearly rate of 92 publications. In the
1990s, Butcher and Rouse ( 1996)
found an almost identical trend continuing from 197 4 to 1994.
An average of 96 Rorschach
research articles appeared annually during this 20-year period in
journals published in the
United States, and the RIM was second only to the MMPI
among personality assessment
instruments in the volume of research it generated. For the 3-
year period 2004 to 2006,
PsycINFO lists 350 scientific articles, books, book chapters,
and dissertations worldwide
concerning Rorschach assessment.
There is in fact a large international community of Rorschach
scholars and practitioners
whose research published abroad has for many years made
important contributions to
the literature (see Weiner, 1999). The international presence of
Rorschach assessment is
reflected in a survey of test use in Spain, Portugal, and Latin
American countries by Muniz,
Prieto, Almeida, and Bartram (1999) in which the RIM emerged
as the third most widely
used psychological assessment instrument, following the
Wechsler scales and versions of
the MMPI. The results of surveys in Japan, as reported by
Ogawa (2004 ), indicate that about
60% of Japanese clinical psychologists use the RIM in their
daily practice. An International
Rorschach Society was founded in 1952, and triennial
congresses sponsored by this society
typically attract participants from over 30 countries and all
parts of the world.
With respect to the scientific soundness of Rorschach
assessment, the final section of
this chapter reviews extensive research findings that document
the adequate intercoder
agreement and retest reliability of the instrument, its validity
when used appropriately for
its intended purposes, and the availability of normative
reference data for representative
samples of children and adults. Significantly in this regard,
Meloy (2007) reported in his
previously mentioned review, "There has been no Daubert
challenge to the scientific status
of the Rorschach in any state, federal, or military court of
appeal since the U.S. Supreme
Court decision in 1993 set the federal standard for admissibility
of scientific evidence"
(p. 85).
Despite widespread dissemination of this information, some
authors have contended
that Rorschach assessment does not satisfy contemporary
criteria for admissibility into
evidence and have discouraged forensic examiners from using
the RIM, even to the point
of calling for a moratorium on its use in forensic settings (Garb,
1999; Grove & Barden,
1999). These Rorschach critics have not presented any data to
refute previous surveys in
this regard or to support their contention that the RIM is
unwelcome in the courtroom. The
ways in which Rorschach assessment has been demonstrated to
assist in forensic decision
404 Performance-Based Measures
making are amplified further in contributions by McCann (1998,
2004), McCann and Evans
(in press), Ritzier, Erard, and Pettigrew (2002), and Hilsenroth
and Stricker (2004).
Organizational Practice
Rorschach assessment in organizational practice is concerned
primarily with the selection
and evaluation of personnel. Personnel selection typically
consists of determining whether
a person applying for a position in an organization is suitable to
fill it, or whether a person
already in the organization is qualified for promotion to a
position of increased respon-
sibility. Standard psychological procedure in making such
selection decisions consists of
first identifying the personality requirements for success in the
position being applied or
aspired to, and then determining the extent to which a candidate
shows these personality
characteristics.
A leadership position requiring initiative and rapid decision
making would probably not
be filled well by a person who is behaviorally passive and given
to painstaking care in
coming to conclusions, as would be suggested by Rorschach
findings of p > a + I and
Zd > + 3.0. A position in sales or public relations that calls for
extensive and persuasive
interaction with people is unlikely to be a good fit for a person
who is emotionally withdrawn
and socially uncomfortable, as would be suggested by a low Afr
and H < Hd + (H) + (Hd).
Among persons being considered for hire as an air traffic
controller or a nuclear power
plant supervisor, it would support their candidacy to find
evidence on personality testing
of good coping capacities and the ability to remain calm and
exercise good judgment even
in highly stressful situations-in Rorschach terms, a person with
a high EA, D > = 0, and
XA % in the normal range.
Personnel evaluations may also involve assessing the current
fitness for duty of persons
whose ability to function has become impaired by psychological
disorder. Most common
in this regard is the onset of an anxiety or depressive disorder
that prevents people from
continuing to perform their job or practice their profession as
competently as they had
previously. Impaired professionals seen for psychological
evaluation may also have had
difficulties related to abuse of alcohol, drugs, or prescription
medicine. Because Rorschach
data can help identify the extent to which people are anxious or
depressed and whether
they are struggling with more stress than they can manage, the
RIM can often contribute to
determining fitness for duty and assessing progress toward
recovery in persons participating
in a treatment or rehabilitation program.
Violence in the workplace has also given rise in recent years to
frequent referrals for
fitness-for-duty evaluations, usually in the wake of an
employee's making verbal threats
or acting aggressively on the job. Estimation of violence
potential is a complex process
that requires careful consideration of an individual 's personality
characteristics, interper-
sonal and sociocultural context, and previous history of violent
behavior (Monahan, 2003 ).
Personality characteristics do not by themselves provide
sufficient basis for concluding that
someone poses a danger to the safety and welfare of others.
However, there is reason to
believe that certain personality characteristics increase the
likelihood of violent behavior
in persons who have behaved violently in the past and are
currently confronting annoying
or threatening situations that on previous occasions were likely
to provoke aggressive reac-
tions on their part. Following is a list of personality
characteristics and Rorschach findings
Rorschach Inkblot Method 405
identified earlier in the chapter that help identify them (see also
Gacono, 2000; Gacono &
Meloy, 1994).
1. Being a selfish and self-centered person with a callous
disregard for the rights and
feelings of other people and a sense of entitlement to do and
have whatever one wants
(e.g., Fr+ rF > 0 and Jr+ (2)/R elevated).
2. Being a psychologically distant person who is generally
mistrustful of others, avoids
intimate relationship, and either ignores people or exploits them
to one's own ends (e.g.,
HVI, T = 0, low SumH, COP = 0 with AG > 2).
3. Being an angry and action-oriented person inclined to express
this anger directly (e.g.,
S > 3, a > p, extratensive EB).
4. Being an impulsive person with little tolerance for
frustration, or a psychologically
disturbed person with impaired reality testing and poor
judgment ( e.g., D < 0, AdjD <
0,XA% and WDA% low).
Neither these personality characteristics nor the Rorschach
variables associated with
them are specific to persons who show violent behavior. Even
among people who exhibit all
these characteristics and Rorschach findings, moreover, many or
most may never consider
physically assaulting another person. However, in persons with
a history of violent behavior
who are exposed to violence-provoking circumstances, each of
these characteristics and
findings increases violence potential risk. The more numerous
these characteristics and
findings, and the more pronounced they are, the greater is the
violence risk they suggest.
PSYCHOMETRIC FOUNDATIONS
As mentioned in discussing the history of Rorschach
assessment, the blossoming of vari-
ous Rorschach systems in the United States and abroad enriched
the instrument for clinical
purposes, but at a cost to its scientific development. The many
Rorschach variations cre-
ated by gifted and respected clinicians limited cumulative
research on the psychometric
properties of the instrument prior to Exner's 1974
standardization of coding and adminis-
tration procedures in the Comprehensive System (CS).
Subsequent widespread use of the
CS in research and practice has fostered substantial advances in
knowledge concerning the
psychometric soundness of the RIM, particularly with respect to
its intercoder agreement,
retest reliability, validity, and normative reference base.
Intercoder Agreement
In constructing the Rorschach CS, Exner included only
variables on which his coders could
achieve at least 80% agreement, and subsequent research
confirmed that the CS variables
can be reliably coded with at least this level of agreement.
However, measuring intercoder
reliability by percentage of agreement is a questionable
procedure, because this method
does not take account of agreement occurring by chance. With
this consideration in mind,
Rorschach researchers began in the late 1990s to assess
intercoder reliability with two
statistics that correct for chance agreements, kappa and
intraclass correlation coefficients
416 Performance-Based Measures
Major Rorschach indices of psychological disturbance include
the X-% (an index of im-
paired reality testing) and the WSum6 (an index of disordered
thinking). If X-% and WSum6
are valid measures of disturbance, they should increase in linear
fashion across these four
reference groups-and they do, as shown by the Exner (2001,
chap. 11) reference data.
A second example of construct validity demonstrated by the
normative reference data
concerns developmental changes in young people. The
previously noted increasing stability
of Rorschach structural variables from childhood into
adolescence, consistent with the grad-
ual consolidation of personality characteristics, is a case in
point. Among specific changes
occurring with maturation, young people are known to become
less self-centered (less
egocentric) and increasingly capable of moderating their affect
(less emotionally in-
tense). The RIM Egocentricity Index is conceptualized as a
measure of self-centeredness,
and the balance between presumed indices of relatively mature
emotionality (FC)
and relatively immature emotionality (CF) is conceptualized as
an indication of affect
moderation.
If these variables are valid measures of what they are posited to
measure, their average
values should change in the expected direction among children
and adolescents at different
ages-and they do. In the CS reference data, the mean
Egocentricity Index of .67 at age
6 decreases in almost linear fashion to .43 at age 16, which is
just slightly higher than the
adult mean of .40. The mean for FC increases steadily over time
from 1.11 at age 6 to 3.43
at age 16 (compared with an adult mean of 3.56), while the
mean for CF decreases from
3.51 to 2.78 between age 6 and 16 (the adult mean is 2.41).
The present chapter has been concerned mainly with the
Rorschach assessment of adults
and older adolescents. In closing the chapter, it is important to
note that the RIM can also
be used to good effect in evaluating children and early
adolescents. Assessors working with
young people will profit from consulting Erdberg (2007), Exner
and Weiner (1995), and
Leichtman ( 1996) in this regard.
REFERENCES
Ackerman, M. J., & Ackerman, M. C. (1997). Custody
evaluations in practice: A survey of experienced
professionals (revisited). Professional Psychology, 28, 137-145.
Acklin, M. W., McDowell, C. J., Verschell, M. S., & Chan, D.
(2000). Interobserver agreement,
intraobserver agreement, and the Rorschach Comprehensive
System. Journal of Personality
Assessment, 74, 15-57.
Allard, G., & Faust, D. (2000). Errors in scoring objective
personality tests. Assessment, 7, 119-129.
Allen, J., & Dana, R.H. (2004). Methodological issues in cross-
cultural and multicultural Rorschach
research. Journal of Personality Assessment, 82, 189-206.
Archer, R. P., Buffington-Vollum, J. K., Stredny, R. V., &
Handel, R. W. (2006). A survey of
psychological test use patterns among forensic psychologists.
Journal ofPersonality Assessment,
87, 84-94.
Archer, R. P., & Newsom, C. R. (2000). Psychological test
usage with adolescent clients: Survey
update. Assessment, 7, 227-235.
Armstrong, J., & Kaser-Boyd, N. (2004). Projective assessment
of psychological trauma. In M. J.
Hilsenroth & D. L Segal (Eds.), Comprehensive handbook
ofpsychological assessment: Vol. 2.
Personality assessment (pp. 500-512). Hoboken, NJ: Wiley.
Aronow, E., & Reznikoff, M. (1976). Rorschach content
interpretation. New York: Grune & Stratton.
Rorschach Inkblot Method 417
Aronow, E., Reznikoff, M., & Moreland, K. L. (1994). The
Rorschach technique. Boston: Allyn &
Bacon.
Auslander, L.A., Perry, W., & Jeste, D. V. (2002). Assessing
disturbed thinking and cognition using
the Ego Impairment Index in older schizophrenic patients:
Paranoid vs. nonparanoid distinction.
Schizophrenia Research, 53, 199-207.
Beck, S. J. (1930a). Personality diagnosis by means of the
Rorschach test. American Journal of
Orthopsychiatry, 1, 81-88.
Beck, S. J. (1930b). The Rorschach test and personality
diagnosis. American Journal of Psychiatry,
10, 19-52.
Beck, S. J. (1937). Introduction to the Rorschach method:
American Orthopsychiatric Association
Monograph I. New York: American Orthopsychiatric
Association.
Beck, S. J. (1944). Rorschach's test: Vol. I. Basic processes.
New York: Grune & Stratton.
Blais, M. A., Hilsenroth, M. J., Castlebury, F., Fowler, J. C., &
Baity, M. R. (2001). Predicting
DSM-IV Cluster B personality disorder criteria from MMPI-2
and Rorschach data: A test of
incremental validity. Journal ofPersonality Assessment, 76,
150--168.
Blatt, S. J., & Ford, R. Q. (1994). Therapeutic change. New
York: Plenum Press.
Boccaccini, M. T., & Brodsky, S. L. (1999). Diagnostic test
usage by forensic psychologists in
emotional injury cases. Professional Psychology, 30, 253-259.
Bornstein, R. F. (1999). Criterion validity of objective and
projective dependency tests: A meta-
analytic assessment of behavioral prediction. Psychological
Assessment, 11, 48-57.
Bornstein, R. F., & Masling, J.M. (2005). The Rorschach Oral
Dependency scale. In R. F. Bornstein &
J.M. Masling (Eds.), Scoring the Rorschach: Seven validated
systems (pp. 135-158). Mahwah,
NJ: Erlbaum.
Borum, R., & Grisso, T. (1995). Psychological test use in
criminal forensic evaluations. Professional
Psychology, 26, 465-473.
Buros, 0. K. (Ed.). (1974). Tests in print II. Highland Park, NJ:
Gryphon.
Butcher, J. N., & Rouse, S. V. (1996). Personality: Individual
differences and clinical assessment.
Annual Review ofPsychology, 47, 87-111.
Camara, W., Nathan, J., & Puente, A. (2000). Psychological test
usage: Implications in professional
psychology. Professional Psychology, 31, 141-154.
Cashel, M. L. (2002). Child and adolescent psychological
assessment: Current clinical practices and
the impact of managed care. Professional Psychology: Research
and Practice, 33, 446-453.
Clarkin, J. F., & Levy, K. N. (2004). The influence of client
variables on psychotherapy. In M. J.
Lambert (Ed.), Bergin and Garfield's handbook ofpsychotherapy
and behavior change (5th ed.,
pp. 194-226). Hoboken, NJ: Wiley.
Clemence, A., & Handler, L. (2001). Psychological assessment
on internship: A survey of training
directors and their expectations for students. Journal
ofPersonality Assessment, 76, 18-47.
Dao, T. K., & Prevatt, F. (2006). A psychometric evaluation of
the Rorschach Comprehensive System's
Perceptual Thinking Index. Journal ofPersonality Assessment,
86, 180--189.
Elfhag, K., Barkeling, B., Carlsson, A. M., & Rossner, S.
(2003). Microstructure of eating behavior
associated with Rorschach characteristics in obesity. Journal of
Personality Assessment, 81,
40--50.
Elfhag, K., Ri:issner, S., Lindgren, T., Andersson, I., &
Carlsson, A. M. (2004). Rorschach personality
predictors of weight loss with behavior modification in obesity
treatment.Journal ofPersonality
Assessment, 83, 293--305.
Eprhaim, D. (2000). Culturally relevant research and practice
with the Rorschach Comprehensive Sys-
tem. In R. H. Dana (Ed.), Handbook of cross-cultural and
multicultural personality assessment
(pp. 303-328). Mahwah, NJ: Erlbaum.
Erard, R. E. (2005). What the Rorschach can contribute to child
custody and parenting time evalua-
tions. Journal of Child Custody, 2, 119-142.
418 Performance-Based Measures
Erdberg, P. (2007). Using the Rorschach with children. In S. R.
Smith & L. Handler (Eds.), The
clinical assessment of children and adolescents (pp. 139-147).
Mahwah, NJ: Erlbaum.
Erdberg, P., & Shaffer, T. W. (2001, March). International
Symposium on Rorschach nonpatient data:
Worldwide findings. Symposium conducted at the annual
meeting of the Society for Personality
Assessment, Philadelphia.
Ewing, C. P. (2003). Expert testimony: Law and practice. In I.
B. Weiner (Editor-in-Chief) & A.
M. Goldstein (Vol. Ed.}, Handbook of psychology: Vol. II.
Forensic psychology (pp. 55-66).
Hoboken, NJ: Wiley.
Exner, J.E., Jr., (1969). The Rorschach systems. New York:
Grune & Stratton.
Exner, J. E., Jr., (1974). The Rorschach: A comprehensive
system. New York: Wiley.
Exner, J. E., Jr., (2001). A Rorschach workbook for the
comprehensive system (5th ed.). Asheville,
NC: Rorschach Workshops.
Exner, J. E., Jr., (2003). The Rorschach: A comprehensive
system: Vol. I. Basic foundations and
principles of interpretation (4th ed.). Hoboken, NJ: Wiley.
Exner, J.E., Jr., & Andronikof-Sanglade, A. ( 1992). Rorschach
changes following brief and short-term
therapy. Journal of Personality Assessment, 59, 59-71.
Exner, J.E., Jr., Armbruster, G. L., & Viglione, D. (2001). The
temporal stability of some Rorschach
features. Journal of Personality Assessment, 42, 474-482.
Exner, J.E., Jr., & Erdberg, P. (2005). The Rorschach: A
comprehensive system: Vol. 2. Advanced
interpretation (3rd ed.). Hoboken, NJ: Wiley.
Exner, J. E., Jr., Thomas, E. A., & Mason, B. (1985). Children's
Rorschachs: Description and
prediction. Journal of Personality Assessment, 49, 13-20.
Exner, J.E., Jr., & Weiner, I. B. (1995). The Rorschach: A
comprehensive system: Vol. 3. Assessment
of children and adolescents (2nd ed.). New York: Wiley.
Exner, J. E., Jr., & Weiner, I. B. (2003). Rorschach
interpretation assistance program, Version
5(RIAP5). Lutz, FL: Psychological Assessment Resources.
Fowler, J. C., Ackerman, S. J., Speanburg, S., Bailey, A.,
Blagys, M., & Conklin, A. C. (2004).
Personality and symptom change in treatment-refractory
inpatients: Evaluation of the phase
model of change using Rorschach, TAT, and DSM-IV Axis
V.Journal ofPersonality Assessment,
83, 306--322. .
Fowler, J. C., Brunnschweiler, B., Swales, S., & Brock, J.
(2005). Assessment of Rorschach depen-
dency measures in female inpatients diagnosed with borderline
disorder. Journal of Personality
Assessment, 85, 146--153.
Fowler, J.C., & Erdberg, P. (2005). The Mutuality of Autonomy
scale: An implicit measure of object
relations for the Rorschach Inkblot Method. South African
Rorschach Journal, 2, 3-10.
Fowler, J.C., Piers, C., Hilsenroth, M. J., Holdwick, D. J., Jr., &
Padawer, J. R. (2001 ). The Rorschach
Suicide Constellation: Assessing various degrees oflethality.
Journal ofPersonality Assessment,
76, 333-351.
Gacono, C. B. (Ed.). (2000). The clinical and forensic
assessment of psychopathy. Mahwah, NJ:
Erlbaum.
Gacono, C. B., Evans, F. B., Kaser-Boyd, N., & Gacono, L.
(Eds.). (in press). Handbook offorensic
Rorschach psychology. Mahwah, NJ: Erlbaum.
Gacono, C. B., & Meloy, J. R. (1994). Rorschach assessment of
aggressive and psychopathic per-
sonalities. Hillsdale, NJ: Erlbaum.
Ganellen, R. J. (2005). Rorschach contributions to assessment
of suicide risk. In R. I. Yufit & D. Lester
(Eds.), Assessment, treatment, and prevention of suicidal
behavior (pp.93-119). Hoboken, NJ:
Wiley.
Garb, H. N. (1999). Call for a moratorium on the use of the
Rorschach Inkblot Test in clinical and
forensic settings. Assessment, 6, 311-318.
Rorschach Inkblot Method 419
Garb, H. N., Wood, J. M., Nezworski, M. T., Grove, W. M., &
Stejskal, W. J. (2001). Toward a
resolution of the Rorschach controversy. Psychological
Assessment, 13, 433-448.
Goldstein, A. M., Morse, S. J., & Shapiro, D. L. (2003).
Evaluation of criminal responsibility. In I.
B. Weiner (Editor-in-Chief) & A. M. Goldstein (Vol. Ed.),
Handbook of psychology: Vol. 11.
Forensic psychology (pp. 381-406). Hoboken, NJ: Wiley.
Greenberg, S. A. (2003). Personal injury examinations in torts
for emotional distress. In I. B. Weiner
(Editor-in-Chief) & A. M. Goldstein (Vol. Ed.), Handbook of
psychology: Vol. 11. Forensic
psychology (pp. 233-257). Hoboken, NJ: Wiley.
Greenway, P., & Milne, L. C. (2001 ). Rorschach tolerance and
control of stress measures D andAdjD:
Beliefs about how well subjective states and reactions can be
controlled. European Journal of
Psychological Assessment, 17, 137-144.
Grizjnner!Z!d, C. (2003). Temporal stability in the Rorschach
method: A meta-analytic review. Journal
ofPersonality Assessment, 80, 272-293.
Grizjnnerizjd, C. (2006). Reanalysis of the Grizjnnerizjd.
(2003). Rorschach temporal stability meta-
analysis set. Journal ofPersonality Assessment, 86, 222-225.
Grove, W. M., & Barden, R. C. (1999). Protecting the integri ty
of the legal system: The admissibility
of testimony from mental health experts under Daubert/Kumho
analysis. Psychology, Public
Policy, and Law, 5, 224-242.
Guamaccia, V., Dill, C. A., Sabatino, S., & Southwick, S.
(2001). Scoring accuracy using the
Comprehensive System for the Rorschach. Journal ofPersonality
Assessment, 77, 464-474.
Hamel, M., Shaffer, T. W., & Erdberg, P. (2000). A study of
nonpatient preadolescent Rorschach
protocols. Journal ofPersonality Assessment, 75, 280-294.
Handler, L., & Clemence, A. J. (2005). The Rorschach
Prognostic Rating scale. In R. F. Bornstein &
J.M. Masling (Eds.), Scoring the Rorschach: Seven validated
systems (pp. 1-24). Mahwah, NJ:
Erlbaum.
Hartmann, E., Norbech, P. B., & Grizjnnerizjd, C. (2006).
Psychopathic and nonpsychopathic violent of-
fenders on the Rorschach: Discriminative features and
comparisons with schizophrenic inpatient
and university student samples. Journal ofPersonality
Assessment, 86, 291-305.
Hartmann, E., Sunde, T., Kristensen, W., & Martinussen, M.
(2003). Psychological measures as
predictors of military training performance. Journal
ofPersonality Assessment, 80, 87-98.
Hess, A. K. (2006). Serving as an expert witness. In I. B.
Weiner & A. K. Hess (Eds.), Handbook of
forensic psychology (3rd ed., pp. 652-700). Hoboken, NJ:
Wiley.
Hiller, J. B., Rosenthal, R., Bornstein, R. F., Berry, D. T. R., &
Brunner-Neuleib, S. (1999). A
comparative meta-analysis of Rorschach validity. Psychological
Assessment, 11, 278-296.
Hilsenroth, M. J., & Stricker, G. (2004). A consideration of
attacks upon psychological assessment
instruments used in forensic settings: Rorschach as exemplar.
Journal ofPersonality Assessment,
83, 141-152.
Hogan, T. P. (2005). 50 widely used psychological tests. In G.
P. Koocher, J.C. Norcross, & S. S. Hill
III (Eds.), Psychologists' desk reference (2nd ed., pp. 101-104).
New York: Oxford University
Press.
Holt, R.R. (2005). The Pripro scoring system. In R. F. Bornstein
& J.M. Masling (Eds.), Scoring the
Rorschach: Seven validated systems (pp. 191-236). Mahwah,
NJ: Erlbaum.
Holzman, P. S., Levy, D. L., & Johnston, M. H. (2005). The use
of the Rorschach technique for
assessing formal thought disorder. In R. F. Bornstein & J. M.
Masling (Eds.), Scoring the
Rorschach: Seven validated systems (pp. 55-96). Mahwah, NJ:
Erlbaum.
Hunsley, J., & Bailey, J.M. (1999). The clinical utility of the
Rorschach: Unfulfilled promises and
an uncertain future. Psychological Assessment, 11, 266-277.
Huprich, S. K. (Ed.). (2006). Rorschach assessment of the
personality disorders. Mahwah, NJ:
Erlbaum.
420 Performance-Based Measures
Ilonen, T., Taiminen, T., Karlsson, H., Lauerma, H., Leinonen,
K.-M., Wallenius, E., et al. (1999).
Diagnostic efficiency of the Rorschach schizophrenia and
depression indices in identifying
first-episode schizophrenia and severe depression. Psychiatry
Research, 87, 183-193.
Janson, H., & Stattin, H. (2003). Predictions of adolescent and
adult delinquency from childhood
Rorschach ratings. Journal ofPersonality Assessment, 81, 51-63.
Johnston, J. R., Walters, M. G., & Olesen, N. W. (2005).
Clinical ratings of parenting capacity
and Rorschach protocols of custody-disputing parents: An
exploratory study. Journal of Child
Custody, 2, 159-178.
Kelly, F. D. (1999). The psychological assessment ofabused and
traumatized children. Mahwah, NJ:
Erlbaum.
Kleiger, J. H. (1999). Disordered thinking and the Rorschach.
Hillsdale, NJ: Analytic Press.
Klopfer, B., Ainsworth, M. D., Klopfer, W. G., & Holt, R.R.
(1954). Developments in the Rorschach
technique: Vol. I. Technique and theory. Yonkers-on-Hudson,
NY: World Books.
Klopfer, B., & Kelley, D. M. (1942). The Rorschach technique.
Yonkers-on-Hudson, NY: World
Books.
Klopfer, B., Kirkner, F., Wisham, W., & Baker, G. (1951).
Rorschach Prognostic Rating scale.Journal
ofProjective Techniques and Personality Assessment, 15, 425-
428.
Leichtman, M. (1996). The Rorschach: A developmental
perspective. Hillsdale, NJ: Analytic
Press.
Lerner, P. M. (1998). Psychoanalytic perspective on the
Rorschach. Hillsdale, NJ: Analytic Press.
Lerner, P. M. (2005). Defense and its assessment: The Lerner
Defense scale. In R. F. Bornstein & J.
M. Masling (Eds.), Scoring the Rorschach: Seven validated
systems (pp. 237-270). Mahwah,
NJ: Erlbaum.
Lilienfeld, S. 0., Wood, J. M., & Garb, H. N. (2000). The
scientific status of projective techniques.
Psychological Science in the Public Interest, 1, 27-66.
Luxenberg, T., & Levin, P. (2004). The role of the Rorschach in
the assessment and treatment of
trauma. In J. P. Wilson & T. M. Keane (Eds.), Assessing
psychological trauma and PTSD (2nd
ed., pp. 190--225). New York: Guilford Press.
McCann, J. T. (1998). Defending the Rorschach in court: An
analysis of admissibility using legal and
professional standards. Journal ofPersonality Assessment, 70,
125-144.
McCann, J. T. (2004). Projective assessment of personality in
forensic settings. In M. Hersen (Editor-
in-Chief), M. J. Hilseroth, & D. L. Segal (Vol. Eds.),
Comprehensive handbook ofpsychological
assessment: Vol. 2. Personality assessment (pp. 562-572).
Hoboken, NJ: Wiley.
McCann, J. T., & Evans, F. B. (in press). Admissibility of the
Rorschach. In C. B. Gacono, F. B. Evans,
N. Kaser-Boyd, & L. Gacono (Eds.), Handbook offorensic
Rorschach psychology. Mahwah,
NJ: Erlbaum.
McCrae, R. R., & Terracciano, A. (2006). National character
and personality. Current Directions in
Psychological Science, 15, 156--161.
McGrath, R. E. (2003). Enhancing accuracy in observational
test scoring: The Comprehensive System
as a case example. Journal ofPersonality Assessment, 81, 104-
110.
McGrath, R. E., Pogge, D. L., Stokes, J. M., Cragnolino, A.,
Zaccaria, M., Hayman, J., et al. (2005).
Field reliability of Comprehensive System scoring in an
adolescent inpatient sample. Assessment,
12, 199-209. [11]
Meloy, J. R. (2007). The authority of the Rorschach: An update.
In C. B. Gacono, F. B. Evans, N.
Kaser-Boyd, & L. Gacono (Eds.), Handbook of forensic
Rorschach psychology (pp. 79-87).
Mahwah, NJ: Erlbaum.
Meloy, J. R., Hansen, T., & Weiner, I. B. (1997). Authority of
the Rorschach: Legal citations in the
past 50 years. Journal ofPersonality Assessment, 69, 53-62.
Meyer, G. J. (1997a). Assessing reliability: Critical corrections
for a critical examination of the
Rorschach Comprehensive System. Psychological Assessment,
9, 480-489.
Rorschach Inkblot Method 421
Meyer, G. J. (1997b). Thinking clearly about reliability: More
critical corrections regarding the
Rorschach Comprehensive System. Psychological Assessment,
9, 495-598.
Meyer, G. J. (2000). The incremental validity of the Rorschach
Prognostic Rating scale over the
MMPI Ego Strength scale and IQ. Journal ofPersonality
Assessment, 74, 356--370.
Meyer, G. J. (2001). Evidence to correct misperceptions about
Rorschach norms. Clinical Psychology:
Science and Practice, 8, 389-396.
Meyer, G. J. (2002). Exploring possible ethnic differences and
bias in the Rorschach Comprehensive
System. Journal ofPersonality Assessment, 78, 104-129.
Meyer, G. J. (2004). The reliability and validity of the
Rorschach and Thematic Apperception Test
(TAT) compared to other psychological and medical procedures:
An analysis of systematically
gathered evidence. In M. Hersen (Editor-in-Chief), M.
Hilsenroth, & D. Segal (Vol. Eds.),
Comprehensive handbook of psychological assessment: Vol. 2.
Personality assessment (pp.
315-342). Hoboken, NJ: Wiley.
Meyer, G. J., & Archer, R. P. (2001). The hard science of
Rorschach research: What do we know and
where do we go? Psychological Assessment, 13, 486--502.
Meyer, G. J., & Handler, L. (1997). The ability of the
Rorschach to predict subsequent outcome:
Meta-analysis of the Rorschach Prognostic Rating scale. Journal
ofPersonality Assessment, 69,
1-38.
Meyer, G. J., Hilsenroth, M. J., Baxter, D., Exner, J.E., Jr.,
Fowler, J.C., Pers, C. C., et al. (2002). An
examination of interrater reliability for scoring the Rorschach
Comprehensive System in eight
data sets. Journal ofPersonality Assessment, 78, 219-274.
Meyer, J. G., Mihura, J. L., & Smith, B. L. (2005). The
interclinician reliability of Rorschach
interpretation in four data sets. Journal ofPersonality
Assessment, 84, 296--314.
Meyer, G. J., & Viglione, D. J. (in press). Scientific status of
the Rorschach. In C. B. Gacono, F.
B. Evans, N. Kaser-Boyd, & L. Gacono (Eds.), Handbook
offorensic Rorschach psychology.
Mahwah, NJ: Erlbaum.
Monahan, J. (2003). Violence risk assessment. In I. B. Weiner
(Editor-in-Chief) & A. M. Goldstein
(Vol. Ed.), Handbook ofpsychology: Vol. 11. Forensic
psychology (pp. 527-540). Hoboken, NJ:
Wiley.
Muniz, J., Prieto, G., Almeida, L., & Bartram, D. (1999). Test
use in Spain, Portugal, and Latin
American countries. European Journal ofPsychological
Assessment, 15, 151-157.
Ogawa, T. (2004). Developments of the Rorschach in Japan: A
brief introduction. South African
Rorschach Journal, 1, 40--45.
Perry, W. (2001). Incremental validity of the Ego Impairment
Index: A reexamination of Dawes
(1999). Psychological Assessment, 13, 403-407.
Phillips, L., & Smith, J. G. (1953). Rorschach interpretation:
Advanced technique. New York: Grune
& Stratton.
Piotrowski, Z. A. (1957). Perceptanalysis. New York:
Macmillan.
Presley, G., Smith, C., Hilsenroth, M., & Exner, J. (2001).
Clinical utility of the Rorschach with
African Americans. Journal ofPersonality Assessment, 78, I 04-
129.
Quinnell, F. A., & Bow, J. N. (2001 ). Psychological tests used
in child custody evaluations. Behavioral
Sciences and the Law, 19, 491-501.
Rapaport, D., Gill, M., & Schafer, R. (1968). Diagnostic
psychological testing (Rev. ed.). New York:
International Universities Press. (Original work published 1946)
Ritsher, J. B. (2004). Association of Rorschach and MMPI
psychosis indicators and schizophrenia-
spectrum diagnoses in a Russian clinical sample. Journal
ofPersonality Assessment, 38, 46--63.
Ritzier, B. (2001). Multicultural usage of the Rorschach. In L.
Suzuki, J. Ponterotto, & P. Meller
(Eds.), Handbook of multicultural assessment (pp. 237-252).
San Francisco: Jossey-Bass.
Ritzler, B. (2004). Cultural applications of the Rorschach,
apperception tests, and figure drawings.
In M. Hersen (Editor-in-Chief), M. J. Hilsenroth, & D. L. Segal
(Vol. Eds.), Comprehensive
422 Performance-Based Measures
handbook ofpsychological assessment: Vol. 2. Personality
assessment (pp. 573-585). Hoboken,
NJ: Wiley.
Ritzier, B., Erard, R., & Pettigrew, T. (2002). Protecting the
integrity of Rorschach expert witnesses:
A reply to Grove and Barden (1999) re: The admissibility of
testimony under Daubert/Kumho
analysis. Psychology, Public Policy, and the Law, 8(2), 201-
215.
Rorschach, H. (1942). Psychodiagnostics: A diagnostic test
based on perception. Bern, Switzerland:
Hans Huber. (Original work published 1921)
Rosenthal, R., Hiller, J. G., Bornstein, R. F., Berry, D. T. R., &
Brunell-Neuleib, S. (2001). Meta-
analytic methods, the Rorschach, and the MMPI. Psychological
Assessment, 13, 449-451.
Schafer, R. (1948). Clinical application ofpsychological tests.
New York: International Universities
Press.
Schafer, R. (1954). Psychoanalytic interpretation in Rorschach
testing. New York: Grune & Stratton.
Schafer, R. (2006). My life in testing. Journal ofPersonali ty
Assessment, 86, 235-241.
Shaffer, T. W., Erdberg, P., & Haroian, J. (1999). Current
nonpatient data for the Rorschach, WAIS,
and MMPI-2. Journal ofPersonality Assessment, 73, 305-316.
Shaffer, T. W., Erdberg, P., & Meyer, G. J. (Eds.). (2007).
International reference sample for
the Rorschach comprehensive system [Special issue]. Journal of
Personality Assessment, 89
(Suppl. I).
Sloane, P., Arsenault, L., & Hilsenroth, M. (2002). Use of the
Rorschach in the assessment of
war-related stress in military personnel. Rorschachiana, 25, 86--
122.
Smith, S., Baity, M. R., Knowles, E. S., & Hilsenroth, M. J.
(2001 ). Assessment of disordered thinking
in children and adolescents: The Rorschach Perceptual -Thinking
Index. Journal of Personality
Assessment, 77, 447-463.
Society for Personality Assessment. (2005). The status of the
Rorschach in clinical and forensic
practice: An official statement by the Board of Trustees of the
Society for Personality Assessment.
Journal ofPersonality Assessment, 85, 219-237.
Stafford, K. P. (2003). Assessment of competence to stand trial.
In I. B. Weiner (Editor-in-Chief) & A.
M. Goldstein (Vol. Ed.), Handbook ofpsychology: Vol. 11.
Forensic psychology (pp. 359-380).
Hoboken, NJ: Wiley.
Stedman, J., Hatch, J., & Schoenfeld, L. (2000). Preinternship
preparation in psychological testing
and psychotherapy: What internship directors say they expect.
Professional Psychology, 31,
321-326.
Stricker, G., & Gooen-Piels, J. (2004). Projective assessment of
object relations. In M. Hersen (Editor-
in-Chief), M. J. Hilsenroth, & D. L. Segal (Vol. Eds.),
Comprehensive handbook ofpsychological
assessment: Vol. 2. Personality assessment (pp. 449-465).
Hoboken, NJ: Wiley.
Urist, J. (1977). The Rorschach test and the assessment of
object relations. Journal of Personality
Assessment, 41, 3-9.
Viglione, D. J. (1999). A review of recent research addressing
the utility of the Rorschach. Psycho-
logical Assessment, 11, 251-265.
Viglione, D. J. (2002). Rorschach coding solutions: A reference
guide for the comprehensive system.
San Diego, CA: Author.
Viglione, D. J., & Hilsenroth, M. J. (2001 ). The Rorschach:
Facts, fictions, and future. Psychological
Assessment, 11, 251-265.
Viglione, D. J., Perry, W., & Meyer, G. (2003). Refinements in
the Rorschach Ego Impairment
Index incorporating the human representational variable.
Journal ofPersonality Assessment, 81,
149-156.
Viglione, D. J., & Taylor, N. (2003). Empirical support for
interrater reliability of Rorschach com-
prehensive system coding. Journal of Clinical Psychology, 59,
111-121.
Weiner, I. B. (1996). Some observations on the validity of the
Rorschach Inkblot Method. Psycho-
logical Assessment, 8, 206--213.
Rorschach Inkblot Method 423
Weiner, I. B. (1998). Principles ofpsychotherapy (2nd ed.). New
York: Wiley.
Weiner, I. B. (1999). Contemporary perspectives on Rorschach
assessment. European Journal of
Psychological Assessment, 15, 78-86.
Weiner, I. B. (2001a). Advancing the science of psychological
assessment: The Rorschach Inkblot
Method as exemplar. Psychological Assessment, 13, 423-432.
Weiner, I. B. (2001b). Considerations in collecting Rorschach
reference data. Journal ofPersonality
Assessment, 77, 122-127.
Weiner, I. B. (2003a). Prediction and postdiction in clinical
decision making. Clinical Psychology,
10, 335-338.
Weiner, I. B. (2003b). Principles ofRorschach interpretation
(2nd ed.). Mahwah, NJ: Erlbaum.
Weiner, I. B. (2004a). Monitoring psychotherapy with
performance-based measures of personality
functioning. Journal of Personality Assessment, 83, 323-331.
Weiner, I. B. (2004b). Rorschach assessment: Current status. In
M. Hersen (Editor-in-Chief), M. J.
Hilsenroth, & D. L. Segal (Vol. Eds.), Comprehensive handbook
of psychological assessment:
Vol. 2. Personality assessment (pp. 343-355). Hoboken, NJ:
Wiley.
Weiner, I. B. (2005a). Rorschach assessment in child custody
cases. Journal of Child Custody, 2,
99-120.
Weiner, I. B. (2005b). Rorschach Inkblot Method. In M.
Maruish (Ed.), The use of psychological
testing in treatment planning and outcome evaluation (3rd ed.,
Vol. 3, pp. 553-588). Mahwah,
NJ: Erlbaum.
Weiner, I. B. (2006). The Rorschach Inkblot Method. In R. P.
Archer (Ed.), Forensic uses of clinical
assessment instruments (pp. 181-207). Mahwah, NJ: Erlbaum.
Weiner, I. B. (2007). Rorschach assessment in forensic cases. In
A. M. Goldstein (Ed.), Forensic
psychology: Emerging topics and expanding roles (pp. 127-
153). Hoboken, NJ: Wiley.
Weiner, I. B. (in press). Presenting and defending Rorschach
testimony. In C. B. Gacono, F. B. Evans,
N. Kaser-Boyd, & L. Gacono (Eds.), Handbook offorensic
Rorschach psychology. Mahwah,
NJ: Erlbaum.
Weiner, I. B., & Exner, J.E., Jr. (1991). Rorschach changes in
long-term and short-term psychotherapy.
Journal ofPersonality Assessment, 56, 453-465.
Weiner, I. B., Exner, J. E., Jr., & Sciara, A. (1996). Is the
Rorschach welcome in the courtroom?
Journal ofPersonality Assessment, 67, 422-424.
Wood, J. M., & Lilienfeld, S. 0. (1999). The Rorschach Inkblot
Tests: A case of overstatement?
Assessment, 6, 341-349.
Wood, J. M., Nezworski, M. T., Garb, H. N., & Lilienfeld, S. 0.
(2001). The misperception of
psychopathology: Problems with the norms of the
comprehensive system. Clinical Psychology:
Science and Practice, 8, 360-373.
Zapf, P. A., Golding, S. L., & Roesch, R. (2006). Criminal
responsibility and the insanity defense.
In I. B. Weiner & A. K. Hess (Eds.), Handbook offorensic
psychology (3rd ed., pp. 332-364).
Hoboken, NJ: Wiley.
Zapf, P. A., & Roesch, R. (2006). Competency to stand trial. In
I. B. Weiner & A. K. Hess (Eds.),
Handbook offorensic psychology (3rd ed., pp. 305-331).
Hoboken, NJ: Wiley.
a-345-354a-395-405a-416-423
Chapter 10
REVISED NEO PERSONALITY
INVENTORY
The NEO Personality Inventory (NEO PI; Costa & McCrae,
1985) and the Revised NEO
Personality Inventory (NEO PI-R; Costa & McCrae, 1992)
measure five broad domains
or dimensions of personality in normal adults. Three of these
domain scales, measur-
ing Neuroticism (N), Extraversion (E), and Openness to
Experience (0), have been re-
searched for years and serve as the basis of the name for the
original Inventory (NEO). The
NEO PI also includes two additional domains, Agreeableness
(A) and Conscientiousness
( C). These five domains allow for a comprehensive description
of personality in normal
adults. The NEO PI-R consists of five global domains and six
facets for each domain (see
Table 10.1).
Table 10.2 provides the general information on the NEO PI-R.
HISTORY
A long line of research on five-factor models of personality
serve as the basis for the
NEO PI-R, most of which is beyond the scope of this Handbook
(cf. Wiggins, 1996). The
rather common finding in the 1980s of five factors in
personality research, served as the
major impetus for a multitude of studies based on a lexical
analysis of words, personality
traits, interpersonal theory, or ratings of schoolchildren's
behavior. Despite critiques that
five-factor models were atheoretical, they have persisted and
gained widespread acceptance
in the field of personality research. A significant impetus for
this widespread acceptance
of five-factor models is the prolific work of Costa and McCrae
and their publication of the
NEO PI (Costa & McCrae, 1985) and NEO PI-R (Costa &
McCrae, 1992). A bibliography
(Costa & McCrae, 2003) available on the website for
Psychological Assessment Resources
(www.parinc.com), the publisher of the NEO PI-R, is nearly 60
pages.
Both the NEO PI (Costa & McCrae, 1985) and the NEO PI-R
(Costa & McCrae, 1992)
have two forms: Form R (Rater) and Form S (Self). Form R is to
be completed by a
knowledgeable other who is well acquainted with the person and
Form S is to be completed
by the person being evaluated. Virtually all the research on the
NEO PI and NEO PI-R
has been conducted with Form S and it is the main form that
will be discussed here. More
frequent use of Form R in conjunction with Form S seems well
warranted because of the
important perspective it can provide on the person being
evaluated. At a minimum, the
reader needs to be aware of the existence of Form R so as to
consider the possibility of its
use.
315
www.parinc.com
316 Self-Report Inventories
Table 10.1 Revised NEO Personality Inventory (NEO PI-R)
domain and facet scales
Domain Facets
N (Neuroticism) NJ Anxiety
N2 Angry Hostility
N3 Depression
N4 Self-Consciousness
NS Impulsiveness
N6 Vulnerability
E (Extraversion) El Warmth
E2 Gregariousness
E3 Assertiveness
E4 Activity
ES Excitement-Seeking
E6 Positive Emotions
0 (Openness) OJ Fantasy
02 Aesthetics
03 Feelings
04 Actions
05 Ideas
06 Values
A (Agreeableness) Al Trust
A2 Straightforwardness
A3 Altruism
A4 Compliance
AS Modesty
A6 Tender-Mindedness
C (Conscientiousness) Cl Competence
C2 Order
CJ Dutifulness
C4 Achievement Striving
cs Self -Discipline
C6 Deliberation
NEO PI (First Edition)
The NEO PI (Costa & McCrae, 1985) consisted of five domains:
Neuroticism (N); Ex-
traversion (E); Openness (0); Agreeableness (A); and
Conscientiousness (C). The name
of the inventory-NEO-was formed from the initial letter of the
first three names in a
concession to an early version of the inventory that contained
only those three domains.
These five domains measure the broad dimensions of
personality in normal adults. The first
three domains (Neuroticism [NJ; Extraversion [E], Openness
[0]) also had six facets or
subscales for each domain.
Revised NEO Personality Inventory 317
Table 10.2 Revised NEO Personality Inventory (NEO PI-R)
Authors:
Published:
Edition:
Publisher:
Website:
Age range:
Reading level:
Administration formats:
Languages:
Number of items:
Response format:
Administration time:
Primary scales:
Additional scales:
Hand scoring:
General texts:
Computer interpretation:
Costa & McRae
1992
Revised
Psychological Assessment Resources
www.parinc.com
18+
6th grade
Paper/pencil, computer, CD, cassette
9 published and 25 validated translations
240
5-point Likert scale
20-30 minutes
5 Domains and 30 Facets
None
2-part carbonless Answer Sheet (self-scoring)
None
Psychological Assessment Resources (Costa & McRae)
NEO PI-R (Revised Edition)
The NEO PI-R (Costa & McCrae, 1992) consists of the same
five domains as in the NEO
PI. There are only two minor differences between the NEO PI-
Rand the NEO PI. First,
the facet scales for Agreeableness (A) and Conscientiousness
(C) were added; they had not
been available on the NEO PI. Second, 10 (4.2%) items were
replaced to allow for more
accurate measurement of several facets.
Although the NEO PI-R is the focus of this chapter, two other
forms of the NEO need to
be mentioned: NEO Five-Factor Inventory (NEO-FFI; Costa &
McCrae, 1992); and NEO
PI-3 (McCrae, Costa, & Martin, 2005). Each of these other
forms of the NEO PI-R is
described in turn. This description can be very brief for both of
them because they retain
the essential features of the NEO PI-R.
NEO Five-Factor Inventory
The NEO-FFI (Costa & McCrae, 1992) is essentially an
authorized short form of the
NEO PI-R. It consists of 60 items from the NEO PI-R that are
used only to score the
five domains: Neuroticism (N); Extraversion (£); Openness (0);
Agreeableness (A); and
Conscientiousness (C). It does not contain the items for
assessing the facets within each
domain. The NEO-FFI is designed for use in circumstances in
which time is too limited
to present the entire NEO PI-R or only scores on the five
domains are required. All the
information provided on the domains for the NEO PI-R will
apply to the NEO-FFI so it
does not need to be discussed explicitly.
NEOP/-3
McCrae et al. (2002) identified 30 items on the NEO PI-R
(Costa & McCrae, 1992)
that were not endorsed by at least 2% of nearly 2,000
adolescents. A number of these
www.parinc.com
318 Self-Report Inventories
30 items contained words that adolescents, and even some
adults, might not understand.
An additional 18 items were identified that had item-total scores
on the facet scales less
than .30. Alternative items were developed for these 48 items
and McCrae et al. (2005)
found acceptable replacements for 37 of them. The original
version of the other 11 items
was retained on the NEO PI-3. The items on the NEO Pl-3 are
easier to read than those on
the NEO PI-R and the NEO PI-3 can be used for adolescents 12
years of age and older.
Further research currently is being conducted to determine
whether the NEO Pl-3 can be
considered as a replacement for the NEO PI-Rat all ages.
The entire December 2000 issue of the journal Assessment was
devoted to the NEO
PI-R. Anyone who is using the NEO PI-R should review this
issue to get a better idea of
the broad extent and wide nature of its usage.
ADMINISTRATION
The first issue in the administration of the NEO PI-R is
ensuring that the individual is
invested in the process. Taking a few extra minutes to answer
any questions the individual
has about why the NEO PI-R is being administered and how the
results will be used will pay
excellent dividends. The examiner should work diligently to
make the assessment process
a collaborative activity with the individual to obtain the desired
information. This issue of
therapeutic assessment (Finn, 1996; Fischer, 1994) was covered
in more depth in Chapter 2
(pp. 43--44 ). The transparent nature of the items on the NEO
PI-R and the lack of extensive
means for assessing the validity of item endorsement ( see later
section in this chapter) make
the task of getting the individual appropriately engaged in
completing the NEO PI-Rall the
more important.
Reading level is not a crucial factor in determining whether a
person can complete the
NEO PI-R. First, the reading level of the NEO PI-R is the sixth
grade. Second, the exam-
iner may read the items to individuals whose reading abilities
are limited and record the
responses (Costa & McCrae, 1992, p. 5). The NEO PI-R is the
only self-report inventory
discussed in this Handbook that allows the examiner to read the
items to the individu-
als. All other self-report inventories explicitly discourage or
forbid this procedure (see
Chapter 5).
SCORING
Scoring the NEO PI-R is relatively straightforward either by
hand or computer. If the NEO
PI-R is administered by computer, the computer automatically
scores it. If the individual's
responses to the items have been placed on an answer sheet,
these responses can be entered
into the computer by the clinician for scoring or they can be
hand scored. If the clinician
enters the item responses into the computer for scoring, they
should be double entered so
that any data entry errors can be identified.
One of the advantages of computer scoring is that the factor
score for each domain is
computed directly. The factor scores can be calculated for the
domains using the formulas
presented in the Manual (Costa & McCrae, 1992, p. 8), and it is
recommended that
researchers use the factor scores. "In most cases, the domain
scale scores are a good
Revised NEO Personality Inventory 319
approximation to factor scores, and it is probably not worth the
effort to apply these
formulas by hand to individual cases" (Costa & McCrae, 1992,
p. 7).
The NEO PI-R (Costa & McCrae, 1992) and the Personality
Assessment Inventory (PAI:
Morey, 1991) are the only self-report inventories reviewed in
this Handbook that do not
use "true/false" items. Both of these inventories have the same
publisher (Psychological
Assessment Resources), and that may account for not using
"true/false" items. The NEO
PI-R uses a five-point Likert scale ranging from SD (Strongly
Disagree), D (Disagree),
N (Neutral), A (Agree), to SA (Strongly Agree). These potential
response options always
are presented in this same order on the answer sheet. When SD
(Strongly Disagree) is the
scored direction for a specific item, the response options are
scored as 4, 3, 2, 1, or 0. When
SA (Strongly Agree) is the scored direction, the preceding five
response options are scored
as 0, 1, 2, 3, or 4. Thus, the total raw score on each eight-item
facet scale can range from
0 to 32. The total score on a domain, each of which consists of
six facet scales, can range
from Oto 192, but the norm tables for adults are truncated at 25
and 172 (Costa & McCrae,
1992, Appendix C, p. 79).
The first step in hand scoring is to examine the answer sheet
carefully and indicate
omitted items and double-marked items by drawing a line
through all five responses to
these items with brightly colored ink. Also, cleaning up the
answer sheet is helpful and
facilitates scoring. Responses that were changed need to be
erased completely if possible,
or clearly marked with an "X" so that the clinician is aware that
this response has not been
endorsed by the client.
The answer sheet for the NEO PI-R is self-scoring, that is, no
templates or overlays are
required for scoring. Instead the top page of the answer sheet is
removed and each row of
items corresponds to one of the facets for each of the domains.
The facets are in numerical
order within each domain and the domains are in the order:
Neuroticism (N); Extraversion
{£); Openness (0); Agreeableness (A); and Conscientiousness
(C). The raw score for each
facet is the sum of the circled numbers on its row. The sum of
the marked scores for the
first row is facet N 1, the sum of the second row is facet El, and
so on. Once the six facet
scores have been calculated for each domain, they are summed
to create the raw score for
each domain. Thus, the sum of facets NJ, N2, N3, N4, N5,
andN6 becomes the raw score
for domain N. These raw scores for each domain are entered
into the corresponding box at
the bottom of the answer sheet.
Plotting the profile is the next step in the scoring process. There
are two profile forms
that can be used with Form S: adults (21 years of age and older)
and college (17 to 20).
Profiles are plotted separately for men and women with each of
these forms and are on
opposite sides of the same page. The college-age profile form is
used for all individuals
aged 17 to 20 no matter whether they are in college. To remove
the ambiguity, it would be
more accurate to say that the "young adult" form should be used
for all individuals between
the ages of 17 and 20 and not call it a "college" profile form.
Once the correct profile form has been selected for the person's
age and gender, all the
raw scores from the answer sheet are transferred to the
appropriate column of the profile
sheet (see Figure 10.1). The first five columns on the profile
sheet are the five domains (N,
E, 0, A, and C) and then the six facets for each domain are
presented in order. The raw
score on each domain and facet is indicated by either circling
the number or marking it
with an "x." Once the individual's scores on the five domains
have been plotted, a solid
line is drawn to connect them. A similar procedure is followed
for each of the six facets.
80
II)
! so~-----.,,,::...-----------~~-----+-'l------+-l~--------~---+----1-
>.-----____:,,,,.~---4-----+-+----
o
u
(/)
1- 50
N E O C A N1 N2 N3 N4 N5 N6 E1 E2 E3 E4 E5 E6 01 02 03
04 05 06 C1 C2 ca C4 C5 C6 A1 A2 A3 A4 A5 A6
NEO Domain and Facet Scales
Figure 10.1 NEO PI-R profile form for Domain and Facet
scales.
Revised NEO Personality Inventory 321
The scores for the domains are not connected to the facet
scores, and the sets of facets are
plotted separately; that is, there will be seven separate lines or
profiles on the form.
ASSESSING VALIDITY
One of the few areas of contention with the NEO PI-R is
whether validity scales are
necessary at all. The focus of this contention revolves around
three issues: (1) whether
responses to the NEO PI-R can be distorted and thus should be
assessed; (2) the prevalence
of such distortions within various groups of individuals; and (3)
whether the use of validity
scales to remove questionable profiles actually improves
correlates with external criteria.
Each of these issues is examined in tum.
A variety of studies have demonstrated that the NEO PI-R, like
all self-report instruments,
can be distorted by students in simulation designs either in a
positive (Ballenger, Caldwell-
Andrews, & Baer, 2001; Griffin, Hesketh, & Grayson, 2004) or
negative direction (Berry
et al., 2001).
It seems natural enough that distortions of responses occur less
frequently in normal
adults, where the NEO PI-R is used most often, because there is
little motivation for
doing so. The frequency of such distortions of responses also
should decrease when the
NEO PI-R is filled out anonymously, which typically happens in
research. Again, finding
that validity scales are not useful in normal adults and research
settings would seem to
reflect the nature of the participants and settings rather than the
usefulness of the validity
scales.
However, in clinical and personnel screening settings, it seems
probable that individuals
may distort their responses in some manner and the preceding
research demonstrates that
scores on the NEO PI-R can be distorted. In both clinical and
personnel selection settings,
the examiner is concerned with assessing potential distortions to
the domain and facet scales
in this specific individual, because it will affect the
interpretation of the scores. Thus, the
finding that validity scales may be more useful in clinical and
personnel selection settings
would seem to reflect the nature of the setting.
Several studies found that using the validity scales to remove
NEO PI-R profiles with
excessive distorted responses did not increase the relationship
with external correlates
(Piedmont, McCrae, Riemann, & Angleitner, 2000; Yang,
Bagby, & Ryder, 2000). These
findings typically occur when large groups of participants are
assessed and the relative
prevalence of such invalid profiles is relatively low.
Several studies also have found that using the validity scales to
remove NEO PI-R pro-
files with excessive distorted responses increased the
relationship with external correlates
(Caldwell-Andrews, Baer, & Berry, 2000; Young & Schinka,
2001). These findings typi-
cally occurred in clinical samples that would be more prone to
distort their responses and
in most cases were instructed to do so.
Another way of framing this contention is whether response
distortion is substance, a
characteristic of the individual such as some form of
psychopathology, or personality trait
or style, an effortful alteration of responses that may be
conscious or reflect lack of insight.
In true diplomatic fashion, Morey et al. (2002) concluded that
both substantive and stylistic
variance may be involved in determining responses to the NEO
PI-R in clinical patients.
336 Self-Report Inventories
Third, high scores on the Openness (0) domain are not
equivalent to intelligence,
but rather to divergent thinking and creativity. They also do not
imply that persons are
unprincipled or without values. They are willing to entertain
new ideas and can apply
these ideas conscientiously. In a similar manner, low scores on
the O domain do not mean
that persons are closed, defensive, or authoritarian, but rather
that they have a narrower
scope and intensity of interest. "Openness may sound healthier
or more mature to many
psychologists, but the value of openness or closedness depends
on the requirement of
the situation, and both open and closed individuals perform
useful functions in society"
(Costa & McCrae, 1992, p. 15).
Fourth, high scores on the Agreeableness (A) domain may seem
to be more socially
preferable and psychologically healthier, and such persons are
generally easier with whom
to interact. However, some situations require that the person be
independent and skeptical
of what is happening and being too agreeable can actually be a
detriment. Dependent
Personality Disorder would be characterized by a high score on
the Agreeableness domain
to illustrate that it is not necessarily psychologically healthy.
Finally, high scores on the Conscientiousness (C) domain
reflect that the person is
more active in planning and organized in carrying out their
activities. These qualities
may be expressed in academic and occupational achievement or
in annoying, fastidious
behaviors. Low scores on the Conscientious domain do not
reflect that individuals are
without principles to govern their behavior, but rather they are
more lackadaisical in working
toward their goals.
The six facet scores for each domain are intended to flesh out
the general qualities that
have been described by the parent domain scale. Important
differences can be identified
between individuals who have similar scores on the parent
domain and a different pattern
of scores on the facet scales for that domain. Two individuals
with similar scores on
the Extraversion (£) domain, one of whom has primary
elevations on Activity (£4) and
Excitement Seeking (£5), while the other has primary elevations
on Assertiveness (£2) and
Positive Emotions (£6), are very different persons.
The interpretation of the facet scales, in addition to the domain
scales on the NEO PI-R,
is recommended in most cases, and particularly in clinical,
educational, and occupational
assessments. It is conceivable in research applications that only
the domain scales are
relevant to the issue under study, and consequently, there is no
reason to score and interpret
the facet scales. It is very important to consider computer
scoring the NEO PI-R when all
the domain and facet scales are to be interpreted, because of the
high probability of some
scoring error in making that many calculations. Computer
scoring also allows for the factor
score for each domain to be computed directly rather than using
the formulas presented in
the Manual (Costa & McCrae, 1992, p. 8) to estimate them.
APPLICATIONS
As a self-report inventory, the NEO PI-R is easily administered
in a wide variety of settings
and for a variety of purposes. It is the most widely used self-
report measure of personality
in countries around the world. Costa and McCrae (2003)
reported that there are 9 published
translations, 25 validated translations, 8 research translations,
and 3 more translations in
progress. This 60-page, single-spaced, bibliography illustrates
the variety of issues and
Revised NEO Personality Inventory 337
research on the NEO PI-Rand NEO-FFI. Any comprehensive
review of this literature is
beyond the scope of this Handbook.
There are numerous settings in which the NEO PI-R·is
appropriate for use: clinical,
educational, medical, organizational, and research. The NEO PI-
R is primarily used in
educational, organizational, and research settings. The NEO PI-
R is probably underutilized
in clinical and medical settings and would seem worthy of wider
usage in these settings.
The NEO PI-R comes out of a long line ofresearch on the five-
factor model of personality
described earlier (p. 315) and will not be reiterated. The use of
the NEO PI-R is discussed
for each of these other four settings in turn.
In clinical settings, the NEO PI-R can serve at least six useful
purposes. First, it can
provide a positive or nonpathological description of the person
that can compensate for
the heavy focus on psychopathology in most assessment tools
and techniques. Most of
the self-report inventories discussed in this Handbook have few,
if any, positive statements
to make about the person. Second, the focus on the more
positive aspects of the person
can help establish rapport and build the therapeutic alliance,
and serves as an easy means
of starting the feedback of the results of the assessment process
before getting into the
psychopathological issues. Third, there is a fairly extensive
literature on the use of the
NEO PI-R in the treatment of personality disorders (cf. Costa &
Widiger, 2002). Fourth,
the assessment of validity as described should be carried out
routinely in clinical settings
because of the higher probability of some type of response
distortion. Fifth, knowledgeable
others' ratings of the person using Form R can make an
important contribution to under-
standing him or her, particularly when there is some reason to
suspect that may be some
type ofresponse distortion. Finally, the NEO PI-R is neither a
diagnostic instrument nor a
measure of psychopathology and cannot be used as the sole
assessment tool or technique
in a clinical setting.
In educational settings, the NEO PI-R can be used in advising
students about personality
characteristics that will facilitate or impede their academic
progress. There are areas of
study, such as chemistry or accounting, where careful attention
to detail is mandatory for
success, and other areas, such as philosophy or literature, where
the focus is on more abstract
or larger conceptual issues, and careful attention to detail is
much less necessary. Persons
with high scores on the Conscientiousness (C) domain are more
likely to be successful in
chemistry or accounting, while persons with high scores on the
Openness ( 0) domain are
more likely to be successful in philosophy or literature. In
neither example is academic
success foreclosed in the other area, but these individuals may
have to work harder to
recognize how their natural personality style affects their
academic performance and they
may need to find methods for coping with them to increase the
probability of success. The
NEO PI-R also can be used in counseling students in academic
settings, which would be
considered a clinical setting and was discussed earlier.
In medical settings, the NEO PI-R can be used to identify
personality characteristics
that might facilitate or impede treatment. The NEO PI-R will be
better accepted by medical
patients than other self-report inventories that have a heavy
focus on psychopathology.
Medical patients, particularly pain patients, are frequently upset
at the thought of psy-
chological assessment because they think that it implies that the
problem "is all in their
head."
Medical patients with high scores on the Neuroticism (N)
domain can alert the examiner
to review their background and history for the potential impact
of psychopathology on the
338 Self-Report Inventories
medical treatment. Medical patients with high scores on the
Conscientiousness ( C) domain
would be expected to be more likely to follow through on the
recommended steps for
treatment, particularly as the treatment process becomes more
complex or long-term. An
interesting line of research has used the NEO PI-R in predicting
risk for coronary heart
disease (cf. Costa, McCrae, & Dembroski, 1989), and Vollrath
and Torgersen (2002) used
the NEO PI-R to predict risky health behavior in college
students. Costa and McCrae (2003)
have listed the multiple areas in behavioral medicine in which
the NEO PI-R is being used.
In occupational settings, the NEO PI-R can be used to identify
personality characteristics
that might facilitate or impede success in a specific occupation.
As with educational settings,
certain personality dimensions are more important in some
occupations than others. These
personality dimensions can be used in selecting candidates for
specific occupations or in ad-
vising individuals on what occupations might be better suited
for them. When the NEO PI-R
is used to select potential candidates for specific occupations,
the examiner must be aware
that because examinees may simulate their scores on the
appropriate domains, evaluating
the validity of the NEO PI-R will be important (cf. Griffin et
al., 2004).
When an occupation requires significant amounts of
interpersonal interactions, individ-
uals with higher scores on the Extraversion (E) and
Agreeableness (A) domains will be
more likely to be successful than individuals with lower scores
on these same domains.
Conversely, when an occupation requires a significant amount
of time by oneself, indi-
viduals with lower scores on the Extraversion and
Agreeableness domains will be more
likely to be successful than individuals with higher scores on
these same domains. Again,
the examiner is reminded that when individuals do not have the
optimal scores on the
personality dimensions for a specific occupation, their success
is not precluded, but they
need to be aware of the potential impact these personality
dimensions may have on their
performance.
PSYCHOMETRIC FOUNDATIONS
Demographic Variables
Age
Specific norms are not provided by age for adults on the NEO
PI-R. There are some
differences in young adults ( <20) and a separate profile form
and norms are used for them.
The items on the NEO PI-3 are easier to read than those on the
NEO PI-R, and the NEO
PI-3 can be used for adolescents 12 years of age and older.
Further research currently is
being conducted to determine whether the NEO Pl-3 can be
considered as a replacement
for the NEO PI-Rat all ages.
Terracciano, McCrae, Brant, and Costa (2005) examined age
trends on the NEO PI-R
in a sample of nearly two thousand adults in the Baltimore
Longitudinal Study on Aging.
There was a gradual curvilinear decline of slightly over one-half
of a standard deviation in
the Neuroticism (N) and Extraversion (E) domains from age 30
to age 90. There was a linear
decline in the Openness (0) domain and linear increase in the
Agreeableness (A) domain.
There was a parabolic change in the Conscientiousness (C)
domain with scores increasing
until about age 70 and then slightly declining thereafter. All
these changes in adulthood
across the five domains were about one T score point per decade
or slightly more than one-
half of a standard deviation across the entire 60-year age span.
A cross-sectional analysis
Revised NEO Personality Inventory 339
of these data produced results that are similar to the
longitudinal analysis. Terracciano et al.
also provide similar information on all 30 of the facet scales on
the NEO PI-R.
Gender
Gender does not create any general issues in NEO Pl-R
interpretation because separate
norms (profile forms) are used for men and women. Any gender
differences in how individ-
uals responded to the items on each scale are removed when the
raw scores are converted
to T scores. Consequently, men and women with a T score of 60
(84th percentile) on
Agreeableness (A) are one standard deviation above the mean,
although women have a
slightly higher raw score (~142) than men (~136; Costa &
McCrae, 1992, Appendix C,
p. 79). Costa, Terracciano, and McCrae (2001) analyzed gender
differences in 26 cultures
and found that these gender differences were typically less than
one-half of a standard
deviation (5 T points), and most were closer to one-quarter of a
standard deviation, relative
to variations within gender.
Education
The potential effects of education have not been investigated in
any systematic manner on
the NEO PI-R, It is not apparent that such research would yield
any significant findings
given the ease with which the NEO PI-R is read and the similar
findings in factor structure
across multiple cultures.
Ethnicity
The effects of ethnicity per se on NEO PI-R performance have
not been studied, if ethnicity
as construed as being different from culture. However, the
prolific literature on the cross-
cultural use of the NEO PI-R is discussed briefly in the next
section.
Cross-Cultural Implementation
Costa and McCrae (2003) reported that there are 9 published
translations, 25 validated
translations, 8 research translations, and 3 more translations in
progress of the NEO PI-R.
The breadth of the use of the NEO PI-R across various cultures
can be seen by the fact
that there are 79 contributing members to the Personality
Profiles of Cultures Project who
represent 51 cultures from six continents (McCrae, Terracciano,
et al., 2005). This project is
looking at the aggregate personality profiles of different
cultures to assess whether they can
provide insight into cultural differences and the stereotypes of
national character (McCrae &
Terracciano, 2006). The robustness of the factor structure of the
NEO PI-R across these
various cultures not only speaks to the usefulness of the NEO
PI-R cross-culturally, but
it allows for comparisons to be made into the actual differences
in aggregate personality
profiles. As would be expected, stereotypes of national
character are erroneous (McCrae &
Terracciano, 2006), similar to the erroneous conceptualization
that all patients within a
specific diagnostic category are alike (pp. 60-61). There are
small differences in these
aggregate personality profiles across the different cultures, but
much larger variability
within cultures. These variations in aggregate personality
profiles appear to reflect real
differences that warrant further investigation.
In summary, it appears that demographic variables have
minimal impact on the
NEO PI-R profile in most individuals. The fact that the NEO PI-
R can be read to indi-
viduals and is available in many different languages makes it
applicability even broader.
340 Self-Report Inventories
Reliability
The NEO PI-R Manual (Costa & McCrae, 1992, table 5, p. 44)
reports the reliability
(coefficient alpha) data for 1,539 individuals for Form S.
Coefficient alpha ranged from
.56 to .81 for the facet scales and .86 to .92 for the domain
scales. The reliability data are
quite good for the domain scales that contain 48 items each. As
expected, the reliability
data are somewhat lower, though still very respectable, for the
facet scales that only have
eight items each.
A subset of the college students (N = 208) in the normative
sample for the NEO PI-R
were retested after an average of nearly 3 months with the NEO-
FFI, which allowed
determination of the reliability of the five domain scores. The
test-retest correlations ranged
from .75 to .83 across the five scales and averaged .79. The
standard error of measurement
is about 4 T points for the domain scales; that is, the
individual's true score on the domain
scales will be within ±4 T points two-thirds of the time.
Stability
There is impressive research on the long-term stability of NEO
PI-R scores. Costa and
McCrae (1988) reported that the stability coefficients over a 6-
year period in a large sample
of adults for the domains of N (Neuroticism), E (Extraversion),
and O (Openness) were .83,
.82, and .83, respectively. The stability coefficients over a 3-
year period for the domains of
A (Agreeableness) and C (Conscientiousness) were .63 and .79,
respectively. These stability
coefficients are higher and over a longer time period than for
any of the other self-report
inventories reviewed in this Handbook.
CONCLUDING COMMENTS
The voluminous literature on the five-factor model of
personality provides solid underpin-
nings for the NEO PI-R (Costa & McCrae, 1992). The
Personality Profiles of Cultures
Project that represents 51 cultures from six continents (McCrae,
Terracciano, et al., 2005)
shows how well regarded the NEO PI-R is internationally. More
widespread use of the
NEO PI-R in clinical and medical settings to provide a positive
perspective on the person
is warranted given the heavy bias toward psychopathology in
virtually all other assessment
tests and techniques. The existence of a parallel form for rating
of the person by a knowl-
edgeable other (Form R) is an invaluable source of information
any time there is reason
to suspect any type of response distortion that seems
particularly helpful in clinical and
medical settings.
REFERENCES
Bagby, R. M., Rector, N. A, Bindseil, K., Dickens, S. F.,
Levitan, R. D., & Kennedy, S. H. (1998).
Self-reports and informant ratings of personalities of depressed
outpatients. American Journal
ofPsychiatry, 155, 437-438.
Ballenger, J. F., Caldwell-Andrews, A., & Baer, R. A. (2001).
Effects of positive impression man-
agement on the NEO PI-Rina clinical population. Psychological
Assessment, 13, 254-260.
Revised NEO Personality Inventory 341
Berry, D. T. R., Bagby, R. M., Smerz, J., Rinaldo, J.C.,
Caldwell-Andrews, A., & Baer, R. A. (2001).
Effectiveness of NEO PI-R research validity scales for
discriminating analog malingering and
genuine psychopathology. Journal ofPersonality Assessment,
76, 496-516.
Caldwell-Andrews, A. (2001). Relationships between MMPI-2
validity scales and NEO PI-R exper-
imental validity scales in police candidates. Unpublished
doctoral dissertation, University of
Kentucky.
Caldwell-Andrews, A., Baer, R. A., & Berry, D. T. R. (2000).
Effects of response sets on NEO
PI-R scores and their relation to external criteria. Journal of
Personality Assessment, 74, 472-
488.
Costa, P. T., Jr., & McCrae, R. R. (1985). The NEO Personality
Inventory manual. Odessa, FL:
Psychological Assessment Resources.
Costa, P. T., Jr., & McCrae, R.R. (1988). Personality in
adulthood: A six-year longitudinal study of
self-reports and spouse ratings on the NEO PI. Journal of
Personality and Social Psychology,
54, 853-863.
Costa, P. T., Jr., & McCrae, R.R. (1992). Revised NEO
Personality Inventory (NEO PI-R) and NEO
Five-Factor Inventory (NEO-FFI) professional manual. Odessa,
FL: Psychological Assessment
Resources.
Costa, P. T., Jr., & McCrae, R.R. (2003). Bibliographyfor the
NEO Pl-Rand NEO FF!. Lutz, FL: Psy-
chological Assessment Resources. Available at
www3.parinc.com/uploads/pdfs/NEO_bib.pdf.
Costa, P. T., Jr., McCrae, R.R., & Dembroski, T. M. (1989).
Agreeableness vs. antagonism: Explica-
tion of a potential risk factor for CHD. In A. Siegman & T. M.
Dembroski (Eds.), In search of
coronary-prone behavior: Beyond Type A (pp. 41-63). Hillsdale,
NJ: Erlbaum.
Costa, P. T., Jr., Terracciano, A., & McCrae, R. R. (2001).
Gender differences in personality traits
across cultures: Robust and surprising findings. Journal of
Personality and Social Psychology,
81, 322-331.
Costa, P. T., Jr., & Widiger, T. A. (2002). Personality disorders
and the five-factor model ofpersonality
(2nd ed.). Washington, DC: American Psychological
Association.
Fiedler, E. R., Oltmanns, T. F., & Turkheimer, E. (2004). Traits
associated with personality disorders
and adjustment to military life: Predictive validity of self and
peer reports. Military Medicine,
169, 32-40.
Finn, S. (1996). Using the MMPI-2 as a therapeutic
intervention. Minneapolis: University of Min-
nesota Press.
Fischer, C. T. (1994). Individualizing psychological assessment.
Hillsdale, NJ: Erlbaum.
Griffin, B., Hesketh, B., & Grayson, D. (2004). Applicants
faking good: Evidence of bias in the NEO
PI-R. Personality and Individual Differences, 36, 1545-1558.
McCrae, R. R., Costa, P. T., Jr., & Martin, T. A. (2005). The
NEO PI-3: A more readable revised
NEO Personality Inventory. Journal of Personality Assessment,
84, 260-270.
McCrae, R. R., Costa, P. T., Jr., Parker, W. D., Mills, C. J.,
Terracciano, A., De Fruyt, F., et al.
(2002). Personality trait development from age 12 to age 18:
Longitudinal, cross-sectional, and
cross-cultural analyses. Journal of Personality and Social
Psychology, 83, 1456-1468.
McCrae, R.R., & Terracciano, A. (2006). National character and
personality. Current Directions in
Psychological Science, 15, 156-161.
McCrae, R.R., Terracciano, A., & 79 Members of the
Personality Profiles of Cultures Project. (2005).
Personality profiles of cultures: Aggregate personality traits.
Journal of Personality and Social
Psychology, 89, 407-425.
Morey, L. C. (1991). Personality Assessment Inventory:
Professional manual. Odessa, FL: Psycho-
logical Assessment Resources.
Morey, L. C., Quigley, B. D., Sanislow, C. A., Skodol, A. E.,
McGlashan, T. H., Shea, M. T., et al.
(2002). Substance or style? An investigation of the NEO PI-R
validity scales. Journal of Per-
sonality Assessment, 79, 583-599.
https://www3.parinc.com/uploads/pdfs/NEO_bib.pdf
342 Self-Report Inventories
Pauls, C. A., & Crost, N. W. (2005). Effects of different
instructional sets on the construct validity of
the NEO PI-R. Personality and Individual Differences, 39, 297-
308.
Piedmont, R. L., McCrae, R. R., Riemann, R., & Angleitner, A.
(2000). On the invalidity of va-
lidity scales: Evidence from self-report and observer ratings in
volunteer samples. Journal of
Personality and Social Psychology, 78, 582-593.
Schink.a, J. A., Kinder, B. N., & Kremer, T. (1997). Research
validity scales for the NEO PI-R:
Development and initial validation. Journal ofPersonality
Assessment, 68, 127-138.
Terracciano, A., McCrae, R. R., Brant, L. J., & Costa, P. T., Jr.
(2005). Hierarchical linear modeling
analyses of the NEO PI-R scales in the Baltimore Longitudinal
Study of Aging. Psychology and
Aging, 20, 493-506.
Vollrath, M., & Torgersen, S. (2002). Who takes health risks? A
probe into eight personality types.
Personality and Individual Differences , 32, 1185-1198.
Wiggins, J. S. (Ed.). (1996). The five-factor model
ofpersonality. New York: Guilford Press.
Yang, J., Bagby, R. M., & Ryder, A.G. (2000). Response style
and the NEO PI-R: Validity scales
and spousal ratings in a Chinese psychiatric sample.
Assessment, 7, 389-402.
Young, M. S., & Schink.a, J. A. (2001). Research validity scales
for the NEO PI-R: Additional
evidence for reliability and validity. Journal ofPersonality
Assessment, 76, 412-420.
a-315-321a-336-342
Chapter 9
PERSONALITY ASSESSMENT
INVENTORY
The Personality Assessment Inventory (PAI: Morey, 1991) is a
broadband measure of
the major dimensions of psychopathology found in Axis I
disorders and some Axis II
disorders of the DSM-IV-TR (American Psychiatric
Association, 2000). The PAI consists
of 4 validity, 11 clinical, 5 treatment consideration, and 2
interpersonal scales (see Table
9.1 ). There also are three or four subscales for 9 of the 11
clinical scales and for one treatment
consideration scale. Finally, a PAI Structural Summary provides
the tables for scoring and
profiles for plotting supplemental indices. Table 9.2 provides
the general information on
the PAI.
HISTORY
The PAI (Morey, 1991) was developed following a sequential,
construct-validation strategy.
The underlying construct for most of the clinical syndrome
scales based on the extant
research is multidimensional, and so the scale to measure each
clinical syndrome was to
be composed of several subscales. Once these component
subscales were identified, items
were written so that the content was directly relevant for each
one. Each item in the original
item pool of over 2,200 items then was rated by four individuals
for its appropriateness for
the specific subscale. Then four experts were asked to assign
times to the appropriate scale,
and items that did not reach 75% agreement either were dropped
or rewritten. These items
then were reviewed by a bias-review panel as to whether they
could be perceived as being
offensive on the basis of gender, race, religion, or ethnic-group
membership. Any item that
was perceived as being offensive or could inappropriately
identify a normal behavior as
psychopathology was deleted.
Expert judges, who were nationally recognized within the
content area of each scale,
then were used to sort the remaining items to ensure that each
item was related to its actual
construct for each scale on the PAI. The overall agreement was
94.3% among these judges
for the 776 items that were retained for the alpha version of the
PAI.
Groups of college students then completed the alpha-version of
the PAI in one of three
conditions: (1) standard, in which students were asked to
respond frankly and honestly; (2)
positive-impression management, in which the students were
asked to respond as if they
were trying to impress a potential employer; and (3)
malingering, in which the students
were asked to simulate the responses of a person with a mental
disorder. Items for the beta
283
284 Self-Report Inventories
Table 9.1 Personality Assessment Inventory (PAI) scales
Validity Scales
ICN
INF
NIM
PIM
Clinical Scales
SOM
SOM-C
SOM-S
SOM-H
ANX
ANX-C
ANX-A
ANX-P
ARD
ARD-0
ARD-P
ARD-T
DEP
DEP-C
DEP-A
DEP-P
MAN
MAN-A
MAN-G
MAN-I
PAR
PAR-R
PAR-H
PAR-P
scz
SCZ-P
SCZ-S
SCZ-T
BOR
BOR-A
BOR-I
BOR-N
BOR-S
ANT
ANT-A
ANT-E
ANT-S
ALC
DRG
Inconsistency
Infrequency
Negative Impression Management
Positive Impression Management
Somatic Complaints
Conversion
Somatization
Health Concerns
Anxiety
Cognitive
Affective
Physiological
Anxiety-Related Disorders
Obsessive-Compulsive
Phobias
Traumatic Stress
Depression
Cognitive
Affective
Physiological
Mania
Activity Level
Grandiosity
Irritability
Paranoia
Resentment
Hypervigilance
Persecution
Schizophrenia
Psychotic Experience
Social Detachment
Thought Disorder
Borderline Features
Affective Instability
Identity Problems
Negative Relationships
Self-Harm
Antisocial Features
Antisocial Behaviors
Egocentricity
Stimulus-Seeking
Alcohol Problems
Drug Problems
Personality Assessment Inventory 285
Table 9.1 (Continued)
Treatment Consideration Scales
AGG Aggression
AGG-A Aggressive Attitude
AGG-V Verbal Aggression
AGG-P Physical Aggression
SUI Suicidal Ideation
STR Stress
NON Nonsupport
RXR Treatment Rejection
Interpersonal Scales
DOM Dominance
WRM Warmth
version of the PAI were selected on six bases: (I) reasonable
variability across the construct,
essentially an item-difficulty parameter; (2) a positive,
corrected part-whole correlation of
the item with the total score of the other items on the scale; (3)
the corrected part-whole
correlation was higher than the correlation with measures of
social desirability and positive
and negative impression management; (4) a higher correlation
with their own scale than
other scales; (5) less face valid or "transparent" measures of the
construct embodied in the
Table 9.2 Personality Assessment Inventory (PAI)
Authors:
Published:
Edition:
Publisher:
Website:
Age range:
Reading level:
Administration formats:
Additional languages:
Number of items:
Response format:
Administration time:
Primary scales:
Additional scales:
Hand scoring:
General texts:
Computer interpretation:
Morey
1991
1st
Psychological Assessment Resources
www.parinc.com
18+
4th grade
paper/pencil, computer, CD, cassette
Arabic, French Canadian, Korean, Norwegian, Serbian, Slovene
and Swedish
344
False/Not at all True, Slightly True, Mainly True, Very True
40--50 minutes
4 Validity, 11 Clinical, 5 Treatment Considerations, 2
Interpersonal
Subscales for 9 clinical scales and 1 Treatment Consideration
scale
Self-scoring answer sheet
Morey (2003), Morey (2007a)
Psychological Assessment Resources (Clinical: Morey;
Corrections: Morey & Edens)
www.parinc.com
286 Self-Report Inventories
scale; and (6) absence of gender differences. Using these
criteria, a total of 597 items were
retained for the beta-version of the PAL
The beta-version of the PAI was administered to three groups of
individuals: (1) com-
munity adults; (2) clinical patients; and (3) college students
with either positive impression
or malingering instructions. Similar item characteristics were
assessed for the beta ver-
sion of the PAI as were assessed with the alpha version. The
final 344 items on the PAI
represented the best balance of all these item characteristics,
including the requirement
that no item could be scored on more than one scale-there are
no overlapping items on
the PAI.
Normative data for the PAI were collected from three groups:
(1) 1,462 community-
dwelling adults from which a subsample of 1,000 were selected
who were census-matched;
(2) 1,265 clinical patients from 69 clinical sites; and (3) 1,051
college students. The norms
for the PAI are based on 1,000 individuals from the census-
matched sample. The skyline
profile on the standard profile form demarcates two standard
deviations above the mean in
the clinical sample allowing the clinician to compare the
individual simultaneously with
both the census-matched and clinical samples (see Figure 9.1).
PAI Scales - Side A 8PAI"
10 11 A C D E y z
110 - ,o: ..,: - 110
70-=------- -~= _36=_-_- 40=------oo: ~=-70
....
I
- 0060
30: 25- 20- ,,_- ,0-
5-
20
: - 25: 20= 20: 1!5: - 5-
- =-20- 15-_- - 15------5--- H'i=------5-----,..,~---=- 5050------
15- - - - - - -
,o: 5-
15-
0- ,: 10- ,0-
10- 10-
:. 4040 0-
100
15-
80.: 10-
15-
66-
OS- ":ss: ,o: .,_
.,_
30-
35: 315- 3J-
,.,_
20-
- 100
:_ 80
3!5-
30.: :_ 30
5-
10-
20 _: :_ 20
Raw
/CN INF NIM PIM
1
SOM
2
ANX
3
ARD
4
DEP
5
MAN
6
PAR
7
SCZ
8
BOR
9
ANT
10
A.LC
11
DRG
A
A.GG
B
SUI
C
STR
D
NON
E
RXR
Y
DOM
Z
WRM
Raw
Tscore Tscore
Figure 9.1 PAI profile form.
Personality Assessment Inventory 287
Short Form of the PAI
The first 160 items of the PAI can be used to provide a
reasonable estimate of 20 of the
22 clinical scales for all scales but Inconsistency (JCN) and
Stress (STR). These estimates
are possible because the items with the largest item-scale
correlations were located at the
beginning of the test when the final version of the PAI was
developed. Table 11.1 in the
Personality Assessment Inventory Professional Manual (Morey,
1991, p. 142) provides
the descriptive characteristics for these 160 items. The short
form only should be used in
the most unusual circumstances, and the estimated scores must
be considered as generating
only the most tentative interpretive hypotheses. Frazier, Naugle,
and Haggerty (2006) found
that agreement between the short- and full-form of the PAI was
affected adversely when the
validity scales were elevated. They also noted that individuals
with lower levels of ability
were more likely to leave items missing and produce invalid
protocols. These individuals
are the very ones for whom the short form was designed. The
hope was that it would provide
information about the presence of psychopathology that
otherwise might not be available
from a self-report inventory.
PAI-A (Adolescent)
As a result of interest by professionals in using the PAI with
adolescents in clinical settings,
work was begun in 1999 on piloting an adolescent version of the
inventory (Morey, 2007b ).
The intent of this work was to explore the applicability of an
adolescent version that
would closely parallel the adult version of the PAL It would
retain the structure and, as
much as possible, the items of the adult version rather than be
an entirely new version
targeted specifically at an adolescent population. The
development of the PAI-A involved
an adaptation of the items of the adult PAI so that the content
was meaningful when applied
to adolescents. The approach taken was a conservative one-the
question was not whether
the item was optimized to capture the experience of an
adolescent, but rather whether the
item would retain its original meaning when read by the
adolescent. This conservative
approach was merited in that the items on the adult PAI had
been selected on the basis of
numerous criteria, and the rewording or replacement of items
could have significant and
unanticipated effects on the final properties of the adolescent
version and its interpretability
as parallel to the adult version. Thus, these revisions included
rewordings of relatively few
items and involved close equivalents of the original wording.
The next stage in development involved collecting a diverse and
representative sample
of adolescent patients, and determining the psychometric
comparability of items on the
adolescent and adult versions. A relatively small number of
items were identified that
appeared to have different characteristics in adolescent patients
than in adult patients, and
the decision was made to explore the impact of elimination of
these items. On the basis of
these analyses, items were removed in an effort to eliminate the
most problematic items and
yield an item distribution pattern that would closely parallel the
adult instrument. On the
basis of this strategy, the final PAI-A included 264 items. The
PAI-A was then standardized
using a census-matched normative sample of 707 adolescents
aged 12 to 18, as well as
a diverse clinical sample of 1,160 patients in the same age
range. The average internal
consistency for the 22 clinical scales was .79 in the community
sample and .80 in the
288 Self-Report Inventories
clinical sample, while the average test-retest reliability for
these scales was .78 over an
interval of approximately 18 days.
ADMINISTRATION
The first issue in the administration of the PAI is ensuring that
the individual is invested
in the process. Taking a few extra minutes to answer any
questions the individual may
have about why the PAI is being administered and how the
results will be used will pay
excellent dividends. The clinician should work diligently to
make the assessment process
a collaborative activity with the individual to obtain the desired
information. This issue
of therapeutic assessment (Finn, 1996; Fischer, 1994) was
covered in depth in Chapter 2
(pp. 43-44).
Reading level is a crucial factor in determining whether a
person can complete the PAI;
inadequate reading ability (to be discussed) is a major cause of
inconsistent patterns of
item endorsement. Morey (1991) suggests that most individuals
who can read at the fourth-
grade level can take the PAI with little or no difficulty because
the items are written on an
fourth-grade level or less. The PAI has the easiest reading level
of any of the self-report
inventories reviewed in this Handbook. As such, one reason for
selecting the PAI is the
larger number of clients who can complete it successfully
compared with the MMPI-2
(Butcher, Dahlstrom, Graham, Tellegen, & Kaemmer, 1989) and
the MCMI-III (Millon,
Davis, & Millon, 1997), both of which are written at the eighth-
grade level.
SCORING
Scoring the PAI is relatively straightforward either by hand or
computer. A different answer
sheet is used for hand scoring (Form HS Answer Sheet) and
optical scanning (Form SS
Answer Sheet), so the proper answer sheet must be selected for
the method of scoring. If the
PAI is administered by computer, the computer automatically
scores it. If the individual's
responses to the items have been placed on an answer sheet,
these responses can be entered
into the computer by the clinician for scoring or they can be
hand scored. If the clinician
enters the item responses into the computer for scoring, they
should be double entered so
that any data entry errors can be identified.
The first step in hand scoring is to examine the answer sheet
carefully and indicate
omitted items and double-marked items by drawing a line
through all four responses to
these items with brightly colored ink. Also, cleaning up the
answer sheet is helpful and
facilitates scoring. Responses that were changed need to be
erased completely if possible,
or clearly marked with an "X" so that the clinician is aware that
this response has not been
endorsed by the client.
The PAI (Morey, 1991) and the NEO PI-R (Costa & McCrae,
1992) are the only self-
report inventories reviewed in this Handbook that do not use
"true/false" items. Both of
these inventories have the same publisher (Psychological
Assessment Resources), which
may account for not using "true/false" items. The PAI uses a
four-point Likert scale ranging
from "false, not at all true," "slightly true," "mainly true," to
"very true." These potential
response options always are presented in this same order on the
answer sheet. When "very
Personality Assessment Inventory 289
true" is the scored direction for a specific item, the response
options are scored as 0, 1, 2,
or 3 ("very true"). When "false, not at all true" is the scored
direction, the preceding four
response options are scored as 3 ("false, not at all true"), 2, 1,
or 0. Thus, the total raw
score on an eight-item scale, which is the characteristic number
of items on each subscale
of the clinical scales, can range from Oto 24. It is imperative
that the clinician realize that
the total score is the sum of the response options for each scale,
not the total number of
items endorsed on the scale, which is the method for scoring the
MCMI-III, MMPI-2, and
MMPI-A.
The PAI is easier to score than other self-report inventories
because no templates are
required. The answer sheet, on which the person records his or
her responses, is self-scoring.
The items on each scale are designated by ruled and shaded
boxes that are identified by scale
abbreviations. The total raw score for each scale or subscale is
entered in the corresponding
box with the same abbreviation on Side B of the profile form.
The subscales for the various
scales on the PAI are plotted on Side B of the profile form. The
total scores, which are the
sum of the scores on the subscales, for all scales are entered on
Side A of the profile form.
Although this process of hand scoring may sound somewhat
complex, it is straightfor-
ward and can be carried out in 10 to 15 minutes. It is advisable
to have another person
double-check all the scoring and transferring of numbers to
catch any scoring or transcrip-
tion errors before the interpretive process begins.
ASSESSING VALIDITY
Figure 9.2 provides the flowchart for assessing the validity of
this specific administration
of the PAI and the criteria for using this flowchart are provided
in Table 9.3. The clinician is
reminded that the criteria provided in Table 9.3 are continuous,
yet ultimately the decisions
that must be made in implementation of the flowchart in Figure
9.2 are dichotomous.
General guidelines will be provided for translating these
continuous data into dichotomous
decisions on the PAI, but these guidelines need to be considered
within the constraints of
this specific client and the circumstances for the evaluation.
Item Omissions
Morey (1991) recommends that more than 95% of the items
should be endorsed if the PAI
is to be interpreted; that is, no more than 17 (.05 x 344) items
should be omitted. Table 9.3
shows that omitting 17 items is somewhere between the 93rd
and 98th percentile in both
normal and clinical samples. Morey also recommends that more
than 80% of the items
should be endorsed for any individual scale to be interpreted.
The subscales of the clinical
scales all have six to eight items, so the omission of two items
from one of these subscales
(6/8 = 75%) would mean that subscale should not be interpreted,
although the entire scale
could be interpreted if more than 80% of its items were
endorsed.
Consistency of Item Endorsement
Consistency of item endorsement on the PAI is assessed by the
Inconsistency Scale (/CN)
and Infrequency Scale (INF). The Inconsistency Scale (/CN)
scale consists of 10 pairs of
310 Self-Report Inventories
shown on this form also allows the clinician to compare the
individual's scores on each
scale with the clinical sample.
APPLICATIONS
As a self-report inventory, the PAI is easily administered in a
wide variety of settings and
for a variety of purposes. Although the PAI was developed as a
broadband measure of
psychopathology in clinical settings, its use has gradually been
extended to forensic and
criminal settings, neuropsychological settings, and medical
settings. One of the primary
reasons for its rising popularity in these settings is that it is
shorter and easier to read than
the other self-report inventories.
Somewhat different issues must be considered in the
administration of the PAI in per-
sonnel selection and forensic settings compared with the more
usual clinical setting. These
general issues were covered in Chapter 6 with the MMPI-2 (pp.
197-198) and will not be
repeated here, but they should be consulted by anyone who is
using the PAI in personnel
selection or forensic settings for the first time.
One of the considerations in the use of any assessment test or
technique in forensic
settings is whether it will meet the legal standards for
admissibility. These considerations
were raised in Chapter 8 with the MCMI-III (pp. 276-277)
because various authors have
opined that the MCMI-III does or does not meet these legal
standards. Morey, Warner, and
Hopwood (2007) have described how the PAI meets the legal
standards for admissibility.
In a survey of forensic psychologists, Lally (2003) reported that
the PAI was rated as being
acceptable for the evaluation of mental status at the time of the
offense, risk for violence,
risk for sexual violence, competency to stand trial, and
malingering.
The PAI is increasingl y being used in correctional settings
because it is shorter and
easier to read than other self-report inventories. Edens, Cruise,
and Buffington-Vollum
(2001) have provided a general overview of the issues involved
in using the PAI in forensic
and correctional settings. Edens and Ruiz (2006) reported that
elevated scores on the
Positive Impression Management (PIM > T56) scale in
conjunction with elevated scores
on the Antisocial Features (ANT > T59) scale predicted
institutional misconduct among
male inmates. Caperton, Edens, and Johnson (2004) found that
elevated scores on the
Antisocial Features (ANT > T69) scale identified sex offenders
who were more likely to
be management risks while in prison. Finally, Kucharski,
Duncan, Egan, and Falkenback
(2006) found that three levels of psychopathy as measured by
the PCL-R were not related
to scores on Negative Impression Management (NIM) scale, the
Malingering Index (MAL),
or Rogers' discriminant function (RDF), and that the criminal
defendants with higher levels
of psychopathy were not more likely to malinger as measured by
the PAI scales.
Finally, the PAI is being used in neuropsychological settings to
evaluate whether
the effects of brain injury have produced any psychological
sequelae. Demakis et al.
(2007) found that 34.7% of their sample of 95 individuals who
had suffered a traumatic
brain injury did not elevate any clinical scale on the PAI above
a T score of 69. This
number of unelevated profiles in individuals with brain injury is
commonly found (cf.
Warriner, Rourke, Velikonja, & Metham, 2003). The most
common two-point codetypes
were: SCZ/BDL-(Schizophrenia/Borderline Features)-18.9%;
DEP/SCZ-(Depression/
Schizophrenia)-12.6%; and SOM/ANX-(Somatic
Complaints/Anxiety)-10.5%.
Personality Assessment Inventory 311
PSYCHOMETRIC FOUNDATIONS
Demographic Variables
Age
Morey (1996a) reported age has minimal impact on the PAI
scale scores. Individuals who
were 18 to 29 years of age elevated the Paranoia (PAR) scale 5
T points, the Borderline
Features (BOR) scale 6 T points, the Antisocial Features (ANT)
scale 7 T points, the
Aggression (AGG) scale 5 T points, and the Stress (STR) scale
4 T points higher than
other age groups. The primary subscale impacted by this
elevation in score was Paranoia-
Persecution (PAR-P), Borderline Features-Identity Problems
(BOR-1), Antisocial Features-
Stimulus Seeking (ANT-S), and Aggression-Verbal Aggression
(AGG-V). There are no
subscales for Stress (STR). Individuals who were 60+ years of
age lower these same five
scales 4 T points. The primary subscale lowered by this
elevation was Paranoia-Resentment
(PAR-R), Borderline Features-Identity Problems (BOR-1),
Antisocial Features-Antisocial
Behavior (ANT-A), and Aggression-Physical Aggression (AGG-
P).
Gender
Gender does not create any general issues in PAI interpretation
because the items were
selected to eliminate gender bias. Men elevated the Antisocial
Features (ANT) scale by 3 T
points more than women (Morey, 1996a). This elevation
primarily impacted the Antisocial
Features-Antisocial Behavior (ANT-A) subscale.
Education
The potential effects of education have not been investigated in
any systematic manner on
the PAI, although such research clearly is needed.
Ethnicity
The effects of ethnicity on PAI performance also have not been
investigated in any system-
atic manner. Morey (1996a) reported that nonwhite individuals
elevated the Paranoid (PAR)
scale 6 T points compared with White individuals. This
elevation primarily impacted the
Paranoid-Hypervigilance (PAR-H) subscale.
Reliability
The PAI Professional Manual (Morey, 1991, Appendix E)
reported the reliability data
for 75 community-dwelling adults who were retested after an
average of 24 days and 80
undergraduate students who were retested at 28 days. The test-
retest correlations ranged
from .85 to .94 in the adult sample and ranged from .66 to .90 in
the student sample across
the 11 clinical scales. The standard error of measurement ranges
from 2.8 to 4.6 T points
for these 11 clinical scales, that is, the individual's "true" score
on the clinical scales will
be within ±3 to 5 T points two-thirds of the time.
Codetype Stability
There are limited empirical data that indicate how consistently
individuals will obtain the
same two highest clinical scales on two successive
administrations of the PAL Codetype
312 Self-Report Inventories
stability was examined in all 155 individuals who were part of
the examination of retest
reliability just described. When only the single highest scale
was examined across the two
administrations, 57.4% had the same high-point scale. When
this analysis was limited only
to those individuals with significant elevations (20/155), 76.9%
had the same high-point
scale. These data should only be considered to be an estimate of
the actual codetype stability
of the PAI. Because only a single high-point scale was
considered, there has to be a lower
rate of stability when the two highest scales are required to be
the same. On the other hand,
clinical samples would produce higher elevations on the PAI
clinical scales than these
normal individuals and the preceding data suggest that
concordance rates would be higher
for more elevated profiles.
CONCLUDING COMMENTS
The PAI (Morey, 1991) is the newest of the self-report
inventories reviewed in this Hand-
book. The PAI is gradually gaining a wide base of usage
because it is shorter than all other
self-report inventories except the MCMI-III and it has the
lowest reading level of any of
them. There has been a substantial increase in research with the
PAI in each ensuing year
that continues to validate its use in a number of different
settings.
REFERENCES
American Psychiatric Association. (2000). Diagnostic and
statistical manual of mental disorders
(4th ed., text rev.). Washington, DC: Author.
Bagby, R. M., Nicholson, R. A., Bacchiochi, J. R., Ryder, A.G.,
& Bury, A. S. (2002). The predictive
capacity of the MMPI-2 and PAI validity scales and indexes to
detect coached and uncoached
feigning. Journal ofPersonality Assessment, 78, 69-86.
Baity, M. R., Siefert, C. J., Chambers, A., & Blais, M.A.
(2007). Deceptiveness with the PAI: A
study of nai"ve faking with psychiatric inpatients. Journal
ofPersonality Assessment, 88, 16-24.
Butcher, J. N., Dahlstrom, W. G., Graham, J. R., Tellegen, A.
M., & Kaemmer, B. (1989). MMPI-2:
Manual for administration and scoring. Minneapolis: University
of Minnesota Press.
Caperton, J. D., Edens, J. F., & Johnson, J. K. (2004).
Predicting sex offender institutional adjustment
and treatment compliance using the PAI. Psychological
Assessment, I 6, 187-191.
Cashel, M. L., Rogers, R., & Sewell, K. (1995). The PAI and
the detection of defensiveness. Assess-
ment, 2, 333-342.
Clark, M. E., Gironda, R. J., & Young, R. W. (2003). Detection
of back random responding: Effec-
tiveness of MMPI-2 and PAI validity indices. Psychological
Assessment, 15, 223-234.
Costa, P. T., Jr., & McCrae, R.R. (1992). Revised NEO
Personality Inventory (NEO PI-R) and NEO
Five-Factor Inventory (NEO-FFI) professional manual. Odessa,
FL: Psychological Assessment
Resources.
Demakis, G. J., Hammond, F., Knotts, A., Cooper, D. B.,
Clement, P., Kennedy, J., et al. (2007).
The PAI in individuals with traumatic brain injury. Archives of
Clinical Neuropsychology, 22,
123-130.
Edens, J. F., Cruise, K. R., & Buffington-Vollum, J. K. (2001).
Forensic and correctional applications
of the PAI. Behavioral Sciences and the Law, 19, 519-543.
Edens, J. F., Poythress, N. G., & Watkins-Clay, M. M. (2007).
Detection of malingering in psychiatric
unit and general population prison inmates: A comparison of the
PAI, SIMS, and SIRS. Journal
ofPersonality Assessment, 88, 33-42.
Personality Assessment Inventory 313
Edens, J. F., & Ruiz, M. A. (2006). On the validity of validity
scales: The importance of defensive
responding in the prediction of institutional misconduct.
Psychological Assessment, 18, 220--224.
Finn, S. (1996). Using the MMPI-2 as a therapeutic
intervention. Minneapolis: University of Min-
nesota Press.
Fischer, C. T. (1994). Individualizing psychological assessment.
Hillsdale, NJ: Erlbaum.
Frazier, T. W., Naugle, R. L, & Haggerty, K. A. (2006).
Psychometric adequacy and comparability
of the short and full forms of the PAL Psychological
Assessment, I 8, 324-333.
Hopwood, C. J., Morey, L. C., Rogers, R., & Sewell, K. (2007).
Malingering on the PAI: Identification
of specific feigned disorders. Journal of Personality
Assessment, 88, 43-48.
Kucharski, L. T., Duncan, S., Egan, S.S., & Falkenbach, D. M.
(2006). Psychopathy and malingering
of psychiatric disorder in criminal defendants. Behavioral
Sciences and the Law, 24, 633---644.
Kucharski, L. T., Toomey, J. P., Fila, K., & Duncan, S. (2007).
Detection of malingering of psy-
chiatric disorder with the PAI: An investigation of criminal
defendants. Journal of Personality
Assessment, 88, 25-32.
Lally, S. J. (2003). What tests are acceptable for use in forensic
evaluations? A survey of experts.
Professional Psychology: Research and Practice, 34, 491-498.
Millon, T., Davis, R., & Millon, C. (1997). MCMI-III manual
(2nd ed.). Minneapolis, MN: National
Computer Systems.
Morey, L. C. (1991). Personality Assessment Inventory
professional manual. Odessa, FL: Psycho-
logical Assessment Resources.
Morey, L. C. (1996a). An interpretive guide to the PAI. Odessa,
FL: Psychological Assessment
Resources.
Morey, L. C. (1996b). PAI structural summary. Odessa, FL:
Psychological Assessment Resources.
Morey, L. C. (1999). PAI interpretive explorer module manual.
Odessa, FL: Psychological Assessment
Resources.
Morey, L. C. (2003). Essentials of PAI assessment. Hoboken,
NJ: Wiley.
Morey, L. C. (2007a). An interpretive guide to the PAI. Odessa,
FL: Psychological Assessment
Resources.
Morey, L. C. (2007b). Personality Assessment Inventory---
A.dolescent professional manual. Odessa,
FL: Psychological Assessment Resources.
Morey, L. C., & Hopwood, C. J. (2004). Efficiency of a strategy
for detecting back random responding
on the PAL Psychological Assessment, 16, 197-200.
Morey, L. C., Warner, M. B., & Hopwood, C. J. (2007).
Personality Assessment Inventory: Issues in
legal and forensic settings. In A. M. Goldstein (Ed.), Forensic
psychology: Emerging topics and
expanding roles (pp. 97-126). Hoboken, NJ: Wiley.
Peebles, J., & Moore, R. J. (1998). Detecting socially desirable
responding with the PAI: The Positive
Impression Management scale and the Defensiveness Index.
Journal ofClinical Psychology, 54,
621---628.
Rogers, R., Sewell, K. W., Morey, L. C., & Ustad, K. L. (1996).
Detection of feigned mental disor-
ders on the Personality Assessment Inventory: A discriminant
analysis. Journal of Personality
Assessment, 67, 629-640.
Warriner, E. M., Rourke, B. P., Velikonja, D., & Metham, L.
(2003). Subtypes of emotional and
behavioral sequelae in patients with traumatic brain injury.
Journal ofClinical and Experimental
Neuropsychology, 25, 904-917.
a-283-289a-310-313
Chapter 6
MINNESOTA MULTIPHASIC
PERSONALITYINVENTORY4
The Minnesota Multiphasic Personality Inventory-2 (MMPI-2:
Butcher, Dahlstrom, Gra-
ham, Tellegen, & Kaemmer, 1989; Butcher et al., 2001) is a
broadband measure of the
major dimensions of psychopathology found in Axis I disorders
and some Axis II disor-
ders of the DSM-W-TR (American Psychiatric Association,
2000). The MMPI-2 consists
of 9 validity and 10 clinical scales in the basic profile, along
with 15 content scales, 9
restructured clinical scales, and 20 supplementary scales (s ee
Table 6.1).
There also are subscales for most of the clinical and content
scales with easily over
120 scales that can be scored and interpreted on the MMPI-2.
Table 6.2 provides general
information on the MMPI-2.
HISTORY
Hathaway and McKinley ( 1940) sought to develop a
multifaceted or multiphasic person-
ality inventory, now known as the Minnesota Multiphasic
Personality Inventory (MMPI),
that would surmount the shortcomings of the previous
personality inventories. These short-
comings included (a) relying on how the researcher thought
individuals should respond to
the content of items rather than validating how they actually
responded to the items; (b)
using only face-valid items whose purpose or intent was easily
understood; and (c) failing
to assess whether individuals were trying to distort their
responses to the items in some
manner. Instead of using independent sets of tests, each with a
specific purpose, Hathaway
and McKinley included in a single inventory a wide sampling of
behavior of significance to
psychologists. They wanted to create a large pool of items from
which various scales could
be constructed, in the hope of evolving a greater variety of valid
personality descriptions
than was currently available.
MMPI (Original Version)
To this end, Hathaway and McKinley (1940) assembled more
than 1,000 items from
psychiatric textbooks, other personality inventories, and clinical
experience. The items
were written as declarative statements in the first-person
singular, and most were phrased in
the affirmative. Using a subset of 504 items, Hathaway and
McKinley constructed a series
of quantitative scales that could be used to assess various
categories of psychopathology.
The items had to be answered differently by the criterion group
(e.g., hypochondriacal
135
136 Self-Report Inventories
Table 6.1 Minnesota Multiphasic Personality Inventory-2
(MMPl-2) scales
Validity Scales
?
VRIN
TRIN
F
FB
Fp
L
K
s
Clinical Scales
I (Hs)
2 (D)
3 (Hy)
4 (Pd)
5(Mf)
6 (Pa)
7 (Pt)
8 (Sc)
9 (Ma)
0 (Si)
Restructured Clinical Scales
RCd
RC/som
RC2lpe
RC3cyn
RC4asb
RC6per
RC7dne
RC8abx
RC9hpm
Content Scales
ANX
FRS
OBS
DEP
HEA
BIZ
ANG
CYN
ASP
TPA
LSE
SOD
FAM
WRK
TRT
Cannot Say
Variable Response Consistency
True Response Consistency
Infrequency
Back Infrequency
Infrequency Psychopathology
Lie
Correction
Superlative
Hypochondriasis
Depression
Hysteria
Psychopathic Deviate
Masculinity-Femininity
Paranoia
Psychasthenia
Schizophrenia
Hypomania
Social Introversion
Demoralization
Somatization
Low Positive Emotionality
Cynicism
Antisocial Behavior
Persecutory Ideas
Dysfunctional Negative Emotions
Aberrant Experiences
Hypomanic Activation
Anxiety
Fears
Obsessions
Depression
Health Concerns
Bizarre Mentation
Anger
Cynicism
Antisocial Practices
Type A
Low Self-Esteem
Social Discomfort
Family Problems
Work Interference
Negative Treatment Indicators
Minnesota Multiphasic Personality Inventory-2 137
Table 6.1 (Continued)
PSY-5 Scales
AGGR
PSYC
DISC
NEGE
INTR
Supplementary Scales
Broad Personality Characteristics
A
R
Es
Do
Re
Generalized Emotional Distress
Mt
PK
MDS
Behavioral Dyscontrol
Ho
0-H
MAC-R
AAS
APS
Gender Role
GF
GM
Aggression
Psychoticism
Disconstraint
Negative Emotionality
Introversion/Low Positive Emotionality
Anxiety
Repression
Ego Strength
Dominance
Social Responsibility
College Maladjustment
PTSD-Keane
Marital Distress
Hostility
Overcontrolled Hostility
MacAndrew Alcoholism-Revised
Addiction Admission
Addiction Potential
Gender Role-Feminine
Gender Role-Masculine
patients) as compared with normal groups. Since their approach
was strictly empirical and
no theoretical rationale was posited as the basis for accepting or
rejecting items on a specific
scale, it is not always possible to discern why a particular item
distinguishes the criterion
group from the normal group. Rather, items were selected solely
because the criterion
group answered them differently than other groups. For each of
the criterion groups and the
normative group, the frequency of "True" and "False" responses
was calculated for each
item. An item was tentatively selected for a scale if the
difference in frequency of response
between the criterion group and the normative group was at
least twice the standard error of
the proportions of true/false responses of the two groups being
compared. Having selected
items according to this procedure, Hathaway and McKinley then
eliminated some of them
for various reasons. First, the frequency of the criterion group's
response was required
to be greater than 10% for nearly all items; those items that
yielded infrequent deviant
response rates from the criterion group were excluded even if
they were highly significant
statistically because they represented so few criterion cases.
Additionally, items whose
responses appeared to reflect biases on variables such as marital
status or socioeconomic
138 Self-Report Inventories
Table 6.2 Minnesota Multiphasic Personality lnventory-2
(MMPI-2)
Authors:
Published:
Edition:
Publisher:
Website:
Age Range:
Reading Level:
Administration Formats:
Additional Languages:
Number of Items:
Response Format:
Administration Time:
Primary Scales:
Additional Scales:
Hand Scoring:
General Texts:
Computer Interpretation:
Butcher, Dahlstrom, Graham, Tellegen, and Kaemmer
1989
2nd
Pearson Assessments
www.PearsonAssessments.com
18+
6th-8th grade
paper/pencil, computer, CD, cassette
Spanish, Hmong, and French for Canada
567
True/False
60--90 minutes
9 Validity, 10 Clinical, 15 Content
5 PSY-5, 9 Restructured Clinical, 20 Supplementary
Templates
Friedman et al. (2001); Graham (2006); Greene (2000); Nichols
(2001)
Caldwell Report (Caldwell); Pearson Assessments (Butcher);
Psychological Assessment Resources (Greene)
status were excluded. Evaluation of several methods of
weighting individual items showed
no advantage over using unweighted items. Therefore, each item
simply received a weight
of "one" in deriving a total score. In other words, a person's
score on any MMPI scale is
equal to the total number of items that the individual answers in
the same manner as the
criterion group.
The empirical approach to item selection used by Hathaway and
McKinley, in fact, freed
them of any concerns about how any individual interprets
specific items because it assumes
that the individual's self-report is just that and makes no a priori
assumptions about the
relationships between the individual's self-report and the
individual's behavior. Items are
selected for inclusion in a specific scale only because the
criterion group answered the items
differently than the normative group irrespective of whether the
item content is actually an
accurate description of the criterion group. Any relationship
between individuals' responses
on a given scale and their behavior must be demonstrated
empirically.
MMPI-2 (Restandardized Version)
The MMPI-2 (Butcher et al., 1989, 2001) represents the
restandardization of the MMPI that
was needed to provide current norms for the inventory, develop
a nationally representative
and larger normative sample, provide appropriate representation
of ethnic minorities, and
update item content where needed. Continuity between the
MMPI and the MMPI-2 was
maintained because new criterion groups and item derivation
procedures were not used
on the standard validity and clinical scales. Thus, the items on
the validity and clinical
scales of the MMPI are essentially unchanged on the MMPI-2
except for the elimination of
13 items based on item content and the rewording of 68 items.
www.PearsonAssessments.com
Minnesota Multiphasic Personality Inventory-2 139
In the development of the MMPl-2, the Restandardization
Committee (Butcher et al.,
1989) started with the 550 items on the original MMPI; that is,
they first deleted the 16
repeated items. They reworded 141 of these 550 items to
eliminate outdated and sexist
language and to make these items more easily understood.
Rewording these items did
not change the correlations of the items with the total scale
score in most cases (Ben-
Porath & Butcher, 1989). Many of these items were omitted on
the original MMPI because
individuals did not understand them. Greene (1991, p. 57)
provides examples of these items
such as playing drop the handkerchief. The Restandardization
Committee then added 154
provisional items that resulted in the 704 items on Form AX,
which was used to collect the
normative data for the MMPI-2.
When finalizing the items to be included on the MMPI-2, the
Restandardization Com-
mittee deleted 77 items from the original MMPI in addition to
the 13 items deleted from the
standard validity and clinical scales and the 16 repeated items.
Consequently, most special
and research scales that have been developed on the MMPI are
still capable of being scored
unless the scale has an emphasis on religious content or the
items are drawn predominantly
from the last 150 items on the original MMPI.
The Restandardization Committee included 68 of the 141 items
that had been rewritten,
and they incorporated 107 of the provisional items to assess
major content areas that were
not covered in the original MMPI item pool. The rationale for
including and dropping items
from Form AX that resulted in the 567 items on the MMPI-2 has
not been made explicit.
The MMPI-2 was standardized on a sample of 2,600 individuals
who resided in seven
different states (California, Minnesota, North Carolina, Ohio,
Pennsylvania, Virginia, and
Washington) to reflect national census parameters on age,
marital status, ethnicity, educa-
tion, and occupational status. The normative sample for the
MMPI-2 varies significantly
from the original normative sample for the MMPI in several
areas: years of education, rep-
resentation of ethnic minorities, and occupational status. The
individuals in the normative
sample for the MMPI-2 also are more representative of the
United States as a whole because
national census parameters were used in their collection.
However, they still varied from
the census parameters on years of education and occupational
status. The potential im-
pact of this higher level of education and occupation in the
MMPI-2 normative sample on
codetype and scale interpretation has been a focus of concern
(Caldwell, 1997c; Helmes
& Reddon, 1993). However, Schink.a and LaLone (1997)
compared a census-matched sub-
sample created within the MMPI-2 restandardization sample and
found only one difference
that exceeded 3 T score points between these two samples on
the standard validity and
clinical scales, content scales, and supplementary scales.
The extant literature that has examined the empirical correlates
of MMPI-2 scales and
codetypes has been consistent with the correlates reported for
their MMPI counterparts
(Archer, Griffin, & Aiduk, 1995; Graham, Ben-Porath, &
McNulty, 1999). It appears safe
to assume that the correlates of well-defined MMPI-2 codetypes
(the two highest clinical
scales composing the codetype should be at least five T points
higher than the next highest
clinical scale) and the individual validity and clinical scales
will be very similar to those
for the MMPI. The data are less clear for MMPI-2 codetypes
that are not well-defined,
although it still will be safe to interpret the individual validity
and clinical scales in these
codetypes using MMPI correlates given the minimal change at
the scale level.
New sets of scales have been developed with the MMPI-2 item
pool: content scales
(Butcher, Graham, Williams, & Ben-Porath, 1990); content
component scales (Ben-Porath
140 Self-Report Inventories
& Sherwood, 1993); personality psychopathology five scales
(PSY-5: Harkness, McNulty,
Ben-Porath, & Graham, 2002); and restructured clinical scales
(Tellegen et al., 2003).
Several major reviews of the MMPI-2 (Butcher, Graham, &
Ben-Porath, 1995; Butcher
& Rouse, 1996; Caldwell, 1997c; Greene, Gwin, & Staal, 1997;
Helmes & Reddon, 1993)
provide summaries from a variety of perspectives on this
venerable instrument. These
reviews provide the interested reader with an excellent starting
point for looking at the
current status of the MMPI-2. Butcher et al. (1995) and Greene
et al. (1997) also outline
the general steps that researchers need to follow and issues that
need to be addressed in
conducting research with the MMPI-2. It is to be hoped that
researchers will heed the advice
dispensed in these reviews to enhance the quality of the data
that are being collected.
Unlike the MMPI which was used with all ages, the MMPI-2 is
to be used only with
adults /8 years of age and older. Adolescents are to be tested
with the MMPI-A (Butcher
et al., 1992), which is designed specifically for them (see
Chapter 7).
ADMINISTRATION
The first requirement in the administration of the MMPI-2 is
ensuring that the individual is
invested in the process. It will pay excellent dividends to spend
a few extra minutes answer-
ing any questions the individual may have about why the
MMPI-2 is being administered and
how the results will be used. The clinician should work
diligently to make the assessment
process a collaborative activity with the individual to obtain the
desired information. This
issue of therapeutic assessment (Finn, 1996; Fischer, 1994) was
covered in more depth in
Chapter 2 (pp. 43-44).
Reading level is a crucial factor in determining whether a
person can complete the
MMPI-2; inadequate reading ability is a major cause of
inconsistent patterns of item
endorsement to be discussed later. Butcher et al. ( 1989) suggest
that most clients who
have had at least 8 years of formal education can take the
MMPl-2 with little or no
difficulty because the items are written on an eighth-grade level
or less. A number of
authors (Dahlstrom, Archer, Hopkins, Jackson, & Dahlstrom,
1994; Paolo, Ryan, & Smith,
1992; Schinka & Borum, 1993) have studied the readability of
MMPl-2 and MMPI-A
items. There was general concurrence that the average
readability of the MMPI-2 and
MMPI-A is in the range of fifth to sixth grade. The scales
requiring the highest reading
levels were 9 (Ma: Hypomania), the three content scales of
Antisocial Practices (ASP),
Cynicism (CYN), and Type A (TPA), several of the Harris and
Lingoes (1955) subscales:
Hy2 (Need for Affection), Pa3 (Naivete), Sc5 (Lack of Ego
Mastery, Defective Inhibition),
Ma 1 (Amorality), Ma2 (Psychomotor Acceleration), Ma3
(Imperturbability), and Ma4 (Ego
Inflation). On most of these scales, at least 25% of their items
required more than an eighth-
grade reading level. These estimates of the required grade level
are conservative because
they are based on assessing the readability of individual MMPI-
2 items or groups of items.
They are not based on the difficulty of understanding what is
meant by saying either "true"
or "false" to a specific item. The reader can assess this problem
directly by trying to
understand exactly what is meant by saying "false" to an MMPI-
2 item that is worded in
the negative. What do individuals actually mean when they say
"false" to an item such as
"I do not always have pain in my back"? Schinka and Borum did
suggest that individuals
be asked to read MMPI-2 items 114, 226, and 445 if they have
completed less than a 10th
Minnesota Multiphasic Personality Inventory-2 141
grade education to determine whether their reading skills are
adequate. Dahlstrom et al.
(1994) also noted that the instructions for the MMPI-2 actually
were more difficult than the
items on the test so clinicians should be sure the individual
fully understands them.
SCORING
Scoring the MMPI-2 is relatively straightforward either by hand
or computer. If the
MMPI-2 is administered by computer, the computer
automatically scores it. If the in-
dividual's responses to the items have been placed on an answer
sheet, these responses can
be entered into the computer by the clinician for scoring or they
can be hand-scored. If the
clinician enters the item responses into the computer for
scoring, they should be double
entered so that any data entry errors can be identified.
The first step in hand-scoring is to examine the answer sheet
carefully and indicate
omitted items and double-marked items by drawing a line with
brightly colored ink through
both the "true" and "false" responses to these items. Also,
cleaning up the answer sheet
helps facilitate scoring. Responses that were changed need to be
erased completely if
possible, or clearly marked with an "X" so that the clinician is
aware that this response has
not been endorsed by the client.
There is one scale that must always be scored without a
template. The Cannot Say(?)
scale score is the total number of items not marked and double
marked. All the other scales
are scored by placing a plastic template over the answer sheet
with a small box drawn at
the scored (deviant) response--either "true" or "false"-for each
item on the scale. The
total number of such items marked equals the client's raw score
for that scale; this score
is recorded in the proper space on the answer sheet. One scale-
Scale 5 (Mf- Masculinity-
Femininity)-is scored differently for men and women, and
unusually high or low scores
on this scale might indicate that the wrong template was used.
Among women, a raw score
less than 30 is unusual, and such raw scores should at least
arouse a suspicion that the
wrong template was used in scoring the scale. All scoring
templates are made of plastic
and must be kept away from heat.
Plotting the profile is the next step in the scoring process. In
essence, the clinician
transfers all the raw scores from the answer sheet to the
appropriate column of the profile
sheet (see Figure 6.1). Some precautions must be taken and data
calculations performed.
First, separate profile sheets are used for men and women as
with the scoring templates for
Scale 5; an unusually high or low score plotted for Scale 5
should alert the clinician to the
possibility that the wrong profile sheet was selected. Second,
each column on the profile
sheet is used to represent the raw scores for a specific scale.
Each dash represents a raw
score of 1 with the larger dashes marking increments of 5. Thus,
the clinician notes the
individual's raw score on the scale being plotted and makes a
point or dot at the appropriate
dash. Once the clinician has plotted the individual's scores on
the eight validity scales, a
solid line is drawn to connect them. The raw score on the
Cannot Say(?) scale is merely
recorded in the proper space in the lower left-hand comer of the
profile sheet.
A similar procedure is followed to plot the 10 clinical scales
except that five of the clinical
scales (1 [Hs: Hypochondriasis], 4 [Pd: Psychopathic Deviate],
7 [Pt: Psychasthenia],
8 [Sc: Schizophrenia], and 9 [Ma: Hypomania]) are K-corrected;
that is, a fraction of K is
added to the raw score before the individual's score is plotted.
For these five scales that
Minnesota Multiphasic Personality lnventory-2 197
Spike 3 codetypes. A client with a T score of 60 on the F scale
is almost 15 points higher
than the mean for Spike 3 codetypes, and nearly 40 points lower
than the mean for 6-8/8-6
codetypes. A T score of 60 is unusual in both of these
codetypes; in the former it is higher
than expected and in the latter it is much lower than expected.
Similar variations can be
seen in the T scores for Scales 2 (D: Depression) and 8 (Sc:
Schizophrenia).
A codetype analysis can be further refined by considering
additional clinical scales to
create three- and four-point codetypes. A number of two-point
codetypes have frequent
three-point variants that should be considered in the
interpretation of the MMPI-2, such as
variants of 2-414-2 (2-4/4-2-(3), 2-4/4-2-(7), 2-4/4-2-(8)) and 2-
7/7-2 (2-717-2-(1), 2-717-
2-(3), 2-717-2-(8), 2-7/7-2-(0)) codetypes. Again, the
interpretation of a client's score on a
given scale will change as the prototypic score changes in the
three-point codetypes within
a particular group.
The final "group" with which the MMPI-2 can be compared in
the interpretive process
is the individual, or idiographic, interpretation. In this
comparison, the relative elevations
of the scales become important because they indicate which
content domains are more
or less important for this particular individual. An individual
who has T scores of 75
and 60 on the content scales of Depression (DEP) and Anxiety
(ANX), respectively, is
saying that symptoms of depression are more of a problem than
symptoms of anxiety. The
MMPI-2 content (Butcher et al., 1990) and content component
(Ben-Porath & Sherwood,
1993) scales are an excellent means of developing such an
idiographic interpretation of an
individual's MMPI-2 profile, because the various content
domains can be juxtaposed so
that the clinician can compare them directly.
APPLICATIONS
As a self-report inventory, the MMPI-2 is easily administered in
a wide variety of settings
and for a variety of purposes. Although the MMPI was
developed originally in a clinical
setting with a primary focus on establishing a diagnosis for the
person (Hathaway &
McKinley, 1940), its uses quickly broadened to include more
general descriptions of the
behavior and symptoms of most forms of psychopathology (cf.
Dahlstrom et al., 1972).
This use was followed by extensions into the screening of
applicants in personnel selection
settings and a multitude of uses in forensic settings.
Somewhat different issues must be considered in the
administration of the MMPI-2 in
personnel selection and forensic settings compared with the
more usual clinical setting.
First, not only is the administration not going to be therapeutic,
the MMPI-2 results have
the potential to cause a fairly negative impact on the individual.
The individual may not be
selected in a personnel-screening setting or be less likely to be
considered for custody of
children because of the acknowledgment of significant
psychopathology.
Second, the assessment of validity is particularly important
because different forensic
settings can have a significant impact on the data that are
obtained from an individual. Items
particularly sensitive to this impact are likely to be those items
about which an individual
is not sure or ambivalent in responding. In civil forensic
settings such as personal injury,
workers' compensation, and insurance disability claims, this
impact is likely to be in the
opposite direction from that in parenting examinations or
personnel selection. Portraying
oneself as being more impaired in cases for civil damages is
likely to benefit an individual's
198 Self-Report Inventories
claim; portraying oneself as being less impaired and more
psychologically healthy is likely
to benefit an individual's chances of being selected, or at least
not screened out, in a
personnel-screening setting. Consequently, it behooves the
forensic psychologist to know
what types of MMPI-2 scores and profiles are to be expected in
every forensic setting.
There also are different expectations of whether to report
problematic behaviors and
symptoms in criminal cases. Individuals who are being
evaluated for competency to stand
trial or for the introduction of mitigating circumstances during
the sentencing phase after
a conviction for murder versus probation or parole should have
different expectations
of the problematic behaviors and symptoms of psychopathology
that are, or are not, to be
reported. Individuals in the former forensic contexts would be
expected to report any and all
problematic behaviors or symptoms that might be in any way
relevant to their circumstances,
while individuals in the latter would not be expected to report
any problematic behaviors
or symptoms.
Third, in a forensic setting it must be kept in mind that the
MMPI-2 is being used to
address a specific psycholegal issue rather than as a general
screen for psychopathology.
Thus, the interpretations provided of the MMPI-2 must be
relevant to this psycholegal
issue. For example, the mere presence of psychopathology as
indicated by elevation of
several clinical scales on the MMPI-2 may not be directly
relevant to the psycholegal issue
of quality of parenting skills in a child-custody examination or
the ability to understand
legal proceedings in a competency hearing.
Finally, whether it is the prosecution (plaintiff) or the defense
(defendant) that has
retained the forensic psychologist also may impact the
problematic behaviors and symptoms
reported by an individual, but there are minimal empirical data
on this point. Hasemann
( 1997) provided data on workers' compensation claimants who
were evaluated by forensic
psychologists for both the defense and the plaintiff. The
claimant reported more symptoms
and distress to the forensic psychologist retained by the defense
attorney. Consequently,
some of the differences in examinations performed by forensi c
psychologists on the same
individual may reflect that he actually describes problematic
behaviors and symptoms
differently depending on whether he believes that the forensic
psychologist is likely to be
sensitive or insensitive to his self-report. The underlying
heuristic of an individual is likely
to be that the opposing forensic psychologist will require more
proof to be able or willing
to perceive and report an individual as being impaired. These
results suggest that being
examined by the plaintiff's expert and then by the defense's
expert over the same psycholegal
issue should be considered as different forensic contexts rather
than as the same one.
PSYCHOMETRIC FOUNDATIONS
Demographic Variables
Age
Specific norms are not provided by age on the MMPI-2, even
though it is well known that
there are substantial effects of age below the age of 20. These
age effects are reflected in
the development of separate sets of adolescent norms for the
original MMPI (Marks &
Briggs, 1972), and the restandardization of a different form of
the MMPI for adolescents
Minnesota Multiphasic Personality Inventory-2 199
(MMPI-A: Butcher et al., 1992). Colligan and his colleagues
(Colligan, Osborne, Swenson,
& Offord, 1983, 1989) found substantial effects of age on
MMPI performance in their
contemporary normative sample with differences of 10 or more
T points between 18- and
19-year-olds and 70-year-olds on Scales L (Lie) and 9 (Ma:
Hypomania). Several MMPl-2
scales demonstrate differences of nearly 5 T points between 20-
year-olds and 60-year-olds
(Butcher et al., 1989, 2001; Caldwell, 1997b, 1997c; Greene &
Schinka, 1995) with scores
on Scales L (Lie: women only), I (Hs: Hypochondriasis), and 3
(Hy: Hysteria) increasing
and Scales 4 (Pd: Psychopathic Deviate) and 9 (Ma:
Hypomania) decreasing with age.
Given that these age comparisons involve different cohorts, it is
not possible to know
whether these effects actually reflect the influence of age or
simply differences between the
cohorts. Butcher et al. (1991) found few effects of age in older
(>60) men and they saw no
reason for age-related norms in these men.
Gender
Gender does not create any general issues in MMPI-2
interpretation because separate norms
(profile forms) are used for men and women. Any gender
differences in how individuals
responded to the items on each scale are removed when the raw
scores are converted to T
scores. Consequently, men and women with a T score of 70 on
Scale 2 (D: Depression) are
one standard deviation above the mean, although women have
endorsed more items (30)
than men (28). When the MMPI-2 is computer scored by
Pearson Assessment, unigender
norms also are provided for each scale. Even a cursory perusal
of these unigender norms
will show that men and women have very similar scores on all
MMPI-2 scales except
for those three scales specifically related to gender (Scale 5
[Mf: Masculinity-Femininity];
Gender-Role Feminine [GF]; Gender-Role Masculine [GM]).
Education
The potential effects of education have not been investigated in
any systematic manner
either on the MMPI or the MMPI-2, although such research is
needed. When the men
and women in the MMPI-2 normative group with less than a
high school education were
contrasted with men and women with postgraduate education
(Dahlstrom & Tellegen, 1993,
pp. 58-59), the differences on the following scales exceeded 5 T
points: L (Lie: women
only), F (Infrequency), K (Correction), 5 (MJ- Masculinity-
Femininity), and O (Si: Social
Introversion). Men and women with less than a high school
education had a higher score in
all these comparisons except for Scales K (Correction) and 5
(Mf· Masculinity-Femininity).
When psychiatric patients with 8 years or less of education were
contrasted with patients
with 16 or more years of education (Caldwell, 1997b), the
differences ranged from 4 to
8 T points on all the scales except 3 (Hy: Hysteria). The
individuals with less education
had higher scores in all these comparisons except for Scales K
(Correction) and 5 (Mf·
Masculinity-Femininity).
Occupation
There do not appear to be any systematic effects for occupation
or income within the
MMPI-2 normative group (Dahlstrom & Tellegen, 1993; Long,
Graham, & Timbrook,
1994). There have been no studies of the effects of these two
factors in psychiatric patients.
200 Self-Report Inventories
Ethnicity
The effects of ethnicity on MMPI performance have been
reviewed by Dahlstrom, Lachar,
and Dahlstrom ( 1986) and Greene ( 1987), and they concluded
that there is not any consistent
pattern of scale differences between any two ethnic groups. A
similar conclusion has been
offered in several other reviews of the effect of ethnicity on
MMPI-2 performance (Greene,
1991, 2000; Hall, Bansal, & Lopez, 1999).
Multivariate regressions of age, education, gender, ethnicity,
and occupation on the
standard validity and clinical scales in the MMPI-2 normative
group (Dahlstrom & Tellegen,
1993) and psychiatric patients (Caldwell, 1997 [age, education,
and gender only]; Schinka,
LaLone, & Greene, 1998) have shown that the percentage of
variance accounted for by
these factors does not exceed 10%. Such small percentages of
variance are unlikely to
impact the interpretation of most MMPl-2 profiles. The one
exception is Scale 5 (Mf:
Masculinity-Femininity) in which slightly over 50% of the
variance is accounted for by
gender.
In summary, demographic variables appear to have minimal
impact on the MMPI-2
profile in most individuals. It may be important to monitor the
validity of the MMPI-2
profile more closely in persons with limited education and
lower occupations. A major
reason that demographic effects are seen in these persons may
simply reflect that the
reading level of the MMPI-2 is approximately the eighth grade
(Butcher et al., 1989, 2001;
Greene, 2000).
Reliability
The MMPI-2 Manual (Butcher et al., 1989, 2001, Appendix E)
reports the reliability data
for 82 men and 111 women who were retested after an average
of 8.58 days. The test-
retest correlations ranged from .54 to .93 across the 10 clinical
scales and averaged .74.
The standard error of measurement is about 5 T points for the
clinical scales, that is, the
individual's true score on the clinical scales will be within ±5 T
points two-thirds of the
time.
The test-retest correlations for the 15 content scales range from
.77 to .91 and averaged
.85. The standard error of measurement is about 4 T points for
the content scales, that is,
the individual's true score on the content scales will be withi n
±4 points two-thirds of the
time.
Codetype Stability
There is little empirical data indicating how consistently clients
will obtain the same
codetype on two successive administrations of the MMPI or the
MMPI-2. The research on
the stability of the MMPI historically focused either on the
reliability of individual scales as
discussed, which leaves unanswered whether clients' codetypes
have remained unchanged.
There would be at least some cause for concern if a client
obtained a 4-9/9-4 codetype on
one occasion and on a second administration of the MMPI-2 a
few months later in another
setting obtained a2-7/7-2 codetype.
Graham, Smith, and Schwartz (1986) have provided the only
empirical data on the
stability of MMPI codetypes for a large sample (N = 405) of
psychiatric inpatients. They
Minnesota Multiphasic Personality Inventory-2 201
reported 42.7%, 44.0%, and 27.7% agreement across an average
interval of approximately
3 months for high-point, low-point, and two-point codetypes,
respectively. If the patients
were classified into the categories of neurotic, psychotic, and
characterological, 58.1 %
remained in the same category when retested.
Greene, Davis, and Morse (1993, August) reported the stability
of the MMPI in 454
alcoholic inpatients who had been retested after an interval of
approximately 6 months.
Approximately 40% of the men and 32% of the women had the
same single high-point
scale on the two successive administrations of the MMPI.
However, they had the same
two-point codetype only 12% and 13% of the time, respectively.
Almost 30% of these men
and women had two totally different high-point scales when
they took the MMPI on their
second admission.
These data on codetype stability, or more accurately the lack
thereof, suggest sev-
eral important conclusions. First, clinicians should be cautious
about making long-term
predictions from a single administration of the MMPI-2. Rather
an MMPI-2 profile
should be interpreted as reflecting the individual's current
status. Second, it is not clear
whether the shifts that do occur in codetypes across time reflect
meaningful changes in the
clients' behaviors, psychometric instability of the MMPI-2, or
some combination of both
factors.
CONCLUDING COMMENTS
The MMPI-2 (Butcher et al., 1989, 2001) is the oldest and the
most widely used of the
self-report inventories. The numerous validity scales have
served it well in assessing the
many forms of response distortion that are encountered in the
various settings in which the
MMPI-2 is administered. The MMPl-2 is the prototype of an
empirically derived test in
which the correlates of individual scales and codetypes are
determined through research.
There is an extensive research base on most of the major issues
in the assessment of
psychopathology reflecting its long history of use.
REFERENCES
American Psychiatric Association. (2000). Diagnostic and
statistical manual ofmental disorders (4th
ed., text rev.). Washington, DC: Author.
Arbisi, P. A., & Ben-Porath, Y. S. (1995). An MMPI-2
infrequent response scale for use with
psychopathological populations: The Infrequency
Psychopathology scale: F(p). Psychological
Assessment, 7, 424-431.
Archer, R. P., Griffin, R., & Aiduk, R. (1995). MMPI-2 clinical
correlates for ten common codes.
Journal of Personality Assessment, 65, 391-407.
Bathurst, K., Gottfried, A. W., & Gottfried, A. E. (1997).
Normative data for the MMPI-2 in child
custody litigation. Psychological Assessment, 9, 205-211.
Ben-Porath, Y. S., & Butcher, J. N. (1989). Psychometric
stability of rewritten MMPI items. Journal
of Personality Assessment, 53, 645-653.
Ben-Porath, Y. S., & Sherwood, N. E. (1993). The MMPI-2
content component scales: Development,
psychometric characteristics, and clinical applicati on.
Minneapolis: University of Minnesota
Press.
202 Self-Report Inventories
Butcher, J. N., Aldwin, C. M., Levenson, M. R., Ben-Porath, Y.
S., Spiro, A., & Bosse, R. (1991).
Personality and aging: A study of the MMPI-2 among older
men. Psychology and Aging, 6,
361-370.
Butcher, J. N., Dahlstrom, W. G., Graham, J. R., Tellegen, A.
M., & Kaemmer, B. (1989). MMPl-2:
Manual for administration and scoring. Minneapolis: University
of Minnesota Press.
Butcher, J. N., Graham, J. R., & Ben-Porath, Y. S. (1995).
Methodological problems and issues in
MMPI, MMPI-2, and MMPI-A research. Psychological
Assessment, 7, 320-329.
Butcher, J. N., Graham, J. R., Ben-Porath, Y. S., Tellegen, A.
M., Dahlstrom, W. G., & Kaemmer, B.
(2001). MMPl-2: Manual for administration and scoring (Rev.
ed.). Minneapolis: University of
Minnesota Press.
Butcher, J. N., Graham, J. R., Williams, C. L., & Ben-Porath, Y.
S. (1990). Development and use of
the MMPl-2 content scales. Minneapolis: University of
Minnesota Press.
Butcher, J. N., & Han, K. (1995). Development of an MMPI-2
scale to assess the presentation of self
in a superlative manner: The S scale. In J. N. Butcher & C. D.
Spielberger (Eds.), Advances in
personality assessment ( Vol. 10, pp. 25-50). Hillsdale, NJ:
Erlbaum.
Butcher, J. N., & Rouse, S. V. (1996). Personality: Individual
differences and clinical assessment.
Annual Review of Psychology, 47, 87-111.
Butcher, J. N., Williams, C. L., Graham, J. R., Archer, R. P.,
Tellegen, A., Ben-Porath, Y. S.,
et al. (1992). MMPI-A (Minnesota Multiphasic Personality
Inventory-Adolescent): Manual for
administration, scoring, and interpretation. Minneapolis:
University of Minnesota Press.
Caldwell, A. B. (1997a). [MMPI-2 data research file for clinical
patients]. Unpublished raw data.
Caldwell, A. B. (1997b ). [MMPI-2 data research file for
personnel applicants]. Unpublished raw data.
Caldwell, A. B. (1997c). Whither goest our redoubtable mentor
the MMPI/MMPI-2? Journal of
Personality Assessment, 68, 47-66.
Caldwell, A. B. (1998). [MMPI-2 data research file for pain
patients]. Unpublished raw data.
Caldwell, A. B. (2006). Maximal measurement or meaningful
measurement: The interpretive chal-
lenges of the MMPI-2 Restructured Clinical (RC) scales.
Journal ofPersonality Assessment, 87,
193-201.
Colligan, R. C., Osborne, D., Swenson, W. M., & Offord, K. P.
(1983). The MMPI: A contemporary
normative study. New York: Praeger.
Colligan, R. C., Osborne, D., Swenson, W. M., & Offord, K. P.
(1989). The MMPI: A contemporary
normative study ( 2nd ed.). Odessa, FL: Psychological
Assessment Resources.
Cord, E. L. J., Sajwaj, T. E., Tolliver, D. K., & Ford, T. W.
(1997, June). Normative update on MMPl-2
data for a large federal power utility. Paper presented at the
32nd annual Symposium on Recent
Developments in the use of the MMPI-2 and MMPI-A,
Minneapolis, MN.
Dahlstrom, W. G., Archer, R. P., Hopkins, D. G ., Jackson, E.,
& Dahlstrom, L. E. (1994 ). Assessing the
readability of the Minnesota Multiphasic Inventory Instruments:
The MMPI, MMPl-2, MMPI-A.
Minneapolis: University of Minnesota Press.
Dahlstrom, W. G., Lachar, D., & Dahlstrom, L. E. (1986).
MMPI patterns of American minorities.
Minneapolis: University of Minnesota Press.
Dahlstrom, W. G., & Tellegen, A. (1993). Socioeconomic status
and the MMPl-2: The rela-
tion of MMPl-2 patterns to levels of education and occupation.
Minneapolis: University of
Minnesota Press.
Dahlstrom, W. G., Welsh, G. S., & Dahlstrom, L. E. (1972). An
MMPI handbook: Vol. I. Clinical
interpretation (Rev. ed.). Minneapolis: University of Minnesota
Press.
Finn, S. (1996). Using the MMPl-2 as a therapeutic
intervention. Minneapolis: University of
Minnesota Press.
Fischer, C. T. (1994). Individualizing psychological asses sment.
Hillsdale, NJ: Erlbaum.
Fowler, R. A., Butcher, J. N., & Williams, C. L. (2000).
Essentials of MMPl-2 and MMPI-A inter-
pretation (2nd ed.). Minneapolis: University of Minnesota
Press.
Minnesota Multiphasic Personality Inventory-2 203
Friedman, A. F., Lewak, R., Nichols, D.S., & Webb, J. T.
(2001). Psychological assessment with the
MMPI-2 (2nd ed.). Hillsdale, NJ: Erlbaum.
Gough, H. G. (1950). The F minus K dissimulation index for the
MMPI. Journal of Consulting
Psychology, 14, 408-413.
Gough, H. G. (1954). Some common misconceptions about
neuroticism. Journal of Consulting
Psychology, 18, 287-292.
Graham, J. R. (2006). MMPI-2: Assessing personality and
psychopathology (4th ed.). New York:
Oxford University Press.
Graham, J. R., Ben-Porath, Y. S., & McNulty, J. L. (1999).
MMPI-2 correlates for outpatient
community mental health settings. Minneapolis: University of
Minnesota Press.
Graham, J. R., Smith, R. L., & Schwartz, G. F. (1986). Stability
ofMMPI configurations for psychiatric
inpatients. Journal of Consulting and Clinical Psychology, 54,
375-380.
Greene, R. L. (1987). Ethnicity and MMPI performance: A
review.Journal ofConsulting and Clinical
Psychology, 55, 497-512.
Greene, R. L. (1991). The MMPI-2/MMPI: An interpretive
manual. Boston: Allyn & Bacon.
Greene, R. L. (2000). The MMPI-2: An interpretive manual.
Boston: Allyn & Bacon.
Greene, R. L., & Brown, R. C. (2006). MMPI-2 adult
interpretive system (3rd ed.). Lutz, FL:
Psychological Assessment Resources.
Greene, R. L., Davis, L. J., Jr., & Morse, R. M. (1993, August).
Stability of MMPI codetypes in
alcoholic inpatients. Paper presented at the annual meeting of
the American Psychological
Association, San Francisco.
Greene, R. L., Gwin, R., & Staal, M. (1997). Current status of
MMPI-2 research: A methodological
overview. Journal ofPersonality Assessment, 68, 20-36.
Greene, R. L., & Schinka, J. A. (1995). [MMPI-2 data research
file for psychiatric inpatients and
outpatients]. Unpublished raw data.
Hall, G. C. N., Bansal, A., & Lopez, I. R. (1999). Ethnicity and
psychopathology: A meta-analytic
review of 31 years of comparative MMPI/MMPI-2 research.
Psychological Assessment, 11,
186-197.
Harkness, A. R., & McNulty, J. L. (1994). The Personality
Psychopathology Five (PSY-5): Issue
from the pages of a diagnostic manual instead of a dictionary.
In S. Strack & M. Lorr (Eds.),
Differentiating normal and abnormal personality (pp. 291-315).
New York: Springer.
Harkness, A. R., McNulty, J. L., Ben-Porath, Y. S., & Graham,
J. R. (2002). MMPI-2 Personality
Psychopathology Five (PSY-5) scales: Gaining an overview for
case conceptualization and
treatment planning. Minneapolis: University of Minnesota
Press.
Harris, R. E., & Lingoes, J. C. (1955). Subscales for the MMPI:
An aid to profile interpretation.
Unpublished manuscript, University of California.
Hasemann, D. M. (1997). Practices and findings of mental
health professionals conducting workers'
compensation evaluations. Unpublished doctoral dissertation,
University of Kentucky.
Hathaway, S. R., & McKinley, J.C. (1940). A multiphasic
personality schedule (Minnesota): Pt. I.
Construction of the schedule. Journal ofPsychology, 10, 249-
254.
Helmes, E., & Reddon, J. R. (1993). A perspective on
developments in assessing psychopathology:
A critical review of the MMPI and MMPI-2. Psychological
Bulletin, 113, 453-471.
Keller, L. S., & Butcher, J. N. (1991). Assessment of chronic
pain patients with the MMP/-2.
Minneapolis: University of Minnesota Press.
Koss, M. P., & Butcher, J. N. (1973). A comparison of
psychiatric patients' self-report with other
sources of clinical information. Journal ofResearch in
Personality, 7, 225-236.
Lachar, D., & Wrobel, T. A. (1974). Validating clinicians'
hunches: Construction of a new MMPI
critical item set. Journal of Consulting and Clinical Psychology,
47, 277-284.
Lees-Haley, P. R. (1997). MMPI-2 base rates for 492 personal
injury plaintiffs: hnplications and
challenges for forensic assessment. Journal of Clinical
Psychology, 53, 745-755.
204 Self-Report Inventories
Long, K. A., Graham, J. R., & Timbrook, R. E. (1994).
Socioeconomic status and MMPI-2 interpre-
tation. Measurement and Evaluation in Counseling and
Development, 27, 158-177.
MacAndrew, C. ( 1965). The differentiation of male alcoholic
outpatients from nonalcoholic psychi-
atric outpatients by means of the MMPI. Quarterly Journal of
Studies on Alcohol, 26, 238-246.
Marks, P.A., & Briggs, P. F. (1972). Adolescent norm tables for
the MMPI. In W. G. Dahlstrom,
G. S. Welsh, & L. E. Dahlstrom (Eds.), An MMPI handbook:
Vol. I. Clinical interpretation (Rev.
ed., pp. 388-399). Minneapolis: University of Minnesota Press.
Meehl, P. E. (1957). When should we use our heads instead of
the formula? Journal of Counseling
Psychology, 4, 268-273.
Megargee, E. I., Mercer, S. J., & Carbonell, J. L. (1999).
MMPI-2 with male and female state and
federal prison inmates. Psychological Assessment, 11, 177-185.
Nichols, D.S. (2001). Essentials of MMPI-2 assessment. New
York: Wiley.
Paolo, A. M., Ryan, J. J., & Smith, A. J. (1992). Reading
difficulty of MMPI-2 subscales. Journal of
Clinical Psychology, 47, 529-532.
Paulhus, D. L. ( 1984). Two-component models of socially
desirable responding.Journal ofPersonality
and Social Psychology, 46, 598-609.
Paulhus, D. L. (1986). Self-deception and impression
management in test responses. In A. Angleitner
& J. S. Wiggins (Eds.), Personality assessment via
questionnaires: Current issues in theory and
measurement (pp. 143-165). Berlin, Germany: Springer-Verlag.
Schinka, J. A., & Borum, R. (1993). Readability of adult
psychopathology inventories. Psychological
Assessment, 5, 384-386.
Schinka, J. A., & LaLone, L. ( 1997). MMPI-2 norms:
Comparisons with a census-matched subsample.
Psychological Assessment, 9, 307-311.
Schinka, J. A., LaLone, L., & Greene, R. L. (1998). Effects of
psychopathology and demographic
characteristics on MMPI-2 scale scores. Journal of Personality
Assessment, 70, 197-211.
Tellegen, A., Ben-Porath, Y. S., McNulty, J. L., Arbisi, P. A.,
Graham, J. R., & Kaemmer, B.
(2003). The MMPI-2 Restructured Clinical Scales:
Development, validation, and interpretation.
Minneapolis: University of Minnesota Press.
Weed, N. C., Butcher, J. N., & Ben-Porath, Y. S. (1995).
MMPI-2 measures of substance abuse.
In J. N. Butcher & C. D. Spielberger (Eds.), Advances in
personality assessment ( Vol 10, pp.
121-145). Hillsdale, NJ: Erlbaum.
Weed, N. C., Butcher, J. N., McKenna, T., & Ben-Porath, Y. S.
(1992). New measures for assessing
alcohol and drug abuse with the MMPI-2: The APS and AAS.
Journal ofPersonality Assessment,
58, 389-404.
Welsh, G. S. (1956). Factor dimensions A and R. In G. S. Welsh
& W. G. Dahlstrom (Eds.), Basic
readings on the MMPI in psychology and medicine (pp. 264-
281). Minneapolis: University of
Minnesota Press.
a-135-141a-197-204
Chapter 8
MILLON CLINICAL MULTIAXIAL
INVENTORY-III
The Millon Clinical Multiaxial Inventory-III (MCMI-III:
Millon, Davis, & Millon, 1994,
1997) is a broadband measure of the major dimensions of
psychopathology found in Axis II
disorders and some Axis I disorders of the DSM-IV-TR
(American Psychiatric Association,
2000). The MCMI-III consists of 4 validity (modifier) scales, 11
personality style scales, 3
severe personality style scales, 7 clinical syndrome scales, and
3 severe clinical syndrome
scales (see Table 8.1). Table 8.2 provides the general
information on the MCMI-111. In
contrast to the MMPl-2 (Butcher, Dahlstrom, Graham, Tellegen,
& Kaemmer, 1989) that
has 120+ additional scales, the MCMI-III does not have any
subscales for these basic
sets of scales or separate content scales so there are only 28
total scales on the MCMI-111.
Consequently, learning to interpret the MCMI-III is more
straightforward than the MMPI-2.
Recently Grossman and del Rio (2005) described the
development of 35 facet scales for
the 14 personality style scales that represent the first such
attempt to create subscales for
any of the versions of the MCMI. These facet scales are very
new so there is little research
on them or clinical information on their use. They are described
briefly later in this chapter.
HISTORY
Millon (1983; Millon & Davis, 1996) conceptualized an
evolutionary framework for per-
sonality in which the interface of three polarities (pleasure-
pain; active-passive; self-other)
determines an individual's specific personality style as an
adaptation to the environment.
The pleasure-pain polarity involves either seeking pleasure as a
way of enhancing life
or avoiding pain as a way of constricting life. The active-
passive polarity involves either
working to change unfavorable aspects of the environment or
accepting unfavorable aspects
that cannot be changed.
Table 8.3 presents the functional processes and structural
domains for each of the 14
personality disorder styles in the MCMI-111. Millon et al.
(1997) believe that each cell of this
matrix contains the diagnostic attribute or criterion that best
captures the personality style
within that specific functional process or structural domain.
Reading down each column
provides an overview of how each personality style differs on
each functional process or
structural domain. Reading across each row provides an
overview of how each personality
style can be described.
Millon's conceptual system for personality disorders does not
map directly onto the
DSM-IV-TR (American Psychiatric Association, 2000) Axis II
personality disorders. The
latter is an atheoretical categorical system that describes the
behaviors and symptoms needed
251
252 Self-Report Inventories
Table 8.1 Millon Clinical Multiaxial Inventory-III
(MCMI-III)
Modifying Indices (Validity Scales)
V Validity Index
X
y
z
Personality Styles
1
2A
2B
3
4
5
6A
6B
7
BA
BB
Severe Personality Styles
s
C
p
Clinical Syndromes
A
H
N
D
B
T
R
Severe Clinical Syndromes
ss
cc
pp
Disclosure Index
Desirability Index
Debasement Index
Schizoid
Avoidant
Depressive
Dependent
Histrionic
Narcissistic
Antisocial
Sadistic (Aggressive)
Compulsive
N egativistic (Passive-Aggressive)
Masochistic
Schizotypal
Borderline
Paranoid
Anxiety Disorder
Somatoform Disorder
Bipolar Disorder: Manic
Dysthymic Disorder
Alcohol Dependence
Drug Dependence
Posttraumatic Stress Disorder
Thought Disorder
Major Depression
Delusional Disorder
to make a specific personality disorder diagnosis. Millon also
includes personality disorders
such as Sadistic (Aggressive) and Depressive on the MCMl-111
that are not included in the
DSM-IV-TR.
MCMI (First Edition)
The original MCMI (Millon, 1977) had five major
distinguishing features when compared
with the MMPI (Hathaway & McKinley, 1951 ), which was the
primary self-report inventory
in use at the time. First, the MCMI was developed following
Millon's comprehensive
Millon Clinical Multiaxial Inventory-III 253
Table 8.2 Millon Multiaxial Clinical Inventory-III (MCMI-111)
Authors:
Published:
Edition:
Publisher:
Website:
Age range:
Reading level:
Administration formats:
Languages:
Number of items:
Response format:
Administration time:
Primary scales:
Additional scales:
Hand scoring:
General texts:
Computer interpretation:
Millon, Davis, Millon
1994
3rd
Pearson Assessments
www.PearsonAssessments.com/tests/MCML3
18+
8th grade
Paper/pencil, computer, CD, cassette
Spanish
175
True/False
25-30 minutes
4 Validity, 11 Personality Styles, 3 Severe Personality Styles,
7 Clinical Syndromes, 3 Severe Clinical Syndromes
35 (42) Facet
Templates
Choca (2004), Craig (2005), Jankowski (2002), Millon et al.
(1997),
Retzlaff (1995), Strack (2002)
Pearson Assessments (Millon); Psychological Assessment
Resources
(Craig)
clinical theory described earlier, in contrast to the atheoretical
or empirical development
of the original MMPI (see Chapter 6). Second, the MCMI
contained specific scales to
assess personality disorders, the more enduring personality
characteristics of patients,
which would be incorporated into Axis II of the forthcoming
diagnostic system at the time,
that is, DSM-III (American Psychiatric Association, 1980).
Third, the comparison group
consisted of a representative sample of psychiatric patients
instead of normal individuals,
which would facilitate differential diagnosis among patients.
Fourth, scores on the scales
were transformed into actuarial base rates. These base rates
reflected the actual frequency
with which various forms of psychopathology occurred rather
than traditional standard
scores, which measure how far the person deviates from the
mean of normal individuals.
Finally, the MCMI was designed to use as few items as possible
to achieve these goals. At
175 items, the MCMI was and remains the shortest self-report
inventory that is a broadband
measure of the major dimensions of psychopathology.
The original MCMI had four items that evaluated whether the
person had read the items.
These four items will become the Validity (V) scale on the
ensuing editions of the MCMI
that assess the consistency of item endorsement.
The original MCMI did not have explicit validity scales to
assess the accuracy of item
endorsement. Instead a weight factor was developed based on
the variation of the person's
score from the midpoint of the total raw score for the eight
basic personality scales. When
this total raw score was below 110, the person was thought to be
too cautious in reporting
problematic behaviors and symptoms of psychopathology so
their scores would need to
be adjusted upward. Conversely, when the total raw score was
above 130, the person was
thought to be too open or self-revealing so their scores would
need to be adjusted downward.
www.PearsonAssessments.com/tests/MCML3
254 Self-Report Inventories
Table 8.3 Expression of personality disorders across the
functional and structural domains
of personality
Functional Processes
Expressive Interpersonal Regulatory
Disorder Arts Conduct Cognitive Style Mechanisms
1 Schizoid Impassive Unengaged Impoverished
Intellectualization
2A Avoidant Fretful Aversive Distracted Fantasy
2B Depressive Disconsolate Defenseless Pessimistic Asceticism
3 Dependent Incompetent Submissive Nai"ve Introjection
4 Histrionic Dramatic Attention- Flighty Dissociation
Seeking
5 Narcissistic Haughty Exploitive Expansive Rationalization
6A Antisocial Impulsive Irresponsible Deviant Acting Out
6B Sadistic Precipitate Abrasive Dogmatic Isolation
7 Compulsive Disciplined Respectful Constricted Reaction
Formation
SA Negativistic Resentful Contrary Skeptical Displacement
SB Masochistic Abstinent Deferential Diffident Exaggeration
s Schizotypal Eccentric Secretive Autistic Undoing
C Borderline Spasmodic Paradoxical Capricious Regression
p Paranoid Defensive Provocative Suspicious Projection
Structural Attributes
Object Morphologic Mood/
Disorder Self-Image Representation Organization Temperament
1 Schizoid Complacent Meager Undifferentiated Apathetic
2A Avoidant Alienated Vexations Fragile Anguished
2B Depressive Worthless Forsaken Depleted Melancholic
3 Dependent Inept Immature Inchoate Pacific
4 Histrionic Gregarious Shallow Disjointed Fickle
5 Narcissistic Admirable Contrived Spurious Insouciant
6A Antisocial Autonomous Debased Unruly Callous
6B Sadistic Combative Pernicious Eruptive Hostile
7 Compulsive Conscientious Concealed Compartmentalized
Solemn
SA Negativistic Discontented Vacillating Divergent Irritable
SB Masochistic Undeserving Discredited Inverted Dysphoric
s Schizotypal Estranged Chaotic Fragmented Distraught or
Insensitive
C Borderline Uncertain Incompatible Split Labile
p Paranoid Inviolable Unalterable Inelastic Irascible
Note: Self-Other are reversed in Compulsive and Negativistic.
Source: MCM/-1// Manual, second edition (p. 27), by T. Millon,
R. Davis, and C. Millon, 1997, Minneapolis,
MN: National Computer Systems. Reprinted with permission
from table 2.2.
Millon Clinical Multiaxial Inventory-III 255
This weight factor will become an explicit validity (modifier)
scale (Disclosure [X]) on the
ensuing forms of the MCMI.
MCMI-11 (Second Edition)
The second edition of the MCMI (MCMI-11: Millon, 1987)
appeared in 1987 to enhance
several features of the original MCMI. Two new personality
disorder scales (Aggres-
sive/Sadistic and Self-Defeating [Masochistic]) and three
validity (modifier) scales (Dis-
closure [X], Desirability [Y], and Debasement [Z]) scales were
added to the profile form.
Forty-five new items (45/175 = 25.7%) were added to replace
45 extant items that did
not add sufficient discriminating power to their scales.
Modifications also were made in
the MCMI-11 items to bring the scales into closer coordination
with DSM-III-R (American
Psychiatric Association, 1987). An item-weighting procedure
was added wherein items
with greater prototypicality for a given scale were given higher
weights of 2 or 3. If an
item was endorsed in the nonscored direction, it was assigned a
weight of 0. If an item was
endorsed in the scored direction, it was assigned a weight of 1,
2, or 3 depending on how
prototypical the item was for that scale with the most
prototypical items assigned a weight
of 3.
The replacement of one-quarter of the items from the original
MCMI limits the general-
izability of its results to the MCMI-11. Even though the scales
still have the same name, the
actual items composing a scale may have changed substantially.
The introduction of the in-
creased weighting of prototypical items on each MCMI-11 scale
also alters the relationship
among the items within the scale and with other scales.
MCMI-III (Third Edition)
The third edition of the MCMI (Millon et al., 1994, 1997)
appeared in 1994 with four major
changes. First, 95 (95/175 = 54.3%) new items were introduced
to parallel the substantive
nature of the then forthcoming DSM-IV criteria (American
Psychiatric Association, 1994).
Second, two new scales were added: one personality style
(Depressive) and one clinical
syndrome scale (Posttraumatic Stress Disorder). Third, a small
set of items was added to
strengthen the Noteworthy responses in the areas of child abuse,
anorexia, and bulimia.
Finally, the weighting of items was reduced to only two levels
with the more prototypical
items for a specific scale adding two points to the raw score.
The generalizability of the research results from the MCMI-11
to the MCMI-III need
to be made cautiously because over one-half of the items were
changed. The emphasis in
these new items also tended to be on DSM-IV criteria. It
appears that the emphasis in the
MCMI-111 is toward the DSM-IV criteria for personality
disorders; whereas the emphasis
in the MCMI-11 was toward Millon's theory.
ADMINISTRATION
The first issue in the administration of the MCMI-III is ensuring
that the individual is
invested in the process. Taking a few extra minutes to answer
any questions the individual
may have about why the MCMI-111 is being administered and
how the results will be used
256 Self-Report Inventories
will pay excellent dividends. This issue may be even more
important with the MCMI-111
than with other self-report inventories because of the relatively
limited number of items
on each scale and the extensive item overlap that quickly
compounds the effect of the
individual distorting responses to even a few items. The
clinician should work diligently
to make the assessment process a collaborative activity with the
individual to obtain the
desired information. This issue of therapeutic assessment (Finn,
1996; Fischer, 1994) was
covered in more depth in Chapter 2 (pp. 43-44).
Reading level is a crucial factor in determining whether a
person can complete the
MCMI-III; inadequate reading ability is a major cause of
inconsistent patterns of item
endorsement. Millon et al. (1997) suggest that most clients who
have had at least 8 years
of formal education can take the MCMI-111 with little or no
difficulty because the items
are written on an eighth-grade level or less. If there is some
concern about the person's
reading level, he or she can be asked to read a few items out
loud to obtain a quick estimate
of whether reading is a problem. In those individuals for whom
reading is difficult, the
MCMI-III can be presented by CD or audiocassette tape.
SCORING
Scoring the MCMI-111 by hand is a complex process that
commonly results in scoring errors
(Millon et al., 1997, p. 112). If computer scoring is not
available, each MCMI-III should
be hand scored and profiled independently by two different
individuals and their scores
verified to catch such errors. If the MCMI-III is administered by
computer, the computer
automatically scores it. If the individual's responses to the items
have been placed on an
answer sheet, these responses can be entered into the computer
by the clinician for scoring
or they can be hand scored. If the clinician enters the item
responses into the computer for
scoring, they should be double entered to identify any data entry
errors.
The first step in hand scoring is to examine the answer sheet
carefully and indicate
omitted items and double-marked items by drawing a line
through both the "true" and
"false" responses to these latter items in brightly colored ink.
Also, cleaning up the answer
sheet is helpful and facilitates scoring. Responses that were
changed need to be erased
completely if possible, or clearly marked with an "X" so that
the clinician is aware that this
response has not been endorsed by the client.
The next step is to determine whether any of the three Validity
(V) scale items (65, 110,
157) have been endorsed as being "True." If two or more of
these items have been endorsed
as being "True," scoring is unwarranted and should stop; it is
probably unwarranted even
if only one of them has been endorsed as "True."
The number of omitted items, which is the total number of items
not marked and double
marked, is scored without a template. There is no standard place
on the profile form on
which the number of omitted items is reported so the clinician
should make it explicit if, and
how many, items have been omitted when it does occur. All the
other scales except for Scale
X (Disclosure) are scored by placing a plastic template over the
answer sheet with a small
box drawn at the scored (deviant) response--either "true" or
"false"-for each item on the
scale. The responses on the MCMI-111 are weighted either "1"
or "2," with the responses
weighted "2" being prototypic for that scale. The sum of these
weighted responses equals
the client's raw score for that scale; this raw score is recorded
in the proper space on the
276 Self-Report Inventories
Critical Items (Noteworthy Responses)
Critical items on the MCMI-III are identified as Noteworthy
Responses (Millon et al.,
1997, Appendix E). These Noteworthy Responses are divided
into six categories: (1)
Health Preoccupations; (2) Interpersonal Alienation; (3)
Emotional Dyscontrol; (4) Self-
Destructive Potential; (5) Childhood Abuse; and (6) Eating
Disorders. The deviant response
to all these items is "True." These items are intended to alert the
clinician to specific items
that warrant close review. All the items except one within
Health Preoccupations are found
on Scale H (Somatoform). The Eating Disorder items are not
scored on any extant MCMI-
III scale and must be reviewed directly. Items 154 and 171
reflect suicide attempts and
suicidal ideation that need to be reviewed any time they are
endorsed or omitted.
APPLICATIONS
As a self-report inventory, the MCMI-111 is used routinely in
clinical settings as well as
correctional and substance abuse settings. However, the MCMI-
III is not to be used "with
normal populations or for purposes other than establishing a
diagnostic screening and
clinical assessment. ... To administer the MCMI-111 to a wider
range of problems or class
of subjects, such as those found in business or industry, or to
identify neurologic lesions, or
to use it for the assessment of general personality traits among
college students is to apply
the instrument to settings and samples for which it is neither
intended nor appropriate"
(Millon et al., 1997, p. 6). Choca (2004) has suggested that
there is nothing wrong with
giving the MCMI-III to normal individuals or other samples on
which the MCMI-III was
not standardized, as long as the clinician keeps in mind the
standardization group to which
the person is being compared.
The MCMI-111 also is used in forensic settings, and several
authors have provided
guidelines for its use (McCann, 2002; Schutte, 2001 ). There
has been substantial debate
whether the MCMI-III meets the federal standards for evidence
in the legal settings with
advocates pro (Craig, 2006; Dyer, 2005) and con (Lally, 2003;
Rogers, Salekin, & Sewell,
1999). Review of these issues is beyond the scope of this text.
The forensic psychologist
does need to be well informed about all these issues before
using the MCMI-III.
Somewhat different issues must be considered in the
administration of the MCMI-III in
forensic settings compared with the more usual clinical setting.
These issues were reviewed
in Chapter 6 on the MMPI-2 (pp. 197-198) and will not be
reiterated here. These issues need
to be considered carefully because the validity (modifier) scales
on the MCMI-III appear
to be relatively insensitive to response distortions (Morgan,
Schoenberg, Dorr, & Burke,
2002; Schoenberg, Dorr, & Morgan, 2003), although
Schoenberg, Dorr, and Morgan (2006)
developed a discriminant function that looked promising in
identifying college students who
were simulating psychopathology.
Millon et al. (1997) have stated that in child-custody settings
when "custody battles
reach the point of requiring psychological evaluation, they
constitute such a degree of
interpersonal difficulty that the evaluation becomes a clinical
matter" (p. 144). McCann,
Flens, and Campagna (2001) have reported normative data for
259 child-custody examinees.
The mean MCMI-III profile for these examinees was an
elevation on Scale Y (Social
Desirability) and subclinical elevations on Scales 4 (Histrionic),
5 (Narcissistic), and 7
Millon Clinical Multiaxial Inventory-III 277
(Compulsive). Lampel (1999) reported elevations on the same
four MCMI-III scales in 50
divorcing couples. Halon (2001) has questioned whether
elevations on these four scales in
child-custody samples reflect personality difficulties or normal
personality characteristics.
PSYCHOMETRIC FOUNDATIONS
Demographic Variables
Age
There are minimal effects of age on any of the MCMI-III scales
(Raddy et al., 2005).
There is a slight tendency for raw scores to decrease slightly
past the age of 50 except on
Scales 4 (Histrionic), 5 (Narcissistic), and 7 (Compulsive). Raw
scores increased slightly
in individuals over 50 on these three scales. Dean and Choca
(2001) reported similar results
when male psychiatric patients were classified as younger (18 to
40) or older (60+). The
older patients had lower scores on all MCMI-III scales except
Scales 4 (Histrionic), 5
(Narcissistic), and 7 (Compulsive).
Gender
Gender does not create any general issues in MCMI-111
interpretation because separate base
rate (BR) scores are used for men and women. Any gender
differences in how individuals
responded to the items on each scale are removed when the raw
scores are converted to
BR scores. Lindsay, Sank.is, and Widiger (2000) reported that
women were more likely to
endorse the items on Scale 4 (Histrionic).
Education
There is no research that has looked at the effects of education
on MCMI-111 scales.
Ethnicity
About 15% of the development and cross-validation for the
MCMI-III were nonwhite.
Millon et al. (1997) reported that some differences were found
for the demographic vari-
ables (unspecified), but these differences appear to reflect
known differences in prevalence
of the disorder. Some ethnic differences were noted on the
MCMI-1 and MCMl-11, but
no published research has looked at the effects of ethnicity on
the MCMI-111. There have
been several dissertations that examined ethnic differences on
the MCMI-111. This ab-
sence of such research on the MCMI-111 is remarkable because
it is so common with the
MMPI/MMPI-2. Until such research is published on the MCMI-
III, the MCMI-III should
be used cautiously with nonwhite individuals.
Reliability
The MCMI-III Manual (Millon et al., 1997, Table 3.3, p. 58)
reports the reliability data for
87 individuals who were retested after an average of 5 to 14
days. The test-retest correlations
ranged from .82 to .96 across the scales with a median of .91,
which is very stable. Measures
of the internal consistency of each scale (Cronbach's Alpha)
also were quite good with only
278 Self-Report Inventories
Table 8.7 Standard error of measurement for MCMI-111 scales
in male psychiatric patients•
Raw Scores
SEM in BR Units at
Base Rate
Scale M SD SEM Alpha* 60 75 85
Personality Styles
J (Schizoid) 9.83 5.52 4.47 .81 3.35 2.23 5.14
2A (Avoidant) 8.94 6.64 5.91 .89 3.56 1.35 3.72
2B (Depressive) 9.58 6.77 6.02 .89 3.32 1.66 4.98
3 (Dependent) 8.55 5.86 4.98 .85 4.01 2.81 5.02
4 (Histrionic) 11.80 5.47 4.43 .81 NA NA NA
5 (Narcissistic) 13.06 4.75 3.18 .67 6.28 5.34 4.71
6A (Antisocial) 10.78 6.02 4.64 .77 3.45 2.59 2.16
6B (Sadistic) 9.67 6.06 4.79 .79 1.04 1.46 5.43
7 (Compulsive) 14.12 5.34 3.52 .66 3.69 NA NA
BA (Negativistic) 10.39 6.51 5.41 .83 4.07 1.48 4.44
BB (Masochistic) 7.32 5.69 4.95 .87 1.62 1.01 5.86
Severe Personality Styles
S (Schizotypal) 8.01 6.65 5.66 .85 1.77 1.77 4.60
C (Borderline) 10.02 6.67 5.67 .85 2.64 3.17 3.53
P (Paranoid) 8.96 6.55 5.50 .84 1.64 4.00 5.45
Clinical Syndromes
A (Anxiety) 8.25 5.71 4.91 .86 5.09 2.65 2.65
H (Somatoform) 7.23 4.76 4.09 .86 1.95 7.33 7.33
N (Bipolar: Manic) 6.99 4.39 3.12 .71 2.57 4.81 6.41
D (Dysthymia) 9.55 6.03 5.31 .88 3.39 1.32 5.65
B (Alcohol Dependence) 8.93 6.00 4.92 .82 3.86 2.03 3.46
T (Drug Dependence) 8.86 6.29 5.22 .83 1.92 5.56 NA
R (PTSD) 8.92 6.47 5.76 .89 1.74 3.47 NA
Severe Clinical Syndromes
SS (Thought Disorder) 8.77 6.15 5.35 .87 l.50 4.68 NA
CC (Major Depression) 9.54 6.61 5.95 .90 1.34 4.20 5.04
PP (Delusional Disorder) 3.79 3.83 3.03 .79 2.64 5.61 7.26
Validity Scales (Modifier Scales)
X (Disclosure) 119.85 34.43 NA NA
Y (Desirability) 11.92 4.74 4.07 .86 6.14 4.91 NA
Z (Debasement) 14.46 8.84 8.40 .95 1.55 1.79 NA
*N = 1,924.
0 Haddy et al. (2005).
Millon Clinical Multiaxial Inventor y-III 279
six scales (5 [Histrionic]-.67; 6A [Antisocial]-.77; 6B
[Sadistic/Aggressive]-.79; 7
[Compulsive]-.66; N [Bipolar: Manic]-.71; PP [Delusional
Disorder]-.79) below .80.
The standard error of measurement for all MCMI-III scales is
provided in Table 8.7 at BR
scores of 60, 75, and 85 for male psychiatric patients (Haddy et
al., 2005). (There were not
a sufficient number of women in this sample to compute
standard errors of measurement
for them. The standard errors of measurement for raw scor es in
men and women were
generally similar suggesting that the standard errors of
measurement for men could be used
in women, too.) The standard error of measurement was
calculated in raw score units for
each scale and then converted in BR scores at these three
points. For example, the standard
error of measurement for Scale I (Schizoid) is 3.35, 2.23, and
5.14 at BR scores of 60, 75,
and 85, respectively. These values change because the
distribution is not uniform around
these numbers. When the SEM is about 3 BR points for one of
these scales, the individual's
true score will be within ±3 BR points two-thirds of the time.
The standard error of measurement for BR scores around 75
tends to be small, which
means that BR scores above that cutting score are very likely to
remain elevated despite
any error of measurement. On the other hand, the standard error
of measurement for BR
scores around 85 tends to be about twice as large as at 75,
which means that BR scores
above cutting scores of 85 are more likely to change.
The maximum BR score on Scales 4 (Histrionic) and 7
(Compulsive) in men is 84 and
83, respectively. Thus, it is not possible for a man to have a BR
score above 85 on this scale
and the standard error of measurement could not be calculated.
The maximum BR on these
same two scales in women is 92 and 91, respectively.
CONCLUDING COMMENTS
The MCMI-III is the self-report inventory most widely used to
assess personality disorders.
The MCMI-III should be considered any time the presence of a
personality disorder is
expected in an individual; it is a frequently overlooked set of
diagnoses given the more
dramatic symptoms in most Axis I disorders. Computer scoring
is almost mandatory for
the MCMI-111 given its complexity and time-consuming nature.
Clinicians must understand
the derivation and use of BR scores for the accurate
interpretation of the scale scores.
REFERENCES
American Psychiatric Association. (1980). Diagnostic and
statistical manual of mental disorders
(3rd ed.). Washington, DC: Author.
American Psychiatric Association. (1987). Diagnostic and
statistical manual of mental disorders
(3rd ed., rev.). Washington, DC: Author.
American Psychiatric Association. (1994). Diagnostic and
statistical manual ofmental disorders (4th
ed.). Washington, DC: Author.
American Psychiatric Association. (2000). Diagnostic and
statistical manual ofmental disorders (4th
ed., text rev.). Washington, D_C: Author.
Butcher, J. N., Dahlstrom, W. G., Graham, J. R., Tellegen, A.
M., & Kaemmer, B. (1989). MMPI-2:
Manual for administration and scoring. Minneapolis: University
of Minnesota Press.
https://Disorder]-.79
https://Manic]-.71
https://Compulsive]-.66
https://Sadistic/Aggressive]-.79
https://Antisocial]-.77
https://Histrionic]-.67
280 Self-Report Inventories
Charter, R. A., & Lopez, M. N. (2002). MCMI-III: The inability
of the validity conditions to detect
random responders. Journal ofClinical Psychology, 58, 1615-
1617.
Choca, J. P. (2004). Interpretive guide to the Millon Clinical
Multiaxial Inventory (3rd ed.). Wash-
ington, DC: American Psychological Association.
Craig, R. J. (Ed.). (2005). New directions in interpreting the
MCMI-lll: Essays on current issues.
Hoboken, NJ: Wiley.
Craig, R. J. (2006). The MCMI-III. In R. P. Archer (Ed.),
Forensic uses of clinical assessment
instruments (pp. 121-145). Mahwah, NJ: Erlbaum.
Dean, K. J., & Choca, J. (2001, August). Psychological changes
of emotionally disturbed men with
age. Paper presented at the annual meeting of the American
Psychological Association, San
Francisco.
Dyer, F. J. (2005). Forensic applications of the MCMI-III in
light of recent controversies. In R.
J. Craig (Ed.), New directions in interpreting the MCMI-lll (pp.
201-226). Hoboken, NJ:
Wiley.
Finn, S. (1996). Using the MMPI-2 as a therapeutic
intervention. Minneapolis: University of Min-
nesota Press.
Fischer, C. T. (1994). Individualizing psychological assessment.
Hillsdale, NJ: Erlbaum.
Grossman, S. D., & de! Rio, C. (2005). The MCMI-III facet
subscales. In R. J. Craig (Ed.), New
directions in interpreting the MCMI-Ill (pp. 3-31). Hoboken,
NJ: Wiley.
Haddy, C., Strack, S., & Choca, J. P. (2005). Linking
personality disorders and clinical syndromes
on the MCMI-III. Journal ofPersonality Assessment, 84, 193-
204.
Halon, R. L. (2001). The MCMI-III: The normal quartet in child
custody cases. American Journal of
Forensic Psychology, 19, 57-75.
Hathaway, S. R., & McKinley, J.C. (1951). MMPI manual. New
York: Psychological Corporation.
Jankowski, D. (2002). A beginner's guide to the MCMI-lll.
Washington, DC: American Psychological
Association.
Lally, S. J. (2003). What tests are acceptable for use in forensic
evaluations?: A survey of experts.
Professional Psychology: Research and Practice, 34, 491-498.
Lampel, A. K. (1999). Use of the MCMI-III in evaluating child
custody litigants. American Journal
ofForensic Psychology, 17, 19-31.
Lindsay, K. A., Sankis, L. M., & Widiger, T. A. (2000). Sex and
gender bias in self-report personality
disorder inventories. Journal ofPersonality Disorders, 14, 218-
232.
Mandell, D. (1997). An investigation ofthe effects of item
omissions on the Millon Clinical Multiax-
ial Inventory-II (MCMI-ll). Unpublished doctoral dissertation,
Fairleigh Dickinson University,
Teaneck, NJ.
McCann, J. T. (2002). Guidelines for the forensic applications
of the MCMI-III. Journal ofForensic
Psychology Practice, 2, 55-70.
McCann, J. T., Flens, J. T., & Campagna, V. (2001). The
MCMI-III in child custody evaluations: A
normative study. Journal ofForensic Psychology Practice, 1, 27-
44.
Millon, T. (1977). MCMI manual. Minneapolis, MN:
Interpretive Scoring Systems.
Millon, T. (1983). Modern psychopathology: A biosocial
approach to maladaptive learning and
functioning. Prospect Heights, IL: Waveland Press.
Millon, T. ( 1987). Manualfor the MCMI-ll ( 2nd ed.).
Minneapolis, MN: National Computer Systems.
Millon, T., & Davis, R. D. (1996). Disorders of personality:
DSM-IV and beyond (Rev. ed.). New
York: Wiley.
Millon, T., Davis, R., & Millon, C. (1994). MCMI-Ill manual.
Minneapolis, MN: National Computer
Systems.
Millon, T., Davis, R., & Millon, C. (1997). MCMI-lll manual (
2nd ed.). Minneapolis, MN: National
Computer Systems.
Millon Clinical Multiaxial Inventory-III 281
Morgan, C. D., Schoenberg, M. R., Dorr, D., & Burke, M. J.
(2002). Overreport on the MCMI-III:
Concurrent validation with the MMPI-2 using a psychiatric
inpatient sample. Journal of Per-
sonality Assessment, 78, 288-300.
Paulhus, D. L. (1984). Two-component models of socially
desirable responding.Journal ofPersonality
and Social Psychology, 46, 598-609.
Retzlaff, P. D. (1995). Tactical psychotherapy of the personality
disorders: An MCMI-III-based
approach. Needham Heights, MA: Allyn & Bacon.
Retzlaff, P. D., Ofman, P., Hyer, L., & Matheson, S. (1994).
MCMI-11 high-point codes: Severe
personality disorder and clinical syndrome extensions. Journal
of Clinical Psychology, 30,
228-234.
Retzlaff, P. D., Stoner, J., & Kleinsasser, D. (2002). The use of
the MCMI-III in the screening and
triage of offenders. International Journal of Offender Therapy
and Comparative Criminology,
46, 319-332.
Rogers, R., Salekin, R. T., & Sewell, K. W. (1999). Validation
of the MCMI for Axis II disorders:
Does it meet the Daubert standard? Law and Human Behavior,
23, 425-443.
Schoenberg, M. R., Dorr, D., & Morgan, C. D. (2003). The
ability of the MCMI-III to detect
malingering. Psychological Assessment, 15, 198-204.
Schoenberg, M. R., Dorr, D., & Morgan, C. D. (2006).
Development of discriminant functions to
detect dissimulation for the MCMI-111. Journal of Forensic
Psychiatry and Psychology, 17,
405-416.
Schutte, J. W. (2001). Using the MCMI-III in forensic
evaluations. American Journal of Forensic
Psychology, 19, 5-20.
Strack, S. (2002). Essentials ofMillon inventories assessment.
Hoboken, NJ: Wiley.
a-251-256a-276-281

More Related Content

PPTX
Thematic apperception test
PDF
Thematic Appreception Test is a Psychological Test conducted
PPTX
Thematic appreception test.powerpoint presentation
PPTX
Thematic Appreciation Test.presentaion PPT
PPT
6224231.ppt
PPTX
Decoding tat 1 Murray's tat concept
PPTX
Lecture- Thematic Apperception Test.pptx
Thematic apperception test
Thematic Appreception Test is a Psychological Test conducted
Thematic appreception test.powerpoint presentation
Thematic Appreciation Test.presentaion PPT
6224231.ppt
Decoding tat 1 Murray's tat concept
Lecture- Thematic Apperception Test.pptx

Similar to Chapter 12 THEMATIC APPERCEPTION TEST Like the Rorscha (20)

PDF
1109 3755-1-pb
PPTX
TAT Interpretation
PPTX
Decoding tat 6 tat interpretation based on bellak
PPTX
Projective tech.
PPTX
Thematic apperception test
PDF
thematicapperceptiontest-160810124347.pdf
PPTX
Thematic apperception test
PPT
Psychometric Assessment
PPT
Raymond Ppt
PPT
Raymond Ppt
PPTX
Child Apperception Test by Mariver C. Mangulabnan
PPTX
Department of education
PPTX
PSYCHOMETRY
PPTX
Understanding representation
PDF
Understanding Differential Item Functioning and Item bias In Psychological In...
PPTX
Department of education
PPTX
Projective techniques
PDF
Review Essay Example
PDF
Thematic Apperception Test (TAT)
PDF
Projective Tests
1109 3755-1-pb
TAT Interpretation
Decoding tat 6 tat interpretation based on bellak
Projective tech.
Thematic apperception test
thematicapperceptiontest-160810124347.pdf
Thematic apperception test
Psychometric Assessment
Raymond Ppt
Raymond Ppt
Child Apperception Test by Mariver C. Mangulabnan
Department of education
PSYCHOMETRY
Understanding representation
Understanding Differential Item Functioning and Item bias In Psychological In...
Department of education
Projective techniques
Review Essay Example
Thematic Apperception Test (TAT)
Projective Tests
Ad

More from EstelaJeffery653 (20)

DOCX
Individual ProjectMedical TechnologyWed, 9617Num.docx
DOCX
Individual ProjectThe Post-Watergate EraWed, 3817Numeric.docx
DOCX
Individual ProjectArticulating the Integrated PlanWed, 31.docx
DOCX
Individual Multilingualism Guidelines1)Where did the a.docx
DOCX
Individual Implementation Strategiesno new messagesObjectives.docx
DOCX
Individual Refine and Finalize WebsiteDueJul 02View m.docx
DOCX
Individual Cultural Communication Written Assignment  (Worth 20 of .docx
DOCX
Individual ProjectThe Basic Marketing PlanWed, 3117N.docx
DOCX
Individual ProjectFinancial Procedures in a Health Care Organiza.docx
DOCX
Individual Expanded Website PlanView more »Expand view.docx
DOCX
Individual Expanded Website PlanDueJul 02View more .docx
DOCX
Individual Communicating to Management Concerning Information Syste.docx
DOCX
Individual Case Analysis-MatavIn max 4 single-spaced total pag.docx
DOCX
Individual Assignment Report Format• Report should contain not m.docx
DOCX
Include LOCO api that allows user to key in an address and get the d.docx
DOCX
Include the title, the name of the composer (if known) and of the .docx
DOCX
include as many events as possible to support your explanation of th.docx
DOCX
Incorporate the suggestions that were provided by your fellow projec.docx
DOCX
inal ProjectDUE Jun 25, 2017 1155 PMGrade DetailsGradeNA.docx
DOCX
include 1page proposal- short introduction to research paper and yo.docx
Individual ProjectMedical TechnologyWed, 9617Num.docx
Individual ProjectThe Post-Watergate EraWed, 3817Numeric.docx
Individual ProjectArticulating the Integrated PlanWed, 31.docx
Individual Multilingualism Guidelines1)Where did the a.docx
Individual Implementation Strategiesno new messagesObjectives.docx
Individual Refine and Finalize WebsiteDueJul 02View m.docx
Individual Cultural Communication Written Assignment  (Worth 20 of .docx
Individual ProjectThe Basic Marketing PlanWed, 3117N.docx
Individual ProjectFinancial Procedures in a Health Care Organiza.docx
Individual Expanded Website PlanView more »Expand view.docx
Individual Expanded Website PlanDueJul 02View more .docx
Individual Communicating to Management Concerning Information Syste.docx
Individual Case Analysis-MatavIn max 4 single-spaced total pag.docx
Individual Assignment Report Format• Report should contain not m.docx
Include LOCO api that allows user to key in an address and get the d.docx
Include the title, the name of the composer (if known) and of the .docx
include as many events as possible to support your explanation of th.docx
Incorporate the suggestions that were provided by your fellow projec.docx
inal ProjectDUE Jun 25, 2017 1155 PMGrade DetailsGradeNA.docx
include 1page proposal- short introduction to research paper and yo.docx
Ad

Recently uploaded (20)

PDF
1.3 FINAL REVISED K-10 PE and Health CG 2023 Grades 4-10 (1).pdf
PDF
Vision Prelims GS PYQ Analysis 2011-2022 www.upscpdf.com.pdf
PDF
medical_surgical_nursing_10th_edition_ignatavicius_TEST_BANK_pdf.pdf
PPTX
202450812 BayCHI UCSC-SV 20250812 v17.pptx
PPTX
Onco Emergencies - Spinal cord compression Superior vena cava syndrome Febr...
DOCX
Cambridge-Practice-Tests-for-IELTS-12.docx
PDF
My India Quiz Book_20210205121199924.pdf
PDF
Τίμαιος είναι φιλοσοφικός διάλογος του Πλάτωνα
PDF
BP 704 T. NOVEL DRUG DELIVERY SYSTEMS (UNIT 1)
PDF
Weekly quiz Compilation Jan -July 25.pdf
PPTX
Computer Architecture Input Output Memory.pptx
PDF
MBA _Common_ 2nd year Syllabus _2021-22_.pdf
PDF
Complications of Minimal Access-Surgery.pdf
PDF
Trump Administration's workforce development strategy
PDF
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
PDF
Hazard Identification & Risk Assessment .pdf
PDF
CISA (Certified Information Systems Auditor) Domain-Wise Summary.pdf
PDF
FOISHS ANNUAL IMPLEMENTATION PLAN 2025.pdf
PDF
Chinmaya Tiranga quiz Grand Finale.pdf
PDF
International_Financial_Reporting_Standa.pdf
1.3 FINAL REVISED K-10 PE and Health CG 2023 Grades 4-10 (1).pdf
Vision Prelims GS PYQ Analysis 2011-2022 www.upscpdf.com.pdf
medical_surgical_nursing_10th_edition_ignatavicius_TEST_BANK_pdf.pdf
202450812 BayCHI UCSC-SV 20250812 v17.pptx
Onco Emergencies - Spinal cord compression Superior vena cava syndrome Febr...
Cambridge-Practice-Tests-for-IELTS-12.docx
My India Quiz Book_20210205121199924.pdf
Τίμαιος είναι φιλοσοφικός διάλογος του Πλάτωνα
BP 704 T. NOVEL DRUG DELIVERY SYSTEMS (UNIT 1)
Weekly quiz Compilation Jan -July 25.pdf
Computer Architecture Input Output Memory.pptx
MBA _Common_ 2nd year Syllabus _2021-22_.pdf
Complications of Minimal Access-Surgery.pdf
Trump Administration's workforce development strategy
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
Hazard Identification & Risk Assessment .pdf
CISA (Certified Information Systems Auditor) Domain-Wise Summary.pdf
FOISHS ANNUAL IMPLEMENTATION PLAN 2025.pdf
Chinmaya Tiranga quiz Grand Finale.pdf
International_Financial_Reporting_Standa.pdf

Chapter 12 THEMATIC APPERCEPTION TEST Like the Rorscha

  • 1. Chapter 12 THEMATIC APPERCEPTION TEST Like the Rorschach Inkblot Method (RIM) discussed in the preceding chapter, the Thematic Apperception Test (TAT) is a performance-based measure of personality. This means that TAT data consist of how people respond to a task they are given to do, not what they may say about themselves. In further contrast to self-report measures, the TAT resembles the RIM in providing an indirect rather than a direct assessment of personality characteristics, which makes it particularly helpful in identifying characteristics that people do not fully recognize in themselves or are reluctant to disclose. The TAT is a storytelling technique in which examinees are shown pictures of people or scenes and asked to make up a story about them. The TAT differs from the RIM in three key respects. First, being real pictures rather than blots of ink, the TAT stimuli are more structured and less ambiguous than the Rorschach cards. Second, the TAT instructions are more open-ended and less structured than those used in administering the RIM. Rorschach examinees are questioned specifically about where they saw their percepts and what made them look as they did. On the TAT, as elaborated in the present chapter, people are asked
  • 2. only in general terms to expand on the stories they tell (e.g., "What is the person think- ing?"). Third, the TAT requires people to exercise their imagination, whereas the RIM is a measure of perception and association. Rorschach examinees who ask whether to use their imagination should be told, "No, this is not a test of imagination; just say what the blots look like and what you see in them." By contrast, TAT takers who say "I'm not sure what the people in the picture are doing," or "I don't know what the outcome will be," can be told, "This is a test of imagination; make something up." This distinction between the RIM and the TAT accounts for the TAT having been called an apperceptive test. As elaborated in Chapter 11, the RIM was originally designed as a test of perception that focused on what people see in the test stimuli, where they see it, and why it looks as it does. The TAT was intended to focus instead on how people interpret what they see and the meaning they attach to their interpretations, and the term "apperception" was chosen to designate this process. The development of the TAT is discussed further following the description of the test. NATURE OF THE THEMATIC APPERCEPTION TEST The Thematic Apperception Test (TAT) consists of 31 achromatic cards measuring 9¼ x 11 inches. Fourteen of the cards show a picture of a single person, 11 cards depict two or more people engaged in some kind of relationship, three are group pictures of three or four
  • 3. people, two portray nature scenes, and one is totally blank. The cards are numbered from 425 426 Performance-Based Measures 1 to 20, and nine of the cards are additionally designated by letters intended to indicate their appropriateness for boys (B) and girls (G) aged 4 to 14, males (M) and females (F) aged 15 or older, or some combination of these characteristics (as in 3BM, 6GF, 12BG, and l 3MF). Twenty cards are designated for each age and gender group. People are asked to tell a story about each of the cards they are shown. They are told that their stories should have a beginning, a middle, and an end and should include what is happening in the picture, what led up to this situation, what the people in the picture are thinking and feeling, and what the outcome of the situation will be. When people have finished telling their story about a picture, they are asked to add story elements they have omitted to mention (e.g., "How did this situation come about?" "What is on this person's mind?" "How is she feeling right now?" "What is likely to happen next?"). In common with a Rorschach administration, these TAT procedures generate structural, thematic, and behavioral data that provide a basis for drawing inferences about an individual's personality
  • 4. characteristics. Structural Data All TAT stories have a structural component that is defined by certain objective features of the test protocol. The length of the stories people tell can provide information about whether they are approaching this task-and perhaps other situations in their lives as well-in a relatively open and revealing fashion (long stories) or in a relatively guarded manner that conceals more than it reveals (short stories). Story length can also provide clues to a person's energy level, perhaps thereby identifying depressive lethargy in one person (short stories) and hypomanic expansiveness in another person (long stories), and clues to whether the individual is by nature a person of few or many words. Shifts in the length of stories from one card to the next, or in the reaction time before the storytelling begins, may identify positive or negative reactions to the typical themes suggested by the cards, which are described later in the chapter. The amount of detail in TAT stories provides another informative structural element of a test protocol. Aside from their length in words, TAT stories can vary in detail from a precisely specified account of who is doing what to whom and why (which might reflect obsessive-compulsive personality characteristics), to a vague and superficial description of people and events that suggests a shallow style of dealing with affective and interpersonal
  • 5. experience. A related structural variable consists of the number and type of stimulus details that are noted in the stories. Most of the TAT pictures contain (a) some prominent elements that are almost always included in the stories people tell; (b) some minor figures or objects that are also included from time to time; and (c) many peripheral details that are rarely noted or mentioned. Card 3BM, for example, depicts a person sitting on the floor (almost always mentioned), a small object on the floor by the person's feet (frequently but not always mentioned), and a piece of furniture on which the person is leaning (seldom mentioned). Divergence from these common expectations can have implications for how people generally pay attention to their surroundings, particularly with respect to whether they tend to be inattentive to what is obvious and important, or whether instead they are likely to become preoccupied with what is obscure or of little relevance. Also of potential interpretive significance is the extent to which TAT stories revolve around original themes or common themes. A preponderance of original themes may reflect Thematic Apperception Test 427 creativity and openness on the part of the examinee, whereas consistently common themes often indicate conventionality or guardedness. As additional structural variables, the
  • 6. coherence and rationality of stories can provide clues to whether people are thinking clearly and logically, and the quality of the vocabulary usage and grammatical construc- tion in people's stories usually says something about their intellectual level and verbal facility. Thematic Data All TAT stories have a thematic as well as a structural component. Like the thematic imagery that often emerges in Rorschach responses, the content of TAT stories provides clues to a person's underlying needs, attitudes, conflicts, and concerns. Because they depict real scenes, the TAT cards provide more numerous and more direct opportunities than the Rorschach inkblots for examinees to attribute characteristics to human figures in various circumstances. Typical TAT stories are consequently rich with information about the de- picted characters' aspirations, intentions, and expectations that will likely reveal aspects of how people feel about themselves, about other people, and about their future prospects. These kinds of information typically derive from four interpretively significant aspects of the imagery in TAT stories: 1. How the people in a story are identified and described (e.g., "young woman," "pres- ident of a bank," "good gymnast") and whether examinees appear to be identifying with these people or seeing them as representing certain other
  • 7. people in their lives (e.g., parent, spouse). 2. How the people in a story are interacting; for example, whether they are helping or hurting each other in some way. 3. The emotional tone of the story, as indicated by the specific affect attributed to the depicted characters (e.g., happy, sad, angry, sorry, enthused, indifferent). 4. The plot of the story, with particular respect to outcomes involving success or failure, gratification or disappointment, love gained or lost, and the like. Behavioral Data As when they are responding to the RIM and other performance- based measures of per- sonality, the way people behave and relate to the examiner during a TAT administration provides clues to how they typically approach task-oriented and interpersonal situations. Whether they appear self-assured or tentative, friendly or surly, assertive or deferential, and detached or engaged can characterize individuals while they are telling their TAT stories, and these test behaviors are likely to reflect general traits of a similar kind. Unlike the situation in Rorschach assessment, the structural, thematic, and behavioral sources of data in TAT assessment are not potentially equivalent in their interpretive sig-
  • 8. nificance. As discussed in Chapter 11, either the structural, the thematic, or the behavioral features of Rorschach responses may tum out to be the most revealing and reliable source of information about an individual's personality functioning, and it cannot be determined in advance which one it will be. On the TAT, by contrast, the thematic imagery in the 428 Performance-Based Measures stories almost always provides more extensive and more useful data than the structural and behavioral features of a test protocol. Moreover, because the TAT pictures portray real-life situations, and because test takers are encouraged to embellish their responses, TAT stories are likely to generate a greater number of specific hypotheses than the thematic imagery in Rorschach responses concern- ing an individual's underlying needs, attitudes, conflicts, and concerns. TAT stories tend to help identify particular persons and situations with whom various motives, intentions, and expectations are associated. With respect both to the inner life of people and the nature of their social relationships, then, the TAT frequently provides more information than the RIM. HISTORY As befits a storytelling technique, the TAT emerged as the
  • 9. outcome of an interesting and in some respects unlikely story. Like the history of Rorschach assessment, the TAT story dates back to the first part of the twentieth century, but it is an American rather than a European story. The tale begins with Morton Prince, a Boston born and Harvard-educated neurologist who lectured at Tufts Medical College and distinguished himself as a specialist in abnormal psychology. Along with accomplishments as a practitioner, teacher, and author of the original work on multiple personality disorder (Prince, 1906), Prince founded the Journal of Abnormal Psychology in 1906 and served for many years as its editor. By the mid- l 920s, he had come to believe that a university setting would be more conducive to advances in psychopathology research than the traditional locus of such research in medical schools, where patient care responsibilities often take precedence over scholarly pursuits. In 1926, Prince offered an endowment to Harvard University to support an academic center for research in psychopathology. The university accepted his offer and established for this purpose the Harvard Psychological Clinic, with Prince as its first director. On assuming the directorship of the Harvard Psychological Clinic, Prince looked to hire a research associate who would plan and implement the programs of the new facility. Acting on the recommendation of an acquaintance, but apparently without benefit of a search committee or consultation with the Harvard psychology faculty, he hired an ostensibly
  • 10. unqualified person for the job-a surgically trained physician and PhD biochemist named Henry Murray. Two years later, Prince retired, and in 1928 Murray succeeded him as Director of the Clinic, a position for which, according to his biographer, Murray "was the first person to admit that he was unqualified ... though he had done a good bit of reading" (Robinson, 1992, p. 142). Henry Murray was to become one of the best-known and highly respected personality theorists in the history of psychology. He remains recognized today for his pioneering emphasis on individual differences rather than group tendencies, which as noted in Chapter 1 (see pp. 12-13) became identified in technical terms as an idiographic approach to the study of persons (as distinguished from a nomothetic approach emphasizing characteristics that differentiate groups of people). The main thrust of what Murray called "personology" was attention to each person's unique integration of psychological characteristics, rather than to the general nature of these characteristics. For Murray, then, the study of person- ality consisted of exploring individual experience and the kinds of lives that people lead, Thematic Apperception Test 429 rather than exploring the origins, development, and manifestations of specific personality characteristics like dependency, assertiveness, sociability, and
  • 11. rigidity ( see Barenbaum & Winter, 2003; Hall, Lindzey, & Campbell, 1998, chap. 5). Most of all, however, Murray is known for having originated the Thematic Apperception Test. When Murray ascended to the directorship of the Harvard Psychological Clinic in 1928, there was little basis for anticipating his subsequent contributions to psychology. Born and reared in New York City, he had studied history as an undergraduate at Harvard, received his medical degree from Columbia in 1919, done a 2-year surgical internship, and then devoted himself to laboratory research that resulted in 21 published articles and a 1927 doctorate in biochemistry (Anderson, 1988, 1999; Stein & Gieser, 1999). As counterpoint to Murray's limited preparation for taking on his Harvard Clinic responsibilities, two personal events in the mid-1920s had attracted him to making this career change. One of these events was reading Melville's Moby Dick and becoming fascinated with the complexity of the characters in the story, particularly the underlying motivations that influenced them to act as they did. The second event was meeting and beginning a lifelong friendship with Christiana Mor- gan, an artist who was enamored of the psychoanalytic conceptions of Carl Jung. Morgan encouraged Murray to visit Jung in Switzerland, which he did in 1925. He later stated that, in 2 days of conversation with Jung, "enough affective stuff erupted to invalid a pure scientist" (Murray, 1940, p. 153). These events and his
  • 12. subsequent extensive reading in the psychological and psychoanalytic literature, combined with his background in patient care and laboratory research, made him far better prepared to head up the Harvard Psycholog- ical Clinic than his formal credentials would have suggested. He later furthered his own education by entering a training program in psychoanalysis, which he completed in 1935. During his tenure as Director from 1928 to 1943, Murray staffed the Harvard Psycho- logical Clinic with a highly talented group of young scholars and clinicians, many of whom went on to distinguished careers of their own. Under his direction, the clinic gained world- wide esteem for its theoretical and research contributions to the literature in personality and psychopathology. As his first major project, Murray orchestrated an intensive psychologi- cal study of 50 male Harvard students, each of whom was assessed individually with over 20 different procedures. Included among these procedures was a picture-story measure in which Murray had become interested in the early 1930s. A conviction had formed in his mind that stories told by people can reveal many aspects of what they think and how they feel, and that carefully chosen pictures provide a useful stimulus for eliciting stories that are rich in personal meaning. In collaboration with Morgan, he experimented with different pictures and eventually selected 20 that seemed particularly likely to suggest a critical situation or at least one person with whom an examinee would identify. These 20 pictures
  • 13. constituted the original version of the TAT, first described in print by C. D. Morgan and Murray (1935) as "a method for investigating fantasies." The results of Murray's Harvard study were published in a classic book, Explorations in Personality, which is best known for presenting his idiographic approach to studying people and his model of personality functioning (Murray, 1938). In Murray's model, each individual's personality is an interactive function of "needs," which are the particular motivational forces emerging from within a person, and "presses," which are environmental forces and situations that affect how a person expresses these needs. Less well-known or recalled is that the 1938 book was subtitled A Clinical and Experimental Study of Fifty 430 Performance-Based Measures Men of College Age. After elaborating his personality theory in terms of 29 different needs and 20 different presses in the first half of the book, Murray devoted the second half to presenting the methods and results of the 50-man Harvard study. The discussion of this research included some historically significant case studies that illustrated for the first time how the TAT could be used in concert with other assessment methods to gain insight into the internal pressures and external forces that shape each individual's personality.
  • 14. The original TAT used in the Harvard Clinic Study was followed by three later versions of the test, as C. D. Morgan and Murray continued to examine the stimulus potential of different kinds of paintings, photographs, and original drawings. The nature and origins of the pictures used in four versions of the test are reviewed by W. G. Morgan (1995, 2002, 2003). The final 31-card version of the test was published in 1943 (Murray, 1943/1971) and remains the version in use today. Ever the curious scientist, Murray might have continued trying out new cards, according to Anderson ( 1999), had he not left Harvard for Washington, D.C., in 1943 to contribute to the World War II effort. Murray was asked to organize an assessment program in the Office of Strategic Services (OSS), the forerunner of the CIA, for selecting men and women who could function effectively as spies and saboteurs behind enemy lines. A fascinating account of how Murray and his colleagues went about this task and the effectiveness of the selection procedures they devised was published after the war by the OSS staff (Office of Strategic Services, 1948), and Handler (2001) has more recently prepared a summary of this account. As Rorschach had done with his inkblots, Murray developed a scheme for coding stories told to the TAT pictures. Also in common with Rorschach's efforts, but for different
  • 15. reasons, Murray's coding scheme opened the door for modification in the hands of others. Rorschach's system was still sketchy at the time of his death and left considerable room for additions and revisions by subsequent systematizers (see Chapter 11). By contrast, Murray (1943/1971) presented in his manual a detailed procedure for rating each of 28 needs and 24 presses on a 5-point scale for their intensity, duration, frequency, and importance whenever they occur in a story. This complex scoring scheme proved too cumbersome to gain much acceptance among researchers and practitioners who took up the TAT after its 1943 publication made it widely available. Consequently, as elaborated by Murstein (1963), many other systems for interpreting the TAT emerged over the next 15 to 20 years; some them followed Murray in emphasizing content themes, and others attended as well to structural and thematic features of stories. Several of these new systems were proposed by psychologists who had worked with Murray at the Harvard Psychological Clinic, notably Leopold Bellak (1947), William Henry (1956), Edwin Shneidman (1951), Morris Stein (1948), and Silvan Tomkins (1947). Shneidman (1965) later wrote that the TAT had quickly become "everybody's favorite adopted baby to change and raise as he wished" (p. 507). Of these and other TAT systems that were devised in the 1940s and 1950s, only variations of an "inspection technique" proposed by Bellak became widely used. Currently in its sixth
  • 16. edition, Bellak's text recom- mends an approach to TAT interpretation in which an individual's stories are examined for repetitive themes and recurring elements that appear to fall together in meaningful ways (Bellak & Abrams, 1997). This inspection technique is described further in the coding and interpretation sections of the present chapter. Aside from proposing different systems for interpreting TAT stories, assessment psy- chologists have at times suggested four reasons for modifying the TAT picture set that Thematic Apperception Test 431 Murray published in 1943. The first of these reasons concerns whether the standard TAT pictures are suitable for use with young children or the elderly. Young children may identify more easily with animals than with people, some said, and the situations portrayed in the standard picture set do not adequately capture the life experiences of older persons. In light of these possibilities, Bellak developed two alternative sets of pictures: the Children's Apperception Test (CAT), intended for use with children aged 3 to 10 and portraying ani- mal rather than human characters, and the Senior Apperception Test (SAT), which depicts primarily elderly people in circumstances they are likely to encounter (Bellak, 1954, 1975; Bellak & Abrams, 1997). Little has been written about the utility of the SAT, however,
  • 17. and the development of the CAT appears to have been unnecessary. Research reviewed by Teglasi (2001, chap. 8) has indicated that children tell equally or even more meaningful stories to human cards than they do to animal cards. A second reason for questioning the appropriateness of the standard TAT set is that all the figures in them are Caucasian. Efforts to enhance multicultural sensitivity in picture- story assessment, particularly in the evaluation of children and adolescents, led to the development of the Tell-Me-A-Story test (TEMAS; Costantino, Malgady, & Rogler, 1988). Th.e TEMAS is a TAT-type measure for use with young people aged 5 to 18 in which the stimulus cards portray conflict situations involving African American and Latino characters. Research with the TEMAS pictures has confirmed that they are likely to elicit fuller and more revealing stories from minority individuals than the all-Caucasian TAT pictures (Costantino & Malgady, 1999; Costantino, Malgady, Rogler, & Tusi, 1988), and there are also indications that the TEMAS has cross-culture applicability in Europe as well as within the United States (see Dana, 2006). As a third concern, there has been little standardization of which of the 20 TAT cards are administered and in what order to a person of a particular age and gender, which has made it difficult to assess the reliability and validity of the instrument. Considerations in card selection and the psychometric foundations of the TAT are discussed later in the chapter.
  • 18. However, dissatisfaction with widespread variation in these aspects of TAT administration influenced the development of two new TAT-type measures. One of these newer measures, the Roberts Apperception Test for Children (RATC), was designed for use with young people aged 5 to 16 and portrays children and adolescents engaged in everyday interactions (McArthur & Roberts, 1990). There are 27 RATC cards, 11 of which are alternate versions for males or females, and each youngster taking the test is administered a standard set of 16 cards in a set sequence, using male or female versions as appropriate. A revision of the RATC, called the Roberts-2 (Roberts, 2006) extends the age range for the test to 18 and includes three parallel sets of cards for use with White, Black, and Hispanic children and adolescents. The second alternative standard set of cards, which also includes multiethnic pictures, is the Apperceptive Personality Test (APT; Holmstrom, Silber, & Karp, 1990; Karp, Holstrom, & Silber, 1989). The APT consists of just eight stimulus pictures, each of which is always administered and in a fixed sequence. Fourth and finally, some users of the TAT have found fault with the generally dark, gloomy, achromatic nature of the pictures and with the old- fashioned appearance of the people and scenes portrayed in them. It may be that these features of the cards make it difficult for people to identify with the figures in them or to tell lively stories about them. The TEMAS, by contrast, features brightly colored pictures and
  • 19. contemporary situations. Colored photographs have also been used to develop an alternative picture set for use with 432 Performance-Based Measures adults, called the Picture Projective Test (PPT), and some research has suggested that the relatively bright PPT cards may generate more active and more emotionally toned stories than the relatively dark TAT cards (Ritzler, Sharkey, & Chudy, 1980; Sharkey & Ritzler, 1985). As alternative picture sets for use with young and elderly individuals, the CAT, SAT, TEMAS, and RATC have enjoyed some popularity in applied practice. Each of these measures also remains visible as the focus of occasional research studies published in the literature. However, none of them appears to have detracted very much from clinical applications and research studies of the original 1943 version of the TAT. With respect to alternative picture sets for adults, neither the APT, the PPT, or any other proposed revision in the TAT picture set has attracted much attention from practitioners or researchers, despite their apparent virtues with respect to standardization and stimulus enhancement. ADMINISTRATION As spelled out in his 1943 Manual, Murray intended that
  • 20. persons taking the TAT would be asked to tell stories to all 20 of the pictures appropriate to their age (child/adult) and gender (male/female). The 20 pictures were to be shown in two 50-minute sessions, with a 1-day interval between sessions, and people would be instructed to devote about 5 minutes to each story. In actual practice over the years, TAT examiners have typically administered 8 to 12 selected cards in a single session. Most commonly, cards are selected on the basis of whether they are expected to elicit stories that are rich in meaning and relevant to specific concerns of the person being assessed. With respect to eliciting interpretively rich stories, the most productive cards are usually those that portray a person in thought or depict emotional states or interpersonal relationships. The selection of cards specifically relevant in the individual case involves matching the content themes commonly pulled by the various cards with what is known or suspected about a person's central issues, such as aggressive or depressive concerns, problematic family relationships, or heterosexual or homosexual anxieties. In selecting which cards to use, then, examiners need to consider the content themes typically associated with each of them. A description of the TAT cards and the story lines they usually pull follows in the interpretation section of the chapter. With respect to common practice in card selection, Teglasi (2001, p. 38) has reported a consensus among TAT clinicians that the most useful TAT cards are 1, 2, 3BM,
  • 21. 6BM, 7GF, 8BM, 9GF, 10, and l 3MF. According to Teglasi's report, each of these 9 car ds appears to work equally well across ages and genders, despite their male, female, boy, or girl designation. Bellak (1999) recommends using a standard 10-card sequence consisting of these 9 cards plus Card 4, with the possible addition of other cards that pull for particular themes. In the individual case, then, the selected set should comprise all or most of these 9- or 10-card sets, with replacement or additional cards chosen on the basis of specific issues that are evaluated. Two research findings relevant to TAT card selection should also be noted. In an analysis by Keiser and Prather (1990) of 26 TAT studies, the 10 cards used most frequently were 1, 2, 3BM, 4, 6BM, 7BM, 8BM, 10, 13MF, and 16. In the other study, Avila-Espada (2000) used several variables, including the number of themes in the stories each card elicited, to calculate a stimulus value for each of them. On this basis, he chose two 12-card sets that Thematic Apperception Test 433 he considered equivalent in stimulus value to the full 20-card TAT set: one set for males (1, 2, 3BM, 4, 6BM, 7BM, 8BM, 10, 13MF, 14, 15, and 18BM) and one set for females (1, 2, 3GF, 4, 6GF, 7GF, 8GF, 9GF, 10, 13MF, 17GF, and 18GF).
  • 22. Turning now to the actual administration of the test, many of the general considerations discussed in Chapter 11 with respect to administering the RIM apply to the TAT as well. Test takers should have had an opportunity to discuss with the examiner (a) the purpose of their being tested (e.g., "The reason for this examination is to help in planning what kind of treatment would be best for you"); (b) the types of information the test will provide (e.g., "This is a measure of personality functioning that will give us a clearer understanding of what you 're like as an individual, the kinds of concerns you have, and what might be helpful to you at this point"); and (c) how the results will be used (e.g., "When the test results are ready, I will be reviewing them with you in a feedback session and then sending a written report to your therapist"). In preparation for giving the formal TAT instructions, the cards that have been selected should be piled face down on the table or desk, with Card I on the top and the rest of the selected cards beneath it in the order in which they are to be presented. To minimize inadver- tent influence of the examiner's facial expressions or bodily movements, it is advisable for the examiner to sit beside or at an angle from the person taking the test, rather than directly in front of the person. Once the test begins, whatever the examinee says should be recorded verbatim. Examiners can word-process the protocol with a computer instead of writing it longhand, should they prefer to do so, and a person's stories can also be tape-recorded and
  • 23. transcribed later on. There is no evidence to indicate that the examiner's writing out the record, using a computer, or tape-recording the protocol makes any difference in the stories that are obtained. Examiners should begin the TAT administration by informing people of the nature of their task. The following instructions, based on Murray's (1943/1971) original procedures and modifications suggested by Bellak and Abrams (1997), will serve this purpose well with adolescents and adults of at least average intelligence: I am going to show you some pictures, one at a time, and your task will be to make up as dramatic a story as you can for each. Tell what has led up to the event shown in the picture, describe what is happening at the moment, what the characters are feeling and thinking, and then give the outcome. Speak your thoughts as they come to your mind. Do you understand? When the TAT is being administered to adolescents and adults of limited intelligence, to children, or to seriously disturbed persons, the following simplified version of the instructions is recommended: This is a storytelling test. I have some pictures here that I am going to show you, and for each picture I want you to make up a story. Tell what has happened before and what is happening now. Say what the people are feeling and thinking and how it will come out. You can make up any kind of story you please. Do you understand?
  • 24. Following whichever of set of instructions is given, the examiner should say, "Here is the first picture" and then hand Card 1 to the examinee. Each of the subsequent cards can be presented by saying, "Here is the next one" or merely handing it to the person without further 434 Performance-Based Measures comment. The story told to each picture should be recorded silently, without interruption, until the person has finished with it. Immediately following the completion of each story, the examiner should inquire about any of the requested story elements that are missing. Depending on the content of the story, this inquiry could include questions about what is happening, what led up to this situation, what the people are thinking and feeling, or what the outcome will be. If a story as first told is missing most of these elements, a gentle reminder of the test instructions and a request to tell the story again may be preferable to asking each of the individual questions concerning what has been omitted. If only some of the requested story elements are missing and individual inquiries about them are answered with "Don't know" or "Can't say," examinees as previously indicated should be encouraged to "Use your imagination and make something up." Should this encouragement fail to generate
  • 25. any further elaboration of the story element being inquired, the examiner should desist without pressing the person further. Putting excessive pressure on test takers rarely generates sufficient additional information to justify the distress it may cause them, and doing so can also generate negative attitudes that limit cooperation with the testing procedures that follow. To the contrary, because adequately informative TAT protocols are so dependent on individuals being willing to fantasize and share the products of their fantasy, it can be helpful to encourage them with occasional praise (e.g., "That's an interesting story"). As Murray (1943/1971, p. 4) said about a little praise from time to time, "There is no better stimulant to the imagination." The examiner's inquiry questions should be limited to requests for information about missing story elements and should not include any other kinds of discussion or questions. For example, direct questions about the character's motives (e.g., "Why are they doing this?") should be avoided. Motivations that emerge in response to such leading questions lack the interpretive significance of motivations that people report spontaneously, and leading questions that go beyond the basic instructions may encourage examinees to report motivations and other kinds of information on subsequent cards when they would not otherwise have done so. Similarly, people should not be asked to talk about any person or object in a picture that they omitted from their story. This kind of question can influence the
  • 26. thoroughness with which individuals attend to subsequent pictures and thereby dilute the potential information value of total or selective attention to certain parts of certain pictures. Certain kinds of responses may at times call for the examiner to interrupt an examinee during the spontaneous phase of a TAT administration. Should the person be telling a rambling, extremely detailed story that contains all the requisite story elements but seems endless, the examiner should break in with something on the order of, "That's fine; I think I have the gist of that story; let's go on to the next picture." If a rambling and detailed story covers all the requisite elements except an outcome, the interruption can be modified to, "That's fine; just tell me how the story ends, and we'll go on to the next one." Long stories rarely provide more information than a briefer version that covers all the required story elements, and endless stories are seldom worth the time and energy they consume in a testing session. A second kind of response that calls for interruption is a drawn- out description of what a person sees in the picture with little or no attention to developing a story line with a plot. In this circumstance, the appropriate intervention is to remind the individual of the instructions: "That's fine so far, but let me remind you that what we need for this test is
  • 27. Thematic Apperception Test 435 for you to tell a story about each picture, with a beginning and an end, and to say what the people are thinking and feeling." A third problematic circumstance arises when people say that they can think of two or three different possibilities in a picture and set out to relate more than one story. Once more, to minimize any dilution of the interpretive significance of the data, examinees should not be allowed to tell alternative stories. If they indicate that such is their intent, they should be interrupted with words to this effect: "For each of these pictures I want you to tell just one story; if you have more than one idea about a picture, choose the one that you think is the best story for it." Finally, the nature of the test makes it suitable for group as well as individual assessments. In group administration, the selected cards are shown on a screen, the instructions are given in written form as well as orally by the person conducting the administration, and people are asked to write out their stories for each picture. Although group administration sacrifices the opportunity for examiners to inquire about missing story elements, this shortcoming
  • 28. can be circumvented in large part by mentioning the story requirements in the instructions. Based on recommendations by Atkinson (1958, Appendix III), who was a leading figure in developing procedures for large sample research with the TAT, the following written instructions can be used for group administration: You are going to see a series of pictures, and your task is to tell a story that is suggested to you by each picture. Try to imagine what is going on in each picture. Then tell what the situation is, what led up to the situation, what the people are thinking and feeling, and what they will do. In other words, write a complete story, with a plot and characters. You will have four minutes to write your story about each picture, and you will be told when it is time to finish your story and get ready for the next picture. There are no right or wrong stories or kinds of stories, so you may feel free to write whatever story is suggested to you when you look at a picture. Together with these written instructions, group test takers can be given a sheet of paper for each picture they will be shown, with the following four sets of questions printed on each sheet and followed by space for writing in an answer: 1. What is happening? Who are the persons? 2. What led up to this situation? What has happened in the past?
  • 29. 3. What is being thought and felt? What do the persons want? 4. What will happen? What will be done? CODING As noted, the cumbersome detail of the TAT coding scheme originally proposed by Murray (1943/1971) discouraged its widespread adoption in either clinical practice or research. The only comprehensive procedure for coding TAT stories that has enjoyed even mild popularity is an "Analysis Sheet" developed by Bellak (Bellak & Abrams, 1997, chap. 4) for use with his inspection method. Bellak's Analysis Sheet calls for examiners to describe briefly several features of each story, including its main theme, the needs and intentions of its characters, the kinds of affects that are being experienced, and the nature of any conflicts Thematic Apperception Test 467 Turning to their thinking, people whose cognitive integrity is intact typically produce coherent TAT stories that are easy to follow and exemplify logical reasoning. Disjointed stories that do not flow smoothly, and confusing stories that lack a sensible sequence, give reason for concern that a person's thought processes may be similarly scattered and incoherent. Narratives characterized by strained and
  • 30. circumstantial reasoning also raise questions about the clarity of an individual's thought processes. Illogical reasoning consists of drawing definite conclusions on the basis of minimal or irrelevant evidence and express- ing these conclusions with absolute certainty when alternative inferences would be equally or more likely. The following examples illustrate what people who are thinking illogically might say in telling their TAT stories. To Card 9BM (men lying on the ground): "These men are probably a barbershop quartet, because there are four of them, and the little guy would be the tenor, because he's the smallest" [being four in number is a highly circumstantial and far from compelling basis for inferring that the men are a vocal group, and there is no necessary or exclusive relationship between small stature and tenor voice]. To Card 12M (young man lying on couch with older man leaning over him): "The boy has a tie on, which means that he's a college student" [this is possible, but far from being a necessary meaning; perhaps the young man is wearing a tie because his mother made him wear it, or because he is going to get his picture taken today]. To Card 13MF (man standing in front of a woman lying in bed): "I think she must be dead, because she's lying down" [seeing the woman in this picture as dead is not unusual, but inferring certain death from lying down overlooks the
  • 31. possibility that she might be sleeping or resting]. APPLICATIONS In common with the other assessment measures presented in this Handbook, the TAT derives its applications from the information it provides about an individual's personality characteristics. The TAT was described in the introduction to this chapter as a performance- based measure that, like the RIM, generates structural, thematic, and behavioral sources of data. As also noted, however, these data sources are not of potentially equivalent significance in TAT interpretation as they are in Rorschach interpretation. Instead, the TAT, with few exceptions, is most useful by virtue of what can be learned from the thematic imagery about a person's inner life. Because the TAT functions best as a measure of underlying needs, attitudes, conflicts, and concerns, its primary application is in clinical work, mostly in planning psychotherapy and monitoring treatment progress. TAT findings may at times provide some secondary assistance in differential diagnosis, as illustrated in some of the examples presented in discussing story interpretation. Nevertheless, TAT stories are more helpful in understanding the possible sources and implications of adjustment difficulties than in distinguishing among categories of psychological disorder. For this reason, forensic and organizational
  • 32. applications of TAT assessment have also been limited, although attention is paid in the discussion that follows to the general acceptance of the TAT in the professional community 468 Performance-Based Measures and its potential utility in personnel selection. Other aspects of TAT assessment that enhance its utility are its suitability for group administration, its value in cross-cultural research, and its resistance to impression management. Treatment Planning and Monitoring The interpretive implications of TAT stories often prove helpful in planning, conducting, and evaluating the impact of psychological treatment. Especially in evaluating people who are seeking mental health care but are unable to recognize or disinclined to reveal very much about themselves, TAT findings typically go well beyond interview data in illuminating issues that should be addressed in psychotherapy. Inferences based on TAT stories are particularly likely to assist in answering the following four central questions in treatment planning: 1. What types of conflicts need to be resolved and what concerns need to be eased for the person to feel better and function more effectively? 2. What sorts of underlying attitudes does the person have
  • 33. toward key figures in his or her life, toward certain kinds of people in general, and toward interpersonal relatedness? 3. What situations or events are likely to be distressing or gratifying to the person, and how does this person tend to cope with distress and respond to gratification? 4. Which of these umesolved conflicts, underlying attitudes, or distressing experiences appears to be a root cause of the emotional or adjustment problems that brought the person into treatment? By providing such information, TAT findings can help guide therapists plan their treat- ment strategies, anticipate obstacles to progress, and identify adroit interventions. Having such knowledge in advance about elements of a person's inner life gives therapists a head start in conducting psychotherapy. This advantage can be especially valuable in short-term or emergency therapy, when the time spent obtaining an in- depth personality assessment is more than compensated by the time saved with early identification of the issues and concerns that need attention. Three research studies with the SCORS and DMM scales have demonstrated the po- tential utility of TAT stories for anticipating the course of psychotherapy and monitor- ing treatment progress. In one of these studies, S. J. Ackerman, Hilsemoth, Clemence, Weatherill, and Fowler (2000) found significant relationships
  • 34. between the pretherapy SCORS levels for affective quality of representations and emotional investment in rela- tionships and the continuation in treatment of 63 patients with a personality disorder, as measured by the number of sessions they attended. Also working with the SCORS, Fowler et al. (2004) followed 77 seriously disturbed patients receiving intensive psychotherapy in a residential setting who were administered the TAT prior to beginning treatment and a second time approximately 16 months later. Behavioral ratings indicated substantial improvement in the condition of these patients, and four of the SCORS scales showed corresponding significant changes for the better (Complexity of Representations, Understanding Social Causality, Self-esteem, and Identity and Coherence of the Self). Thematic Apperception Test 469 Cramer and Blatt (1990) were similarly successful in demonstrating the utility of the DMM in monitoring treatment change. In the Cramer and Blatt study, 90 seriously disturbed adults in residential treatment were tested on admission and retested after an average of 15 months of therapy. Reduction of psychiatric symptoms in these patients was accompanied by significant decline in total use of defenses, as measured with the DMM.
  • 35. Diagnostic Evaluations Contemporary practice in differential diagnosis distinguishes among categories or dimen- sions of disorder primarily on the basis of a person's manifest symptomatology or behavior, rather than the person's underlying attitudes and concerns (see American Psychiatric As- sociation, 2000). For this reason, what the TAT does best- generate hypotheses about a person's inner life-rarely plays a prominent role in clinical diagnostic evaluations. Nev- ertheless, certain thematic, structural, and behavioral features of a TAT protocol may be consistent with and reinforce diagnostic impressions based on other sources of informa- tion. Examples of this diagnostic relevance include suspicion- laden story plots that suggest paranoia, disjointed narratives that indicate disordered thinking, and a slow rate of speech that points to depressive lethargy. 2 In addition, research with the SCORS and DMM scales has demonstrated that objectified TAT findings can identify personality differences among persons with different types of problem. Patients with borderline personality disorder differ significantly on some SCORS variables from patients with major depressive disorder (Westen et al., 1990), and SCORS variables have been found to distinguish among patients with borderline, narcissistic, and antisocial personality disorders (S. J. Ackerman et al., 1999). Young people who have been physically or sexually abused display quite different interpersonal attitudes and expectations
  • 36. on the SCORS scales from the attitudes and expectations of children and adolescents who have not experienced abuse (Freedenfeld, Ornduff, & Kelsey, 1995; Kelly, 1999; Ornduff, Freedenfeld, Kelsey, & Critelli, 1994; Ornduff & Kelsey, 1996). Sandstrom and Cramer (2003) found that elementary schoolchildren whose DMM scores indicate use of identification are better adjusted psychologically, as measured by parent and self-report questionnaires, than children who rely on denial. In particular, the children in this study who showed identification reported less social anxiety and depression than those who showed denial, were less often described by their parents as having behavior prob- lems, and were more likely to perceive themselves as socially and academically competent. Adolescents with conduct disorder show less mature defenses on the DMM than adoles- cents with adjustment disorder, with the conduct disorder group being more likely to use denial than the adjustment disorder group, and less likely to use identification (Cramer & Kelly, 2004). Frequency of resorting to violence for resolution of interpersonal conflicts, as 2Note should be taken of the recent publication of the Psychodynamic Diagnostic Manual (PDM Task Force, 2006), which is intended to supplement the Diagnostic and Statistical Manual (DSM; American Psychiatric Association, 2000) as a guideline for differential diagnosis. The diagnostic framework formulated in the PDM encourages attention to each person's profile of mental functioning, which includes "patterns of relating, comprehending, and expressing feelings, coping with stress and anxiety, observing
  • 37. one's own emotions and behaviors, and forming moral judgments" (p. 2). Should such considerations come to play a more formal part in differential diagnosis than has traditionally been the case, TAT findings may become increasingly relevant in determining diagnostic classifications. 470 Performance-Based Measures self-reported by a sample of college student men, has shown a significant negative corre- lation with DMM use of identification and a significant positive correlation with use of projection (Porcerelli, Cogan, Kamoo, & Letman, 2004). These and similar TAT findings can help clinicians understand psychological distur- bances and appreciate the needs and concerns of people with adjustment problems. How- ever, these findings do not warrant using the SCORS, the DMM, or any other TAT scale as a sole or primary basis for diagnosing personality disorders or identifying victims of abuse. Differential diagnosis should always be an integrative process drawing on information from diverse sources, and for reasons already mentioned, the information gleaned from TAT sto- ries usually plays a minor role in this process. Moreover, neither the TAT nor any other performance-based measure of personality provides sufficient basis for inferring whether a person has been abused or had any other particular type of past experience. The following caution in this regard should always be kept in mind:
  • 38. "Psychological assessment data are considerably more dependable for describing what people are like than for predicting how they are likely to behave or postdicting what they are likely to have done or experienced" (Weiner, 2003, p. 335). Forensic and Organizational Applications Like the imagery in Rorschach responses, stories told to TAT pictures are better suited for generating hypotheses to be pursued than for establishing the reasonable certainties expected in the courtroom. On occasion, thematic preoccupations may carry some weight in documenting a state of mind relevant to a legal question, as in a personal injury case in which the TAT stories of a plaintiff seeking damages because of a claimed posttraumatic stress disorder reflect pervasive fears of being harmed or damaged. By and large, however, the psycholegal issues contended in the courtroom seldom hinge on suppositions about a litigant's or defendant's inner life. In terms of the criteria for admissibility into evidence discussed in the previous chapter, then, TAT testimony has limited likelihood of being helpful to judges and juries. As discussed in the final section of this chapter, moreover, TAT interpretation does not rest on a solid scientific basis, except for conclusions based on objectified scales for measuring specific personality characteristics. Nevertheless, forensic psychologists report using the TAT in their practice, and TAT
  • 39. assessment easily meets the general acceptance criterion for admissibility into evidence. Among forensic psychologists responding to surveys, over one- third report using the TAT or CAT in evaluations of children involved in custody disputes, and 24% to 29% in evaluating adults in these cases, with smaller numbers using the TAT in evaluations of personal injury (9%), criminal responsibility (8%), and competency to stand trial (5%; M. J. Ackerman & Ackerman, 1997; Boccaccini & Brodsky, 1999; Borum & Grisso, 1995; Quinnell & Bow, 2001). In a more recent survey of forensic psychologists by Archer, Buffington-Vollum, Stredny, and Handel (2006), 29% reported using the TAT for various purposes in their case evaluations. In clinical settings, the TAT has consistently been among the four or five most frequently used tests, and it has been the third most frequently used personality assessment method, following the MMPI and RIM with adults and the RIM and sentence completion tests with adolescents (Archer & Newsom, 2000; Camara, Nathan, & Puente, 2000; Hogan, 2005; Moretti & Rossini, 2004). Thematic Apperception Test 471 A majority (62%) of internship training directors report a preference for their incoming trainees to have had prior TAT coursework or at least a good working knowledge of the instrument (Clemence & Handler, 2001 ). Over the years, the TAT has been surpassed only
  • 40. by the MMPI and the RIM in the volume of published personality assessment research it has generated (Butcher & Rouse, 1996). As judged from its widespread use, its endorsement as a method that clinicians should learn, and the extensive body of literature devoted to it, TAT assessment appears clearly to have achieved general acceptance in the professional community. With respect to potential applications of the TAT in personnel selection, two meta- analytic studies have identified substantial relationships between McClelland's n-Ach scale and achievement-related outcomes. In one of these meta- analyses, Spangler (1992) found a statistically significant average affect size for n-Ach in predicting such outcomes as income earned, occupational success, sales success, job performance, and participation in and lead- ership of community organizations. This TAT measure of achievement motivation showed higher correlations with outcome criteria in these studies than self-report questionnaire measures of motivation to achieve. In the other meta-analysis, Collins, Hang es, and Locke (2004) examined 41 studies of need for achievement among persons described as entrepreneurs. Entrepreneurship in these studies consisted of being a manager responsible for making decisions in the business world or a founder of a business with responsibility for undertaking a new venture. The n-Ach scale in these studies was significantly correlated with
  • 41. choosing an entrepreneurial career and performing well in it, and Collins et al. concluded, "Achievement motivation may be particularly potent at differentiating between successful and unsuccessful groups of entrepreneurs" (p. 111). Hence there is reason to expect that TAT assessment may be helpful in identifying individuals who are likely to be adept at recognizing and exploiting entrepreneurial opportunities in the marketplace. Group Administration, Cross-Cultural Relevance, and Resistance to Impression Management As mentioned, three other aspects of the TAT are likely to enhance its applications for various purposes. First, the suitability of the TAT for group administration facilitates large- scale data collection for research purposes and creates possibilities for using the instrument as a screening device in applied settings. Second, since early in its history, the TAT has been used as a clinical and research instrument in many different countries and has proved particularly valuable in studying cultural change and cross-cultural differences in personality characteristics. Contributions by Dana (1999) and Ephraim (2000) provide overviews of these international applications of the TAT, and the particular sensitivity of TAT stories to cultural influences is elaborated by Ritzier (2004) and by Hofer and Chasiotis (2004).
  • 42. Third, as a performance-based measure, the TAT is somewhat resistant to impression management. People who choose to conceal their inner life by telling brief and unelaborated stories can easily defeat the purpose of the examination. In so doing, however, they make it obvious that they are delivering a guarded protocol that reveals very little about them, other than the fact of their concealment. For examinees who are being reasonably open 472 Performance-Based Measures and cooperative, the ambiguity of the task and their limited awareness of what their stories might signify make it difficult for them to convey any intentionally misleading impression of their attitudes and concerns. Nevertheless, telling stories is a more reality-based enterprise than saying what inkblots might be, and for this reason, the TAT is probably not as resistant as the RIM to impression management. Moreover, research reported in the 1960s and 1970s showed that college students could modify the TAT stories they told after being instructed to respond in certain ways ( e.g., as an aggressive and hostile person). Schretlen ( 1997) has concluded from these early studies that they "clearly demonstrate the fakability of the TAT" (p. 281).
  • 43. To take issue with Schretlen's conclusion, however, the ability of volunteer research participants to shape their TAT stories according to certain instructions may have little bearing on whether people being examined for clinical purposes can successfully manage the impression they give on this measure. Moreover, it is reasonable to hypothesize that experienced examiners, working with the benefit of case history information and data from other tests as well, would have little difficulty identifying in TAT stories the inconsistencies and exaggerations that assist in detecting malingering. However, the sensitivity of clinicians to attempted impression management in real-world TAT assessment has not yet been put to adequate empirical test. PSYCHOMETRIC FOUNDATIONS The nature of the TAT and the ways in which it has most commonly been used have made it difficult to determine its psychometric properties. Aside from a widely used and fairly standard set of instructions based on Murray's original guidelines for administration, research and practice with the TAT has been largely unsystematic. Certain sets of cards have been recommended by various authorities on the test, but there has been little consistency with respect to which cards are used and in what sequence they are shown (Keiser & Prather, 1990). Moreover, the primarily qualitative approach that typifies TAT interpretation in clinical practice does not yield the quantitative data that facilitate estimating the reliability
  • 44. of an assessment instrument, determining its validity for various purposes, and developing numerical reference norms. This lack of systematization and the resultant shortfall in traditional psychometric veri- fication have fueled a long history of controversy between critics who have questioned the propriety of using the TAT in clinical practice and proponents who have endorsed the value of the instrument and refuted criticisms of its use. Commentaries by Conklin and Westen (2001), Cramer (1999), Garb (1998), Hibbard (2003), Karon (2000), and Lilienfeld, Wood, and Garb (2000) provide contemporary summaries of these opposing views. Without re- hashing this debate, and with the psychometric shortcomings of traditional TAT assessment having already been noted, the following discussion calls attention to four considerations bearing on how and why this instrument can be used effectively for certain purposes. First, criticisms of the validity of the TAT have frequently been based on low correlations between impressions gleaned from TAT stories and either clinical diagnosis or self-report data. However, correlations with clinical diagnoses and self- report measures are conceptu- ally irrelevant to the validity of TAT for its intended purposes, and criticisms based on such correlations accordingly lack solid basis. The TAT was designed to explore the personal
  • 45. Thematic Apperception Test 473 experience and underlying motives of people, not to facilitate a differential diagnosis based primarily on manifest symptomatology (which is the basis of psychiatric classification in the Diagnostic and Statistical Manual [DSM-IV-TR]; American Psychiatric Association, 2000). Should some TAT scales show an association with particular psychological disor- ders, as they in the SCORS and DMM research, the test may help identify personality characteristics associated with these disorders. Failure to accomplish differential diagnosis, although important to recognize as a limitation of TAT applications, does not invalidate use of the instrument for its primary intended purposes. As for correlations with self-report measures, there is little to gain from attempting to validate performance-based personality tests against self- report questionnaires, or vice . versa for that matter. These are two types of test that are constructed differently, ask for different kinds of responses, provide different amounts of structure, and tap different levels of self-awareness, as discussed in concluding Chapter 1. Hence they may at times yield different results when measuring similar constructs, and in such instances they are more likely to complement than to contradict each other (see pp. 24- 26; see also Weiner, 2005). Meyer et al. (2001) drew the following conclusions in this
  • 46. regard from a detailed review of evidence and issues in psychological testing: Distinct assessment methods provide unique information .... Any single assessment method provides a partial or incomplete representation of the characteristics it intends to mea- sure.... Cross-method correlations cannot reveal ... how good a test is in any specific sense.... Psychologists should anticipate disagreements when similarly named scales are com- pared across diverse assessment methods. (p. 145) Because both self-report and performance-based personality tests are inferential mea- sures, furthermore, substantial correlations between them usually have only modest impli- cations for their criterion validity. Two tests that correlate perfectly with each other can be equally invalid, with no significant relationship to any meaningful criterion. Compelling evidence of criterion validity emerges when personality test scores correlate not with each other, but with external (nontest) variables consisting of what people are like and how they are observed to behave. Second, the traditionally qualitative TAT methods have been supplemented with quantita- tive scales that are readily accessible to psychometric verification. The previously mentioned research with the SCORS, DMM, and n-Ach scoring demonstrates that TAT assessment can be objectified to yield valid and reliable scales for measuring dimensions of personal- ity functioning. Additional research has demonstrated the
  • 47. internal consistency of SCORS and its validity in identifying developmental differences in the interpersonal capacities of children (e.g., Hibbard, Mitchell, & Porcerelli, 2001; Niec & Russ, 2002). The DMM has been validated as a measure of maturity level in children and adolescents, of developmental level of maturity in college students, and of long-term personality change and stability in adults (Cramer, 2003; Hibbard & Porcerelli, 1998; Porcerelli, Thomas, Hibbard, & Cogan, 1998). Support for the validity of these scales is acknowledged by critics as well as proponents of the TAT, although in the former case with the qualification that these "promising TAT scoring systems ... are not yet appropriate for routine clinical use" (Lilienfeld et al., 2000, p. 46). Even if this qualification is warranted, the point has been made that TAT assessment has the potential to generate valid and reliable findings. 474 Performance-Based Measures Research with other picture-story measures, notably the RATC and the TEMAS, has provided additional evidence of the potential psychometric soundness of assessing person- ality with this method. As reviewed by Weiner and Kuehnle (1998), quantitative scores generated by both measures have valid and meaningful correlates and have shown adequate levels of interscorer agreement and either internal consistency
  • 48. or retest stability. Third, not having systematically gathered quantitative normative data to guide TAT interpretations does not mean that the instrument lacks reference points. As reviewed in the section of this chapter on card pull, cumulative clinical experience has established expectations concerning the types of stories commonly elicited by each of the TAT cards. Hence examiners are not in the position of inventing a new test each time they use the TAT. Instead, similarities and differences between a person's stories and common expectations can and should play a prominent role in the interpretive process, as they did in many of the examples presented in this chapter. The fourth consideration pertains to the primary purpose of TAT assessment, which is to explore an individual's personal experience and generate hypotheses concerning the individual's underlying needs, attitudes, conflicts, and concerns. The value of the TAT resides in generating hypotheses that expand understanding of a person's inner life. If a TAT story suggests three alternative self-perceptions or sources of anxiety, and only one of these alternatives finds confirmation when other data sources are examined, then the test has done its job in useful fashion. It is not invalidated because two-thirds of the suggested alternatives in this instance proved incorrect. This is the nature of working with a primarily qualitative assessment instrument, which shows its worth, not through
  • 49. quantitative psychometric verification, but by clinicians finding it helpful in understanding and treating people who seek their services. Psychologists who may be concerned that this qualitative perspective detracts from the scientific status of assessment psychology should keep in mind that generating hypotheses is just as much a part of science as confirming hypotheses. REFERENCES Ackerman, M. J., & Ackerman, M. C. (1997). Custody evaluations in practice: A survey of experienced professionals (revisited). Professional Psychology, 28, 137-145. Ackerman, S. J., Clemence, A. J., Weatherill, R., & Hilsenroth, M. J. (1999). Use of the TAT in the assessment of DSM-IV Custer B personality disorders. Journal of Personality Assessment, 73, 422-448. Ackerman, S. J., Hilsenroth, M. J., Clemence, A. J., Weatherill, R., & Fowler, J. C. (2000). The effect of social cognition and object representation on psychotherapy continuation. Bulletin of the Menninger Clinic, 64, 386-408. American Psychiatric Association. (2000). Diagnostic and statistical manual of mental disorders (4th ed., text rev.). Washington, DC: Author. Anderson, J. W. (1988). Henry Murray's early career: A psychobiographical exploration. Journal of Personality, 56, 139-171.
  • 50. Anderson, J. W. (1999). Henry A. Murray and the creation of the Thematic Apperception Test. In L. Gieser & M. I. Stein (Eds.), Evocative images: The Thematic Apperception Test and the art ofprojection (pp. 23-38). Washington, DC: American Psychological Association. Thematic Apperception Test 475 Archer, R. P., Buffington-Vollum, J. K., Stredny, R. V., & Handel, R. W. (2006). A survey of psychological test use patterns among forensic psychologists. Journal ofPersonality Assessment, 87, 84-94. Archer, R. P., & Newsom, C. R. (2000). Psychological test usage with adolescent clients: Survey update. Assessment, 7, 227-235. Atkinson, J. W. (Ed.). (1958). Motives in fantasy, action, and society. Princeton, NJ: Van Nostrand. Avila-Espada, A. (2000). Objective scoring for the TAT. In R. H. Dana (Ed.), Handbook of cross- cultural and multicultural personality assessment (pp. 465-480). Mahwah, NJ: Erlbaum. Barenbaum, N. R., & Winter, D. G. (2003). Personality. In I. B. Weiner (Editor-in-Chief) & D. K. Freedheim (Vol. Ed.), Handbook of psychology: Vol. 1. History of psychology (pp. 177-302). Hoboken, NJ: Wiley. Bellak, L. (1947). A guide to the interpretation of the Thematic
  • 51. Apperception Test. New York: Psychological Corporation. Bellak, L. (1954). The Thematic Apperception Test and the Children's Apperception Test in clinical use. New York: Grune & Stratton. Bellak, L. (1975). The TAT, CAT, and SAT in clinical use ( 3rd ed.). New York: Grune & Stratton. Bellak, L. (1999). My perceptions of the Thematic Apperception Test in psychodiagnosis and psy- chotherapy. In L. Gieser & M. I. Stein (Eds.), Evocative images; The Thematic Apperception Test and the art ofprojection (pp. 133-141). Washington, DC: American Psychological Association. Bellak, L., & Abrams, D. M. (1997). The TAT, CAT, and SAT in clinical use ( 6th ed.). Boston: Allyn &Bacon. Blankenship, V., Vega, C. M., Ramos, E., Romero, K., Warren, K., Keenan, K., et al. (2006). Using the multifaceted Rasch model to improve the TAT/PSE measure of need for achievement. Journal ofPersonality Assessment, 86, 100-114. Boccaccini, M. T., & Brodsky, S. L. (1999). Diagnostic test usage by forensic psychologists in emotional injury cases. Professional Psychology, 30, 253-259. Borum, R., & Grisso, T. (1995). Psychological test use in criminal forensic evaluations. Professional Psychology, 26, 465-473. Busch, F. (1995). The ego at the center of clinical technique.
  • 52. Northvale, NJ: Aronson. Butcher, J. N., & Rouse, S. V. (1996). Personality: Individual differences and clinical assessment. Annual Review ofPsychology, 47, 87-111. Camara, W., Nathan, J., & Puente, A. (2000). Psychological test usage: Implications in professional use. Professional Psychology, 31, 141-154. Clemence, A. J., & Handler, L. (2001). Psychological assessment on internship: A survey of training directors and their expectations for students. Journal ofPersonality Assessment, 76, 18-47. Collins, C. J., Ranges, P. J., & Locke, E. A. (2004). The relationship of achievement motivation to entrepreneurial behavior: A meta-analysis. Human Performance, 17, 95-117. Conklin, A., & Westen, D. (2001). Thematic apperception test. In W. I. Dorfman & M. Hersen (Eds.), Understanding psychological assessment (pp. 107-133). Dordrecht, The Netherlands: Kluwer Academic. Costantino, G., & Malgady, R. G. (1999). The Tell-Me-A-Story Test: A multicultural offspring of the Thematic Apperception Test. In L. Gieser & M. I. Stein (Eds.), Evocative images: The Thematic Apperception Test and the art ofprojection (pp. 177- 190). Washington, DC: American Psychological Association. Costantino, G., Malgady, R. G., & Rogler, L. H. (1998). Technical manual: TEMAS Thematic
  • 53. Apperception Test. Los Angeles: Western Psychological Services. Costantino, G., Malgady, R. G., Rogler, L. H., & Tosi, E. C. (1998). Discriminant analysis of clinical outpatients and public school children by TEMAS: A thematic apperception test for Hispanics and Blacks. Journal ofPersonality Assessment, 52, 670-678. 476 Performance-Based Measures Cramer, P. (1991). The development of defense mechanisms: Theory, research and assessment. New York: Springer-Verlag. Cramer, P. (1996). Storytelling, narrative, and the Thematic Apperception Test. New York: Guilford Press. Cramer, P. (1999). Future directions for the Thematic Apperception Test. Journal of Personality Assessment, 72, 74-92. Cramer, P. (2003). Personality change in later adulthood is predicted by defense mechanism use in early adulthood. Journal ofResearch in Personality, 37, 76-104. Cramer, P. (2006). Protecting the self: Defense mechanisms in action. New York: Guilford Press. Cramer, P., & Blatt, S. J. (1990). Use of the TAT to measure change in defense mechanisms following intensive psychotherapy. Journal ofPersonality Assessment, 54,
  • 54. 236-251. Cramer, P., & Kelly, F. D. (2004). Adolescent conduct disorder and adjustment reaction. Journal of Nervous and Mental Diseases, 192, 139-145. Dana, R. H. (1999). Cross-cultural-multicultural use of the Thematic Apperception Test. In L. Gieger & M. I. Stein (Eds.), Evocative images: The Thematic Apperception Test and the art of projection (pp. 177-190). Washington, DC: American Psychological Association. Dana, R. H. (2006). TEMAS among the Europeans: Different, complementary, and provocative. South African Rorschach Journal, 3, 17-28. Ephraim, D. (2000). A psychocultural approach to TAT scoring and interpretation. In R.H. Dana (Ed.), Handbook of cross-cultural and multicultural personality assessment (pp. 427-446). Mahwah, NJ: Erlbaum. Eron,L. D. (1950). A normative study of the Thematic Apperception Test.Psychological Monographs, 64( Whole No. 315). Eron, L. D. (1953). Responses of women to the Thematic Apperception Test. Journal of Consulting Psychology, 17, 269-282. Fowler, J. C., Ackerman, S. J., Speanburg, S., Bailey, A., Blagys, M., & Conklin, A. C. (2004). Personality and symptom change in treatment refractory inpatients: Evaluation of the phase model of change using Rorschach TAT and DSM-IV Axis V.
  • 55. Journal ofPersonality Assessment, 83, 306-322. Freedenfeld, R. N., Orndoff, S. R., & Kelsey, R. M. (1995). Object relations and physical abuse: A TAT analysis. Journal ofPersonality Assessment, 64, 552-568. Freud, S. (1957). "Wild" psychoanalysis. In J. Strachey (Ed. & Trans.), The standard edition of the works of Sigmund Freud (Vol. 11, pp. 221-227). London: Hogarth Press. (Original work published 1910) Garb, H. N. (1998). Recommendations for training in the use of the Thematic Apperception Test (TAT). Professional Psychology, 29, 621-622. Hall, C. S., Lindzey, G., & Campbell, J. B. (1998). Theori es of personality (4th ed.). New York: Wiley. Handler, L. (2001). Assessment of men: Personality assessment goes to war by the Office of Strategic Services Assessment staff. Journal ofPersonality Assessment, 76, 558-578. Henry, W. E. (1956). The analysis offantasy: The thematic apperception technique in the study of personality. New York: Wiley. Hibbard, S. (2003). A critique of Lilienfeld et al. 's (2000) "The scientific status of projective tech- niques." Journal ofPersonality Assessment, 80, 260-271. Hibbard, S., Mitchell, D., & Porcerelli, J. (2001). Internal consistency of the Object Relations and
  • 56. Social Cognition scales for the Thematic Apperception Test. Journal ofPersonality Assessment, 77, 408-419. Hibbard, S., & Porcerelli, J. (1998). Further validation for the Cramer Defense Mechanisms manual. Journal ofPersonality Assessment, 70, 460-483. Thematic Apperception Test 477 Hofer, J., & Chasiotis, A. (2004). Methodological considerations of applying a TAT-type picture-story test in cross-cultural research. Journal of Cross-Cultural Psychology, 35, 224-241. Hogan, T. P. (2005). 50 widely used psychological tests. In G. P. Koocher, J.C. Norcross, & S. S. Hill III (Eds.), Psychologists' desk reference ( 2nd ed., pp. 101-104). New York: Oxford University Press. Holmstrom, R. W., Silber, D. E., & Karp, S. A. (1990). Development of the Apperceptive Personality Test. Journal ofPersonality Assessment, 54, 252-264. Huprich, S. K., & Greenberg, R. P. (2003). Advances in the assessment of object relations in the 1990s. Clinical Psychology Review, 23, 665-698. Jenkins, S. R. (in press). Handbook ofclinical scoring systems for Thematic Apperception techniques. Mahwah, NJ: Erlbaum. Karon, B. P. (2000). The clinical interpretation of the Thematic
  • 57. Apperception Test, Rorschach, and other clinical data: A reexamination of statistical versus clinical prediction. Professional Psychology, 31, 230-233. Karp, S. A., Holstrom, R. W., & Silber, D. E. (1989). Manual for the Apperceptive Personality Test (APT). Orland Park, IL: International Diagnostic Services. Keiser, R. E., & Prather, E. N. (1990). What is the TAT? A review of ten years of research. Journal of Personality Assessment, 55, 800-803. Kelly, F. D. (1999). The psychological assessment ofabused and traumatized children. Mahwah, NJ: Erlbaum. Kelly, F. D. (2007). The clinical application of the Social Cognition and Object Relations scale with children and adolescents. In S. R. Smith & L. Handler (Eds.), The clinical assessment ofchildren and adolescents (pp. 169-182). Mahwah, NJ: Erlbaum. Lanagan-Fox, J., & Grant, S. (2006). The Thematic Apperception Test: Toward a standard measure of the big three motives. Journal ofPersonality Assessment, 87, 277-291. Lilienfeld, S. 0., Wood, J. M., & Garb, H. N. (2000). The scientific status of projective techniques. Psychological Science in the Public Interest, 1, 27-66. McArthur, D. S., & Roberts, G. E. (1990). Roberts Apperception Test for Children manual. Los Angeles: Western Psychological Services.
  • 58. McClelland, D. C. (1999). How the test lives on: Extensions of the Thematic Apperception Test approach. In L. Gieser & M. I. Stein (Eds.), Evocative images: The Thematic Apperception Test and the art ofprojection (pp. 163-175). Washington, DC: American Psychological Association. McClelland, D. C., Atkinson, J. W., Clark, R. A., & Lowell, E. L. (1953). The achievement motive. New York: Appleton-Century-Crofts. McClelland, D. C., Clark, R. A., Roby, T. B., & Atkinson, J. W. (1958). The effect of the need for achievement on thematic apperception. In J. W. Atkinson (Ed.), Motives in fantasy, action, and society (pp. 64-82). Princeton, NJ: Van Nostrand. Meyer, J. G. (2004). The reliability and validity of the Rorschach and Thematic Apperception Test (TAT) compared to other psychological and medical procedures: An analysis of system- atically gathered evidence. In M. Hersen (Editor-in-Chief), M. Hilsenroth, & D. Segal (Vol. Eds.), Comprehensive handbook of psychological assessment: Vol. 2. Personality assessment (pp. 315-342). Hoboken, NJ: Wiley. Meyer, J. G., Finn, S. E., Eyde, L. D., Kay, G. G., Moreland, K. L., Dies, R. R., et al. (2001). Psychological testing and psychological assessment: A review of evidence and issues. American Psychologist, 56, 128-165. Moretti, R. J., & Rossini, E. D. (2004). The Thematic Apperception Test (TAT). In M. Hersen (Editor- in-Chief), M. J. Hilsenroth, & D. L. Segal (Vol. Eds.),
  • 59. Comprehensive handbook ofpsychological assessment: Vol. 2. Personality assessment (pp. 356-371). Hoboken, NJ: Wiley. Morgan, C. D., & Murray, H. A. (1935). A method for investigating fantasies: The Thematic Apper- ception Test. Archives ofNeurology and Psychiatry, 34, 289- 306. 478 Performance-Based Measures Morgan, W. G. (1995). Origin and history of Thematic Apperception Test images. Journal ofPerson- ality Assessment, 65, 237-254. Morgan, W. G. (2002). Origin and history of the earliest Thematic Apperception Test pictures. Journal ofPersonality Assessment, 79, 422-445. Morgan, W. G. (2003). Origin and history of the "Series B" and "Series C" TAT pictures. Journal of Personality Assessment, 81, 133-148. Murray, H. A. ( 1938). Explorations in personality: A clinical and experimental study offifty men of college age. New York: Oxford University Press. Murray, H. A. (1940). What should psychologists do about psychoanalysis? Journal of Abnormal and Social Psychology, 35, 150--175. Murray, H. A. (1971). Thematic Apperception Test: Manual. Cambridge, MA: Harvard University Press. (Original work published 1943)
  • 60. Murstein, B. I. (1963). Theory and research in projective techniques (Emphasizing the TAT). New York: Wiley. Niec, L. N., & Russ, S. W. (2002). Children's internal representations, empathy, and fantasy play: A validity study of the SCORS-Q. Psychological Assessment, 14, 331-338. Office of Strategic Services Assessment Staff. (1948). Assessment of men. New York: Rinehart. Ornduff, S. R., Freedendeld, R. N., Kelsey, R. M., & Critelli , J. W. ( 1994 ). Object relations of sexually abused female subjects: A TAT analysis. Journal of Personality Assessment, 63, 223-238. Ornduff, S. R., & Kelsey, R. M. (1996). Object relations of sexually and physically abused female children: A TAT analysis. Journal ofPersonality Assessment, 66, 91-105. Pang, J. S., & Schultheiss, 0. C. (2005). Assessing implicit motives in U.S. college students effects of picture type and position, gender, and ethnicity, and cross - cultural comparisons. Journal of Personality Assessment, 85, 280--294. PDM Task Force. (2006). Psychodynamic diagnostic manual. Silver Spring, MD: Alliance of Psy- choanalytic Organizations. Peters, E. J., Hilsenroth, M. J., Eudell-Simmons, E. M., Blagys, M. D., & Handler, L. (2006). Reliability and validity of the Social Cognition and Object
  • 61. Relations scale in clinical use. Psychotherapy Research, 16, 617--616. Porcerelli, J. H., Cogan, R., Kamoo, R., & Leitman, W. (2004). Defense mechanisms and self-reported violence toward partners and strangers. Journal ofPersonality Assessment, 82, 317-320. Porcerelli, J. H., & Hibbard, S. (2004). Projective assessment of defense mechanisms. In M. Hersen (Editor-in-Chief), M. J. Hilsenroth, & D. L. Segal (Vol. Eds.), Comprehensive hand- book ofpsychological assessment: Vol. 2. Personality assessment (pp. 466-475). Hoboken, NJ: Wiley. Porcerelli, J. H., Thomas, S., Hibbard, S., & Cogan, R. (1998). Defense mechanism development in children, adolescents, and late adolescents. Journal of Personality Assessment, 71, 411-420. Prince, M. (1906). The dissociation of a personality: A biographical study in abnormal psychology. New York: Longmans. Quinnell, F. A., & Bow, J. N. (2001). Psychological tests used in child custody evaluations. Behavioral Sciences and the Law, 19, 491-501. Ritzier, B. A. (2004). Cultural applications of the Rorschach, Apperception Tests, and figure drawings. In M. Hersen (Editor-in-Chief), M. J. Hilsenroth, & D. L. Segal (Vol. Eds.), Comprehensive handbook ofpsychological assessment: Vol. 2. Personality assessment (pp. 573-585). Hoboken, NJ: Wiley.
  • 62. Ritzier, B. A., Sharkey, K. J., & Chudy, J. F. (1980). A comprehensive projective alternative to the TAT. Journal ofPersonality Assessment, 44, 358-362. Roberts, G. E. (2006). Roberts-2 manual. Los Angeles: Western Psychological Services. Robinson, F. G. (1992). Love's story told: A life of Henry A. Murray. Cambridge, MA: Harvard University Press. Thematic Apperception Test 479 Sandstrom, M. J., & Cramer, P. (2003). Defense mechanisms and psychological adjustment in child- hood. Journal ofNervous and Mental Diseases, 191, 487-495. Schretlen, D. J. (1997). Dissimulation on the Rorschach and other projective measures. In R. Rogers (Ed.), Clinical assessment of malingering and deception (2nd ed., pp. 208-222). New York: Guilford Press. Shakespeare, W. (194 7). The tragedy ofHamlet, Prince ofDenmark. New Haven, CT: Yale University Press. (Original work published 1604) Sharkey, K. J., & Ritzler, B. A. (1985). Comparing diagnostic validity of the TAT and a new Picture Projective Test. Journal ofPersonality Assessment, 49, 406-412. Shneidman, E. S. (1951). Thematic test analysis. New York: Grune & Stratton.
  • 63. Shneidman, E. S. (1965). Projective techniques. In B. B. Wolman (Ed.), Handbook of clinical psy- chology (pp. 498-521). New York: McGraw-Hill. Smith, C. P. (Ed.). (1992). Motivation and personality: Handbook of thematic content analysis. New York: Cambridge University Press. Spangler, W. D. (1992). Validity of questionnaire and TAT measures of need for achievement: Two meta-analyses. Psychological Bulletin, 112, 140---154. Stein, M. I. (1948). The Thematic Apperception Test. Reading, MA: Addison-Wesley. Stein, M. I., & Gieser, L. (1999). The zeitgeists and events surrounding the birth of the Thematic Apperception Test. In L. Gieser & M. I. Stein (Eds.), Evocative images: The Thematic Apper- ception Test and the art of projection (pp. 15-22). Washington, DC: American Psychological Association. Stricker, G., & Gooen-Piels, J. (2004). Projective assessment of object relations. In M. Hersen (Editor-in-Chief), M. J. Hilsenroth, & D. L. Segal (Vol. Eds.), Comprehensive handbook of psychological assessment: Vol. 2. Personality assessment (pp. 449-465). Hoboken, NJ: Wiley. Teglasi, H. (2001). Essentials ofTAT and other storytelling techniques assessment. New York: Wiley. Tomkins, S. S. (1947). The Thematic Apperception Test: The
  • 64. theory and technique of interpretation. New York: Grune & Stratton. Vaillant, G. E. (1977). Adaptation to life. Boston: Little, Brown. Vaillant, G. E. (1994). Ego mechanisms of defense and personality psychopathology. Journal of Abnormal Psychology, 105, 44--50. Vane, J. R. (1981). The Thematic Apperception Test: A review. Clinical Psychology Review, 1, 319--336. Weiner, I. B. (2003). Prediction and postdiction in clinical decision making. Clinical Psychology: Science and Practice, JO, 335-338. Weiner, I. B. (2005). Integrative personality assessment with self-report and performance-based measures. In S. Strack (Ed.), Handbook of personology and psychopathology (pp. 317-331). Hoboken, NJ: Wiley. Weiner, I. B., & Kuehnle, K. (1998). Projective assessment of children and adolescents. In A. S. Bellack & M. Hersen (Eds.), Comprehensive clinical psychology: Vol. 4. Assessment (pp. 432-458). New York: Pergamon Press. Westen, D. (1991). Social cognition and object relations. Psychological Bulletin, 109, 429-455. Westen, D. (1995). Social Cognition and Object Relations Scale: Q-Sort for Projective Stories (SCORS-Q). Unpublished manuscript, Harvard Medical School,
  • 65. Cambridge, MA. Westen, D., Lohr, N. E., Silk, K., Gold, L., & Kerber, K. (1990). Object relations and social cog- nition in borderlines, major depressives, and normals: A Thematic Apperception Test analysis. Psychological Assessment, 2, 355-364. Westen, D., Lohr, N. E., Silk, K., Kerber, K., & Goodrich, S. (1989). Object relations and social cognition TAT scoring manual (4th ed.). Unpublished manuscript, University of Michigan, Ann Arbor. 480 Performance-Based Measures Winter, D. G. (1998). Toward a science of personality psychology: David McClelland's development of empirically derived TAT measures. History ofPsychology, 1, 130--153. Winter, D. G. (1999). Linking personality and "scientific" psychology: The development of empiri- cally derived Thematic Apperception Test measures. In L. Gieser & M. I. Stein (Eds.), Evocative images: The Thematic Apperception Test and the art ofprojection (pp. 106-124). Washington, DC: American Psychological Association. Zubin, J., Eron, L. D., & Schumer, F. (1965). An experimental approach to projective techniques. New York: Wiley. a-425-435a-467-480
  • 66. Chapter 11 RORSCHACH INKBLOT METHOD The preceding five chapters have presented the most commonly used self-report inventories for assessing personality functioning. As noted in Chapter 1, inventories of this kind differ in several respects from performance-based personality measures. Self-report inventories provide direct assessments of personality characteristics in which people are asked to describe themselves by indicating whether certain statements apply to them. Performance- based measures are an indirect approach in which personality characteristics are inferred from the way people respond to various standardized tasks. Self-report and performance- based methods both bring advantages and limitations to the assessment process, as discussed in Chapters 1 and 2, and there are many reasons personality assessments should ordinarily be conducted with a multifaceted test battery that includes both kinds of measures (see pp. 13-15 and 22-26). This and the following three chapters address the most widely used performance-based measures of personality functioning: the Rorschach Inkblot Method (RIM), the Thematic Apperception Test (TAT), figure drawing methods, and sentence completion methods. These and other performance-based personality measures have
  • 67. traditionally been referred to as projective tests and are still commonly labeled this way. As pointed out in concluding Chapter 1, however, "projective" is not an apt categorization of these measures, and con- temporary assessment psychologists prefer more accurate descriptive labels for them such as performance-based. NATURE OF THE RORSCHACH INKBLOT METHOD The Rorschach Inkblot Method (RIM) consists of 10 inkblots printed individually on 6 %" by 9 %" cards. Five of these blots are printed in shades of gray and black (Cards I and IV-VII); two of the blots are in shades of red, gray, and black (Cards II and III); and the remaining three blots are in shades of various pastel colors (Cards VIII-X). In what is called the Response Phase of a Rorschach examination, people are shown the cards one at a time and asked to say what they see in them. In the subsequent Inquiry Phase of the examination, persons being examined are asked to indicate where in the blots they saw each of the percepts they reported and what made those percepts look the way they did. These procedures yield three sources of data. First, the manner in which people structure their responses identifies how they are likely to structure other situations in their lives. People who base most of their responses on the overall appearance of the inkblots and pay little attention to separate parts of them are likely to be individuals who tend to form global
  • 68. impressions of situations and ignore or overlook details of these situations. Conversely, 345 346 Performance-Based Measures people who base most of their responses on parts of the blots and seldom make use of an entire blot are often people who become preoccupied with the details of situations and fail to grasp their overall significance-as in "not being able to see the forest for the trees." As another example of response structure, people who report seeing objects that are shaped similarly to the part of the blot where they are seeing them are likely in general to perceive people and events accurately, and hence to show adequate reality testing. By contrast, people who give numerous perceptually inaccurate responses that do not resemble the shapes of the blots are prone in general to form distorted impressions of what they see, and hence to show impaired reality testing. As a second source of data, Rorschach responses frequently contain content themes that provide clues to a person's underlying needs, attitudes, and concerns. People who consistently describe human figures they see in the inkblots as being angry, carrying weapons, or fighting with each other may harbor concerns that other people are potentially dangerous to them, or they may view interpersonal relationships
  • 69. as typified by competition and strife. Conversely, a thematic emphasis on people described as friendly, as carrying a peace offering, or as helping each other in a shared endeavor probably reveals a sense of safety in interpersonal relationships and an expectation that people will interact in collaborative ways. In similar fashion, recurrent descriptions of people, animals, or objects seen in the blots as being damaged or dysfunctional (e.g., "a decrepit old person"; "a wounded bug"; "a piece of machinery that's rusting away") may reflect personal concerns about being injured or defective in some way, or about being vulnerable to becoming injured or defective. The third source of data in a Rorschach examination consists of the manner in which individuals conduct themselves and relate to the examiner, which provides behavioral indi- cations of how they are likely to deal with task-oriented and interpersonal situations. Some of the behavioral data that emerge during a Rorschach examination resemble observa- tions that clinicians can make whenever they are conducting interview or test assessments. Whether people being assessed seem deferential or antagonistic toward the examiner may say something about their attitudes toward authority. Whether they appear relaxed or ner- vous may say something about how self-confident and self- assured they are and about how they generally respond to being evaluated. The RIM also provides some test-specific behavioral data in the
  • 70. form of how people handle the cards and how they frame their responses. Do they carefully hand each card back to the examiner when they are finished responding to it, or do they carelessly toss the card on the desk? Do they give definite responses and take responsibility for them (as in "This one looks to me like a bat"), or do they disavow responsibility and avoid commitment (as in "It really doesn't look like anything to me, but if I have to say something, I'd say it might look something like a bat")? To summarize this instrument, then, the RIM involves each of the following three tasks: 1. A perceptual task yielding structural information that helps to identify personality states and traits 2. An associational task generating content themes that contain clues to a person's underlying needs and attitudes 3. A behavioral task that provides a representative sample of an individual's orientation to problem-solving and interpersonal situations Rorschach Inkblot Method 347 In parallel to these three test characteristics, Rorschach assessment measures personality functioning because the way people go about seeing things in the inkblots reflects how they
  • 71. look at their world and how they customarily make decisions and deal with events. What they see in the inkblots provides a window into their inner life and the contents of their mind, and how they conduct themselves during the examination provides information about how they usually respond to people and to external demands. By integrating these structural, thematic, and behavioral features of the data, Rorschach clinicians can generate comprehensive personality descriptions of the people they examine. These descriptions typically address adaptive strengths and weaknesses in how people manage stress, how they attend to and perceive their surroundings, how they form concepts and ideas, how they experience and express feelings, how they view themselves, and how they relate to other people. Later sections of this chapter elaborate the codification, scoring, and interpretation of Rorschach responses and delineate how Rorschach-based descriptions of personality characteristics facilitate numerous applications of the instrument. As a further introduction to these topics and to the psychometric features of the RIM, the next two sections of the chapter review the history of Rorschach assessment and standard procedures for administering the instrument. HISTORY Of the personality assessment instruments discussed in this text, the Rorschach Inkblot Method has the longest and most interesting history because it
  • 72. was shaped by diverse personal experiences and life events. The inkblot method first took systematic form in the mind of Hermann Rorschach, a Swiss psychiatrist who lived only 37 years, from 1885 to 1922. As a youth, Rorschach had been exposed to inkblots in the form of a popular parlor game in tum-of-the-century Europe called Klecksographie. Klecks is the German word for "blot," and the Klecksographie game translates loosely into English as "Blotto." The game was played by dropping ink in the middle of a piece of paper, folding the paper in half to make a more or less symmetrical blot, and then competing to see who among the players could generate the most numerous or interesting descriptions of the blots or suggest associations to what they resembled. According to available reports, Rorschach's enthusiasm for this game, which appealed to adolescents as well as adults, and his creativity in playing it led to his being nicknamed "Klex" by his high school classmates (Exner, 2003, chap. 1). From 1917 to 1919, while serving as Associate Director of the Krombach Mental Hospital in Herisau, Switzerland, Rorschach pursued a notion he had formed earlier in his career that patients with different types of mental disorders would respond to inkblots differently from each other and from psychologically healthy people. To test this notion, he constructed and experimented with a large number of blots, but these were not the accidental ink splotches of the parlor games. Rorschach was a skilled
  • 73. amateur artist who left behind an impressive portfolio of drawings that can be viewed in the Rorschach Archives and Museum in Bern, Switzerland. The blots with which he experimented were carefully drawn by him, and over time he selected a small set that seemed particularly effective in eliciting responses and reflecting individual differences. 348 Performance-Based Measures Rorschach then administered his selected set of blots to samples of 288 mental hos- pital patients and 117 nonpatients, using a standard instruction, "What might this be?" Rorschach published his findings from this research in a 1921 monograph titled Psychodi- agnostics (Rorschach, 1921/1942). The materials and methods described by Rorschach in Psychodiagnostics provided the basic foundation for the manner in which Rorschach assessment has been most commonly practiced since that time, and the standard Rorschach plates used today are the same 10 inkblots that were published with Rorschach's original monograph. Rorschach's monograph was nevertheless a preliminary work, and he was just beginning to explore potential refinements and applications of the inkblot method when he succumbed a year after its publication to peritonitis, following a ruptured appendix. The monograph itself did not attract much attention initially, and the method
  • 74. might have succumbed along with its creator were it not for the efforts of a few close friends and colleagues of Rorschach who were devoted to keeping the method alive. Their efforts were facilitated by the fact that Switzerland in the 1920s was a Mecca for medical scientists and researchers, who visited from many parts of the world to study with famous physicians at Swiss hospitals and medical schools. Some of these visiting scholars and practitioners heard about Rorschach's method while they were in Switzerland and took copies of the inkblots home with them. As a result, articles on the Rorschach were published during the 1920s in such diverse countries as Russia, Peru, and Japan. Turning to how the Rorschach came to the United States, an American psychiatrist named David Levy went to Zurich in the mid-1920s to study for a year with Emil Oberholzer, a prominent psychoanalyst who had been one of Rorschach's good friends and supporters. Levy returned to the United States with several copies of the inkblots, and that is how the Rorschach came to America. Levy's interests lay elsewhere, and the Rorschach materials languished for a time in his desk at the New York Institute of Guidance. Then, in 1929, Samuel Beck, a graduate student at Columbia University who was doing a fellowship at the Institute, mentioned to Levy that he was looking for a dissertation topic. Levy told Beck about the Rorschach materials he had brought back from Switzerland and suggested that Beck might do a research project with them. Acting on this
  • 75. suggestion, Beck earned his doctorate with a Rorschach standardization study of children. While collecting his data, Beck published the first two English language articles on the method in 1930 (Beck, 1930a, 1930b). He followed these articles 7 years later with Introduction to the Rorschach Met hod, which was the first English language monograph on the Rorschach, and in 1944 with the first edition of his basic text, Rorschach's Test: I. Basic Processes (Beck, 1937, 1944). Throughout a long, productive career, Beck remained an influential figure in Rorschach assessment, and his contributions became internationally known and respected. In 1934, Beck went to Switzerland for a year's study with Oberholzer, and his departure coincided with the arrival from Zurich of another Rorschach pioneer, Bruno Klopfer. Klopfer had received a doctorate in educational psychology in 1922 and by 1933 had advanced to a senior staff position at the Berlin Information Center for Child Guidance. He also had become interested in Jungian psychoanalytic theory and was in the final phases of completing training as a Jungian analyst. However, the restrictions being placed on Jews in Adolf Hitler's Germany at that time led Klopfer to an advisedly dim view of his future professional prospects in Berlin, and he decided to move to Zurich. Without a job in Zurich, he was helped by Carl Jung to obtain a position as a technician at the Zurich
  • 76. Rorschach Inkblot Method 349 Psychotechnic Institute. Klopfer's responsibilities at the Institute included psychological testing of applicants for various jobs, and the Rorschach was among the tests he was required to use for this purpose. He had no previous interest or experience in testing, but he soon became intrigued with the ways in which Rorschach responses could reveal the underlying thoughts and feelings of the people he was testing. Klopfer was dissatisfied with his low status as a technician and soon began looking for other opportunities. His search resulted in his being appointed as a research associate in the Department of Anthropology at Columbia University, where he began working in 1934. Having learned of his arrival on campus, a group of psychology graduate students asked their department to arrange for Klopfer to give them some Rorschach training. Unimpressed with Klopfer's credentials, the department declined to hire him for this purpose. The students were not deterred, however, and they approached Klopfer privately about offering some evening seminars for them in his home, which he agreed to do. Giving these seminars for this and subsequent groups of students and professionals produced a network of Klopfer-trained psychologists who were eager to keep in touch with each other and continue exchanging ideas about the Rorschach. In response to this
  • 77. interest, Klopfer in 1936 founded the Rorschach Research Exchange, which has been published regularly since that time and evolved into the contemporary Journal ofPersonality Assessment. In 1938, Klopfer founded the Rorschach Institute, a scientific and professional organization that continues to function actively today, and more broadly than Klopfer envisioned, as the Society for Personality Assessment. Klopfer's first Rorschach book, The Rorschach Technique, appeared in 1942, but it was not until 1954 that he published his definitive basic text, Developments in the Rorschach Technique: Volume 1. Technique and Theory (Klopfer, Ainsworth, Klopfer, & Holt, 1954; Klopfer & Kelley, 1942). Because one of them needed a dissertation topic and the other needed a job, then, these two Rorschach pioneers were drawn into a lifetime engagement with the inkblot method. Like Beck, Klopfer gained international acclaim for his teaching and writing about Rorschach assessment. Regrettably for the development of the instrument, Beck and Klopfer approached their work from very different perspectives. Having been educated in an experimentally oriented department of psychology, Beck was interested in describing personality characteristics and was firmly committed to advancing knowledge through controlled research designs and empirical data collection. He stuck closely to Rorschach's original procedures for administration and coding, and he favored a primarily quantitative approach to Rorschach interpretation. With respect to the
  • 78. distinction between nomothetic and idiographic approaches in personality assessment discussed in Chapters 1 (pp. 12-13) and 2 (p. 34 ), Beck was very much in the nomothetic camp. Klopfer, on the other hand, was a Jungian analyst at heart and an enthusiast for idiography. He had a strong interest in symbolic meanings and with umaveling the phenomenology of each person's human experience. He employed qualitative approaches to interpretation that Beck considered inappropriate, and he added many new response codes and summary scores on the basis of imaginative ideas rather than research data, which Beck found unacceptable. These differences in perspective led Beck and Klopfer to formulate and promulgate distinctive Rorschach systems that involved dissimilar approaches to administering, scoring, and interpreting the test. Divergence in method did not stop with these two pioneers, however. In the early 1930s, Beck talked about his Rorschach research with Marguerite Hertz, the wife of an old friend of his, who was working on her doctorate in psychology 350 Performance-Based Measures at Western Reserve University in Cleveland. Hertz became an ardent enthusiast for the value of Rorschach assessment, especially in working with children. She developed some distinctive variations of her own in Rorschach administration,
  • 79. scoring, and interpretation, and, in the course of a long and productive life as a university professor, she taught her approach to many generations of graduate students and workshop participants. Klopfer's first seminar group included several psychology graduate students and a friend of one of these students who had encouraged him to sit in. This friend was Zygmunt Piotrowski, who at the time was a postdoctoral fellow at the Neuropsychiatric Institute in New York. Piotrowski had received a doctorate in experimental psychology in Poland in 1927 and was in the United States for advanced study in neuropsychology. Aside from curiosity, he had little interest in Rorschach assessment when he joined Klopfer's seminar group. However, he soon began to contemplate the possibility that persons with various kinds of neurological disorders might respond to the inkblots in ways that would help identify their condition. Piotrowski subsequently pioneered in conducting Rorschach research with brain-injured patients, and he developed many creative ideas about how the inkblot method should be conceived, coded, and interpreted. These new ideas coalesced into a Rorschach system that Piotrowski called Perceptanalysis (Piotrowski, 1957). Like Beck, Klopfer, and Hertz, Piotrowski worked productively throughout a long life during which his courses, publications, and lectures introduced a loyal following to his particular Rorschach system.
  • 80. This early history of the Rorschach in America came to a close with the arrival in the United States of another refugee from Europe, David Rapaport, a psychoanalytically oriented doctoral-level psychologist who fled his native Hungary in 1938. In 1940, Rapaport joined the staff of the Menninger Foundation in Topeka, Kansas, where 2 years later he became head of the psychology department His responsibilities at the Foundation included mounting a research project to evaluate the utility of a battery of psychological tests for describing people and facilitating differential diagnosis. The Rorschach was part of this test battery, and Rapaport's collaborators in the project included Roy Schafer, who was an undergraduate psychology student at the time and completed his doctoral studies several years later at Clark University, after moving from the Menninger Foundation to the Austen Riggs Center in Massachusetts (see Schafer, 2006). Rapaport's psychoanalytic perspectives and many original ideas that he and Schafer formed about how to elicit and interpret Rorschach responses resulted in their using a modified inkblot method that differed substantially from any of the previous methods. Publication of a 2-volume treatise based on the Menninger research project and subsequent influential books by Schafer established the Rapaport/Schafer system as another alternative for practitioners and researchers to consider in their work with the Rorschach (Rapaport, Gill, & Schafer, 1946/1968; Schafer, 1948, 1954).
  • 81. By 1950, then, there were five different Rorschach systems in the United States, each with its own adherents. Moreover, even though the Beck and Klopfer systems had become well-known abroad, the Rorschach landscape also included distinctive systems developed in other countries and popular among psychologists in Europe, South America, and Japan. This diversity of method made it difficult for Rorschach practitioners to communicate with each other and almost impossible for researchers to cumulate systematic data concerning the reliability of Rorschach findings and their validity for particular purposes. This problem persisted until the early 1970s, when John Exner undertook to resolve it by standardizing Rorschach Inkblot Method 351 the Rorschach method in a conceptually reasonable and psychometrically sound manner. Having conducted a detailed comparative analysis of the five American systems (Exner, 1969), Exner instituted a research program to measure the impact of the different methods of administration used in the systems and to identify which of their response codes could be explained clearly and coded reliably. Drawing on what appeared to be the best features of each of the five American systems, Exner combined them into a Rorschach Comprehensive System (CS) that he published in 1974 (Exner, 1974). The Rorschach CS provides specific and detailed instructions
  • 82. for administration and coding that are to be followed in exactly the same way in every instance. Now in its fourth edition (Exner, 2003), the CS has become by far the most frequently used Rorschach system in the United States as well as in many other countries of the world. Widespread adoption of the CS standardization has made possible the development of large sample normative standards and international collaboration in examining cross-cultural similarities and differences in Rorschach responses. The cross-cultural applicability of Rorschach assessment has provided a unique large-scale opportunity to compare and understand different cultures from all over the world (see Shaffer, Erdberg, & Meyer, 2007). Standard Rorschach procedures have also fostered systematic collection and comparison of data concerning intercoder agreement, retest reliability, and criterion, construct, and incremental validity, both in the United States and abroad, which are reviewed later in the chapter. The advent of the CS has additionally allowed clinicians who use it to exchange information about Rorschach findings with confidence that these findings are based on the same method of obtaining and codifying the data. The next two sections of the chapter provide an overview of the CS administration and coding procedures. ADMINISTRATION To preserve standardization for the reasons just mentioned,
  • 83. Rorschach examiners should follow as closely as possible the administration and coding procedures delineated for the CS by Exner (2003). Prior to beginning the testing, as discussed in Chapter 2, the examiner should have discussed with the person being evaluated such matters as the purposes of the assessment and how and to whom the results will be communicated. People are entitled to information about these matters, and even a brief discussion of them can be helpful in establishing rapport, reducing concerns the person may have about being examined, and clarifying misconceptions about the testing process. Typically, the RIM is part of a test battery that can be introduced in general terms such as the following: "As for the tests we 're going to do, I'll be asking you questions about various matters and giving you some tasks to do; let's get started, and I'll show you what each of these tests is like as we do them." In preparing to administer the RIM, the examiner should have the cards face down in a single pile where they can be seen but not easily reached by the examinee. The examiner should also sit alongside the person or at an angle that is at least slightly behind the examinee and out of the person's direct line of vision. This arrangement makes it easy for people to show the examiner where on the blots they are seeing their percepts. Avoiding face-to-face administration also minimizes the possible influence on test responses of an examiner's facial expressions or other bodily movements. The Rorschach
  • 84. administration should begin 352 Performance-Based Measures with the following type of explanation: The next test we're going to do is one you may have heard of. It's often referred to as the inkblot test, and it's called that because it consists of a series of cards with blots of ink on them. The blots aren't anything in particular, but when people look at them, they see different things in them. There are 10 of these cards, and I'm going to show them to you one at a time and ask you what kinds of things you see in them and what they look like to you. No further explanation should routinely be given of Rorschach procedures or of what can be learned from Rorschach responses. Should examinees ask, "How does this test work?" they can be told the following: "The way people look at things says something about what they are like as a person, and this test will give us information about your personality that should be helpful in ... [some reference to the purpose of the examination]." Should examinees say something on the order of "So this will be a test of my imagination" or "You want me to tell you what they remind me of?" the perceptual elements of the Rorschach task should be emphasized by indicating otherwise: "No, this is a test of what you see
  • 85. in the blots, and I want you to tell me what they look like to you." If there are no such questions or comments that examiners must answer first, they should proceed directly after their explanation by handing the person Card I and saying, "What might this be?" People will usually take Card I when it is handed to them and should be asked to do so if necessary. Having people hold the cards promotes their engagement in the Rorschach task, and, as mentioned, the manner in which they handle the cards can be a source of useful behavioral data. In other respects, the individual's task during the Response Phase of the administration should be left as unstructured as possible. In response to questions ("How many responses should I give?" "Can I tum the card?" "Do I use the whole thing or parts of it as well?"), examiners should provide noncommittal replies ("It's up to you"; "Any way you wish"). Should the person begin by saying "It's an inkblot," the examiner should restate the basic instruction: "Yes, that's right, but what you need to do is tell me what it looks like to you, what kinds of things you see in it." Occasionally, some additional procedures may be necessary to obtain a record of suffi- cient but manageable length. A minimum of 14 responses is required to ensure the validity of a Rorschach protocol. Records with fewer than 14 responses are too brief to be entirely reliable and rarely support valid interpretations. To decrease the risk of ending up with a record of insufficient length, persons who give only one
  • 86. response to Card I should be prompted by saying, "If you look at it some more, you'll see other things as well." If the person still does not produce more than one response, the single response should be accepted and the card taken back. However, individuals who have given just one or two responses to Card I, and then handed back or put down Cards II, III, or IV after only a single response, can be offered the following indirect encouragement, should they seem disengaged from their task and on their way to producing a brief record with fewer than 14 responses: "Wait, don't hurry through these; we're in no hurry, take your time." Should the Response Phase for all IO cards yield fewer than 14 responses, despite such prompting and encouragement, the examiner should implement the following instructions: Now you know how it's done. But there's a problem. You didn't give enough answers for us to learn very much from the test. So let's go through them again, and this time I'd like you to give me more responses. You can include the same ones you've already given, if you like, but give me more answers this time through. Rorschach Inkblot Method 353 There is also a standard procedure for not taking more responses than are necessary for interpretive purposes. If a person has given five responses to Card I and appears about to
  • 87. give more, the examiner should take the card back while saying, "Okay, that's fine, let's go on to the next one." This procedure can be repeated on each subsequent card, should the person continue to give five responses and appear ready to give more. However, if on any card the person gives fewer than five responses, the limiting procedure should be discontinued and not resumed, even if the person later on gives more than five responses to some card. Exner (2003, pp. 52-56) identifies some unusual circumstances that might warrant departing from these standard guidelines for increasing or curtailing response total, but the procedures presented here suffice with few exceptions to direct the Response Phase of the administration. Of additional importance in conducting both the Response Phase and the subsequent Inquiry Phase is verbatim recording of whatever the examiner and the examinee say. Accurate coding and thorough interpretation depend on having a complete account of exactly how people expressed themselves and precisely what they were told or asked by the examiner. Most examiners rely on a system of abbreviations to simplify the task of recording a verbatim protocol; for example, using "II a bfly" for "Looks like a butterfly" or "enc" to indicate when they have used the encouragement prompt after getting only one response on Card I. Some examiners tape-record Rorschach administrations to ensure preservation of the verbatim record. Whatever means is used, adequate Rorschach administration demands
  • 88. maintaining the integrity of the raw data. To this end, examiners should write down how examinees behave during the administration as well as what they say (e.g., "laughed," "big sigh," "detached, looking at ceiling") to provide the behavioral data that emich Rorschach interpretation. Following completion of the Response Phase, the examiner should introduce the Inquiry Phase of the administration as follows: Now I want to take a moment to go through these cards with you again, so that I can see the things you saw. I'll read back each of the things you said, and for each one I'd like you to tell me where you saw it and what made it look like that to you. The examiner should then hand the cards to the person one at a time, say for each response something on the order of "On this one you saw ..." or "Then you said ... " or "Next there was ... ," and then complete this statement with a verbatim reading of the person's exact words. Nondirective prompts should then be used as necessary to help people comply with the inquiry instructions by clarifying what they have seen, where on the blot they saw it, and why it looked as it did to them. With respect to what the person has seen, appropriate prompts would include such statements and questions as "I'm not sure what it is you're seeing there," "Is it the whole person or just part of the person?" or "You said it could be a butterfly or a moth-which does it look more like to you?"
  • 89. To inquire about where the person has seen a percept, the examiner might ask, "How much of the blot is included in it?" or say, "You mentioned a head and a tail, and I'm not clear which part of the blot is which." Should the response to such questions or statements leave unclear where a percept has been seen, examinees should be asked to outline with their finger the area of the blot they were using for it. Inquiry about what made a percept look as it did can take the form of such questions as "What made it look like that to you?" "What helped you see it that way?" or "What about the blot suggested that to you?" In 354 Performance-Based Measures each of these aspects of the Inquiry Phase, examiners should strive as much as possible to eliminate ambiguity concerning the what, where, and why of a response, because such ambiguities in responses are the main source of uncertainty in deciding how to code them. As these nondirective questions and statements illustrate, a paramount principle of con- ducting a Rorschach inquiry is to avoid leading the examinee or providing clues to what may be expected or desired. For example, "Are the people doing anything?" and "Did the color help you see it that way?" are inappropriate questions,
  • 90. because they can convey that movement and color are important for the person to note. Such messages can influence individuals to articulate more movement or color determinants during the course of an inquiry than they would have otherwise. As a similar precaution against conveying unin- tended messages, examiners should avoid the question "Anything else?" Asking "Anything else?" can suggest that more is expected from the person, or that something has been left out, either of which can lead individuals to say more than they would have otherwise and thereby detract from the standardization of the administration. A second guiding principle in conducting the Inquiry Phase concerns its basic purpose, which is to enable accurate coding of the response. With this principle in mind, examiners should stop inquiring about a response once they have obtained enough information to code it. For example, "Two people standing there" is clearly a human movement response that, as indicated in the next section, warrants coding an M. It is neither necessary nor appropriate to ask, "What makes it look like two people standing there?" The additional question in this instance would not generate any information necessary to code an M. Asking such unnecessary questions violates CS standardization and may have the unwanted consequence of eliciting response elaborations that, however interesting, would not have occurred if standard procedures had been followed. Should a person report, "Two funny-looking people picking up a
  • 91. basket," there is no need to inquire about the human movement, but two other inquiry questions would be called for: "What suggests that the people are funny-looking?" and "What helped you see this part as a basket?" The first question illustrates the importance of inquiring about key words in an individual's responses, particularly nouns, adjectives, verbs, and adverbs that give responses a potentially distinctive flavor. Consider the following examples, with the key words shown in italics: "Two witches dancing" [Inquiry: What suggests they are witches?]; "Two old people dancing" [Inquiry: What makes them look like old people?]; "Two people arguing or fighting" [Inquiry: What helps you see them as arguing or fighting?]; "Two people walking along slowly" [Inquiry: What gave you the idea that they're walking slowly?]. The second question illustrates the importance of inquiring about each part of a complex response. Thus "Animals climbing a tree" requires clarifying the where and the why for both the animals and the tree, "A jet plane with exhaust coming out the back" must be inquired sufficiently to code both the plane and the exhaust, and so on. CODING AND SCORING The scoring of a Rorschach protocol is a two-step process. The first step consists of assigning each response a set of codes that identify various features of how the response has been formulated and expressed. The second step consists of combining these response
  • 92. Rorschach Inkblot Method 395 This guideline does not preclude person-specific features of card pull that may influence a person's behavior or responses on Card IX. The popular human figures may in some instances pull an impression that they are fighting, in which case Card IX could arouse some concerns about aggression. Similarly, the resemblance of the lower middle red detail of Card IX to female genitals could evoke some sexual concerns that affect a person's manner and responses while looking at this card. Neither of these possible Card IX pulls is as strong or common as the other card pulls identified in this section. CardX The broken appearance of Card X and its array of loosely connected but rather sharply defined and colored details give it a close structural resemblance to Card VIII. At the same time, the sheer number of variegated shapes and colors on Card X imbue it with the same type of uncertainty and complexity posed by Card IX. Although Card X is usually seen as a pleasant stimulus and offers examinees many alternative possibilities for easily seen percepts, the challenge of organizing it effectively makes it the second most difficult card to manage, after Card IX. Particularly for people who feel overwhelmed or overburdened by having to deal with many things at once, responding to Card
  • 93. X, despite its pleasant appearance and bright colors, may be a disconcerting experience that they dislike and are happy to complete. Finally of note is the position of Card X as the final card. Just as the initial response in a record may be a way for people to sign in and introduce what they feel is important about themselves, the last response may serve as an opportuni ty to sign out by indicating, in effect, "When all is said and done, this is where things stand for me and what I want you to know about me." As a parallel to the example given earlier of a sign- in response, consider the contrasting implications of the following responses for the present status of two depressed persons. The first one concluded Card X by saying, "And it looks like everything is falling apart"; the second one concluded, "And it's brightly colored, like the sun is coming up." APPLICATIONS In common with the self-report inventories presented in Chapters 6 through I 0, the RIM is an omnibus personality assessment instrument, in the sense that it provides information about a broad range of personality characteristics. As elaborated in discussing the interpretive significance of Rorschach findings, these data shed light on the adequacy of a person's adaptive capacities in several key respects, on the types of psychological states and traits that define what the person is like, and on the underlying needs, attitudes, conflicts, and
  • 94. concerns that may be influencing the person's behavior. Such information about personality functioning serves practical purposes by helping to identify (a) the presence and nature of psychological disorder, (b) whether a person needs and is likely to benefit from various kinds of treatment, and (c) the probability of a person's functioning effectively in certain kinds of situations. By serving these purposes, the RIM frequently facilitates making decisions that are based in part on personality characteristics. Such personality- based decisions commonly 396 Performance-Based Measures characterize the practice of clinical, forensic, and organizational psychology, the three contexts in which Rorschach assessment finds its most frequent applications. Clinical Practice Rorschach assessment contributes to clinical practice by assisting in differential diagno- sis and treatment planning and outcome evaluation. With respect to differential diagnosis, many states and traits identified by Rorschach variables are associated with particular forms of psychopathology. Schizophrenia is usually defined to include disordered thinking and poor reality testing, and Rorschach evidence of these cognitive impairments (low XA %
  • 95. and WDA %, an elevated WSum6) accordingly indicates the likelihood of a schizophrenia spectrum disorder. Similarly, because paranoia involves being hypervigilant and interper- sonally aversive, a positive HVI suggests the presence of paranoid features in how people look at their world. Depressive disorder is suggested by Rorschach indices of dysphoria ( elevated C', Col-Shd Bids) and negative self-attitudes ( elevated V, low Jr+ 2/R), obsessive- compulsive personality disorder is suggested by indices of pedantry and perfectionism (positive OBS), and so on. To learn more about these and other applications of Rorschach findings in differential diagnosis, readers are referred to articles and books by Hartmann, Norbech, and Gr11mner!?)d (2006), Huprich (2006), Kleiger (1999), and Weiner (2003b). The applications to which the RIM contributes by measuring personality characteristics identify its limitations as well. In assessing psychopathology, Rorschach data are of little use in determining the particular symptoms a person is manifesting. Someone with Rorschach indications of an obsessive-compulsive personality style may be a compulsive hand washer, an obsessive prognosticator, or neither. Someone with depressive preoccupations may be having crying spells, disturbed sleep, or neither. There is no isomorphic relationship between the personality characteristics of disturbed people and their specific symptoms. Accordingly, the nature of these symptoms is better determined from observing or asking directly about them than by speculating about their presence on the basis of
  • 96. Rorschach data. Likewise, Rorschach data do not provide dependable indications concerning whether a person has had certain life experiences (e.g., been sexually abused) or behaved in certain ways (e.g., abused alcohol or drugs). Only when there is a substantial known correlation between specific personality characteristics and the likelihood of certain experiences or be- havior having occurred can Rorschach findings provide reliable postdictions, as mentioned. The predictive validity of Rorschach findings are similarly limited by the extent to which personality factors determine whatever is to be identified or predicted. As for treatment planning, Rorschach findings measure personality characteristics that have a bearing on numerous decisions that must be made prior to and during an intervention process. The degree of disturbance or coping incapacity reflected in Rorschach responses assists in determining whether a person requires inpatient care or is functioning sufficiently well to be treated as an outpatient. Considered together with the person's preferences, the personality style and severity of distress or disorganization revealed by Rorschach findings help indicate whether treatment needs will best be met by a supportive approach oriented to relieving distress, a cognitive-behavioral approach designed to modify symptoms or behavior, or an exploratory approach intended to enhance self- understanding. Whichever treatment approach is implemented, the maladaptive personality
  • 97. traits and the underlying concerns identified by the Rorschach data can help therapists determine, in consultation Rorschach Inkblot Method 397 with their patients, what the goals for the treatment should be and in what order these treatment targets should be addressed (see Weiner, 2005b). Some predictive utility derives from the fact that certain personality characteristics mea- sured by Rorschach variables are typically associated with ability to participate in and benefit from psychological treatment. These personality characteristics include being open to experience (Lambda not elevated), cognitively flexible (balanced a:p), emotionally re- sponsive (adequate WSumC and Afr), interpersonally receptive (presence of T, adequate SumH), and personally introspective (presence of FD), each of which facilitates engage- ment and progress in psychotherapy. By contrast, having an avoidant or guarded approach to experience, being set in one's ways, having difficulty recognizing and expressing one's feelings, being interpersonally aversive or withdrawn, and lacking psychological minded- ness are often obstacles to progress in psychotherapy (Clarkin & Levy, 2004; Weiner, 1998, chap. 2).
  • 98. In a research project relevant to the utility of the RIM in guiding therapist activity once treatment is underway, Blatt and Ford (1994) used Rorschach variables to assist in categorizing patients as having problems primarily with forming satisfying interpersonal relationships (called anac/itic) or primarily with maintaining their own sense of identity, autonomy, and self-worth ( called introjective). In the course of their subsequent psychother- apy, the anaclitic patients studied by Blatt and Ford were initially more involved in and responsive to relational aspects of the treatment than the introjective patients, who were more attuned to and influenced by their therapist's interpretations than by attention to the treatment relationship. By helping to identify treatment goals and targets, Rorschach assessment can also be helpful in monitoring treatment progress and evaluating treatment outcome. Suppose that a RIM is administered prior to beginning therapy and certain treatment targets can be identified in Rorschach terms (e.g., reducing subjectively felt distress, as in changing D < 0 to D = O; increasing receptivity to emotional arousal, as in bringing up a low Afr; promoting more careful problem solving, as in reducing a Zd < - 3.5). Retesting after some period of time can then provide quantitative indications of how much progress has been made toward achieving these goals and how much work remains to be done on them. Rorschach evidence concerning the extent to which the goals of
  • 99. the treatment have been achieved can guide therapists in deciding if and when termination is indicated. Similarly, comparing Rorschach findings at the point of termination or in a later follow-up evaluation with those obtained in a pretreatment evaluation will provide a useful objective measure of the effects of the treatment, for better or worse. Both research findings and case reports have demonstrated how Rorschach assessment can be applied in treatment outcome evaluation. In studies reported by Weiner and Exner (1991) and Exner and Andronikof-Sanglade (1992), patients in long-term, short-term, and brief psychotherapy were examined at several points during and after their treatment. The data analysis focused on 27 structural variables considered to have implications for a person's overall level of adjustment. The results of both studies showed significant positive changes in these Rorschach variables over the course of therapy, consistent with expectation, and the amount of improvement was associated with the length of the therapy. These findings were considered to demonstrate both the effectiveness of psychotherapy in promoting positive personality change and the validity of the RIM in measuring such change. 398 Performance-Based Measures
  • 100. In a study with similar implications, Fowler et al. (2004) monitored the progress of a group of previously treatment-refractory patients who entered a residential treatment center and were engaged in psychodynamically oriented psychotherapy. After a treatment duration averaging 16 months, these patients showed significant improvement in their average behavior ratings on scales related to social and occupational functioning, and these improvements were matched by significant changes for the better in their average scores on three Rorschach scales based on response content. With its thematic imagery as well as its structural variables, then, Rorschach assessment has been shown to provide valid measurement of treatment progress, while helping to demonstrate the effectiveness of the treatment. Readers are referred to Weiner (2004a, 2005a) for additional discussion of Rorschach monitoring of psychotherapy and a detailed case study that illustrates positive Rorschach changes accompanying successful psychotherapy. Forensic Practice In the clinical applications just discussed, diagnostic inferences derive from linkages be- tween personality characteristics that typify certain disorders and Rorschach variables that measure these characteristics. In similar fashion, forensic applications of Rorschach as-
  • 101. sessment in criminal, civil, and family law cases derive from a translation of legal concepts into psychological terms. In criminal law, the two questions most commonly addressed to consulting psychologists concern whether an accused person is competent to proceed to trial and whether the person can or should be held responsible for the alleged criminal behavior. Being competent in this context consists of having a rational and factual understanding of the legal proceedings one is facing and being able to participate effectively in one's own defense. These principal components of competency are commonly translated into specific questions such as (a) whether defendants appreciate the nature of the charges and possible penalties they are facing, (b) whether they understand the adversarial process and the roles of the key people in it, (c) whether they can disclose pertinent facts in their case to their attorney, and (d) whether they are capable of behaving appropriately in the courtroom and testifying relevantly in their own behalf (Stafford, 2003; Zapf & Roesch, 2006). With respect to dimensions of personality functioning, these aspects of competence are most closely related to being able to think logically and coherently and to perceive people and events accurately. Disordered thinking and impaired reality testing, in combination with the poor judgment and inappropriate behavior typically associated with them, can interfere
  • 102. with a person's ability to demonstrate competence. Accordingly, the same Rorschach indices of disordered thinking and impaired reality testing just mentioned in connection with differential diagnosis (low XA % , low WDA % , elevated WSum6), although not sufficient evidence of incompetence, serve two purposes in this regard. They alert the examiner to a distinct likelihood that the defendant will have difficulty satisfying customary criteria for competency to stand trial, and if a defendant appears incompetent with respect to the applicable criteria, these Rorschach findings help the examiner explain to the court why the person is having this difficulty. Criminal responsibility refers in legal terms to whether an accused person was legally sane at the time of committing an alleged offense. In some jurisdictions, insanity is defined as a cognitive incapacity that prevented the accused person from recognizing the criminality Rorschach Inkblot Method 399 of his or her actions or appreciating the wrongfulness of this conduct. Insanity in other jurisdictions is defined either as this type of cognitive incapacity or as a loss of behavioral control, such that the person was unable to alter or refrain from the alleged criminal conduct at the time (Goldstein, Morse, & Shapiro, 2003; Zapf, Golding, & Roesch, 2006).
  • 103. With respect to personality functioning, cognitive incapacity is measured on the RIM by the previously mentioned indices of disordered thinking and poor reality testing. Behavioral dyscontrol is suggested by Rorschach indices of acute and chronic stress overload (minus D-score, minus AdjD-score), which are commonly associated with limited frustration toler- ance, intemperate outbursts of affect, and episodes of impulsive behavior. However, because legal sanity is defined by the person's state of mind at the time of an alleged offense, and not at the time of a present examination, Rorschach findings suggesting cognitive impairment or susceptibility to loss of control must be supplemented by other types of information (e.g., observations of defendants' behavior by witnesses to their alleged offense and by the law enforcement officers who arrested them) to serve adequately as a basis for drawing conclusions about criminal responsibility. In civil law cases involving allegations of personal injury, personality assessment helps to determine the extent to which a person has become emotionally distressed or incapacitated as a consequence of irresponsible behavior on the part of another person or some entity. As prescribed by tort law, this circumstance exists when the potentially liable person or entity has, by omission or commission of certain actions, been derelict in a duty or obligation to the complainant, thereby causing the aggrieved person to experience psychological injury that would otherwise not have occurred (see Greenberg, 2003).
  • 104. Emotional distress caused by the irresponsible actions of others is often likely to be reflected in Rorschach responses, most commonly in indications of generalized anxiety, stress disorder, depressive affect and cognitions, and psychotic loss of touch with reality. Persons with Posttraumatic Stress Disorder tend to produce one of two types of Rorschach protocols. Those whose disorder is manifest primarily in the reexperiencing of distressing events and mental and physical hyperarousal tend to produce a flooded pro- tocol that is notable for the incursions of anxiety on comfortable and effective functioning. The implications of the minus D-score and minus Adj D-score for stress overload can be particularly helpful in identifying such incursions, as can a high frequency of content codes suggesting concerns about bodily harm (e.g., AG, An, Bl, MOR, Sx; see Armstrong & Kaser-Boyd, 2004; Kelly, 1999; Luxenberg & Levin, 2004). Those anxious or traumatized persons whose disorder is manifest primarily in efforts to avoid or withdraw from thoughts, feelings, or situations that might precipitate psychological distress tend to produce a con- stricted Rorschach protocol that is notably guarded or evasive. Such hallmarks of a guarded record as a low R, high Lambda, low WSumC, andD = 0 tend to increase the likelihood that a person who has been exposed to a potentially traumatizing experience is experiencing a stress disorder characterized by defensive avoidance.
  • 105. However, neither flooded nor constricted Rorschach protocols are specific to anxiety and stress disorder, nor do they provide conclusive evidence that such a disorder is present. Given historical and other clinical or test data to suggest such a disorder, they merely increase its likelihood. Moreover, as in the case of evaluating sanity, the results of a present personal injury examination are useful only if they can be interpreted in the context of past events. Personal injury cases require examiners to determine whether any currently observed distress predated the alleged misconduct by the defendant and whether this distress 400 Performance-Based Measures constitutes a decline in functioni ng capacity from some previously higher level prior to when the misconduct occurred. Similar considerations apply in the assessment of depressive or psychotic features in plaintiffs seeking personal injury damages. As noted, the DEPI and its several components are helpful in identifying the presence of dysphoric affect and negative cognitions, but they do not provide a dependable basis for ruling out these features of depression. A psychotic impairment of reality testing is indicated by a low XA% and low WDA%, and psychosis can usually be ruled out if these variables fall within a normal range. Lack of evidence of psychosis would counter a plaintiff's claim to have suffered
  • 106. psychological injury, but present indications of psychosis would give little support to such a claim unless other reliable data (e.g., previous testing, historical indications of sound mental health) gave good reason to believe that this person was not psychotic prior to the alleged harmful conduct by the defendant. Personality assessment also enters into family law cases, in the context of disputed child custody and visitation rights. In determining how a child's time and supervision should be divided between separated or divorced parents, judges frequently make their determination partly on the basis of information about the personality characteristics of the child and the parents. Similarly, in deciding whether persons should have their parental rights terminated, courts often seek information about their personality strengths and weaknesses as identified by a psychological examination. There are no infallible guidelines concerning which of two persons would be the better parent for a particular child, nor is there any perfect measure of suitability to parent. However, certain personality characteristics as measured by the RIM are likely to enhance or detract from parents' abilities to meet the needs of their children. These characteristics pertain to the presence or absence of serious psychological distur- bance, the adequacy of the person's coping skills, and the person's degree of interpersonal accessibility. Although having a psychological disorder does not necessarily
  • 107. prevent a person from being a good parent, being seriously disturbed or psychologically incapacitated is likely to interfere with a person's having sufficient judgment, impulse control, energy, and peace of mind to function effectively in a parental capacity. As indicated in presenting interpretive guidelines for the RIM and as previously mentioned in this section on applications, several Rorschach variables help identify such serious disturbance. These include indices of signif- icant thinking disorder and substantially impaired reality testing (elevated PT/), pervasive dysphoria and negative cognitions (elevated DEP[), overwhelming anxiety (a large minus D-score), and marked suicide potential (elevated S-CON). As for coping skills, good parenting is facilitated by capacities for good judgment, careful decision making, a flexible approach to solving problems, and effective stress management. Conversely, poor judgment, careless decision making, inflexible problem solving, and inability to manage stress without becoming unduly upset are likely to interfere with effective parenting. Rorschach findings often cast light on the adequacy of a person's skills in each of these respects, as noted in discussing interpretive guidelines: XA% with respect to judgment; Zd with respect to decision making; a:p with respect to problem- solving approach; and D-score with respect to stress management. This is by no means a definitive or exhaustive list of coping skills relevant to quality of parenting or of Rorschach variables that might prove helpful in evaluating parental
  • 108. suitability. The list nevertheless Rorschach Inkblot Method 401 illustrates important respects in which Rorschach assessment can be applied in family law consultation. Finally, with respect to interpersonal accessibility, the quality of child care that par- ents can provide is usually enhanced by their being a person who is interested in people and comfortable being around them, a person who is nurturing and caring in his or her relationships with others, and a person who is sufficiently empathic to understand what other people are like and recognize their needs and concerns. Conversely, interpersonal disinterest and discomfort are likely to detract from parental effectiveness, as is being a detached, self-absorbed, or insensitive person. In Rorschach terms, then, the likelihood of a person's being a good parent is measured in part by the interpersonal cluster of variables discussed earlier, which means that good parenting is often, though not always, associated with the following seven Rorschach findings: 1. SumH > 3 2. H > Hd + (H) + (Hd) 3. /SOL< .25 4. p <a+ 2
  • 109. 5. T >0 6. COP> 1 7. Accurate M > 2 and M- < 2 In drawing these inferences about interpersonal accessibility, examiners must always keep in mind that such Rorschach findings may suggest how parents are likely to in- teract with their children, but they are never conclusive. The test data identify probable parental strengths or limitations in interpersonal accessibility that should be considered as evaluators proceed to observe and obtain reports of how parents are functioning. Integra- tion of Rorschach indications of adjustment level and coping skills with these behavioral observations and reports should always precede coming to conclusions about a person's effectiveness as a parent. Further elaboration of these and other substantive guidelines in forensic Rorschach assessment is provided by Erard (2005), Gacono, Evans, Kaser-Boyd, and Gacono (in press), Johnston, Walters, & Olesen (2005), and Weiner (2005a, 2006, 2007, in press). Whatever the nature of a forensic case, attention must be paid not only to the substantive interpretation of Rorschach findings, but also to whether testimony based on these findings is admissible into evidence in courtroom proceedings. Applicable criteria for admissibility vary, depending on the particular federal or state jurisdiction in
  • 110. which a case is being tried, and judges have considerable discretion in determining what types of testimony are allowed. As established by published guidelines and case law, the criteria used in individual cases involve some combination of the following considerations: whether the testimony is relevant to the issues in the case and will help the judge or jury arrive at their decision (Federal Rules of Evidence); whether the testimony is based on generally accepted methods and procedures in the expert's field (Frye standard); and whether the testimony is derived from scientifically sound methods and procedures (Daubert standard; see Ewing, 2003; Hess, 2006). 402 Performance-Based Measures The RIM satisfies criteria for admissibility in all three of these respects. The usefulness of Rorschach-based testimony in facilitating legal decisions is demonstrated by the frequency with which this testimony is in fact welcomed in the courtroom. In a survey of almost 8,000 cases in which forensic psychologists offered the court Rorschach-based testimony, the appropriateness of the instrument was challenged in only six instances, and in only one of these cases was the testimony ruled inadmissible (Weiner, Exner, & Sciara, 1996). Among the full set of 247 cases in which Rorschach evidence was presented to a federal, state, or military court of appeals during the half-century from 1945 to
  • 111. 1995, the admissibility and weight of the Rorschach data were questioned in only 10.5% of the hearings. The relevance and utility of Rorschach assessment was challenged in only two of these appellate cases, and the remaining criticisms of the Rorschach testimony were directed at the interpretation of the data, not the method itself (Meloy, Hansen, & Weiner, 1997). More recently Meloy (in press) has examined the full set of 150 published cases in which Rorschach findings were cited in federal, state, and military appellate court pro- ceedings during the 10-year period from 1996 to 2005. These 150 cases over a 10-year period indicate an average of 15 Rorschach citations per year in appellate cases, which is three times the annual rate of citation found by Meloy et al. (1997) for the preceding 50 years. Along with this greatly increased use of the RIM in appellate courts, the percentage of cases in which these courts recorded criticisms of Rorschach testimony decreased from 10.5% during 1945 to 1995 to just 2% during 1996 to 2005. In not one of these 1996 to 2005 appellate cases was the Rorschach method ridiculed or disparaged by opposing counsel. The general acceptance of the Rorschach method is reflected in data concerning how frequently it is used, taught, and studied. Surveys over the past 40 years have consistently shown substantial endorsement of Rorschach testing as a valuable skill to teach, learn, and
  • 112. practice. Among clinical psychologists, the RIM has been the fourth most widely used test, exceeded in frequency of use only by the Wechsler Adult Intelligence Scale (WAIS), the Minnesota Multiphasic Personality Inventory (MMPI), and the Wechsler Intelligence Scale for Children (WISC), in that order (Hogan, 2005). Surveys also indicate that over 80% of clinical psychologists engaged in providing assessment services use the RIM in their work and believe that clinical students should be competent in Rorschach assessment; that over 80% of graduate programs teach the RIM; and that students usually find this training helpful in improving their assessment skills and their understanding of the patients and clients with whom they work (see Camara, Nathan, & Puente, 2000; Viglione & Hilsenroth, 2001). With respect to assessment of young people, 162 child and adolescent practitioners surveyed by Cashel (2002) reported that the RIM was their third most frequently used personality assessment measure, following sentence completion and figure drawing meth- ods. Among 346 psychologists working with adolescents in clinical and academic settings, Archer and Newsom (2000) found the RIM to be their most frequently used personality test and second among all tests only to the Wechsler scales. Surveys of training directors in predoctoral internship sites have also identified widespread endorsement of the value of Rorschach testing. Training directors report that the RIM is one of the three measures most frequently used in
  • 113. their test batteries (along with the WAIS/WISC and the MMPI-2/MMPI-A), and they commonly express the hope or expectation that their incoming interns will have had a Rorschach course or at least arrive with a good working knowledge of the instrument (Clemence & Handler, 2001; Stedman, Hatch, & Schoenfeld, 2000). Rorschach Inkblot Method 403 Survey findings confirm that Rorschach assessment has gained an established place in forensic as well as clinical practice. Data collected from forensic psychologists by Ack- erman and Ackerman (1997), Boccaccini and Brodsky (1999), Borum and Grisso (1995), and Quinnell and Bow (2001) showed 30% using the RIM in evaluations of competency to stand trial, 32 % in evaluations of criminal responsibility, 41 % in evaluations of personal injury, 44% to 48% in evaluations of adults involved in custody disputes, and 23% in eval- uations of children in custody cases. Consistent with these earlier surveys, a more recent report by Archer, Buffington-Vollum, Stredny, and Handel (2006) indicated Rorschach us- age for all purposes combined by 36% of the forensic psychologists participating in their survey. As for study of the instrument, the scientific status of the RIM
  • 114. has been attested over many years by a steady and substantial volume of published research concerning its nature and utility. Buros (1974) Tests in Print II identified 4,580 Rorschach references through 1971, with an average yearly rate of 92 publications. In the 1990s, Butcher and Rouse ( 1996) found an almost identical trend continuing from 197 4 to 1994. An average of 96 Rorschach research articles appeared annually during this 20-year period in journals published in the United States, and the RIM was second only to the MMPI among personality assessment instruments in the volume of research it generated. For the 3- year period 2004 to 2006, PsycINFO lists 350 scientific articles, books, book chapters, and dissertations worldwide concerning Rorschach assessment. There is in fact a large international community of Rorschach scholars and practitioners whose research published abroad has for many years made important contributions to the literature (see Weiner, 1999). The international presence of Rorschach assessment is reflected in a survey of test use in Spain, Portugal, and Latin American countries by Muniz, Prieto, Almeida, and Bartram (1999) in which the RIM emerged as the third most widely used psychological assessment instrument, following the Wechsler scales and versions of the MMPI. The results of surveys in Japan, as reported by Ogawa (2004 ), indicate that about 60% of Japanese clinical psychologists use the RIM in their daily practice. An International Rorschach Society was founded in 1952, and triennial
  • 115. congresses sponsored by this society typically attract participants from over 30 countries and all parts of the world. With respect to the scientific soundness of Rorschach assessment, the final section of this chapter reviews extensive research findings that document the adequate intercoder agreement and retest reliability of the instrument, its validity when used appropriately for its intended purposes, and the availability of normative reference data for representative samples of children and adults. Significantly in this regard, Meloy (2007) reported in his previously mentioned review, "There has been no Daubert challenge to the scientific status of the Rorschach in any state, federal, or military court of appeal since the U.S. Supreme Court decision in 1993 set the federal standard for admissibility of scientific evidence" (p. 85). Despite widespread dissemination of this information, some authors have contended that Rorschach assessment does not satisfy contemporary criteria for admissibility into evidence and have discouraged forensic examiners from using the RIM, even to the point of calling for a moratorium on its use in forensic settings (Garb, 1999; Grove & Barden, 1999). These Rorschach critics have not presented any data to refute previous surveys in this regard or to support their contention that the RIM is unwelcome in the courtroom. The ways in which Rorschach assessment has been demonstrated to assist in forensic decision
  • 116. 404 Performance-Based Measures making are amplified further in contributions by McCann (1998, 2004), McCann and Evans (in press), Ritzier, Erard, and Pettigrew (2002), and Hilsenroth and Stricker (2004). Organizational Practice Rorschach assessment in organizational practice is concerned primarily with the selection and evaluation of personnel. Personnel selection typically consists of determining whether a person applying for a position in an organization is suitable to fill it, or whether a person already in the organization is qualified for promotion to a position of increased respon- sibility. Standard psychological procedure in making such selection decisions consists of first identifying the personality requirements for success in the position being applied or aspired to, and then determining the extent to which a candidate shows these personality characteristics. A leadership position requiring initiative and rapid decision making would probably not be filled well by a person who is behaviorally passive and given to painstaking care in coming to conclusions, as would be suggested by Rorschach findings of p > a + I and Zd > + 3.0. A position in sales or public relations that calls for
  • 117. extensive and persuasive interaction with people is unlikely to be a good fit for a person who is emotionally withdrawn and socially uncomfortable, as would be suggested by a low Afr and H < Hd + (H) + (Hd). Among persons being considered for hire as an air traffic controller or a nuclear power plant supervisor, it would support their candidacy to find evidence on personality testing of good coping capacities and the ability to remain calm and exercise good judgment even in highly stressful situations-in Rorschach terms, a person with a high EA, D > = 0, and XA % in the normal range. Personnel evaluations may also involve assessing the current fitness for duty of persons whose ability to function has become impaired by psychological disorder. Most common in this regard is the onset of an anxiety or depressive disorder that prevents people from continuing to perform their job or practice their profession as competently as they had previously. Impaired professionals seen for psychological evaluation may also have had difficulties related to abuse of alcohol, drugs, or prescription medicine. Because Rorschach data can help identify the extent to which people are anxious or depressed and whether they are struggling with more stress than they can manage, the RIM can often contribute to determining fitness for duty and assessing progress toward recovery in persons participating in a treatment or rehabilitation program. Violence in the workplace has also given rise in recent years to
  • 118. frequent referrals for fitness-for-duty evaluations, usually in the wake of an employee's making verbal threats or acting aggressively on the job. Estimation of violence potential is a complex process that requires careful consideration of an individual 's personality characteristics, interper- sonal and sociocultural context, and previous history of violent behavior (Monahan, 2003 ). Personality characteristics do not by themselves provide sufficient basis for concluding that someone poses a danger to the safety and welfare of others. However, there is reason to believe that certain personality characteristics increase the likelihood of violent behavior in persons who have behaved violently in the past and are currently confronting annoying or threatening situations that on previous occasions were likely to provoke aggressive reac- tions on their part. Following is a list of personality characteristics and Rorschach findings Rorschach Inkblot Method 405 identified earlier in the chapter that help identify them (see also Gacono, 2000; Gacono & Meloy, 1994). 1. Being a selfish and self-centered person with a callous disregard for the rights and feelings of other people and a sense of entitlement to do and have whatever one wants (e.g., Fr+ rF > 0 and Jr+ (2)/R elevated).
  • 119. 2. Being a psychologically distant person who is generally mistrustful of others, avoids intimate relationship, and either ignores people or exploits them to one's own ends (e.g., HVI, T = 0, low SumH, COP = 0 with AG > 2). 3. Being an angry and action-oriented person inclined to express this anger directly (e.g., S > 3, a > p, extratensive EB). 4. Being an impulsive person with little tolerance for frustration, or a psychologically disturbed person with impaired reality testing and poor judgment ( e.g., D < 0, AdjD < 0,XA% and WDA% low). Neither these personality characteristics nor the Rorschach variables associated with them are specific to persons who show violent behavior. Even among people who exhibit all these characteristics and Rorschach findings, moreover, many or most may never consider physically assaulting another person. However, in persons with a history of violent behavior who are exposed to violence-provoking circumstances, each of these characteristics and findings increases violence potential risk. The more numerous these characteristics and findings, and the more pronounced they are, the greater is the violence risk they suggest. PSYCHOMETRIC FOUNDATIONS As mentioned in discussing the history of Rorschach assessment, the blossoming of vari- ous Rorschach systems in the United States and abroad enriched
  • 120. the instrument for clinical purposes, but at a cost to its scientific development. The many Rorschach variations cre- ated by gifted and respected clinicians limited cumulative research on the psychometric properties of the instrument prior to Exner's 1974 standardization of coding and adminis- tration procedures in the Comprehensive System (CS). Subsequent widespread use of the CS in research and practice has fostered substantial advances in knowledge concerning the psychometric soundness of the RIM, particularly with respect to its intercoder agreement, retest reliability, validity, and normative reference base. Intercoder Agreement In constructing the Rorschach CS, Exner included only variables on which his coders could achieve at least 80% agreement, and subsequent research confirmed that the CS variables can be reliably coded with at least this level of agreement. However, measuring intercoder reliability by percentage of agreement is a questionable procedure, because this method does not take account of agreement occurring by chance. With this consideration in mind, Rorschach researchers began in the late 1990s to assess intercoder reliability with two statistics that correct for chance agreements, kappa and intraclass correlation coefficients 416 Performance-Based Measures
  • 121. Major Rorschach indices of psychological disturbance include the X-% (an index of im- paired reality testing) and the WSum6 (an index of disordered thinking). If X-% and WSum6 are valid measures of disturbance, they should increase in linear fashion across these four reference groups-and they do, as shown by the Exner (2001, chap. 11) reference data. A second example of construct validity demonstrated by the normative reference data concerns developmental changes in young people. The previously noted increasing stability of Rorschach structural variables from childhood into adolescence, consistent with the grad- ual consolidation of personality characteristics, is a case in point. Among specific changes occurring with maturation, young people are known to become less self-centered (less egocentric) and increasingly capable of moderating their affect (less emotionally in- tense). The RIM Egocentricity Index is conceptualized as a measure of self-centeredness, and the balance between presumed indices of relatively mature emotionality (FC) and relatively immature emotionality (CF) is conceptualized as an indication of affect moderation. If these variables are valid measures of what they are posited to measure, their average values should change in the expected direction among children and adolescents at different ages-and they do. In the CS reference data, the mean Egocentricity Index of .67 at age 6 decreases in almost linear fashion to .43 at age 16, which is
  • 122. just slightly higher than the adult mean of .40. The mean for FC increases steadily over time from 1.11 at age 6 to 3.43 at age 16 (compared with an adult mean of 3.56), while the mean for CF decreases from 3.51 to 2.78 between age 6 and 16 (the adult mean is 2.41). The present chapter has been concerned mainly with the Rorschach assessment of adults and older adolescents. In closing the chapter, it is important to note that the RIM can also be used to good effect in evaluating children and early adolescents. Assessors working with young people will profit from consulting Erdberg (2007), Exner and Weiner (1995), and Leichtman ( 1996) in this regard. REFERENCES Ackerman, M. J., & Ackerman, M. C. (1997). Custody evaluations in practice: A survey of experienced professionals (revisited). Professional Psychology, 28, 137-145. Acklin, M. W., McDowell, C. J., Verschell, M. S., & Chan, D. (2000). Interobserver agreement, intraobserver agreement, and the Rorschach Comprehensive System. Journal of Personality Assessment, 74, 15-57. Allard, G., & Faust, D. (2000). Errors in scoring objective personality tests. Assessment, 7, 119-129. Allen, J., & Dana, R.H. (2004). Methodological issues in cross- cultural and multicultural Rorschach research. Journal of Personality Assessment, 82, 189-206.
  • 123. Archer, R. P., Buffington-Vollum, J. K., Stredny, R. V., & Handel, R. W. (2006). A survey of psychological test use patterns among forensic psychologists. Journal ofPersonality Assessment, 87, 84-94. Archer, R. P., & Newsom, C. R. (2000). Psychological test usage with adolescent clients: Survey update. Assessment, 7, 227-235. Armstrong, J., & Kaser-Boyd, N. (2004). Projective assessment of psychological trauma. In M. J. Hilsenroth & D. L Segal (Eds.), Comprehensive handbook ofpsychological assessment: Vol. 2. Personality assessment (pp. 500-512). Hoboken, NJ: Wiley. Aronow, E., & Reznikoff, M. (1976). Rorschach content interpretation. New York: Grune & Stratton. Rorschach Inkblot Method 417 Aronow, E., Reznikoff, M., & Moreland, K. L. (1994). The Rorschach technique. Boston: Allyn & Bacon. Auslander, L.A., Perry, W., & Jeste, D. V. (2002). Assessing disturbed thinking and cognition using the Ego Impairment Index in older schizophrenic patients: Paranoid vs. nonparanoid distinction. Schizophrenia Research, 53, 199-207. Beck, S. J. (1930a). Personality diagnosis by means of the Rorschach test. American Journal of Orthopsychiatry, 1, 81-88.
  • 124. Beck, S. J. (1930b). The Rorschach test and personality diagnosis. American Journal of Psychiatry, 10, 19-52. Beck, S. J. (1937). Introduction to the Rorschach method: American Orthopsychiatric Association Monograph I. New York: American Orthopsychiatric Association. Beck, S. J. (1944). Rorschach's test: Vol. I. Basic processes. New York: Grune & Stratton. Blais, M. A., Hilsenroth, M. J., Castlebury, F., Fowler, J. C., & Baity, M. R. (2001). Predicting DSM-IV Cluster B personality disorder criteria from MMPI-2 and Rorschach data: A test of incremental validity. Journal ofPersonality Assessment, 76, 150--168. Blatt, S. J., & Ford, R. Q. (1994). Therapeutic change. New York: Plenum Press. Boccaccini, M. T., & Brodsky, S. L. (1999). Diagnostic test usage by forensic psychologists in emotional injury cases. Professional Psychology, 30, 253-259. Bornstein, R. F. (1999). Criterion validity of objective and projective dependency tests: A meta- analytic assessment of behavioral prediction. Psychological Assessment, 11, 48-57. Bornstein, R. F., & Masling, J.M. (2005). The Rorschach Oral Dependency scale. In R. F. Bornstein & J.M. Masling (Eds.), Scoring the Rorschach: Seven validated systems (pp. 135-158). Mahwah,
  • 125. NJ: Erlbaum. Borum, R., & Grisso, T. (1995). Psychological test use in criminal forensic evaluations. Professional Psychology, 26, 465-473. Buros, 0. K. (Ed.). (1974). Tests in print II. Highland Park, NJ: Gryphon. Butcher, J. N., & Rouse, S. V. (1996). Personality: Individual differences and clinical assessment. Annual Review ofPsychology, 47, 87-111. Camara, W., Nathan, J., & Puente, A. (2000). Psychological test usage: Implications in professional psychology. Professional Psychology, 31, 141-154. Cashel, M. L. (2002). Child and adolescent psychological assessment: Current clinical practices and the impact of managed care. Professional Psychology: Research and Practice, 33, 446-453. Clarkin, J. F., & Levy, K. N. (2004). The influence of client variables on psychotherapy. In M. J. Lambert (Ed.), Bergin and Garfield's handbook ofpsychotherapy and behavior change (5th ed., pp. 194-226). Hoboken, NJ: Wiley. Clemence, A., & Handler, L. (2001). Psychological assessment on internship: A survey of training directors and their expectations for students. Journal ofPersonality Assessment, 76, 18-47. Dao, T. K., & Prevatt, F. (2006). A psychometric evaluation of the Rorschach Comprehensive System's Perceptual Thinking Index. Journal ofPersonality Assessment,
  • 126. 86, 180--189. Elfhag, K., Barkeling, B., Carlsson, A. M., & Rossner, S. (2003). Microstructure of eating behavior associated with Rorschach characteristics in obesity. Journal of Personality Assessment, 81, 40--50. Elfhag, K., Ri:issner, S., Lindgren, T., Andersson, I., & Carlsson, A. M. (2004). Rorschach personality predictors of weight loss with behavior modification in obesity treatment.Journal ofPersonality Assessment, 83, 293--305. Eprhaim, D. (2000). Culturally relevant research and practice with the Rorschach Comprehensive Sys- tem. In R. H. Dana (Ed.), Handbook of cross-cultural and multicultural personality assessment (pp. 303-328). Mahwah, NJ: Erlbaum. Erard, R. E. (2005). What the Rorschach can contribute to child custody and parenting time evalua- tions. Journal of Child Custody, 2, 119-142. 418 Performance-Based Measures Erdberg, P. (2007). Using the Rorschach with children. In S. R. Smith & L. Handler (Eds.), The clinical assessment of children and adolescents (pp. 139-147). Mahwah, NJ: Erlbaum. Erdberg, P., & Shaffer, T. W. (2001, March). International Symposium on Rorschach nonpatient data: Worldwide findings. Symposium conducted at the annual
  • 127. meeting of the Society for Personality Assessment, Philadelphia. Ewing, C. P. (2003). Expert testimony: Law and practice. In I. B. Weiner (Editor-in-Chief) & A. M. Goldstein (Vol. Ed.}, Handbook of psychology: Vol. II. Forensic psychology (pp. 55-66). Hoboken, NJ: Wiley. Exner, J.E., Jr., (1969). The Rorschach systems. New York: Grune & Stratton. Exner, J. E., Jr., (1974). The Rorschach: A comprehensive system. New York: Wiley. Exner, J. E., Jr., (2001). A Rorschach workbook for the comprehensive system (5th ed.). Asheville, NC: Rorschach Workshops. Exner, J. E., Jr., (2003). The Rorschach: A comprehensive system: Vol. I. Basic foundations and principles of interpretation (4th ed.). Hoboken, NJ: Wiley. Exner, J.E., Jr., & Andronikof-Sanglade, A. ( 1992). Rorschach changes following brief and short-term therapy. Journal of Personality Assessment, 59, 59-71. Exner, J.E., Jr., Armbruster, G. L., & Viglione, D. (2001). The temporal stability of some Rorschach features. Journal of Personality Assessment, 42, 474-482. Exner, J.E., Jr., & Erdberg, P. (2005). The Rorschach: A comprehensive system: Vol. 2. Advanced interpretation (3rd ed.). Hoboken, NJ: Wiley. Exner, J. E., Jr., Thomas, E. A., & Mason, B. (1985). Children's
  • 128. Rorschachs: Description and prediction. Journal of Personality Assessment, 49, 13-20. Exner, J.E., Jr., & Weiner, I. B. (1995). The Rorschach: A comprehensive system: Vol. 3. Assessment of children and adolescents (2nd ed.). New York: Wiley. Exner, J. E., Jr., & Weiner, I. B. (2003). Rorschach interpretation assistance program, Version 5(RIAP5). Lutz, FL: Psychological Assessment Resources. Fowler, J. C., Ackerman, S. J., Speanburg, S., Bailey, A., Blagys, M., & Conklin, A. C. (2004). Personality and symptom change in treatment-refractory inpatients: Evaluation of the phase model of change using Rorschach, TAT, and DSM-IV Axis V.Journal ofPersonality Assessment, 83, 306--322. . Fowler, J. C., Brunnschweiler, B., Swales, S., & Brock, J. (2005). Assessment of Rorschach depen- dency measures in female inpatients diagnosed with borderline disorder. Journal of Personality Assessment, 85, 146--153. Fowler, J.C., & Erdberg, P. (2005). The Mutuality of Autonomy scale: An implicit measure of object relations for the Rorschach Inkblot Method. South African Rorschach Journal, 2, 3-10. Fowler, J.C., Piers, C., Hilsenroth, M. J., Holdwick, D. J., Jr., & Padawer, J. R. (2001 ). The Rorschach Suicide Constellation: Assessing various degrees oflethality. Journal ofPersonality Assessment, 76, 333-351.
  • 129. Gacono, C. B. (Ed.). (2000). The clinical and forensic assessment of psychopathy. Mahwah, NJ: Erlbaum. Gacono, C. B., Evans, F. B., Kaser-Boyd, N., & Gacono, L. (Eds.). (in press). Handbook offorensic Rorschach psychology. Mahwah, NJ: Erlbaum. Gacono, C. B., & Meloy, J. R. (1994). Rorschach assessment of aggressive and psychopathic per- sonalities. Hillsdale, NJ: Erlbaum. Ganellen, R. J. (2005). Rorschach contributions to assessment of suicide risk. In R. I. Yufit & D. Lester (Eds.), Assessment, treatment, and prevention of suicidal behavior (pp.93-119). Hoboken, NJ: Wiley. Garb, H. N. (1999). Call for a moratorium on the use of the Rorschach Inkblot Test in clinical and forensic settings. Assessment, 6, 311-318. Rorschach Inkblot Method 419 Garb, H. N., Wood, J. M., Nezworski, M. T., Grove, W. M., & Stejskal, W. J. (2001). Toward a resolution of the Rorschach controversy. Psychological Assessment, 13, 433-448. Goldstein, A. M., Morse, S. J., & Shapiro, D. L. (2003). Evaluation of criminal responsibility. In I. B. Weiner (Editor-in-Chief) & A. M. Goldstein (Vol. Ed.), Handbook of psychology: Vol. 11. Forensic psychology (pp. 381-406). Hoboken, NJ: Wiley.
  • 130. Greenberg, S. A. (2003). Personal injury examinations in torts for emotional distress. In I. B. Weiner (Editor-in-Chief) & A. M. Goldstein (Vol. Ed.), Handbook of psychology: Vol. 11. Forensic psychology (pp. 233-257). Hoboken, NJ: Wiley. Greenway, P., & Milne, L. C. (2001 ). Rorschach tolerance and control of stress measures D andAdjD: Beliefs about how well subjective states and reactions can be controlled. European Journal of Psychological Assessment, 17, 137-144. Grizjnner!Z!d, C. (2003). Temporal stability in the Rorschach method: A meta-analytic review. Journal ofPersonality Assessment, 80, 272-293. Grizjnnerizjd, C. (2006). Reanalysis of the Grizjnnerizjd. (2003). Rorschach temporal stability meta- analysis set. Journal ofPersonality Assessment, 86, 222-225. Grove, W. M., & Barden, R. C. (1999). Protecting the integri ty of the legal system: The admissibility of testimony from mental health experts under Daubert/Kumho analysis. Psychology, Public Policy, and Law, 5, 224-242. Guamaccia, V., Dill, C. A., Sabatino, S., & Southwick, S. (2001). Scoring accuracy using the Comprehensive System for the Rorschach. Journal ofPersonality Assessment, 77, 464-474. Hamel, M., Shaffer, T. W., & Erdberg, P. (2000). A study of nonpatient preadolescent Rorschach protocols. Journal ofPersonality Assessment, 75, 280-294.
  • 131. Handler, L., & Clemence, A. J. (2005). The Rorschach Prognostic Rating scale. In R. F. Bornstein & J.M. Masling (Eds.), Scoring the Rorschach: Seven validated systems (pp. 1-24). Mahwah, NJ: Erlbaum. Hartmann, E., Norbech, P. B., & Grizjnnerizjd, C. (2006). Psychopathic and nonpsychopathic violent of- fenders on the Rorschach: Discriminative features and comparisons with schizophrenic inpatient and university student samples. Journal ofPersonality Assessment, 86, 291-305. Hartmann, E., Sunde, T., Kristensen, W., & Martinussen, M. (2003). Psychological measures as predictors of military training performance. Journal ofPersonality Assessment, 80, 87-98. Hess, A. K. (2006). Serving as an expert witness. In I. B. Weiner & A. K. Hess (Eds.), Handbook of forensic psychology (3rd ed., pp. 652-700). Hoboken, NJ: Wiley. Hiller, J. B., Rosenthal, R., Bornstein, R. F., Berry, D. T. R., & Brunner-Neuleib, S. (1999). A comparative meta-analysis of Rorschach validity. Psychological Assessment, 11, 278-296. Hilsenroth, M. J., & Stricker, G. (2004). A consideration of attacks upon psychological assessment instruments used in forensic settings: Rorschach as exemplar. Journal ofPersonality Assessment, 83, 141-152. Hogan, T. P. (2005). 50 widely used psychological tests. In G. P. Koocher, J.C. Norcross, & S. S. Hill
  • 132. III (Eds.), Psychologists' desk reference (2nd ed., pp. 101-104). New York: Oxford University Press. Holt, R.R. (2005). The Pripro scoring system. In R. F. Bornstein & J.M. Masling (Eds.), Scoring the Rorschach: Seven validated systems (pp. 191-236). Mahwah, NJ: Erlbaum. Holzman, P. S., Levy, D. L., & Johnston, M. H. (2005). The use of the Rorschach technique for assessing formal thought disorder. In R. F. Bornstein & J. M. Masling (Eds.), Scoring the Rorschach: Seven validated systems (pp. 55-96). Mahwah, NJ: Erlbaum. Hunsley, J., & Bailey, J.M. (1999). The clinical utility of the Rorschach: Unfulfilled promises and an uncertain future. Psychological Assessment, 11, 266-277. Huprich, S. K. (Ed.). (2006). Rorschach assessment of the personality disorders. Mahwah, NJ: Erlbaum. 420 Performance-Based Measures Ilonen, T., Taiminen, T., Karlsson, H., Lauerma, H., Leinonen, K.-M., Wallenius, E., et al. (1999). Diagnostic efficiency of the Rorschach schizophrenia and depression indices in identifying first-episode schizophrenia and severe depression. Psychiatry Research, 87, 183-193. Janson, H., & Stattin, H. (2003). Predictions of adolescent and
  • 133. adult delinquency from childhood Rorschach ratings. Journal ofPersonality Assessment, 81, 51-63. Johnston, J. R., Walters, M. G., & Olesen, N. W. (2005). Clinical ratings of parenting capacity and Rorschach protocols of custody-disputing parents: An exploratory study. Journal of Child Custody, 2, 159-178. Kelly, F. D. (1999). The psychological assessment ofabused and traumatized children. Mahwah, NJ: Erlbaum. Kleiger, J. H. (1999). Disordered thinking and the Rorschach. Hillsdale, NJ: Analytic Press. Klopfer, B., Ainsworth, M. D., Klopfer, W. G., & Holt, R.R. (1954). Developments in the Rorschach technique: Vol. I. Technique and theory. Yonkers-on-Hudson, NY: World Books. Klopfer, B., & Kelley, D. M. (1942). The Rorschach technique. Yonkers-on-Hudson, NY: World Books. Klopfer, B., Kirkner, F., Wisham, W., & Baker, G. (1951). Rorschach Prognostic Rating scale.Journal ofProjective Techniques and Personality Assessment, 15, 425- 428. Leichtman, M. (1996). The Rorschach: A developmental perspective. Hillsdale, NJ: Analytic Press. Lerner, P. M. (1998). Psychoanalytic perspective on the Rorschach. Hillsdale, NJ: Analytic Press.
  • 134. Lerner, P. M. (2005). Defense and its assessment: The Lerner Defense scale. In R. F. Bornstein & J. M. Masling (Eds.), Scoring the Rorschach: Seven validated systems (pp. 237-270). Mahwah, NJ: Erlbaum. Lilienfeld, S. 0., Wood, J. M., & Garb, H. N. (2000). The scientific status of projective techniques. Psychological Science in the Public Interest, 1, 27-66. Luxenberg, T., & Levin, P. (2004). The role of the Rorschach in the assessment and treatment of trauma. In J. P. Wilson & T. M. Keane (Eds.), Assessing psychological trauma and PTSD (2nd ed., pp. 190--225). New York: Guilford Press. McCann, J. T. (1998). Defending the Rorschach in court: An analysis of admissibility using legal and professional standards. Journal ofPersonality Assessment, 70, 125-144. McCann, J. T. (2004). Projective assessment of personality in forensic settings. In M. Hersen (Editor- in-Chief), M. J. Hilseroth, & D. L. Segal (Vol. Eds.), Comprehensive handbook ofpsychological assessment: Vol. 2. Personality assessment (pp. 562-572). Hoboken, NJ: Wiley. McCann, J. T., & Evans, F. B. (in press). Admissibility of the Rorschach. In C. B. Gacono, F. B. Evans, N. Kaser-Boyd, & L. Gacono (Eds.), Handbook offorensic Rorschach psychology. Mahwah, NJ: Erlbaum. McCrae, R. R., & Terracciano, A. (2006). National character
  • 135. and personality. Current Directions in Psychological Science, 15, 156--161. McGrath, R. E. (2003). Enhancing accuracy in observational test scoring: The Comprehensive System as a case example. Journal ofPersonality Assessment, 81, 104- 110. McGrath, R. E., Pogge, D. L., Stokes, J. M., Cragnolino, A., Zaccaria, M., Hayman, J., et al. (2005). Field reliability of Comprehensive System scoring in an adolescent inpatient sample. Assessment, 12, 199-209. [11] Meloy, J. R. (2007). The authority of the Rorschach: An update. In C. B. Gacono, F. B. Evans, N. Kaser-Boyd, & L. Gacono (Eds.), Handbook of forensic Rorschach psychology (pp. 79-87). Mahwah, NJ: Erlbaum. Meloy, J. R., Hansen, T., & Weiner, I. B. (1997). Authority of the Rorschach: Legal citations in the past 50 years. Journal ofPersonality Assessment, 69, 53-62. Meyer, G. J. (1997a). Assessing reliability: Critical corrections for a critical examination of the Rorschach Comprehensive System. Psychological Assessment, 9, 480-489. Rorschach Inkblot Method 421 Meyer, G. J. (1997b). Thinking clearly about reliability: More critical corrections regarding the Rorschach Comprehensive System. Psychological Assessment,
  • 136. 9, 495-598. Meyer, G. J. (2000). The incremental validity of the Rorschach Prognostic Rating scale over the MMPI Ego Strength scale and IQ. Journal ofPersonality Assessment, 74, 356--370. Meyer, G. J. (2001). Evidence to correct misperceptions about Rorschach norms. Clinical Psychology: Science and Practice, 8, 389-396. Meyer, G. J. (2002). Exploring possible ethnic differences and bias in the Rorschach Comprehensive System. Journal ofPersonality Assessment, 78, 104-129. Meyer, G. J. (2004). The reliability and validity of the Rorschach and Thematic Apperception Test (TAT) compared to other psychological and medical procedures: An analysis of systematically gathered evidence. In M. Hersen (Editor-in-Chief), M. Hilsenroth, & D. Segal (Vol. Eds.), Comprehensive handbook of psychological assessment: Vol. 2. Personality assessment (pp. 315-342). Hoboken, NJ: Wiley. Meyer, G. J., & Archer, R. P. (2001). The hard science of Rorschach research: What do we know and where do we go? Psychological Assessment, 13, 486--502. Meyer, G. J., & Handler, L. (1997). The ability of the Rorschach to predict subsequent outcome: Meta-analysis of the Rorschach Prognostic Rating scale. Journal ofPersonality Assessment, 69, 1-38. Meyer, G. J., Hilsenroth, M. J., Baxter, D., Exner, J.E., Jr.,
  • 137. Fowler, J.C., Pers, C. C., et al. (2002). An examination of interrater reliability for scoring the Rorschach Comprehensive System in eight data sets. Journal ofPersonality Assessment, 78, 219-274. Meyer, J. G., Mihura, J. L., & Smith, B. L. (2005). The interclinician reliability of Rorschach interpretation in four data sets. Journal ofPersonality Assessment, 84, 296--314. Meyer, G. J., & Viglione, D. J. (in press). Scientific status of the Rorschach. In C. B. Gacono, F. B. Evans, N. Kaser-Boyd, & L. Gacono (Eds.), Handbook offorensic Rorschach psychology. Mahwah, NJ: Erlbaum. Monahan, J. (2003). Violence risk assessment. In I. B. Weiner (Editor-in-Chief) & A. M. Goldstein (Vol. Ed.), Handbook ofpsychology: Vol. 11. Forensic psychology (pp. 527-540). Hoboken, NJ: Wiley. Muniz, J., Prieto, G., Almeida, L., & Bartram, D. (1999). Test use in Spain, Portugal, and Latin American countries. European Journal ofPsychological Assessment, 15, 151-157. Ogawa, T. (2004). Developments of the Rorschach in Japan: A brief introduction. South African Rorschach Journal, 1, 40--45. Perry, W. (2001). Incremental validity of the Ego Impairment Index: A reexamination of Dawes (1999). Psychological Assessment, 13, 403-407. Phillips, L., & Smith, J. G. (1953). Rorschach interpretation:
  • 138. Advanced technique. New York: Grune & Stratton. Piotrowski, Z. A. (1957). Perceptanalysis. New York: Macmillan. Presley, G., Smith, C., Hilsenroth, M., & Exner, J. (2001). Clinical utility of the Rorschach with African Americans. Journal ofPersonality Assessment, 78, I 04- 129. Quinnell, F. A., & Bow, J. N. (2001 ). Psychological tests used in child custody evaluations. Behavioral Sciences and the Law, 19, 491-501. Rapaport, D., Gill, M., & Schafer, R. (1968). Diagnostic psychological testing (Rev. ed.). New York: International Universities Press. (Original work published 1946) Ritsher, J. B. (2004). Association of Rorschach and MMPI psychosis indicators and schizophrenia- spectrum diagnoses in a Russian clinical sample. Journal ofPersonality Assessment, 38, 46--63. Ritzier, B. (2001). Multicultural usage of the Rorschach. In L. Suzuki, J. Ponterotto, & P. Meller (Eds.), Handbook of multicultural assessment (pp. 237-252). San Francisco: Jossey-Bass. Ritzler, B. (2004). Cultural applications of the Rorschach, apperception tests, and figure drawings. In M. Hersen (Editor-in-Chief), M. J. Hilsenroth, & D. L. Segal (Vol. Eds.), Comprehensive
  • 139. 422 Performance-Based Measures handbook ofpsychological assessment: Vol. 2. Personality assessment (pp. 573-585). Hoboken, NJ: Wiley. Ritzier, B., Erard, R., & Pettigrew, T. (2002). Protecting the integrity of Rorschach expert witnesses: A reply to Grove and Barden (1999) re: The admissibility of testimony under Daubert/Kumho analysis. Psychology, Public Policy, and the Law, 8(2), 201- 215. Rorschach, H. (1942). Psychodiagnostics: A diagnostic test based on perception. Bern, Switzerland: Hans Huber. (Original work published 1921) Rosenthal, R., Hiller, J. G., Bornstein, R. F., Berry, D. T. R., & Brunell-Neuleib, S. (2001). Meta- analytic methods, the Rorschach, and the MMPI. Psychological Assessment, 13, 449-451. Schafer, R. (1948). Clinical application ofpsychological tests. New York: International Universities Press. Schafer, R. (1954). Psychoanalytic interpretation in Rorschach testing. New York: Grune & Stratton. Schafer, R. (2006). My life in testing. Journal ofPersonali ty Assessment, 86, 235-241. Shaffer, T. W., Erdberg, P., & Haroian, J. (1999). Current nonpatient data for the Rorschach, WAIS, and MMPI-2. Journal ofPersonality Assessment, 73, 305-316.
  • 140. Shaffer, T. W., Erdberg, P., & Meyer, G. J. (Eds.). (2007). International reference sample for the Rorschach comprehensive system [Special issue]. Journal of Personality Assessment, 89 (Suppl. I). Sloane, P., Arsenault, L., & Hilsenroth, M. (2002). Use of the Rorschach in the assessment of war-related stress in military personnel. Rorschachiana, 25, 86-- 122. Smith, S., Baity, M. R., Knowles, E. S., & Hilsenroth, M. J. (2001 ). Assessment of disordered thinking in children and adolescents: The Rorschach Perceptual -Thinking Index. Journal of Personality Assessment, 77, 447-463. Society for Personality Assessment. (2005). The status of the Rorschach in clinical and forensic practice: An official statement by the Board of Trustees of the Society for Personality Assessment. Journal ofPersonality Assessment, 85, 219-237. Stafford, K. P. (2003). Assessment of competence to stand trial. In I. B. Weiner (Editor-in-Chief) & A. M. Goldstein (Vol. Ed.), Handbook ofpsychology: Vol. 11. Forensic psychology (pp. 359-380). Hoboken, NJ: Wiley. Stedman, J., Hatch, J., & Schoenfeld, L. (2000). Preinternship preparation in psychological testing and psychotherapy: What internship directors say they expect. Professional Psychology, 31, 321-326. Stricker, G., & Gooen-Piels, J. (2004). Projective assessment of
  • 141. object relations. In M. Hersen (Editor- in-Chief), M. J. Hilsenroth, & D. L. Segal (Vol. Eds.), Comprehensive handbook ofpsychological assessment: Vol. 2. Personality assessment (pp. 449-465). Hoboken, NJ: Wiley. Urist, J. (1977). The Rorschach test and the assessment of object relations. Journal of Personality Assessment, 41, 3-9. Viglione, D. J. (1999). A review of recent research addressing the utility of the Rorschach. Psycho- logical Assessment, 11, 251-265. Viglione, D. J. (2002). Rorschach coding solutions: A reference guide for the comprehensive system. San Diego, CA: Author. Viglione, D. J., & Hilsenroth, M. J. (2001 ). The Rorschach: Facts, fictions, and future. Psychological Assessment, 11, 251-265. Viglione, D. J., Perry, W., & Meyer, G. (2003). Refinements in the Rorschach Ego Impairment Index incorporating the human representational variable. Journal ofPersonality Assessment, 81, 149-156. Viglione, D. J., & Taylor, N. (2003). Empirical support for interrater reliability of Rorschach com- prehensive system coding. Journal of Clinical Psychology, 59, 111-121. Weiner, I. B. (1996). Some observations on the validity of the Rorschach Inkblot Method. Psycho- logical Assessment, 8, 206--213.
  • 142. Rorschach Inkblot Method 423 Weiner, I. B. (1998). Principles ofpsychotherapy (2nd ed.). New York: Wiley. Weiner, I. B. (1999). Contemporary perspectives on Rorschach assessment. European Journal of Psychological Assessment, 15, 78-86. Weiner, I. B. (2001a). Advancing the science of psychological assessment: The Rorschach Inkblot Method as exemplar. Psychological Assessment, 13, 423-432. Weiner, I. B. (2001b). Considerations in collecting Rorschach reference data. Journal ofPersonality Assessment, 77, 122-127. Weiner, I. B. (2003a). Prediction and postdiction in clinical decision making. Clinical Psychology, 10, 335-338. Weiner, I. B. (2003b). Principles ofRorschach interpretation (2nd ed.). Mahwah, NJ: Erlbaum. Weiner, I. B. (2004a). Monitoring psychotherapy with performance-based measures of personality functioning. Journal of Personality Assessment, 83, 323-331. Weiner, I. B. (2004b). Rorschach assessment: Current status. In M. Hersen (Editor-in-Chief), M. J. Hilsenroth, & D. L. Segal (Vol. Eds.), Comprehensive handbook of psychological assessment: Vol. 2. Personality assessment (pp. 343-355). Hoboken, NJ:
  • 143. Wiley. Weiner, I. B. (2005a). Rorschach assessment in child custody cases. Journal of Child Custody, 2, 99-120. Weiner, I. B. (2005b). Rorschach Inkblot Method. In M. Maruish (Ed.), The use of psychological testing in treatment planning and outcome evaluation (3rd ed., Vol. 3, pp. 553-588). Mahwah, NJ: Erlbaum. Weiner, I. B. (2006). The Rorschach Inkblot Method. In R. P. Archer (Ed.), Forensic uses of clinical assessment instruments (pp. 181-207). Mahwah, NJ: Erlbaum. Weiner, I. B. (2007). Rorschach assessment in forensic cases. In A. M. Goldstein (Ed.), Forensic psychology: Emerging topics and expanding roles (pp. 127- 153). Hoboken, NJ: Wiley. Weiner, I. B. (in press). Presenting and defending Rorschach testimony. In C. B. Gacono, F. B. Evans, N. Kaser-Boyd, & L. Gacono (Eds.), Handbook offorensic Rorschach psychology. Mahwah, NJ: Erlbaum. Weiner, I. B., & Exner, J.E., Jr. (1991). Rorschach changes in long-term and short-term psychotherapy. Journal ofPersonality Assessment, 56, 453-465. Weiner, I. B., Exner, J. E., Jr., & Sciara, A. (1996). Is the Rorschach welcome in the courtroom? Journal ofPersonality Assessment, 67, 422-424. Wood, J. M., & Lilienfeld, S. 0. (1999). The Rorschach Inkblot
  • 144. Tests: A case of overstatement? Assessment, 6, 341-349. Wood, J. M., Nezworski, M. T., Garb, H. N., & Lilienfeld, S. 0. (2001). The misperception of psychopathology: Problems with the norms of the comprehensive system. Clinical Psychology: Science and Practice, 8, 360-373. Zapf, P. A., Golding, S. L., & Roesch, R. (2006). Criminal responsibility and the insanity defense. In I. B. Weiner & A. K. Hess (Eds.), Handbook offorensic psychology (3rd ed., pp. 332-364). Hoboken, NJ: Wiley. Zapf, P. A., & Roesch, R. (2006). Competency to stand trial. In I. B. Weiner & A. K. Hess (Eds.), Handbook offorensic psychology (3rd ed., pp. 305-331). Hoboken, NJ: Wiley. a-345-354a-395-405a-416-423 Chapter 10 REVISED NEO PERSONALITY INVENTORY The NEO Personality Inventory (NEO PI; Costa & McCrae, 1985) and the Revised NEO Personality Inventory (NEO PI-R; Costa & McCrae, 1992) measure five broad domains or dimensions of personality in normal adults. Three of these domain scales, measur- ing Neuroticism (N), Extraversion (E), and Openness to Experience (0), have been re-
  • 145. searched for years and serve as the basis of the name for the original Inventory (NEO). The NEO PI also includes two additional domains, Agreeableness (A) and Conscientiousness ( C). These five domains allow for a comprehensive description of personality in normal adults. The NEO PI-R consists of five global domains and six facets for each domain (see Table 10.1). Table 10.2 provides the general information on the NEO PI-R. HISTORY A long line of research on five-factor models of personality serve as the basis for the NEO PI-R, most of which is beyond the scope of this Handbook (cf. Wiggins, 1996). The rather common finding in the 1980s of five factors in personality research, served as the major impetus for a multitude of studies based on a lexical analysis of words, personality traits, interpersonal theory, or ratings of schoolchildren's behavior. Despite critiques that five-factor models were atheoretical, they have persisted and gained widespread acceptance in the field of personality research. A significant impetus for this widespread acceptance of five-factor models is the prolific work of Costa and McCrae and their publication of the NEO PI (Costa & McCrae, 1985) and NEO PI-R (Costa & McCrae, 1992). A bibliography (Costa & McCrae, 2003) available on the website for Psychological Assessment Resources (www.parinc.com), the publisher of the NEO PI-R, is nearly 60 pages.
  • 146. Both the NEO PI (Costa & McCrae, 1985) and the NEO PI-R (Costa & McCrae, 1992) have two forms: Form R (Rater) and Form S (Self). Form R is to be completed by a knowledgeable other who is well acquainted with the person and Form S is to be completed by the person being evaluated. Virtually all the research on the NEO PI and NEO PI-R has been conducted with Form S and it is the main form that will be discussed here. More frequent use of Form R in conjunction with Form S seems well warranted because of the important perspective it can provide on the person being evaluated. At a minimum, the reader needs to be aware of the existence of Form R so as to consider the possibility of its use. 315 www.parinc.com 316 Self-Report Inventories Table 10.1 Revised NEO Personality Inventory (NEO PI-R) domain and facet scales Domain Facets N (Neuroticism) NJ Anxiety N2 Angry Hostility N3 Depression N4 Self-Consciousness NS Impulsiveness
  • 147. N6 Vulnerability E (Extraversion) El Warmth E2 Gregariousness E3 Assertiveness E4 Activity ES Excitement-Seeking E6 Positive Emotions 0 (Openness) OJ Fantasy 02 Aesthetics 03 Feelings 04 Actions 05 Ideas 06 Values A (Agreeableness) Al Trust A2 Straightforwardness A3 Altruism A4 Compliance AS Modesty A6 Tender-Mindedness C (Conscientiousness) Cl Competence C2 Order CJ Dutifulness C4 Achievement Striving cs Self -Discipline C6 Deliberation NEO PI (First Edition) The NEO PI (Costa & McCrae, 1985) consisted of five domains: Neuroticism (N); Ex- traversion (E); Openness (0); Agreeableness (A); and Conscientiousness (C). The name
  • 148. of the inventory-NEO-was formed from the initial letter of the first three names in a concession to an early version of the inventory that contained only those three domains. These five domains measure the broad dimensions of personality in normal adults. The first three domains (Neuroticism [NJ; Extraversion [E], Openness [0]) also had six facets or subscales for each domain. Revised NEO Personality Inventory 317 Table 10.2 Revised NEO Personality Inventory (NEO PI-R) Authors: Published: Edition: Publisher: Website: Age range: Reading level: Administration formats: Languages: Number of items: Response format: Administration time: Primary scales: Additional scales: Hand scoring: General texts: Computer interpretation: Costa & McRae 1992
  • 149. Revised Psychological Assessment Resources www.parinc.com 18+ 6th grade Paper/pencil, computer, CD, cassette 9 published and 25 validated translations 240 5-point Likert scale 20-30 minutes 5 Domains and 30 Facets None 2-part carbonless Answer Sheet (self-scoring) None Psychological Assessment Resources (Costa & McRae) NEO PI-R (Revised Edition) The NEO PI-R (Costa & McCrae, 1992) consists of the same five domains as in the NEO PI. There are only two minor differences between the NEO PI- Rand the NEO PI. First, the facet scales for Agreeableness (A) and Conscientiousness (C) were added; they had not been available on the NEO PI. Second, 10 (4.2%) items were replaced to allow for more accurate measurement of several facets. Although the NEO PI-R is the focus of this chapter, two other forms of the NEO need to be mentioned: NEO Five-Factor Inventory (NEO-FFI; Costa & McCrae, 1992); and NEO PI-3 (McCrae, Costa, & Martin, 2005). Each of these other forms of the NEO PI-R is described in turn. This description can be very brief for both of them because they retain
  • 150. the essential features of the NEO PI-R. NEO Five-Factor Inventory The NEO-FFI (Costa & McCrae, 1992) is essentially an authorized short form of the NEO PI-R. It consists of 60 items from the NEO PI-R that are used only to score the five domains: Neuroticism (N); Extraversion (£); Openness (0); Agreeableness (A); and Conscientiousness (C). It does not contain the items for assessing the facets within each domain. The NEO-FFI is designed for use in circumstances in which time is too limited to present the entire NEO PI-R or only scores on the five domains are required. All the information provided on the domains for the NEO PI-R will apply to the NEO-FFI so it does not need to be discussed explicitly. NEOP/-3 McCrae et al. (2002) identified 30 items on the NEO PI-R (Costa & McCrae, 1992) that were not endorsed by at least 2% of nearly 2,000 adolescents. A number of these www.parinc.com 318 Self-Report Inventories 30 items contained words that adolescents, and even some adults, might not understand. An additional 18 items were identified that had item-total scores
  • 151. on the facet scales less than .30. Alternative items were developed for these 48 items and McCrae et al. (2005) found acceptable replacements for 37 of them. The original version of the other 11 items was retained on the NEO PI-3. The items on the NEO Pl-3 are easier to read than those on the NEO PI-R and the NEO PI-3 can be used for adolescents 12 years of age and older. Further research currently is being conducted to determine whether the NEO Pl-3 can be considered as a replacement for the NEO PI-Rat all ages. The entire December 2000 issue of the journal Assessment was devoted to the NEO PI-R. Anyone who is using the NEO PI-R should review this issue to get a better idea of the broad extent and wide nature of its usage. ADMINISTRATION The first issue in the administration of the NEO PI-R is ensuring that the individual is invested in the process. Taking a few extra minutes to answer any questions the individual has about why the NEO PI-R is being administered and how the results will be used will pay excellent dividends. The examiner should work diligently to make the assessment process a collaborative activity with the individual to obtain the desired information. This issue of therapeutic assessment (Finn, 1996; Fischer, 1994) was covered in more depth in Chapter 2 (pp. 43--44 ). The transparent nature of the items on the NEO PI-R and the lack of extensive
  • 152. means for assessing the validity of item endorsement ( see later section in this chapter) make the task of getting the individual appropriately engaged in completing the NEO PI-Rall the more important. Reading level is not a crucial factor in determining whether a person can complete the NEO PI-R. First, the reading level of the NEO PI-R is the sixth grade. Second, the exam- iner may read the items to individuals whose reading abilities are limited and record the responses (Costa & McCrae, 1992, p. 5). The NEO PI-R is the only self-report inventory discussed in this Handbook that allows the examiner to read the items to the individu- als. All other self-report inventories explicitly discourage or forbid this procedure (see Chapter 5). SCORING Scoring the NEO PI-R is relatively straightforward either by hand or computer. If the NEO PI-R is administered by computer, the computer automatically scores it. If the individual's responses to the items have been placed on an answer sheet, these responses can be entered into the computer by the clinician for scoring or they can be hand scored. If the clinician enters the item responses into the computer for scoring, they should be double entered so that any data entry errors can be identified.
  • 153. One of the advantages of computer scoring is that the factor score for each domain is computed directly. The factor scores can be calculated for the domains using the formulas presented in the Manual (Costa & McCrae, 1992, p. 8), and it is recommended that researchers use the factor scores. "In most cases, the domain scale scores are a good Revised NEO Personality Inventory 319 approximation to factor scores, and it is probably not worth the effort to apply these formulas by hand to individual cases" (Costa & McCrae, 1992, p. 7). The NEO PI-R (Costa & McCrae, 1992) and the Personality Assessment Inventory (PAI: Morey, 1991) are the only self-report inventories reviewed in this Handbook that do not use "true/false" items. Both of these inventories have the same publisher (Psychological Assessment Resources), and that may account for not using "true/false" items. The NEO PI-R uses a five-point Likert scale ranging from SD (Strongly Disagree), D (Disagree), N (Neutral), A (Agree), to SA (Strongly Agree). These potential response options always are presented in this same order on the answer sheet. When SD (Strongly Disagree) is the scored direction for a specific item, the response options are scored as 4, 3, 2, 1, or 0. When SA (Strongly Agree) is the scored direction, the preceding five
  • 154. response options are scored as 0, 1, 2, 3, or 4. Thus, the total raw score on each eight-item facet scale can range from 0 to 32. The total score on a domain, each of which consists of six facet scales, can range from Oto 192, but the norm tables for adults are truncated at 25 and 172 (Costa & McCrae, 1992, Appendix C, p. 79). The first step in hand scoring is to examine the answer sheet carefully and indicate omitted items and double-marked items by drawing a line through all five responses to these items with brightly colored ink. Also, cleaning up the answer sheet is helpful and facilitates scoring. Responses that were changed need to be erased completely if possible, or clearly marked with an "X" so that the clinician is aware that this response has not been endorsed by the client. The answer sheet for the NEO PI-R is self-scoring, that is, no templates or overlays are required for scoring. Instead the top page of the answer sheet is removed and each row of items corresponds to one of the facets for each of the domains. The facets are in numerical order within each domain and the domains are in the order: Neuroticism (N); Extraversion {£); Openness (0); Agreeableness (A); and Conscientiousness (C). The raw score for each facet is the sum of the circled numbers on its row. The sum of the marked scores for the first row is facet N 1, the sum of the second row is facet El, and so on. Once the six facet scores have been calculated for each domain, they are summed
  • 155. to create the raw score for each domain. Thus, the sum of facets NJ, N2, N3, N4, N5, andN6 becomes the raw score for domain N. These raw scores for each domain are entered into the corresponding box at the bottom of the answer sheet. Plotting the profile is the next step in the scoring process. There are two profile forms that can be used with Form S: adults (21 years of age and older) and college (17 to 20). Profiles are plotted separately for men and women with each of these forms and are on opposite sides of the same page. The college-age profile form is used for all individuals aged 17 to 20 no matter whether they are in college. To remove the ambiguity, it would be more accurate to say that the "young adult" form should be used for all individuals between the ages of 17 and 20 and not call it a "college" profile form. Once the correct profile form has been selected for the person's age and gender, all the raw scores from the answer sheet are transferred to the appropriate column of the profile sheet (see Figure 10.1). The first five columns on the profile sheet are the five domains (N, E, 0, A, and C) and then the six facets for each domain are presented in order. The raw score on each domain and facet is indicated by either circling the number or marking it with an "x." Once the individual's scores on the five domains have been plotted, a solid line is drawn to connect them. A similar procedure is followed for each of the six facets.
  • 156. 80 II) ! so~-----.,,,::...-----------~~-----+-'l------+-l~--------~---+----1- >.-----____:,,,,.~---4-----+-+---- o u (/) 1- 50 N E O C A N1 N2 N3 N4 N5 N6 E1 E2 E3 E4 E5 E6 01 02 03 04 05 06 C1 C2 ca C4 C5 C6 A1 A2 A3 A4 A5 A6 NEO Domain and Facet Scales Figure 10.1 NEO PI-R profile form for Domain and Facet scales. Revised NEO Personality Inventory 321 The scores for the domains are not connected to the facet scores, and the sets of facets are plotted separately; that is, there will be seven separate lines or profiles on the form. ASSESSING VALIDITY One of the few areas of contention with the NEO PI-R is whether validity scales are necessary at all. The focus of this contention revolves around
  • 157. three issues: (1) whether responses to the NEO PI-R can be distorted and thus should be assessed; (2) the prevalence of such distortions within various groups of individuals; and (3) whether the use of validity scales to remove questionable profiles actually improves correlates with external criteria. Each of these issues is examined in tum. A variety of studies have demonstrated that the NEO PI-R, like all self-report instruments, can be distorted by students in simulation designs either in a positive (Ballenger, Caldwell- Andrews, & Baer, 2001; Griffin, Hesketh, & Grayson, 2004) or negative direction (Berry et al., 2001). It seems natural enough that distortions of responses occur less frequently in normal adults, where the NEO PI-R is used most often, because there is little motivation for doing so. The frequency of such distortions of responses also should decrease when the NEO PI-R is filled out anonymously, which typically happens in research. Again, finding that validity scales are not useful in normal adults and research settings would seem to reflect the nature of the participants and settings rather than the usefulness of the validity scales. However, in clinical and personnel screening settings, it seems probable that individuals
  • 158. may distort their responses in some manner and the preceding research demonstrates that scores on the NEO PI-R can be distorted. In both clinical and personnel selection settings, the examiner is concerned with assessing potential distortions to the domain and facet scales in this specific individual, because it will affect the interpretation of the scores. Thus, the finding that validity scales may be more useful in clinical and personnel selection settings would seem to reflect the nature of the setting. Several studies found that using the validity scales to remove NEO PI-R profiles with excessive distorted responses did not increase the relationship with external correlates (Piedmont, McCrae, Riemann, & Angleitner, 2000; Yang, Bagby, & Ryder, 2000). These findings typically occur when large groups of participants are assessed and the relative prevalence of such invalid profiles is relatively low. Several studies also have found that using the validity scales to remove NEO PI-R pro- files with excessive distorted responses increased the relationship with external correlates (Caldwell-Andrews, Baer, & Berry, 2000; Young & Schinka, 2001). These findings typi- cally occurred in clinical samples that would be more prone to distort their responses and in most cases were instructed to do so. Another way of framing this contention is whether response
  • 159. distortion is substance, a characteristic of the individual such as some form of psychopathology, or personality trait or style, an effortful alteration of responses that may be conscious or reflect lack of insight. In true diplomatic fashion, Morey et al. (2002) concluded that both substantive and stylistic variance may be involved in determining responses to the NEO PI-R in clinical patients. 336 Self-Report Inventories Third, high scores on the Openness (0) domain are not equivalent to intelligence, but rather to divergent thinking and creativity. They also do not imply that persons are unprincipled or without values. They are willing to entertain new ideas and can apply these ideas conscientiously. In a similar manner, low scores on the O domain do not mean that persons are closed, defensive, or authoritarian, but rather that they have a narrower scope and intensity of interest. "Openness may sound healthier or more mature to many psychologists, but the value of openness or closedness depends on the requirement of the situation, and both open and closed individuals perform useful functions in society" (Costa & McCrae, 1992, p. 15). Fourth, high scores on the Agreeableness (A) domain may seem to be more socially preferable and psychologically healthier, and such persons are
  • 160. generally easier with whom to interact. However, some situations require that the person be independent and skeptical of what is happening and being too agreeable can actually be a detriment. Dependent Personality Disorder would be characterized by a high score on the Agreeableness domain to illustrate that it is not necessarily psychologically healthy. Finally, high scores on the Conscientiousness (C) domain reflect that the person is more active in planning and organized in carrying out their activities. These qualities may be expressed in academic and occupational achievement or in annoying, fastidious behaviors. Low scores on the Conscientious domain do not reflect that individuals are without principles to govern their behavior, but rather they are more lackadaisical in working toward their goals. The six facet scores for each domain are intended to flesh out the general qualities that have been described by the parent domain scale. Important differences can be identified between individuals who have similar scores on the parent domain and a different pattern of scores on the facet scales for that domain. Two individuals with similar scores on the Extraversion (£) domain, one of whom has primary elevations on Activity (£4) and Excitement Seeking (£5), while the other has primary elevations on Assertiveness (£2) and Positive Emotions (£6), are very different persons. The interpretation of the facet scales, in addition to the domain
  • 161. scales on the NEO PI-R, is recommended in most cases, and particularly in clinical, educational, and occupational assessments. It is conceivable in research applications that only the domain scales are relevant to the issue under study, and consequently, there is no reason to score and interpret the facet scales. It is very important to consider computer scoring the NEO PI-R when all the domain and facet scales are to be interpreted, because of the high probability of some scoring error in making that many calculations. Computer scoring also allows for the factor score for each domain to be computed directly rather than using the formulas presented in the Manual (Costa & McCrae, 1992, p. 8) to estimate them. APPLICATIONS As a self-report inventory, the NEO PI-R is easily administered in a wide variety of settings and for a variety of purposes. It is the most widely used self- report measure of personality in countries around the world. Costa and McCrae (2003) reported that there are 9 published translations, 25 validated translations, 8 research translations, and 3 more translations in progress. This 60-page, single-spaced, bibliography illustrates the variety of issues and Revised NEO Personality Inventory 337 research on the NEO PI-Rand NEO-FFI. Any comprehensive review of this literature is
  • 162. beyond the scope of this Handbook. There are numerous settings in which the NEO PI-R·is appropriate for use: clinical, educational, medical, organizational, and research. The NEO PI- R is primarily used in educational, organizational, and research settings. The NEO PI- R is probably underutilized in clinical and medical settings and would seem worthy of wider usage in these settings. The NEO PI-R comes out of a long line ofresearch on the five- factor model of personality described earlier (p. 315) and will not be reiterated. The use of the NEO PI-R is discussed for each of these other four settings in turn. In clinical settings, the NEO PI-R can serve at least six useful purposes. First, it can provide a positive or nonpathological description of the person that can compensate for the heavy focus on psychopathology in most assessment tools and techniques. Most of the self-report inventories discussed in this Handbook have few, if any, positive statements to make about the person. Second, the focus on the more positive aspects of the person can help establish rapport and build the therapeutic alliance, and serves as an easy means of starting the feedback of the results of the assessment process before getting into the psychopathological issues. Third, there is a fairly extensive literature on the use of the NEO PI-R in the treatment of personality disorders (cf. Costa & Widiger, 2002). Fourth, the assessment of validity as described should be carried out routinely in clinical settings
  • 163. because of the higher probability of some type of response distortion. Fifth, knowledgeable others' ratings of the person using Form R can make an important contribution to under- standing him or her, particularly when there is some reason to suspect that may be some type ofresponse distortion. Finally, the NEO PI-R is neither a diagnostic instrument nor a measure of psychopathology and cannot be used as the sole assessment tool or technique in a clinical setting. In educational settings, the NEO PI-R can be used in advising students about personality characteristics that will facilitate or impede their academic progress. There are areas of study, such as chemistry or accounting, where careful attention to detail is mandatory for success, and other areas, such as philosophy or literature, where the focus is on more abstract or larger conceptual issues, and careful attention to detail is much less necessary. Persons with high scores on the Conscientiousness (C) domain are more likely to be successful in chemistry or accounting, while persons with high scores on the Openness ( 0) domain are more likely to be successful in philosophy or literature. In neither example is academic success foreclosed in the other area, but these individuals may have to work harder to recognize how their natural personality style affects their academic performance and they may need to find methods for coping with them to increase the probability of success. The NEO PI-R also can be used in counseling students in academic settings, which would be
  • 164. considered a clinical setting and was discussed earlier. In medical settings, the NEO PI-R can be used to identify personality characteristics that might facilitate or impede treatment. The NEO PI-R will be better accepted by medical patients than other self-report inventories that have a heavy focus on psychopathology. Medical patients, particularly pain patients, are frequently upset at the thought of psy- chological assessment because they think that it implies that the problem "is all in their head." Medical patients with high scores on the Neuroticism (N) domain can alert the examiner to review their background and history for the potential impact of psychopathology on the 338 Self-Report Inventories medical treatment. Medical patients with high scores on the Conscientiousness ( C) domain would be expected to be more likely to follow through on the recommended steps for treatment, particularly as the treatment process becomes more complex or long-term. An interesting line of research has used the NEO PI-R in predicting risk for coronary heart disease (cf. Costa, McCrae, & Dembroski, 1989), and Vollrath and Torgersen (2002) used the NEO PI-R to predict risky health behavior in college students. Costa and McCrae (2003) have listed the multiple areas in behavioral medicine in which
  • 165. the NEO PI-R is being used. In occupational settings, the NEO PI-R can be used to identify personality characteristics that might facilitate or impede success in a specific occupation. As with educational settings, certain personality dimensions are more important in some occupations than others. These personality dimensions can be used in selecting candidates for specific occupations or in ad- vising individuals on what occupations might be better suited for them. When the NEO PI-R is used to select potential candidates for specific occupations, the examiner must be aware that because examinees may simulate their scores on the appropriate domains, evaluating the validity of the NEO PI-R will be important (cf. Griffin et al., 2004). When an occupation requires significant amounts of interpersonal interactions, individ- uals with higher scores on the Extraversion (E) and Agreeableness (A) domains will be more likely to be successful than individuals with lower scores on these same domains. Conversely, when an occupation requires a significant amount of time by oneself, indi- viduals with lower scores on the Extraversion and Agreeableness domains will be more likely to be successful than individuals with higher scores on these same domains. Again, the examiner is reminded that when individuals do not have the optimal scores on the personality dimensions for a specific occupation, their success is not precluded, but they need to be aware of the potential impact these personality
  • 166. dimensions may have on their performance. PSYCHOMETRIC FOUNDATIONS Demographic Variables Age Specific norms are not provided by age for adults on the NEO PI-R. There are some differences in young adults ( <20) and a separate profile form and norms are used for them. The items on the NEO PI-3 are easier to read than those on the NEO PI-R, and the NEO PI-3 can be used for adolescents 12 years of age and older. Further research currently is being conducted to determine whether the NEO Pl-3 can be considered as a replacement for the NEO PI-Rat all ages. Terracciano, McCrae, Brant, and Costa (2005) examined age trends on the NEO PI-R in a sample of nearly two thousand adults in the Baltimore Longitudinal Study on Aging. There was a gradual curvilinear decline of slightly over one-half of a standard deviation in the Neuroticism (N) and Extraversion (E) domains from age 30 to age 90. There was a linear decline in the Openness (0) domain and linear increase in the Agreeableness (A) domain. There was a parabolic change in the Conscientiousness (C) domain with scores increasing until about age 70 and then slightly declining thereafter. All these changes in adulthood across the five domains were about one T score point per decade
  • 167. or slightly more than one- half of a standard deviation across the entire 60-year age span. A cross-sectional analysis Revised NEO Personality Inventory 339 of these data produced results that are similar to the longitudinal analysis. Terracciano et al. also provide similar information on all 30 of the facet scales on the NEO PI-R. Gender Gender does not create any general issues in NEO Pl-R interpretation because separate norms (profile forms) are used for men and women. Any gender differences in how individ- uals responded to the items on each scale are removed when the raw scores are converted to T scores. Consequently, men and women with a T score of 60 (84th percentile) on Agreeableness (A) are one standard deviation above the mean, although women have a slightly higher raw score (~142) than men (~136; Costa & McCrae, 1992, Appendix C, p. 79). Costa, Terracciano, and McCrae (2001) analyzed gender differences in 26 cultures and found that these gender differences were typically less than one-half of a standard deviation (5 T points), and most were closer to one-quarter of a
  • 168. standard deviation, relative to variations within gender. Education The potential effects of education have not been investigated in any systematic manner on the NEO PI-R, It is not apparent that such research would yield any significant findings given the ease with which the NEO PI-R is read and the similar findings in factor structure across multiple cultures. Ethnicity The effects of ethnicity per se on NEO PI-R performance have not been studied, if ethnicity as construed as being different from culture. However, the prolific literature on the cross- cultural use of the NEO PI-R is discussed briefly in the next section. Cross-Cultural Implementation Costa and McCrae (2003) reported that there are 9 published translations, 25 validated translations, 8 research translations, and 3 more translations in progress of the NEO PI-R. The breadth of the use of the NEO PI-R across various cultures can be seen by the fact that there are 79 contributing members to the Personality Profiles of Cultures Project who represent 51 cultures from six continents (McCrae, Terracciano, et al., 2005). This project is looking at the aggregate personality profiles of different cultures to assess whether they can
  • 169. provide insight into cultural differences and the stereotypes of national character (McCrae & Terracciano, 2006). The robustness of the factor structure of the NEO PI-R across these various cultures not only speaks to the usefulness of the NEO PI-R cross-culturally, but it allows for comparisons to be made into the actual differences in aggregate personality profiles. As would be expected, stereotypes of national character are erroneous (McCrae & Terracciano, 2006), similar to the erroneous conceptualization that all patients within a specific diagnostic category are alike (pp. 60-61). There are small differences in these aggregate personality profiles across the different cultures, but much larger variability within cultures. These variations in aggregate personality profiles appear to reflect real differences that warrant further investigation. In summary, it appears that demographic variables have minimal impact on the NEO PI-R profile in most individuals. The fact that the NEO PI- R can be read to indi- viduals and is available in many different languages makes it applicability even broader. 340 Self-Report Inventories Reliability The NEO PI-R Manual (Costa & McCrae, 1992, table 5, p. 44) reports the reliability
  • 170. (coefficient alpha) data for 1,539 individuals for Form S. Coefficient alpha ranged from .56 to .81 for the facet scales and .86 to .92 for the domain scales. The reliability data are quite good for the domain scales that contain 48 items each. As expected, the reliability data are somewhat lower, though still very respectable, for the facet scales that only have eight items each. A subset of the college students (N = 208) in the normative sample for the NEO PI-R were retested after an average of nearly 3 months with the NEO- FFI, which allowed determination of the reliability of the five domain scores. The test-retest correlations ranged from .75 to .83 across the five scales and averaged .79. The standard error of measurement is about 4 T points for the domain scales; that is, the individual's true score on the domain scales will be within ±4 T points two-thirds of the time. Stability There is impressive research on the long-term stability of NEO PI-R scores. Costa and McCrae (1988) reported that the stability coefficients over a 6- year period in a large sample of adults for the domains of N (Neuroticism), E (Extraversion), and O (Openness) were .83, .82, and .83, respectively. The stability coefficients over a 3- year period for the domains of A (Agreeableness) and C (Conscientiousness) were .63 and .79, respectively. These stability
  • 171. coefficients are higher and over a longer time period than for any of the other self-report inventories reviewed in this Handbook. CONCLUDING COMMENTS The voluminous literature on the five-factor model of personality provides solid underpin- nings for the NEO PI-R (Costa & McCrae, 1992). The Personality Profiles of Cultures Project that represents 51 cultures from six continents (McCrae, Terracciano, et al., 2005) shows how well regarded the NEO PI-R is internationally. More widespread use of the NEO PI-R in clinical and medical settings to provide a positive perspective on the person is warranted given the heavy bias toward psychopathology in virtually all other assessment tests and techniques. The existence of a parallel form for rating of the person by a knowl- edgeable other (Form R) is an invaluable source of information any time there is reason to suspect any type of response distortion that seems particularly helpful in clinical and medical settings. REFERENCES Bagby, R. M., Rector, N. A, Bindseil, K., Dickens, S. F., Levitan, R. D., & Kennedy, S. H. (1998). Self-reports and informant ratings of personalities of depressed outpatients. American Journal ofPsychiatry, 155, 437-438. Ballenger, J. F., Caldwell-Andrews, A., & Baer, R. A. (2001). Effects of positive impression man-
  • 172. agement on the NEO PI-Rina clinical population. Psychological Assessment, 13, 254-260. Revised NEO Personality Inventory 341 Berry, D. T. R., Bagby, R. M., Smerz, J., Rinaldo, J.C., Caldwell-Andrews, A., & Baer, R. A. (2001). Effectiveness of NEO PI-R research validity scales for discriminating analog malingering and genuine psychopathology. Journal ofPersonality Assessment, 76, 496-516. Caldwell-Andrews, A. (2001). Relationships between MMPI-2 validity scales and NEO PI-R exper- imental validity scales in police candidates. Unpublished doctoral dissertation, University of Kentucky. Caldwell-Andrews, A., Baer, R. A., & Berry, D. T. R. (2000). Effects of response sets on NEO PI-R scores and their relation to external criteria. Journal of Personality Assessment, 74, 472- 488. Costa, P. T., Jr., & McCrae, R. R. (1985). The NEO Personality Inventory manual. Odessa, FL: Psychological Assessment Resources. Costa, P. T., Jr., & McCrae, R.R. (1988). Personality in adulthood: A six-year longitudinal study of self-reports and spouse ratings on the NEO PI. Journal of Personality and Social Psychology, 54, 853-863.
  • 173. Costa, P. T., Jr., & McCrae, R.R. (1992). Revised NEO Personality Inventory (NEO PI-R) and NEO Five-Factor Inventory (NEO-FFI) professional manual. Odessa, FL: Psychological Assessment Resources. Costa, P. T., Jr., & McCrae, R.R. (2003). Bibliographyfor the NEO Pl-Rand NEO FF!. Lutz, FL: Psy- chological Assessment Resources. Available at www3.parinc.com/uploads/pdfs/NEO_bib.pdf. Costa, P. T., Jr., McCrae, R.R., & Dembroski, T. M. (1989). Agreeableness vs. antagonism: Explica- tion of a potential risk factor for CHD. In A. Siegman & T. M. Dembroski (Eds.), In search of coronary-prone behavior: Beyond Type A (pp. 41-63). Hillsdale, NJ: Erlbaum. Costa, P. T., Jr., Terracciano, A., & McCrae, R. R. (2001). Gender differences in personality traits across cultures: Robust and surprising findings. Journal of Personality and Social Psychology, 81, 322-331. Costa, P. T., Jr., & Widiger, T. A. (2002). Personality disorders and the five-factor model ofpersonality (2nd ed.). Washington, DC: American Psychological Association. Fiedler, E. R., Oltmanns, T. F., & Turkheimer, E. (2004). Traits associated with personality disorders and adjustment to military life: Predictive validity of self and peer reports. Military Medicine, 169, 32-40. Finn, S. (1996). Using the MMPI-2 as a therapeutic
  • 174. intervention. Minneapolis: University of Min- nesota Press. Fischer, C. T. (1994). Individualizing psychological assessment. Hillsdale, NJ: Erlbaum. Griffin, B., Hesketh, B., & Grayson, D. (2004). Applicants faking good: Evidence of bias in the NEO PI-R. Personality and Individual Differences, 36, 1545-1558. McCrae, R. R., Costa, P. T., Jr., & Martin, T. A. (2005). The NEO PI-3: A more readable revised NEO Personality Inventory. Journal of Personality Assessment, 84, 260-270. McCrae, R. R., Costa, P. T., Jr., Parker, W. D., Mills, C. J., Terracciano, A., De Fruyt, F., et al. (2002). Personality trait development from age 12 to age 18: Longitudinal, cross-sectional, and cross-cultural analyses. Journal of Personality and Social Psychology, 83, 1456-1468. McCrae, R.R., & Terracciano, A. (2006). National character and personality. Current Directions in Psychological Science, 15, 156-161. McCrae, R.R., Terracciano, A., & 79 Members of the Personality Profiles of Cultures Project. (2005). Personality profiles of cultures: Aggregate personality traits. Journal of Personality and Social Psychology, 89, 407-425. Morey, L. C. (1991). Personality Assessment Inventory: Professional manual. Odessa, FL: Psycho- logical Assessment Resources.
  • 175. Morey, L. C., Quigley, B. D., Sanislow, C. A., Skodol, A. E., McGlashan, T. H., Shea, M. T., et al. (2002). Substance or style? An investigation of the NEO PI-R validity scales. Journal of Per- sonality Assessment, 79, 583-599. https://www3.parinc.com/uploads/pdfs/NEO_bib.pdf 342 Self-Report Inventories Pauls, C. A., & Crost, N. W. (2005). Effects of different instructional sets on the construct validity of the NEO PI-R. Personality and Individual Differences, 39, 297- 308. Piedmont, R. L., McCrae, R. R., Riemann, R., & Angleitner, A. (2000). On the invalidity of va- lidity scales: Evidence from self-report and observer ratings in volunteer samples. Journal of Personality and Social Psychology, 78, 582-593. Schink.a, J. A., Kinder, B. N., & Kremer, T. (1997). Research validity scales for the NEO PI-R: Development and initial validation. Journal ofPersonality Assessment, 68, 127-138. Terracciano, A., McCrae, R. R., Brant, L. J., & Costa, P. T., Jr. (2005). Hierarchical linear modeling analyses of the NEO PI-R scales in the Baltimore Longitudinal Study of Aging. Psychology and Aging, 20, 493-506. Vollrath, M., & Torgersen, S. (2002). Who takes health risks? A probe into eight personality types. Personality and Individual Differences , 32, 1185-1198.
  • 176. Wiggins, J. S. (Ed.). (1996). The five-factor model ofpersonality. New York: Guilford Press. Yang, J., Bagby, R. M., & Ryder, A.G. (2000). Response style and the NEO PI-R: Validity scales and spousal ratings in a Chinese psychiatric sample. Assessment, 7, 389-402. Young, M. S., & Schink.a, J. A. (2001). Research validity scales for the NEO PI-R: Additional evidence for reliability and validity. Journal ofPersonality Assessment, 76, 412-420. a-315-321a-336-342 Chapter 9 PERSONALITY ASSESSMENT INVENTORY The Personality Assessment Inventory (PAI: Morey, 1991) is a broadband measure of the major dimensions of psychopathology found in Axis I disorders and some Axis II disorders of the DSM-IV-TR (American Psychiatric Association, 2000). The PAI consists of 4 validity, 11 clinical, 5 treatment consideration, and 2 interpersonal scales (see Table 9.1 ). There also are three or four subscales for 9 of the 11 clinical scales and for one treatment consideration scale. Finally, a PAI Structural Summary provides the tables for scoring and profiles for plotting supplemental indices. Table 9.2 provides
  • 177. the general information on the PAI. HISTORY The PAI (Morey, 1991) was developed following a sequential, construct-validation strategy. The underlying construct for most of the clinical syndrome scales based on the extant research is multidimensional, and so the scale to measure each clinical syndrome was to be composed of several subscales. Once these component subscales were identified, items were written so that the content was directly relevant for each one. Each item in the original item pool of over 2,200 items then was rated by four individuals for its appropriateness for the specific subscale. Then four experts were asked to assign times to the appropriate scale, and items that did not reach 75% agreement either were dropped or rewritten. These items then were reviewed by a bias-review panel as to whether they could be perceived as being offensive on the basis of gender, race, religion, or ethnic-group membership. Any item that was perceived as being offensive or could inappropriately identify a normal behavior as psychopathology was deleted. Expert judges, who were nationally recognized within the content area of each scale, then were used to sort the remaining items to ensure that each item was related to its actual construct for each scale on the PAI. The overall agreement was 94.3% among these judges for the 776 items that were retained for the alpha version of the
  • 178. PAI. Groups of college students then completed the alpha-version of the PAI in one of three conditions: (1) standard, in which students were asked to respond frankly and honestly; (2) positive-impression management, in which the students were asked to respond as if they were trying to impress a potential employer; and (3) malingering, in which the students were asked to simulate the responses of a person with a mental disorder. Items for the beta 283 284 Self-Report Inventories Table 9.1 Personality Assessment Inventory (PAI) scales Validity Scales ICN INF NIM PIM Clinical Scales SOM SOM-C SOM-S
  • 181. Negative Impression Management Positive Impression Management Somatic Complaints Conversion Somatization Health Concerns Anxiety Cognitive Affective Physiological Anxiety-Related Disorders Obsessive-Compulsive Phobias Traumatic Stress Depression Cognitive Affective Physiological Mania Activity Level Grandiosity Irritability
  • 182. Paranoia Resentment Hypervigilance Persecution Schizophrenia Psychotic Experience Social Detachment Thought Disorder Borderline Features Affective Instability Identity Problems Negative Relationships Self-Harm Antisocial Features Antisocial Behaviors Egocentricity Stimulus-Seeking Alcohol Problems Drug Problems Personality Assessment Inventory 285 Table 9.1 (Continued) Treatment Consideration Scales AGG Aggression AGG-A Aggressive Attitude
  • 183. AGG-V Verbal Aggression AGG-P Physical Aggression SUI Suicidal Ideation STR Stress NON Nonsupport RXR Treatment Rejection Interpersonal Scales DOM Dominance WRM Warmth version of the PAI were selected on six bases: (I) reasonable variability across the construct, essentially an item-difficulty parameter; (2) a positive, corrected part-whole correlation of the item with the total score of the other items on the scale; (3) the corrected part-whole correlation was higher than the correlation with measures of social desirability and positive and negative impression management; (4) a higher correlation with their own scale than other scales; (5) less face valid or "transparent" measures of the construct embodied in the Table 9.2 Personality Assessment Inventory (PAI) Authors: Published: Edition: Publisher: Website: Age range:
  • 184. Reading level: Administration formats: Additional languages: Number of items: Response format: Administration time: Primary scales: Additional scales: Hand scoring: General texts: Computer interpretation: Morey 1991 1st Psychological Assessment Resources www.parinc.com 18+ 4th grade paper/pencil, computer, CD, cassette Arabic, French Canadian, Korean, Norwegian, Serbian, Slovene and Swedish 344 False/Not at all True, Slightly True, Mainly True, Very True 40--50 minutes 4 Validity, 11 Clinical, 5 Treatment Considerations, 2 Interpersonal
  • 185. Subscales for 9 clinical scales and 1 Treatment Consideration scale Self-scoring answer sheet Morey (2003), Morey (2007a) Psychological Assessment Resources (Clinical: Morey; Corrections: Morey & Edens) www.parinc.com 286 Self-Report Inventories scale; and (6) absence of gender differences. Using these criteria, a total of 597 items were retained for the beta-version of the PAL The beta-version of the PAI was administered to three groups of individuals: (1) com- munity adults; (2) clinical patients; and (3) college students with either positive impression or malingering instructions. Similar item characteristics were assessed for the beta ver- sion of the PAI as were assessed with the alpha version. The final 344 items on the PAI represented the best balance of all these item characteristics, including the requirement that no item could be scored on more than one scale-there are no overlapping items on the PAI. Normative data for the PAI were collected from three groups: (1) 1,462 community- dwelling adults from which a subsample of 1,000 were selected who were census-matched; (2) 1,265 clinical patients from 69 clinical sites; and (3) 1,051
  • 186. college students. The norms for the PAI are based on 1,000 individuals from the census- matched sample. The skyline profile on the standard profile form demarcates two standard deviations above the mean in the clinical sample allowing the clinician to compare the individual simultaneously with both the census-matched and clinical samples (see Figure 9.1). PAI Scales - Side A 8PAI" 10 11 A C D E y z 110 - ,o: ..,: - 110 70-=------- -~= _36=_-_- 40=------oo: ~=-70 .... I - 0060 30: 25- 20- ,,_- ,0- 5- 20 : - 25: 20= 20: 1!5: - 5- - =-20- 15-_- - 15------5--- H'i=------5-----,..,~---=- 5050------ 15- - - - - - - ,o: 5- 15- 0- ,: 10- ,0- 10- 10- :. 4040 0-
  • 187. 100 15- 80.: 10- 15- 66- OS- ":ss: ,o: .,_ .,_ 30- 35: 315- 3J- ,.,_ 20- - 100 :_ 80 3!5- 30.: :_ 30 5- 10- 20 _: :_ 20 Raw /CN INF NIM PIM
  • 189. B SUI C STR D NON E RXR Y DOM Z WRM Raw Tscore Tscore Figure 9.1 PAI profile form. Personality Assessment Inventory 287 Short Form of the PAI The first 160 items of the PAI can be used to provide a reasonable estimate of 20 of the 22 clinical scales for all scales but Inconsistency (JCN) and Stress (STR). These estimates are possible because the items with the largest item-scale correlations were located at the
  • 190. beginning of the test when the final version of the PAI was developed. Table 11.1 in the Personality Assessment Inventory Professional Manual (Morey, 1991, p. 142) provides the descriptive characteristics for these 160 items. The short form only should be used in the most unusual circumstances, and the estimated scores must be considered as generating only the most tentative interpretive hypotheses. Frazier, Naugle, and Haggerty (2006) found that agreement between the short- and full-form of the PAI was affected adversely when the validity scales were elevated. They also noted that individuals with lower levels of ability were more likely to leave items missing and produce invalid protocols. These individuals are the very ones for whom the short form was designed. The hope was that it would provide information about the presence of psychopathology that otherwise might not be available from a self-report inventory. PAI-A (Adolescent) As a result of interest by professionals in using the PAI with adolescents in clinical settings, work was begun in 1999 on piloting an adolescent version of the inventory (Morey, 2007b ). The intent of this work was to explore the applicability of an adolescent version that would closely parallel the adult version of the PAL It would retain the structure and, as much as possible, the items of the adult version rather than be an entirely new version targeted specifically at an adolescent population. The development of the PAI-A involved
  • 191. an adaptation of the items of the adult PAI so that the content was meaningful when applied to adolescents. The approach taken was a conservative one-the question was not whether the item was optimized to capture the experience of an adolescent, but rather whether the item would retain its original meaning when read by the adolescent. This conservative approach was merited in that the items on the adult PAI had been selected on the basis of numerous criteria, and the rewording or replacement of items could have significant and unanticipated effects on the final properties of the adolescent version and its interpretability as parallel to the adult version. Thus, these revisions included rewordings of relatively few items and involved close equivalents of the original wording. The next stage in development involved collecting a diverse and representative sample of adolescent patients, and determining the psychometric comparability of items on the adolescent and adult versions. A relatively small number of items were identified that appeared to have different characteristics in adolescent patients than in adult patients, and the decision was made to explore the impact of elimination of these items. On the basis of these analyses, items were removed in an effort to eliminate the most problematic items and yield an item distribution pattern that would closely parallel the adult instrument. On the basis of this strategy, the final PAI-A included 264 items. The PAI-A was then standardized using a census-matched normative sample of 707 adolescents aged 12 to 18, as well as
  • 192. a diverse clinical sample of 1,160 patients in the same age range. The average internal consistency for the 22 clinical scales was .79 in the community sample and .80 in the 288 Self-Report Inventories clinical sample, while the average test-retest reliability for these scales was .78 over an interval of approximately 18 days. ADMINISTRATION The first issue in the administration of the PAI is ensuring that the individual is invested in the process. Taking a few extra minutes to answer any questions the individual may have about why the PAI is being administered and how the results will be used will pay excellent dividends. The clinician should work diligently to make the assessment process a collaborative activity with the individual to obtain the desired information. This issue of therapeutic assessment (Finn, 1996; Fischer, 1994) was covered in depth in Chapter 2 (pp. 43-44). Reading level is a crucial factor in determining whether a person can complete the PAI; inadequate reading ability (to be discussed) is a major cause of inconsistent patterns of item endorsement. Morey (1991) suggests that most individuals
  • 193. who can read at the fourth- grade level can take the PAI with little or no difficulty because the items are written on an fourth-grade level or less. The PAI has the easiest reading level of any of the self-report inventories reviewed in this Handbook. As such, one reason for selecting the PAI is the larger number of clients who can complete it successfully compared with the MMPI-2 (Butcher, Dahlstrom, Graham, Tellegen, & Kaemmer, 1989) and the MCMI-III (Millon, Davis, & Millon, 1997), both of which are written at the eighth- grade level. SCORING Scoring the PAI is relatively straightforward either by hand or computer. A different answer sheet is used for hand scoring (Form HS Answer Sheet) and optical scanning (Form SS Answer Sheet), so the proper answer sheet must be selected for the method of scoring. If the PAI is administered by computer, the computer automatically scores it. If the individual's responses to the items have been placed on an answer sheet, these responses can be entered into the computer by the clinician for scoring or they can be hand scored. If the clinician enters the item responses into the computer for scoring, they should be double entered so that any data entry errors can be identified. The first step in hand scoring is to examine the answer sheet carefully and indicate omitted items and double-marked items by drawing a line
  • 194. through all four responses to these items with brightly colored ink. Also, cleaning up the answer sheet is helpful and facilitates scoring. Responses that were changed need to be erased completely if possible, or clearly marked with an "X" so that the clinician is aware that this response has not been endorsed by the client. The PAI (Morey, 1991) and the NEO PI-R (Costa & McCrae, 1992) are the only self- report inventories reviewed in this Handbook that do not use "true/false" items. Both of these inventories have the same publisher (Psychological Assessment Resources), which may account for not using "true/false" items. The PAI uses a four-point Likert scale ranging from "false, not at all true," "slightly true," "mainly true," to "very true." These potential response options always are presented in this same order on the answer sheet. When "very Personality Assessment Inventory 289 true" is the scored direction for a specific item, the response options are scored as 0, 1, 2, or 3 ("very true"). When "false, not at all true" is the scored direction, the preceding four response options are scored as 3 ("false, not at all true"), 2, 1, or 0. Thus, the total raw score on an eight-item scale, which is the characteristic number of items on each subscale of the clinical scales, can range from Oto 24. It is imperative
  • 195. that the clinician realize that the total score is the sum of the response options for each scale, not the total number of items endorsed on the scale, which is the method for scoring the MCMI-III, MMPI-2, and MMPI-A. The PAI is easier to score than other self-report inventories because no templates are required. The answer sheet, on which the person records his or her responses, is self-scoring. The items on each scale are designated by ruled and shaded boxes that are identified by scale abbreviations. The total raw score for each scale or subscale is entered in the corresponding box with the same abbreviation on Side B of the profile form. The subscales for the various scales on the PAI are plotted on Side B of the profile form. The total scores, which are the sum of the scores on the subscales, for all scales are entered on Side A of the profile form. Although this process of hand scoring may sound somewhat complex, it is straightfor- ward and can be carried out in 10 to 15 minutes. It is advisable to have another person double-check all the scoring and transferring of numbers to catch any scoring or transcrip- tion errors before the interpretive process begins. ASSESSING VALIDITY Figure 9.2 provides the flowchart for assessing the validity of this specific administration of the PAI and the criteria for using this flowchart are provided in Table 9.3. The clinician is
  • 196. reminded that the criteria provided in Table 9.3 are continuous, yet ultimately the decisions that must be made in implementation of the flowchart in Figure 9.2 are dichotomous. General guidelines will be provided for translating these continuous data into dichotomous decisions on the PAI, but these guidelines need to be considered within the constraints of this specific client and the circumstances for the evaluation. Item Omissions Morey (1991) recommends that more than 95% of the items should be endorsed if the PAI is to be interpreted; that is, no more than 17 (.05 x 344) items should be omitted. Table 9.3 shows that omitting 17 items is somewhere between the 93rd and 98th percentile in both normal and clinical samples. Morey also recommends that more than 80% of the items should be endorsed for any individual scale to be interpreted. The subscales of the clinical scales all have six to eight items, so the omission of two items from one of these subscales (6/8 = 75%) would mean that subscale should not be interpreted, although the entire scale could be interpreted if more than 80% of its items were endorsed. Consistency of Item Endorsement Consistency of item endorsement on the PAI is assessed by the Inconsistency Scale (/CN) and Infrequency Scale (INF). The Inconsistency Scale (/CN) scale consists of 10 pairs of
  • 197. 310 Self-Report Inventories shown on this form also allows the clinician to compare the individual's scores on each scale with the clinical sample. APPLICATIONS As a self-report inventory, the PAI is easily administered in a wide variety of settings and for a variety of purposes. Although the PAI was developed as a broadband measure of psychopathology in clinical settings, its use has gradually been extended to forensic and criminal settings, neuropsychological settings, and medical settings. One of the primary reasons for its rising popularity in these settings is that it is shorter and easier to read than the other self-report inventories. Somewhat different issues must be considered in the administration of the PAI in per- sonnel selection and forensic settings compared with the more usual clinical setting. These general issues were covered in Chapter 6 with the MMPI-2 (pp. 197-198) and will not be repeated here, but they should be consulted by anyone who is using the PAI in personnel selection or forensic settings for the first time. One of the considerations in the use of any assessment test or technique in forensic settings is whether it will meet the legal standards for
  • 198. admissibility. These considerations were raised in Chapter 8 with the MCMI-III (pp. 276-277) because various authors have opined that the MCMI-III does or does not meet these legal standards. Morey, Warner, and Hopwood (2007) have described how the PAI meets the legal standards for admissibility. In a survey of forensic psychologists, Lally (2003) reported that the PAI was rated as being acceptable for the evaluation of mental status at the time of the offense, risk for violence, risk for sexual violence, competency to stand trial, and malingering. The PAI is increasingl y being used in correctional settings because it is shorter and easier to read than other self-report inventories. Edens, Cruise, and Buffington-Vollum (2001) have provided a general overview of the issues involved in using the PAI in forensic and correctional settings. Edens and Ruiz (2006) reported that elevated scores on the Positive Impression Management (PIM > T56) scale in conjunction with elevated scores on the Antisocial Features (ANT > T59) scale predicted institutional misconduct among male inmates. Caperton, Edens, and Johnson (2004) found that elevated scores on the Antisocial Features (ANT > T69) scale identified sex offenders who were more likely to be management risks while in prison. Finally, Kucharski, Duncan, Egan, and Falkenback (2006) found that three levels of psychopathy as measured by the PCL-R were not related to scores on Negative Impression Management (NIM) scale, the Malingering Index (MAL),
  • 199. or Rogers' discriminant function (RDF), and that the criminal defendants with higher levels of psychopathy were not more likely to malinger as measured by the PAI scales. Finally, the PAI is being used in neuropsychological settings to evaluate whether the effects of brain injury have produced any psychological sequelae. Demakis et al. (2007) found that 34.7% of their sample of 95 individuals who had suffered a traumatic brain injury did not elevate any clinical scale on the PAI above a T score of 69. This number of unelevated profiles in individuals with brain injury is commonly found (cf. Warriner, Rourke, Velikonja, & Metham, 2003). The most common two-point codetypes were: SCZ/BDL-(Schizophrenia/Borderline Features)-18.9%; DEP/SCZ-(Depression/ Schizophrenia)-12.6%; and SOM/ANX-(Somatic Complaints/Anxiety)-10.5%. Personality Assessment Inventory 311 PSYCHOMETRIC FOUNDATIONS Demographic Variables Age Morey (1996a) reported age has minimal impact on the PAI scale scores. Individuals who
  • 200. were 18 to 29 years of age elevated the Paranoia (PAR) scale 5 T points, the Borderline Features (BOR) scale 6 T points, the Antisocial Features (ANT) scale 7 T points, the Aggression (AGG) scale 5 T points, and the Stress (STR) scale 4 T points higher than other age groups. The primary subscale impacted by this elevation in score was Paranoia- Persecution (PAR-P), Borderline Features-Identity Problems (BOR-1), Antisocial Features- Stimulus Seeking (ANT-S), and Aggression-Verbal Aggression (AGG-V). There are no subscales for Stress (STR). Individuals who were 60+ years of age lower these same five scales 4 T points. The primary subscale lowered by this elevation was Paranoia-Resentment (PAR-R), Borderline Features-Identity Problems (BOR-1), Antisocial Features-Antisocial Behavior (ANT-A), and Aggression-Physical Aggression (AGG- P). Gender Gender does not create any general issues in PAI interpretation because the items were selected to eliminate gender bias. Men elevated the Antisocial Features (ANT) scale by 3 T points more than women (Morey, 1996a). This elevation primarily impacted the Antisocial Features-Antisocial Behavior (ANT-A) subscale. Education The potential effects of education have not been investigated in any systematic manner on the PAI, although such research clearly is needed.
  • 201. Ethnicity The effects of ethnicity on PAI performance also have not been investigated in any system- atic manner. Morey (1996a) reported that nonwhite individuals elevated the Paranoid (PAR) scale 6 T points compared with White individuals. This elevation primarily impacted the Paranoid-Hypervigilance (PAR-H) subscale. Reliability The PAI Professional Manual (Morey, 1991, Appendix E) reported the reliability data for 75 community-dwelling adults who were retested after an average of 24 days and 80 undergraduate students who were retested at 28 days. The test- retest correlations ranged from .85 to .94 in the adult sample and ranged from .66 to .90 in the student sample across the 11 clinical scales. The standard error of measurement ranges from 2.8 to 4.6 T points for these 11 clinical scales, that is, the individual's "true" score on the clinical scales will be within ±3 to 5 T points two-thirds of the time. Codetype Stability There are limited empirical data that indicate how consistently individuals will obtain the same two highest clinical scales on two successive administrations of the PAL Codetype
  • 202. 312 Self-Report Inventories stability was examined in all 155 individuals who were part of the examination of retest reliability just described. When only the single highest scale was examined across the two administrations, 57.4% had the same high-point scale. When this analysis was limited only to those individuals with significant elevations (20/155), 76.9% had the same high-point scale. These data should only be considered to be an estimate of the actual codetype stability of the PAI. Because only a single high-point scale was considered, there has to be a lower rate of stability when the two highest scales are required to be the same. On the other hand, clinical samples would produce higher elevations on the PAI clinical scales than these normal individuals and the preceding data suggest that concordance rates would be higher for more elevated profiles. CONCLUDING COMMENTS The PAI (Morey, 1991) is the newest of the self-report inventories reviewed in this Hand- book. The PAI is gradually gaining a wide base of usage because it is shorter than all other self-report inventories except the MCMI-III and it has the lowest reading level of any of them. There has been a substantial increase in research with the PAI in each ensuing year that continues to validate its use in a number of different settings.
  • 203. REFERENCES American Psychiatric Association. (2000). Diagnostic and statistical manual of mental disorders (4th ed., text rev.). Washington, DC: Author. Bagby, R. M., Nicholson, R. A., Bacchiochi, J. R., Ryder, A.G., & Bury, A. S. (2002). The predictive capacity of the MMPI-2 and PAI validity scales and indexes to detect coached and uncoached feigning. Journal ofPersonality Assessment, 78, 69-86. Baity, M. R., Siefert, C. J., Chambers, A., & Blais, M.A. (2007). Deceptiveness with the PAI: A study of nai"ve faking with psychiatric inpatients. Journal ofPersonality Assessment, 88, 16-24. Butcher, J. N., Dahlstrom, W. G., Graham, J. R., Tellegen, A. M., & Kaemmer, B. (1989). MMPI-2: Manual for administration and scoring. Minneapolis: University of Minnesota Press. Caperton, J. D., Edens, J. F., & Johnson, J. K. (2004). Predicting sex offender institutional adjustment and treatment compliance using the PAI. Psychological Assessment, I 6, 187-191. Cashel, M. L., Rogers, R., & Sewell, K. (1995). The PAI and the detection of defensiveness. Assess- ment, 2, 333-342. Clark, M. E., Gironda, R. J., & Young, R. W. (2003). Detection of back random responding: Effec- tiveness of MMPI-2 and PAI validity indices. Psychological Assessment, 15, 223-234.
  • 204. Costa, P. T., Jr., & McCrae, R.R. (1992). Revised NEO Personality Inventory (NEO PI-R) and NEO Five-Factor Inventory (NEO-FFI) professional manual. Odessa, FL: Psychological Assessment Resources. Demakis, G. J., Hammond, F., Knotts, A., Cooper, D. B., Clement, P., Kennedy, J., et al. (2007). The PAI in individuals with traumatic brain injury. Archives of Clinical Neuropsychology, 22, 123-130. Edens, J. F., Cruise, K. R., & Buffington-Vollum, J. K. (2001). Forensic and correctional applications of the PAI. Behavioral Sciences and the Law, 19, 519-543. Edens, J. F., Poythress, N. G., & Watkins-Clay, M. M. (2007). Detection of malingering in psychiatric unit and general population prison inmates: A comparison of the PAI, SIMS, and SIRS. Journal ofPersonality Assessment, 88, 33-42. Personality Assessment Inventory 313 Edens, J. F., & Ruiz, M. A. (2006). On the validity of validity scales: The importance of defensive responding in the prediction of institutional misconduct. Psychological Assessment, 18, 220--224. Finn, S. (1996). Using the MMPI-2 as a therapeutic intervention. Minneapolis: University of Min- nesota Press. Fischer, C. T. (1994). Individualizing psychological assessment.
  • 205. Hillsdale, NJ: Erlbaum. Frazier, T. W., Naugle, R. L, & Haggerty, K. A. (2006). Psychometric adequacy and comparability of the short and full forms of the PAL Psychological Assessment, I 8, 324-333. Hopwood, C. J., Morey, L. C., Rogers, R., & Sewell, K. (2007). Malingering on the PAI: Identification of specific feigned disorders. Journal of Personality Assessment, 88, 43-48. Kucharski, L. T., Duncan, S., Egan, S.S., & Falkenbach, D. M. (2006). Psychopathy and malingering of psychiatric disorder in criminal defendants. Behavioral Sciences and the Law, 24, 633---644. Kucharski, L. T., Toomey, J. P., Fila, K., & Duncan, S. (2007). Detection of malingering of psy- chiatric disorder with the PAI: An investigation of criminal defendants. Journal of Personality Assessment, 88, 25-32. Lally, S. J. (2003). What tests are acceptable for use in forensic evaluations? A survey of experts. Professional Psychology: Research and Practice, 34, 491-498. Millon, T., Davis, R., & Millon, C. (1997). MCMI-III manual (2nd ed.). Minneapolis, MN: National Computer Systems. Morey, L. C. (1991). Personality Assessment Inventory professional manual. Odessa, FL: Psycho- logical Assessment Resources. Morey, L. C. (1996a). An interpretive guide to the PAI. Odessa,
  • 206. FL: Psychological Assessment Resources. Morey, L. C. (1996b). PAI structural summary. Odessa, FL: Psychological Assessment Resources. Morey, L. C. (1999). PAI interpretive explorer module manual. Odessa, FL: Psychological Assessment Resources. Morey, L. C. (2003). Essentials of PAI assessment. Hoboken, NJ: Wiley. Morey, L. C. (2007a). An interpretive guide to the PAI. Odessa, FL: Psychological Assessment Resources. Morey, L. C. (2007b). Personality Assessment Inventory--- A.dolescent professional manual. Odessa, FL: Psychological Assessment Resources. Morey, L. C., & Hopwood, C. J. (2004). Efficiency of a strategy for detecting back random responding on the PAL Psychological Assessment, 16, 197-200. Morey, L. C., Warner, M. B., & Hopwood, C. J. (2007). Personality Assessment Inventory: Issues in legal and forensic settings. In A. M. Goldstein (Ed.), Forensic psychology: Emerging topics and expanding roles (pp. 97-126). Hoboken, NJ: Wiley. Peebles, J., & Moore, R. J. (1998). Detecting socially desirable responding with the PAI: The Positive Impression Management scale and the Defensiveness Index. Journal ofClinical Psychology, 54, 621---628.
  • 207. Rogers, R., Sewell, K. W., Morey, L. C., & Ustad, K. L. (1996). Detection of feigned mental disor- ders on the Personality Assessment Inventory: A discriminant analysis. Journal of Personality Assessment, 67, 629-640. Warriner, E. M., Rourke, B. P., Velikonja, D., & Metham, L. (2003). Subtypes of emotional and behavioral sequelae in patients with traumatic brain injury. Journal ofClinical and Experimental Neuropsychology, 25, 904-917. a-283-289a-310-313 Chapter 6 MINNESOTA MULTIPHASIC PERSONALITYINVENTORY4 The Minnesota Multiphasic Personality Inventory-2 (MMPI-2: Butcher, Dahlstrom, Gra- ham, Tellegen, & Kaemmer, 1989; Butcher et al., 2001) is a broadband measure of the major dimensions of psychopathology found in Axis I disorders and some Axis II disor- ders of the DSM-W-TR (American Psychiatric Association, 2000). The MMPI-2 consists of 9 validity and 10 clinical scales in the basic profile, along with 15 content scales, 9 restructured clinical scales, and 20 supplementary scales (s ee Table 6.1). There also are subscales for most of the clinical and content
  • 208. scales with easily over 120 scales that can be scored and interpreted on the MMPI-2. Table 6.2 provides general information on the MMPI-2. HISTORY Hathaway and McKinley ( 1940) sought to develop a multifaceted or multiphasic person- ality inventory, now known as the Minnesota Multiphasic Personality Inventory (MMPI), that would surmount the shortcomings of the previous personality inventories. These short- comings included (a) relying on how the researcher thought individuals should respond to the content of items rather than validating how they actually responded to the items; (b) using only face-valid items whose purpose or intent was easily understood; and (c) failing to assess whether individuals were trying to distort their responses to the items in some manner. Instead of using independent sets of tests, each with a specific purpose, Hathaway and McKinley included in a single inventory a wide sampling of behavior of significance to psychologists. They wanted to create a large pool of items from which various scales could be constructed, in the hope of evolving a greater variety of valid personality descriptions than was currently available. MMPI (Original Version) To this end, Hathaway and McKinley (1940) assembled more than 1,000 items from psychiatric textbooks, other personality inventories, and clinical
  • 209. experience. The items were written as declarative statements in the first-person singular, and most were phrased in the affirmative. Using a subset of 504 items, Hathaway and McKinley constructed a series of quantitative scales that could be used to assess various categories of psychopathology. The items had to be answered differently by the criterion group (e.g., hypochondriacal 135 136 Self-Report Inventories Table 6.1 Minnesota Multiphasic Personality Inventory-2 (MMPl-2) scales Validity Scales ? VRIN TRIN F FB Fp L K s Clinical Scales
  • 210. I (Hs) 2 (D) 3 (Hy) 4 (Pd) 5(Mf) 6 (Pa) 7 (Pt) 8 (Sc) 9 (Ma) 0 (Si) Restructured Clinical Scales RCd RC/som RC2lpe RC3cyn RC4asb RC6per RC7dne RC8abx RC9hpm Content Scales
  • 211. ANX FRS OBS DEP HEA BIZ ANG CYN ASP TPA LSE SOD FAM WRK TRT Cannot Say Variable Response Consistency True Response Consistency Infrequency Back Infrequency Infrequency Psychopathology Lie
  • 212. Correction Superlative Hypochondriasis Depression Hysteria Psychopathic Deviate Masculinity-Femininity Paranoia Psychasthenia Schizophrenia Hypomania Social Introversion Demoralization Somatization Low Positive Emotionality Cynicism Antisocial Behavior Persecutory Ideas Dysfunctional Negative Emotions Aberrant Experiences Hypomanic Activation Anxiety Fears Obsessions Depression Health Concerns Bizarre Mentation
  • 213. Anger Cynicism Antisocial Practices Type A Low Self-Esteem Social Discomfort Family Problems Work Interference Negative Treatment Indicators Minnesota Multiphasic Personality Inventory-2 137 Table 6.1 (Continued) PSY-5 Scales AGGR PSYC DISC NEGE INTR Supplementary Scales Broad Personality Characteristics A R Es Do Re
  • 214. Generalized Emotional Distress Mt PK MDS Behavioral Dyscontrol Ho 0-H MAC-R AAS APS Gender Role GF GM Aggression Psychoticism Disconstraint Negative Emotionality Introversion/Low Positive Emotionality Anxiety Repression Ego Strength Dominance Social Responsibility College Maladjustment PTSD-Keane Marital Distress
  • 215. Hostility Overcontrolled Hostility MacAndrew Alcoholism-Revised Addiction Admission Addiction Potential Gender Role-Feminine Gender Role-Masculine patients) as compared with normal groups. Since their approach was strictly empirical and no theoretical rationale was posited as the basis for accepting or rejecting items on a specific scale, it is not always possible to discern why a particular item distinguishes the criterion group from the normal group. Rather, items were selected solely because the criterion group answered them differently than other groups. For each of the criterion groups and the normative group, the frequency of "True" and "False" responses was calculated for each item. An item was tentatively selected for a scale if the difference in frequency of response between the criterion group and the normative group was at least twice the standard error of the proportions of true/false responses of the two groups being compared. Having selected items according to this procedure, Hathaway and McKinley then eliminated some of them for various reasons. First, the frequency of the criterion group's response was required to be greater than 10% for nearly all items; those items that yielded infrequent deviant response rates from the criterion group were excluded even if
  • 216. they were highly significant statistically because they represented so few criterion cases. Additionally, items whose responses appeared to reflect biases on variables such as marital status or socioeconomic 138 Self-Report Inventories Table 6.2 Minnesota Multiphasic Personality lnventory-2 (MMPI-2) Authors: Published: Edition: Publisher: Website: Age Range: Reading Level: Administration Formats: Additional Languages: Number of Items: Response Format: Administration Time: Primary Scales: Additional Scales: Hand Scoring: General Texts: Computer Interpretation: Butcher, Dahlstrom, Graham, Tellegen, and Kaemmer 1989 2nd Pearson Assessments
  • 217. www.PearsonAssessments.com 18+ 6th-8th grade paper/pencil, computer, CD, cassette Spanish, Hmong, and French for Canada 567 True/False 60--90 minutes 9 Validity, 10 Clinical, 15 Content 5 PSY-5, 9 Restructured Clinical, 20 Supplementary Templates Friedman et al. (2001); Graham (2006); Greene (2000); Nichols (2001) Caldwell Report (Caldwell); Pearson Assessments (Butcher); Psychological Assessment Resources (Greene) status were excluded. Evaluation of several methods of weighting individual items showed no advantage over using unweighted items. Therefore, each item simply received a weight of "one" in deriving a total score. In other words, a person's score on any MMPI scale is equal to the total number of items that the individual answers in the same manner as the criterion group. The empirical approach to item selection used by Hathaway and McKinley, in fact, freed them of any concerns about how any individual interprets specific items because it assumes that the individual's self-report is just that and makes no a priori assumptions about the relationships between the individual's self-report and the individual's behavior. Items are selected for inclusion in a specific scale only because the
  • 218. criterion group answered the items differently than the normative group irrespective of whether the item content is actually an accurate description of the criterion group. Any relationship between individuals' responses on a given scale and their behavior must be demonstrated empirically. MMPI-2 (Restandardized Version) The MMPI-2 (Butcher et al., 1989, 2001) represents the restandardization of the MMPI that was needed to provide current norms for the inventory, develop a nationally representative and larger normative sample, provide appropriate representation of ethnic minorities, and update item content where needed. Continuity between the MMPI and the MMPI-2 was maintained because new criterion groups and item derivation procedures were not used on the standard validity and clinical scales. Thus, the items on the validity and clinical scales of the MMPI are essentially unchanged on the MMPI-2 except for the elimination of 13 items based on item content and the rewording of 68 items. www.PearsonAssessments.com Minnesota Multiphasic Personality Inventory-2 139 In the development of the MMPl-2, the Restandardization Committee (Butcher et al., 1989) started with the 550 items on the original MMPI; that is, they first deleted the 16 repeated items. They reworded 141 of these 550 items to
  • 219. eliminate outdated and sexist language and to make these items more easily understood. Rewording these items did not change the correlations of the items with the total scale score in most cases (Ben- Porath & Butcher, 1989). Many of these items were omitted on the original MMPI because individuals did not understand them. Greene (1991, p. 57) provides examples of these items such as playing drop the handkerchief. The Restandardization Committee then added 154 provisional items that resulted in the 704 items on Form AX, which was used to collect the normative data for the MMPI-2. When finalizing the items to be included on the MMPI-2, the Restandardization Com- mittee deleted 77 items from the original MMPI in addition to the 13 items deleted from the standard validity and clinical scales and the 16 repeated items. Consequently, most special and research scales that have been developed on the MMPI are still capable of being scored unless the scale has an emphasis on religious content or the items are drawn predominantly from the last 150 items on the original MMPI. The Restandardization Committee included 68 of the 141 items that had been rewritten, and they incorporated 107 of the provisional items to assess major content areas that were not covered in the original MMPI item pool. The rationale for including and dropping items
  • 220. from Form AX that resulted in the 567 items on the MMPI-2 has not been made explicit. The MMPI-2 was standardized on a sample of 2,600 individuals who resided in seven different states (California, Minnesota, North Carolina, Ohio, Pennsylvania, Virginia, and Washington) to reflect national census parameters on age, marital status, ethnicity, educa- tion, and occupational status. The normative sample for the MMPI-2 varies significantly from the original normative sample for the MMPI in several areas: years of education, rep- resentation of ethnic minorities, and occupational status. The individuals in the normative sample for the MMPI-2 also are more representative of the United States as a whole because national census parameters were used in their collection. However, they still varied from the census parameters on years of education and occupational status. The potential im- pact of this higher level of education and occupation in the MMPI-2 normative sample on codetype and scale interpretation has been a focus of concern (Caldwell, 1997c; Helmes & Reddon, 1993). However, Schink.a and LaLone (1997) compared a census-matched sub- sample created within the MMPI-2 restandardization sample and found only one difference that exceeded 3 T score points between these two samples on the standard validity and clinical scales, content scales, and supplementary scales. The extant literature that has examined the empirical correlates of MMPI-2 scales and
  • 221. codetypes has been consistent with the correlates reported for their MMPI counterparts (Archer, Griffin, & Aiduk, 1995; Graham, Ben-Porath, & McNulty, 1999). It appears safe to assume that the correlates of well-defined MMPI-2 codetypes (the two highest clinical scales composing the codetype should be at least five T points higher than the next highest clinical scale) and the individual validity and clinical scales will be very similar to those for the MMPI. The data are less clear for MMPI-2 codetypes that are not well-defined, although it still will be safe to interpret the individual validity and clinical scales in these codetypes using MMPI correlates given the minimal change at the scale level. New sets of scales have been developed with the MMPI-2 item pool: content scales (Butcher, Graham, Williams, & Ben-Porath, 1990); content component scales (Ben-Porath 140 Self-Report Inventories & Sherwood, 1993); personality psychopathology five scales (PSY-5: Harkness, McNulty, Ben-Porath, & Graham, 2002); and restructured clinical scales (Tellegen et al., 2003). Several major reviews of the MMPI-2 (Butcher, Graham, & Ben-Porath, 1995; Butcher & Rouse, 1996; Caldwell, 1997c; Greene, Gwin, & Staal, 1997; Helmes & Reddon, 1993) provide summaries from a variety of perspectives on this
  • 222. venerable instrument. These reviews provide the interested reader with an excellent starting point for looking at the current status of the MMPI-2. Butcher et al. (1995) and Greene et al. (1997) also outline the general steps that researchers need to follow and issues that need to be addressed in conducting research with the MMPI-2. It is to be hoped that researchers will heed the advice dispensed in these reviews to enhance the quality of the data that are being collected. Unlike the MMPI which was used with all ages, the MMPI-2 is to be used only with adults /8 years of age and older. Adolescents are to be tested with the MMPI-A (Butcher et al., 1992), which is designed specifically for them (see Chapter 7). ADMINISTRATION The first requirement in the administration of the MMPI-2 is ensuring that the individual is invested in the process. It will pay excellent dividends to spend a few extra minutes answer- ing any questions the individual may have about why the MMPI-2 is being administered and how the results will be used. The clinician should work diligently to make the assessment process a collaborative activity with the individual to obtain the desired information. This issue of therapeutic assessment (Finn, 1996; Fischer, 1994) was covered in more depth in Chapter 2 (pp. 43-44). Reading level is a crucial factor in determining whether a
  • 223. person can complete the MMPI-2; inadequate reading ability is a major cause of inconsistent patterns of item endorsement to be discussed later. Butcher et al. ( 1989) suggest that most clients who have had at least 8 years of formal education can take the MMPl-2 with little or no difficulty because the items are written on an eighth-grade level or less. A number of authors (Dahlstrom, Archer, Hopkins, Jackson, & Dahlstrom, 1994; Paolo, Ryan, & Smith, 1992; Schinka & Borum, 1993) have studied the readability of MMPl-2 and MMPI-A items. There was general concurrence that the average readability of the MMPI-2 and MMPI-A is in the range of fifth to sixth grade. The scales requiring the highest reading levels were 9 (Ma: Hypomania), the three content scales of Antisocial Practices (ASP), Cynicism (CYN), and Type A (TPA), several of the Harris and Lingoes (1955) subscales: Hy2 (Need for Affection), Pa3 (Naivete), Sc5 (Lack of Ego Mastery, Defective Inhibition), Ma 1 (Amorality), Ma2 (Psychomotor Acceleration), Ma3 (Imperturbability), and Ma4 (Ego Inflation). On most of these scales, at least 25% of their items required more than an eighth- grade reading level. These estimates of the required grade level are conservative because they are based on assessing the readability of individual MMPI- 2 items or groups of items. They are not based on the difficulty of understanding what is meant by saying either "true" or "false" to a specific item. The reader can assess this problem directly by trying to understand exactly what is meant by saying "false" to an MMPI-
  • 224. 2 item that is worded in the negative. What do individuals actually mean when they say "false" to an item such as "I do not always have pain in my back"? Schinka and Borum did suggest that individuals be asked to read MMPI-2 items 114, 226, and 445 if they have completed less than a 10th Minnesota Multiphasic Personality Inventory-2 141 grade education to determine whether their reading skills are adequate. Dahlstrom et al. (1994) also noted that the instructions for the MMPI-2 actually were more difficult than the items on the test so clinicians should be sure the individual fully understands them. SCORING Scoring the MMPI-2 is relatively straightforward either by hand or computer. If the MMPI-2 is administered by computer, the computer automatically scores it. If the in- dividual's responses to the items have been placed on an answer sheet, these responses can be entered into the computer by the clinician for scoring or they can be hand-scored. If the clinician enters the item responses into the computer for scoring, they should be double entered so that any data entry errors can be identified. The first step in hand-scoring is to examine the answer sheet carefully and indicate omitted items and double-marked items by drawing a line with
  • 225. brightly colored ink through both the "true" and "false" responses to these items. Also, cleaning up the answer sheet helps facilitate scoring. Responses that were changed need to be erased completely if possible, or clearly marked with an "X" so that the clinician is aware that this response has not been endorsed by the client. There is one scale that must always be scored without a template. The Cannot Say(?) scale score is the total number of items not marked and double marked. All the other scales are scored by placing a plastic template over the answer sheet with a small box drawn at the scored (deviant) response--either "true" or "false"-for each item on the scale. The total number of such items marked equals the client's raw score for that scale; this score is recorded in the proper space on the answer sheet. One scale- Scale 5 (Mf- Masculinity- Femininity)-is scored differently for men and women, and unusually high or low scores on this scale might indicate that the wrong template was used. Among women, a raw score less than 30 is unusual, and such raw scores should at least arouse a suspicion that the wrong template was used in scoring the scale. All scoring templates are made of plastic and must be kept away from heat. Plotting the profile is the next step in the scoring process. In essence, the clinician transfers all the raw scores from the answer sheet to the appropriate column of the profile sheet (see Figure 6.1). Some precautions must be taken and data
  • 226. calculations performed. First, separate profile sheets are used for men and women as with the scoring templates for Scale 5; an unusually high or low score plotted for Scale 5 should alert the clinician to the possibility that the wrong profile sheet was selected. Second, each column on the profile sheet is used to represent the raw scores for a specific scale. Each dash represents a raw score of 1 with the larger dashes marking increments of 5. Thus, the clinician notes the individual's raw score on the scale being plotted and makes a point or dot at the appropriate dash. Once the clinician has plotted the individual's scores on the eight validity scales, a solid line is drawn to connect them. The raw score on the Cannot Say(?) scale is merely recorded in the proper space in the lower left-hand comer of the profile sheet. A similar procedure is followed to plot the 10 clinical scales except that five of the clinical scales (1 [Hs: Hypochondriasis], 4 [Pd: Psychopathic Deviate], 7 [Pt: Psychasthenia], 8 [Sc: Schizophrenia], and 9 [Ma: Hypomania]) are K-corrected; that is, a fraction of K is added to the raw score before the individual's score is plotted. For these five scales that Minnesota Multiphasic Personality lnventory-2 197 Spike 3 codetypes. A client with a T score of 60 on the F scale is almost 15 points higher than the mean for Spike 3 codetypes, and nearly 40 points lower
  • 227. than the mean for 6-8/8-6 codetypes. A T score of 60 is unusual in both of these codetypes; in the former it is higher than expected and in the latter it is much lower than expected. Similar variations can be seen in the T scores for Scales 2 (D: Depression) and 8 (Sc: Schizophrenia). A codetype analysis can be further refined by considering additional clinical scales to create three- and four-point codetypes. A number of two-point codetypes have frequent three-point variants that should be considered in the interpretation of the MMPI-2, such as variants of 2-414-2 (2-4/4-2-(3), 2-4/4-2-(7), 2-4/4-2-(8)) and 2- 7/7-2 (2-717-2-(1), 2-717- 2-(3), 2-717-2-(8), 2-7/7-2-(0)) codetypes. Again, the interpretation of a client's score on a given scale will change as the prototypic score changes in the three-point codetypes within a particular group. The final "group" with which the MMPI-2 can be compared in the interpretive process is the individual, or idiographic, interpretation. In this comparison, the relative elevations of the scales become important because they indicate which content domains are more or less important for this particular individual. An individual who has T scores of 75 and 60 on the content scales of Depression (DEP) and Anxiety (ANX), respectively, is saying that symptoms of depression are more of a problem than symptoms of anxiety. The MMPI-2 content (Butcher et al., 1990) and content component
  • 228. (Ben-Porath & Sherwood, 1993) scales are an excellent means of developing such an idiographic interpretation of an individual's MMPI-2 profile, because the various content domains can be juxtaposed so that the clinician can compare them directly. APPLICATIONS As a self-report inventory, the MMPI-2 is easily administered in a wide variety of settings and for a variety of purposes. Although the MMPI was developed originally in a clinical setting with a primary focus on establishing a diagnosis for the person (Hathaway & McKinley, 1940), its uses quickly broadened to include more general descriptions of the behavior and symptoms of most forms of psychopathology (cf. Dahlstrom et al., 1972). This use was followed by extensions into the screening of applicants in personnel selection settings and a multitude of uses in forensic settings. Somewhat different issues must be considered in the administration of the MMPI-2 in personnel selection and forensic settings compared with the more usual clinical setting. First, not only is the administration not going to be therapeutic, the MMPI-2 results have the potential to cause a fairly negative impact on the individual. The individual may not be selected in a personnel-screening setting or be less likely to be considered for custody of children because of the acknowledgment of significant psychopathology.
  • 229. Second, the assessment of validity is particularly important because different forensic settings can have a significant impact on the data that are obtained from an individual. Items particularly sensitive to this impact are likely to be those items about which an individual is not sure or ambivalent in responding. In civil forensic settings such as personal injury, workers' compensation, and insurance disability claims, this impact is likely to be in the opposite direction from that in parenting examinations or personnel selection. Portraying oneself as being more impaired in cases for civil damages is likely to benefit an individual's 198 Self-Report Inventories claim; portraying oneself as being less impaired and more psychologically healthy is likely to benefit an individual's chances of being selected, or at least not screened out, in a personnel-screening setting. Consequently, it behooves the forensic psychologist to know what types of MMPI-2 scores and profiles are to be expected in every forensic setting. There also are different expectations of whether to report problematic behaviors and symptoms in criminal cases. Individuals who are being evaluated for competency to stand trial or for the introduction of mitigating circumstances during the sentencing phase after a conviction for murder versus probation or parole should have different expectations
  • 230. of the problematic behaviors and symptoms of psychopathology that are, or are not, to be reported. Individuals in the former forensic contexts would be expected to report any and all problematic behaviors or symptoms that might be in any way relevant to their circumstances, while individuals in the latter would not be expected to report any problematic behaviors or symptoms. Third, in a forensic setting it must be kept in mind that the MMPI-2 is being used to address a specific psycholegal issue rather than as a general screen for psychopathology. Thus, the interpretations provided of the MMPI-2 must be relevant to this psycholegal issue. For example, the mere presence of psychopathology as indicated by elevation of several clinical scales on the MMPI-2 may not be directly relevant to the psycholegal issue of quality of parenting skills in a child-custody examination or the ability to understand legal proceedings in a competency hearing. Finally, whether it is the prosecution (plaintiff) or the defense (defendant) that has retained the forensic psychologist also may impact the problematic behaviors and symptoms reported by an individual, but there are minimal empirical data on this point. Hasemann ( 1997) provided data on workers' compensation claimants who were evaluated by forensic psychologists for both the defense and the plaintiff. The claimant reported more symptoms and distress to the forensic psychologist retained by the defense
  • 231. attorney. Consequently, some of the differences in examinations performed by forensi c psychologists on the same individual may reflect that he actually describes problematic behaviors and symptoms differently depending on whether he believes that the forensic psychologist is likely to be sensitive or insensitive to his self-report. The underlying heuristic of an individual is likely to be that the opposing forensic psychologist will require more proof to be able or willing to perceive and report an individual as being impaired. These results suggest that being examined by the plaintiff's expert and then by the defense's expert over the same psycholegal issue should be considered as different forensic contexts rather than as the same one. PSYCHOMETRIC FOUNDATIONS Demographic Variables Age Specific norms are not provided by age on the MMPI-2, even though it is well known that there are substantial effects of age below the age of 20. These age effects are reflected in the development of separate sets of adolescent norms for the original MMPI (Marks & Briggs, 1972), and the restandardization of a different form of the MMPI for adolescents Minnesota Multiphasic Personality Inventory-2 199
  • 232. (MMPI-A: Butcher et al., 1992). Colligan and his colleagues (Colligan, Osborne, Swenson, & Offord, 1983, 1989) found substantial effects of age on MMPI performance in their contemporary normative sample with differences of 10 or more T points between 18- and 19-year-olds and 70-year-olds on Scales L (Lie) and 9 (Ma: Hypomania). Several MMPl-2 scales demonstrate differences of nearly 5 T points between 20- year-olds and 60-year-olds (Butcher et al., 1989, 2001; Caldwell, 1997b, 1997c; Greene & Schinka, 1995) with scores on Scales L (Lie: women only), I (Hs: Hypochondriasis), and 3 (Hy: Hysteria) increasing and Scales 4 (Pd: Psychopathic Deviate) and 9 (Ma: Hypomania) decreasing with age. Given that these age comparisons involve different cohorts, it is not possible to know whether these effects actually reflect the influence of age or simply differences between the cohorts. Butcher et al. (1991) found few effects of age in older (>60) men and they saw no reason for age-related norms in these men. Gender Gender does not create any general issues in MMPI-2 interpretation because separate norms (profile forms) are used for men and women. Any gender differences in how individuals responded to the items on each scale are removed when the raw scores are converted to T scores. Consequently, men and women with a T score of 70 on Scale 2 (D: Depression) are one standard deviation above the mean, although women have
  • 233. endorsed more items (30) than men (28). When the MMPI-2 is computer scored by Pearson Assessment, unigender norms also are provided for each scale. Even a cursory perusal of these unigender norms will show that men and women have very similar scores on all MMPI-2 scales except for those three scales specifically related to gender (Scale 5 [Mf: Masculinity-Femininity]; Gender-Role Feminine [GF]; Gender-Role Masculine [GM]). Education The potential effects of education have not been investigated in any systematic manner either on the MMPI or the MMPI-2, although such research is needed. When the men and women in the MMPI-2 normative group with less than a high school education were contrasted with men and women with postgraduate education (Dahlstrom & Tellegen, 1993, pp. 58-59), the differences on the following scales exceeded 5 T points: L (Lie: women only), F (Infrequency), K (Correction), 5 (MJ- Masculinity- Femininity), and O (Si: Social Introversion). Men and women with less than a high school education had a higher score in all these comparisons except for Scales K (Correction) and 5 (Mf· Masculinity-Femininity). When psychiatric patients with 8 years or less of education were contrasted with patients with 16 or more years of education (Caldwell, 1997b), the differences ranged from 4 to 8 T points on all the scales except 3 (Hy: Hysteria). The individuals with less education had higher scores in all these comparisons except for Scales K
  • 234. (Correction) and 5 (Mf· Masculinity-Femininity). Occupation There do not appear to be any systematic effects for occupation or income within the MMPI-2 normative group (Dahlstrom & Tellegen, 1993; Long, Graham, & Timbrook, 1994). There have been no studies of the effects of these two factors in psychiatric patients. 200 Self-Report Inventories Ethnicity The effects of ethnicity on MMPI performance have been reviewed by Dahlstrom, Lachar, and Dahlstrom ( 1986) and Greene ( 1987), and they concluded that there is not any consistent pattern of scale differences between any two ethnic groups. A similar conclusion has been offered in several other reviews of the effect of ethnicity on MMPI-2 performance (Greene, 1991, 2000; Hall, Bansal, & Lopez, 1999). Multivariate regressions of age, education, gender, ethnicity, and occupation on the standard validity and clinical scales in the MMPI-2 normative group (Dahlstrom & Tellegen, 1993) and psychiatric patients (Caldwell, 1997 [age, education,
  • 235. and gender only]; Schinka, LaLone, & Greene, 1998) have shown that the percentage of variance accounted for by these factors does not exceed 10%. Such small percentages of variance are unlikely to impact the interpretation of most MMPl-2 profiles. The one exception is Scale 5 (Mf: Masculinity-Femininity) in which slightly over 50% of the variance is accounted for by gender. In summary, demographic variables appear to have minimal impact on the MMPI-2 profile in most individuals. It may be important to monitor the validity of the MMPI-2 profile more closely in persons with limited education and lower occupations. A major reason that demographic effects are seen in these persons may simply reflect that the reading level of the MMPI-2 is approximately the eighth grade (Butcher et al., 1989, 2001; Greene, 2000). Reliability The MMPI-2 Manual (Butcher et al., 1989, 2001, Appendix E) reports the reliability data for 82 men and 111 women who were retested after an average of 8.58 days. The test- retest correlations ranged from .54 to .93 across the 10 clinical scales and averaged .74. The standard error of measurement is about 5 T points for the clinical scales, that is, the individual's true score on the clinical scales will be within ±5 T
  • 236. points two-thirds of the time. The test-retest correlations for the 15 content scales range from .77 to .91 and averaged .85. The standard error of measurement is about 4 T points for the content scales, that is, the individual's true score on the content scales will be withi n ±4 points two-thirds of the time. Codetype Stability There is little empirical data indicating how consistently clients will obtain the same codetype on two successive administrations of the MMPI or the MMPI-2. The research on the stability of the MMPI historically focused either on the reliability of individual scales as discussed, which leaves unanswered whether clients' codetypes have remained unchanged. There would be at least some cause for concern if a client obtained a 4-9/9-4 codetype on one occasion and on a second administration of the MMPI-2 a few months later in another setting obtained a2-7/7-2 codetype. Graham, Smith, and Schwartz (1986) have provided the only empirical data on the stability of MMPI codetypes for a large sample (N = 405) of psychiatric inpatients. They
  • 237. Minnesota Multiphasic Personality Inventory-2 201 reported 42.7%, 44.0%, and 27.7% agreement across an average interval of approximately 3 months for high-point, low-point, and two-point codetypes, respectively. If the patients were classified into the categories of neurotic, psychotic, and characterological, 58.1 % remained in the same category when retested. Greene, Davis, and Morse (1993, August) reported the stability of the MMPI in 454 alcoholic inpatients who had been retested after an interval of approximately 6 months. Approximately 40% of the men and 32% of the women had the same single high-point scale on the two successive administrations of the MMPI. However, they had the same two-point codetype only 12% and 13% of the time, respectively. Almost 30% of these men and women had two totally different high-point scales when they took the MMPI on their second admission. These data on codetype stability, or more accurately the lack thereof, suggest sev- eral important conclusions. First, clinicians should be cautious about making long-term predictions from a single administration of the MMPI-2. Rather an MMPI-2 profile should be interpreted as reflecting the individual's current status. Second, it is not clear whether the shifts that do occur in codetypes across time reflect meaningful changes in the clients' behaviors, psychometric instability of the MMPI-2, or some combination of both
  • 238. factors. CONCLUDING COMMENTS The MMPI-2 (Butcher et al., 1989, 2001) is the oldest and the most widely used of the self-report inventories. The numerous validity scales have served it well in assessing the many forms of response distortion that are encountered in the various settings in which the MMPI-2 is administered. The MMPl-2 is the prototype of an empirically derived test in which the correlates of individual scales and codetypes are determined through research. There is an extensive research base on most of the major issues in the assessment of psychopathology reflecting its long history of use. REFERENCES American Psychiatric Association. (2000). Diagnostic and statistical manual ofmental disorders (4th ed., text rev.). Washington, DC: Author. Arbisi, P. A., & Ben-Porath, Y. S. (1995). An MMPI-2 infrequent response scale for use with psychopathological populations: The Infrequency Psychopathology scale: F(p). Psychological Assessment, 7, 424-431. Archer, R. P., Griffin, R., & Aiduk, R. (1995). MMPI-2 clinical correlates for ten common codes. Journal of Personality Assessment, 65, 391-407. Bathurst, K., Gottfried, A. W., & Gottfried, A. E. (1997).
  • 239. Normative data for the MMPI-2 in child custody litigation. Psychological Assessment, 9, 205-211. Ben-Porath, Y. S., & Butcher, J. N. (1989). Psychometric stability of rewritten MMPI items. Journal of Personality Assessment, 53, 645-653. Ben-Porath, Y. S., & Sherwood, N. E. (1993). The MMPI-2 content component scales: Development, psychometric characteristics, and clinical applicati on. Minneapolis: University of Minnesota Press. 202 Self-Report Inventories Butcher, J. N., Aldwin, C. M., Levenson, M. R., Ben-Porath, Y. S., Spiro, A., & Bosse, R. (1991). Personality and aging: A study of the MMPI-2 among older men. Psychology and Aging, 6, 361-370. Butcher, J. N., Dahlstrom, W. G., Graham, J. R., Tellegen, A. M., & Kaemmer, B. (1989). MMPl-2: Manual for administration and scoring. Minneapolis: University of Minnesota Press. Butcher, J. N., Graham, J. R., & Ben-Porath, Y. S. (1995). Methodological problems and issues in MMPI, MMPI-2, and MMPI-A research. Psychological Assessment, 7, 320-329. Butcher, J. N., Graham, J. R., Ben-Porath, Y. S., Tellegen, A. M., Dahlstrom, W. G., & Kaemmer, B. (2001). MMPl-2: Manual for administration and scoring (Rev.
  • 240. ed.). Minneapolis: University of Minnesota Press. Butcher, J. N., Graham, J. R., Williams, C. L., & Ben-Porath, Y. S. (1990). Development and use of the MMPl-2 content scales. Minneapolis: University of Minnesota Press. Butcher, J. N., & Han, K. (1995). Development of an MMPI-2 scale to assess the presentation of self in a superlative manner: The S scale. In J. N. Butcher & C. D. Spielberger (Eds.), Advances in personality assessment ( Vol. 10, pp. 25-50). Hillsdale, NJ: Erlbaum. Butcher, J. N., & Rouse, S. V. (1996). Personality: Individual differences and clinical assessment. Annual Review of Psychology, 47, 87-111. Butcher, J. N., Williams, C. L., Graham, J. R., Archer, R. P., Tellegen, A., Ben-Porath, Y. S., et al. (1992). MMPI-A (Minnesota Multiphasic Personality Inventory-Adolescent): Manual for administration, scoring, and interpretation. Minneapolis: University of Minnesota Press. Caldwell, A. B. (1997a). [MMPI-2 data research file for clinical patients]. Unpublished raw data. Caldwell, A. B. (1997b ). [MMPI-2 data research file for personnel applicants]. Unpublished raw data. Caldwell, A. B. (1997c). Whither goest our redoubtable mentor the MMPI/MMPI-2? Journal of Personality Assessment, 68, 47-66.
  • 241. Caldwell, A. B. (1998). [MMPI-2 data research file for pain patients]. Unpublished raw data. Caldwell, A. B. (2006). Maximal measurement or meaningful measurement: The interpretive chal- lenges of the MMPI-2 Restructured Clinical (RC) scales. Journal ofPersonality Assessment, 87, 193-201. Colligan, R. C., Osborne, D., Swenson, W. M., & Offord, K. P. (1983). The MMPI: A contemporary normative study. New York: Praeger. Colligan, R. C., Osborne, D., Swenson, W. M., & Offord, K. P. (1989). The MMPI: A contemporary normative study ( 2nd ed.). Odessa, FL: Psychological Assessment Resources. Cord, E. L. J., Sajwaj, T. E., Tolliver, D. K., & Ford, T. W. (1997, June). Normative update on MMPl-2 data for a large federal power utility. Paper presented at the 32nd annual Symposium on Recent Developments in the use of the MMPI-2 and MMPI-A, Minneapolis, MN. Dahlstrom, W. G., Archer, R. P., Hopkins, D. G ., Jackson, E., & Dahlstrom, L. E. (1994 ). Assessing the readability of the Minnesota Multiphasic Inventory Instruments: The MMPI, MMPl-2, MMPI-A. Minneapolis: University of Minnesota Press. Dahlstrom, W. G., Lachar, D., & Dahlstrom, L. E. (1986). MMPI patterns of American minorities. Minneapolis: University of Minnesota Press. Dahlstrom, W. G., & Tellegen, A. (1993). Socioeconomic status
  • 242. and the MMPl-2: The rela- tion of MMPl-2 patterns to levels of education and occupation. Minneapolis: University of Minnesota Press. Dahlstrom, W. G., Welsh, G. S., & Dahlstrom, L. E. (1972). An MMPI handbook: Vol. I. Clinical interpretation (Rev. ed.). Minneapolis: University of Minnesota Press. Finn, S. (1996). Using the MMPl-2 as a therapeutic intervention. Minneapolis: University of Minnesota Press. Fischer, C. T. (1994). Individualizing psychological asses sment. Hillsdale, NJ: Erlbaum. Fowler, R. A., Butcher, J. N., & Williams, C. L. (2000). Essentials of MMPl-2 and MMPI-A inter- pretation (2nd ed.). Minneapolis: University of Minnesota Press. Minnesota Multiphasic Personality Inventory-2 203 Friedman, A. F., Lewak, R., Nichols, D.S., & Webb, J. T. (2001). Psychological assessment with the MMPI-2 (2nd ed.). Hillsdale, NJ: Erlbaum. Gough, H. G. (1950). The F minus K dissimulation index for the MMPI. Journal of Consulting Psychology, 14, 408-413. Gough, H. G. (1954). Some common misconceptions about neuroticism. Journal of Consulting
  • 243. Psychology, 18, 287-292. Graham, J. R. (2006). MMPI-2: Assessing personality and psychopathology (4th ed.). New York: Oxford University Press. Graham, J. R., Ben-Porath, Y. S., & McNulty, J. L. (1999). MMPI-2 correlates for outpatient community mental health settings. Minneapolis: University of Minnesota Press. Graham, J. R., Smith, R. L., & Schwartz, G. F. (1986). Stability ofMMPI configurations for psychiatric inpatients. Journal of Consulting and Clinical Psychology, 54, 375-380. Greene, R. L. (1987). Ethnicity and MMPI performance: A review.Journal ofConsulting and Clinical Psychology, 55, 497-512. Greene, R. L. (1991). The MMPI-2/MMPI: An interpretive manual. Boston: Allyn & Bacon. Greene, R. L. (2000). The MMPI-2: An interpretive manual. Boston: Allyn & Bacon. Greene, R. L., & Brown, R. C. (2006). MMPI-2 adult interpretive system (3rd ed.). Lutz, FL: Psychological Assessment Resources. Greene, R. L., Davis, L. J., Jr., & Morse, R. M. (1993, August). Stability of MMPI codetypes in alcoholic inpatients. Paper presented at the annual meeting of the American Psychological Association, San Francisco.
  • 244. Greene, R. L., Gwin, R., & Staal, M. (1997). Current status of MMPI-2 research: A methodological overview. Journal ofPersonality Assessment, 68, 20-36. Greene, R. L., & Schinka, J. A. (1995). [MMPI-2 data research file for psychiatric inpatients and outpatients]. Unpublished raw data. Hall, G. C. N., Bansal, A., & Lopez, I. R. (1999). Ethnicity and psychopathology: A meta-analytic review of 31 years of comparative MMPI/MMPI-2 research. Psychological Assessment, 11, 186-197. Harkness, A. R., & McNulty, J. L. (1994). The Personality Psychopathology Five (PSY-5): Issue from the pages of a diagnostic manual instead of a dictionary. In S. Strack & M. Lorr (Eds.), Differentiating normal and abnormal personality (pp. 291-315). New York: Springer. Harkness, A. R., McNulty, J. L., Ben-Porath, Y. S., & Graham, J. R. (2002). MMPI-2 Personality Psychopathology Five (PSY-5) scales: Gaining an overview for case conceptualization and treatment planning. Minneapolis: University of Minnesota Press. Harris, R. E., & Lingoes, J. C. (1955). Subscales for the MMPI: An aid to profile interpretation. Unpublished manuscript, University of California. Hasemann, D. M. (1997). Practices and findings of mental health professionals conducting workers' compensation evaluations. Unpublished doctoral dissertation, University of Kentucky.
  • 245. Hathaway, S. R., & McKinley, J.C. (1940). A multiphasic personality schedule (Minnesota): Pt. I. Construction of the schedule. Journal ofPsychology, 10, 249- 254. Helmes, E., & Reddon, J. R. (1993). A perspective on developments in assessing psychopathology: A critical review of the MMPI and MMPI-2. Psychological Bulletin, 113, 453-471. Keller, L. S., & Butcher, J. N. (1991). Assessment of chronic pain patients with the MMP/-2. Minneapolis: University of Minnesota Press. Koss, M. P., & Butcher, J. N. (1973). A comparison of psychiatric patients' self-report with other sources of clinical information. Journal ofResearch in Personality, 7, 225-236. Lachar, D., & Wrobel, T. A. (1974). Validating clinicians' hunches: Construction of a new MMPI critical item set. Journal of Consulting and Clinical Psychology, 47, 277-284. Lees-Haley, P. R. (1997). MMPI-2 base rates for 492 personal injury plaintiffs: hnplications and challenges for forensic assessment. Journal of Clinical Psychology, 53, 745-755. 204 Self-Report Inventories Long, K. A., Graham, J. R., & Timbrook, R. E. (1994). Socioeconomic status and MMPI-2 interpre-
  • 246. tation. Measurement and Evaluation in Counseling and Development, 27, 158-177. MacAndrew, C. ( 1965). The differentiation of male alcoholic outpatients from nonalcoholic psychi- atric outpatients by means of the MMPI. Quarterly Journal of Studies on Alcohol, 26, 238-246. Marks, P.A., & Briggs, P. F. (1972). Adolescent norm tables for the MMPI. In W. G. Dahlstrom, G. S. Welsh, & L. E. Dahlstrom (Eds.), An MMPI handbook: Vol. I. Clinical interpretation (Rev. ed., pp. 388-399). Minneapolis: University of Minnesota Press. Meehl, P. E. (1957). When should we use our heads instead of the formula? Journal of Counseling Psychology, 4, 268-273. Megargee, E. I., Mercer, S. J., & Carbonell, J. L. (1999). MMPI-2 with male and female state and federal prison inmates. Psychological Assessment, 11, 177-185. Nichols, D.S. (2001). Essentials of MMPI-2 assessment. New York: Wiley. Paolo, A. M., Ryan, J. J., & Smith, A. J. (1992). Reading difficulty of MMPI-2 subscales. Journal of Clinical Psychology, 47, 529-532. Paulhus, D. L. ( 1984). Two-component models of socially desirable responding.Journal ofPersonality and Social Psychology, 46, 598-609. Paulhus, D. L. (1986). Self-deception and impression management in test responses. In A. Angleitner & J. S. Wiggins (Eds.), Personality assessment via
  • 247. questionnaires: Current issues in theory and measurement (pp. 143-165). Berlin, Germany: Springer-Verlag. Schinka, J. A., & Borum, R. (1993). Readability of adult psychopathology inventories. Psychological Assessment, 5, 384-386. Schinka, J. A., & LaLone, L. ( 1997). MMPI-2 norms: Comparisons with a census-matched subsample. Psychological Assessment, 9, 307-311. Schinka, J. A., LaLone, L., & Greene, R. L. (1998). Effects of psychopathology and demographic characteristics on MMPI-2 scale scores. Journal of Personality Assessment, 70, 197-211. Tellegen, A., Ben-Porath, Y. S., McNulty, J. L., Arbisi, P. A., Graham, J. R., & Kaemmer, B. (2003). The MMPI-2 Restructured Clinical Scales: Development, validation, and interpretation. Minneapolis: University of Minnesota Press. Weed, N. C., Butcher, J. N., & Ben-Porath, Y. S. (1995). MMPI-2 measures of substance abuse. In J. N. Butcher & C. D. Spielberger (Eds.), Advances in personality assessment ( Vol 10, pp. 121-145). Hillsdale, NJ: Erlbaum. Weed, N. C., Butcher, J. N., McKenna, T., & Ben-Porath, Y. S. (1992). New measures for assessing alcohol and drug abuse with the MMPI-2: The APS and AAS. Journal ofPersonality Assessment, 58, 389-404. Welsh, G. S. (1956). Factor dimensions A and R. In G. S. Welsh & W. G. Dahlstrom (Eds.), Basic
  • 248. readings on the MMPI in psychology and medicine (pp. 264- 281). Minneapolis: University of Minnesota Press. a-135-141a-197-204 Chapter 8 MILLON CLINICAL MULTIAXIAL INVENTORY-III The Millon Clinical Multiaxial Inventory-III (MCMI-III: Millon, Davis, & Millon, 1994, 1997) is a broadband measure of the major dimensions of psychopathology found in Axis II disorders and some Axis I disorders of the DSM-IV-TR (American Psychiatric Association, 2000). The MCMI-III consists of 4 validity (modifier) scales, 11 personality style scales, 3 severe personality style scales, 7 clinical syndrome scales, and 3 severe clinical syndrome scales (see Table 8.1). Table 8.2 provides the general information on the MCMI-111. In contrast to the MMPl-2 (Butcher, Dahlstrom, Graham, Tellegen, & Kaemmer, 1989) that has 120+ additional scales, the MCMI-III does not have any subscales for these basic sets of scales or separate content scales so there are only 28 total scales on the MCMI-111. Consequently, learning to interpret the MCMI-III is more straightforward than the MMPI-2. Recently Grossman and del Rio (2005) described the development of 35 facet scales for the 14 personality style scales that represent the first such
  • 249. attempt to create subscales for any of the versions of the MCMI. These facet scales are very new so there is little research on them or clinical information on their use. They are described briefly later in this chapter. HISTORY Millon (1983; Millon & Davis, 1996) conceptualized an evolutionary framework for per- sonality in which the interface of three polarities (pleasure- pain; active-passive; self-other) determines an individual's specific personality style as an adaptation to the environment. The pleasure-pain polarity involves either seeking pleasure as a way of enhancing life or avoiding pain as a way of constricting life. The active- passive polarity involves either working to change unfavorable aspects of the environment or accepting unfavorable aspects that cannot be changed. Table 8.3 presents the functional processes and structural domains for each of the 14 personality disorder styles in the MCMI-111. Millon et al. (1997) believe that each cell of this matrix contains the diagnostic attribute or criterion that best captures the personality style within that specific functional process or structural domain. Reading down each column provides an overview of how each personality style differs on each functional process or structural domain. Reading across each row provides an overview of how each personality style can be described.
  • 250. Millon's conceptual system for personality disorders does not map directly onto the DSM-IV-TR (American Psychiatric Association, 2000) Axis II personality disorders. The latter is an atheoretical categorical system that describes the behaviors and symptoms needed 251 252 Self-Report Inventories Table 8.1 Millon Clinical Multiaxial Inventory-III (MCMI-III) Modifying Indices (Validity Scales) V Validity Index X y z Personality Styles 1 2A 2B 3 4 5 6A 6B 7
  • 251. BA BB Severe Personality Styles s C p Clinical Syndromes A H N D B T R Severe Clinical Syndromes ss cc pp Disclosure Index Desirability Index Debasement Index Schizoid Avoidant Depressive Dependent Histrionic
  • 252. Narcissistic Antisocial Sadistic (Aggressive) Compulsive N egativistic (Passive-Aggressive) Masochistic Schizotypal Borderline Paranoid Anxiety Disorder Somatoform Disorder Bipolar Disorder: Manic Dysthymic Disorder Alcohol Dependence Drug Dependence Posttraumatic Stress Disorder Thought Disorder Major Depression Delusional Disorder to make a specific personality disorder diagnosis. Millon also includes personality disorders such as Sadistic (Aggressive) and Depressive on the MCMl-111 that are not included in the DSM-IV-TR. MCMI (First Edition) The original MCMI (Millon, 1977) had five major distinguishing features when compared with the MMPI (Hathaway & McKinley, 1951 ), which was the
  • 253. primary self-report inventory in use at the time. First, the MCMI was developed following Millon's comprehensive Millon Clinical Multiaxial Inventory-III 253 Table 8.2 Millon Multiaxial Clinical Inventory-III (MCMI-111) Authors: Published: Edition: Publisher: Website: Age range: Reading level: Administration formats: Languages: Number of items: Response format: Administration time: Primary scales: Additional scales: Hand scoring: General texts: Computer interpretation: Millon, Davis, Millon 1994 3rd Pearson Assessments www.PearsonAssessments.com/tests/MCML3 18+
  • 254. 8th grade Paper/pencil, computer, CD, cassette Spanish 175 True/False 25-30 minutes 4 Validity, 11 Personality Styles, 3 Severe Personality Styles, 7 Clinical Syndromes, 3 Severe Clinical Syndromes 35 (42) Facet Templates Choca (2004), Craig (2005), Jankowski (2002), Millon et al. (1997), Retzlaff (1995), Strack (2002) Pearson Assessments (Millon); Psychological Assessment Resources (Craig) clinical theory described earlier, in contrast to the atheoretical or empirical development of the original MMPI (see Chapter 6). Second, the MCMI contained specific scales to assess personality disorders, the more enduring personality characteristics of patients, which would be incorporated into Axis II of the forthcoming diagnostic system at the time, that is, DSM-III (American Psychiatric Association, 1980). Third, the comparison group consisted of a representative sample of psychiatric patients instead of normal individuals, which would facilitate differential diagnosis among patients. Fourth, scores on the scales were transformed into actuarial base rates. These base rates reflected the actual frequency
  • 255. with which various forms of psychopathology occurred rather than traditional standard scores, which measure how far the person deviates from the mean of normal individuals. Finally, the MCMI was designed to use as few items as possible to achieve these goals. At 175 items, the MCMI was and remains the shortest self-report inventory that is a broadband measure of the major dimensions of psychopathology. The original MCMI had four items that evaluated whether the person had read the items. These four items will become the Validity (V) scale on the ensuing editions of the MCMI that assess the consistency of item endorsement. The original MCMI did not have explicit validity scales to assess the accuracy of item endorsement. Instead a weight factor was developed based on the variation of the person's score from the midpoint of the total raw score for the eight basic personality scales. When this total raw score was below 110, the person was thought to be too cautious in reporting problematic behaviors and symptoms of psychopathology so their scores would need to be adjusted upward. Conversely, when the total raw score was above 130, the person was thought to be too open or self-revealing so their scores would need to be adjusted downward. www.PearsonAssessments.com/tests/MCML3 254 Self-Report Inventories
  • 256. Table 8.3 Expression of personality disorders across the functional and structural domains of personality Functional Processes Expressive Interpersonal Regulatory Disorder Arts Conduct Cognitive Style Mechanisms 1 Schizoid Impassive Unengaged Impoverished Intellectualization 2A Avoidant Fretful Aversive Distracted Fantasy 2B Depressive Disconsolate Defenseless Pessimistic Asceticism 3 Dependent Incompetent Submissive Nai"ve Introjection 4 Histrionic Dramatic Attention- Flighty Dissociation Seeking 5 Narcissistic Haughty Exploitive Expansive Rationalization 6A Antisocial Impulsive Irresponsible Deviant Acting Out 6B Sadistic Precipitate Abrasive Dogmatic Isolation 7 Compulsive Disciplined Respectful Constricted Reaction Formation SA Negativistic Resentful Contrary Skeptical Displacement SB Masochistic Abstinent Deferential Diffident Exaggeration s Schizotypal Eccentric Secretive Autistic Undoing C Borderline Spasmodic Paradoxical Capricious Regression p Paranoid Defensive Provocative Suspicious Projection Structural Attributes Object Morphologic Mood/ Disorder Self-Image Representation Organization Temperament
  • 257. 1 Schizoid Complacent Meager Undifferentiated Apathetic 2A Avoidant Alienated Vexations Fragile Anguished 2B Depressive Worthless Forsaken Depleted Melancholic 3 Dependent Inept Immature Inchoate Pacific 4 Histrionic Gregarious Shallow Disjointed Fickle 5 Narcissistic Admirable Contrived Spurious Insouciant 6A Antisocial Autonomous Debased Unruly Callous 6B Sadistic Combative Pernicious Eruptive Hostile 7 Compulsive Conscientious Concealed Compartmentalized Solemn SA Negativistic Discontented Vacillating Divergent Irritable SB Masochistic Undeserving Discredited Inverted Dysphoric s Schizotypal Estranged Chaotic Fragmented Distraught or Insensitive C Borderline Uncertain Incompatible Split Labile p Paranoid Inviolable Unalterable Inelastic Irascible Note: Self-Other are reversed in Compulsive and Negativistic. Source: MCM/-1// Manual, second edition (p. 27), by T. Millon, R. Davis, and C. Millon, 1997, Minneapolis, MN: National Computer Systems. Reprinted with permission from table 2.2. Millon Clinical Multiaxial Inventory-III 255 This weight factor will become an explicit validity (modifier) scale (Disclosure [X]) on the ensuing forms of the MCMI. MCMI-11 (Second Edition)
  • 258. The second edition of the MCMI (MCMI-11: Millon, 1987) appeared in 1987 to enhance several features of the original MCMI. Two new personality disorder scales (Aggres- sive/Sadistic and Self-Defeating [Masochistic]) and three validity (modifier) scales (Dis- closure [X], Desirability [Y], and Debasement [Z]) scales were added to the profile form. Forty-five new items (45/175 = 25.7%) were added to replace 45 extant items that did not add sufficient discriminating power to their scales. Modifications also were made in the MCMI-11 items to bring the scales into closer coordination with DSM-III-R (American Psychiatric Association, 1987). An item-weighting procedure was added wherein items with greater prototypicality for a given scale were given higher weights of 2 or 3. If an item was endorsed in the nonscored direction, it was assigned a weight of 0. If an item was endorsed in the scored direction, it was assigned a weight of 1, 2, or 3 depending on how prototypical the item was for that scale with the most prototypical items assigned a weight of 3. The replacement of one-quarter of the items from the original MCMI limits the general- izability of its results to the MCMI-11. Even though the scales still have the same name, the actual items composing a scale may have changed substantially. The introduction of the in- creased weighting of prototypical items on each MCMI-11 scale also alters the relationship among the items within the scale and with other scales.
  • 259. MCMI-III (Third Edition) The third edition of the MCMI (Millon et al., 1994, 1997) appeared in 1994 with four major changes. First, 95 (95/175 = 54.3%) new items were introduced to parallel the substantive nature of the then forthcoming DSM-IV criteria (American Psychiatric Association, 1994). Second, two new scales were added: one personality style (Depressive) and one clinical syndrome scale (Posttraumatic Stress Disorder). Third, a small set of items was added to strengthen the Noteworthy responses in the areas of child abuse, anorexia, and bulimia. Finally, the weighting of items was reduced to only two levels with the more prototypical items for a specific scale adding two points to the raw score. The generalizability of the research results from the MCMI-11 to the MCMI-III need to be made cautiously because over one-half of the items were changed. The emphasis in these new items also tended to be on DSM-IV criteria. It appears that the emphasis in the MCMI-111 is toward the DSM-IV criteria for personality disorders; whereas the emphasis in the MCMI-11 was toward Millon's theory. ADMINISTRATION The first issue in the administration of the MCMI-III is ensuring that the individual is invested in the process. Taking a few extra minutes to answer any questions the individual may have about why the MCMI-111 is being administered and how the results will be used
  • 260. 256 Self-Report Inventories will pay excellent dividends. This issue may be even more important with the MCMI-111 than with other self-report inventories because of the relatively limited number of items on each scale and the extensive item overlap that quickly compounds the effect of the individual distorting responses to even a few items. The clinician should work diligently to make the assessment process a collaborative activity with the individual to obtain the desired information. This issue of therapeutic assessment (Finn, 1996; Fischer, 1994) was covered in more depth in Chapter 2 (pp. 43-44). Reading level is a crucial factor in determining whether a person can complete the MCMI-III; inadequate reading ability is a major cause of inconsistent patterns of item endorsement. Millon et al. (1997) suggest that most clients who have had at least 8 years of formal education can take the MCMI-111 with little or no difficulty because the items are written on an eighth-grade level or less. If there is some concern about the person's reading level, he or she can be asked to read a few items out loud to obtain a quick estimate of whether reading is a problem. In those individuals for whom reading is difficult, the MCMI-III can be presented by CD or audiocassette tape.
  • 261. SCORING Scoring the MCMI-111 by hand is a complex process that commonly results in scoring errors (Millon et al., 1997, p. 112). If computer scoring is not available, each MCMI-III should be hand scored and profiled independently by two different individuals and their scores verified to catch such errors. If the MCMI-III is administered by computer, the computer automatically scores it. If the individual's responses to the items have been placed on an answer sheet, these responses can be entered into the computer by the clinician for scoring or they can be hand scored. If the clinician enters the item responses into the computer for scoring, they should be double entered to identify any data entry errors. The first step in hand scoring is to examine the answer sheet carefully and indicate omitted items and double-marked items by drawing a line through both the "true" and "false" responses to these latter items in brightly colored ink. Also, cleaning up the answer sheet is helpful and facilitates scoring. Responses that were changed need to be erased completely if possible, or clearly marked with an "X" so that the clinician is aware that this response has not been endorsed by the client. The next step is to determine whether any of the three Validity (V) scale items (65, 110, 157) have been endorsed as being "True." If two or more of these items have been endorsed
  • 262. as being "True," scoring is unwarranted and should stop; it is probably unwarranted even if only one of them has been endorsed as "True." The number of omitted items, which is the total number of items not marked and double marked, is scored without a template. There is no standard place on the profile form on which the number of omitted items is reported so the clinician should make it explicit if, and how many, items have been omitted when it does occur. All the other scales except for Scale X (Disclosure) are scored by placing a plastic template over the answer sheet with a small box drawn at the scored (deviant) response--either "true" or "false"-for each item on the scale. The responses on the MCMI-111 are weighted either "1" or "2," with the responses weighted "2" being prototypic for that scale. The sum of these weighted responses equals the client's raw score for that scale; this raw score is recorded in the proper space on the 276 Self-Report Inventories Critical Items (Noteworthy Responses) Critical items on the MCMI-III are identified as Noteworthy Responses (Millon et al., 1997, Appendix E). These Noteworthy Responses are divided into six categories: (1) Health Preoccupations; (2) Interpersonal Alienation; (3) Emotional Dyscontrol; (4) Self- Destructive Potential; (5) Childhood Abuse; and (6) Eating
  • 263. Disorders. The deviant response to all these items is "True." These items are intended to alert the clinician to specific items that warrant close review. All the items except one within Health Preoccupations are found on Scale H (Somatoform). The Eating Disorder items are not scored on any extant MCMI- III scale and must be reviewed directly. Items 154 and 171 reflect suicide attempts and suicidal ideation that need to be reviewed any time they are endorsed or omitted. APPLICATIONS As a self-report inventory, the MCMI-111 is used routinely in clinical settings as well as correctional and substance abuse settings. However, the MCMI- III is not to be used "with normal populations or for purposes other than establishing a diagnostic screening and clinical assessment. ... To administer the MCMI-111 to a wider range of problems or class of subjects, such as those found in business or industry, or to identify neurologic lesions, or to use it for the assessment of general personality traits among college students is to apply the instrument to settings and samples for which it is neither intended nor appropriate" (Millon et al., 1997, p. 6). Choca (2004) has suggested that there is nothing wrong with giving the MCMI-III to normal individuals or other samples on which the MCMI-III was not standardized, as long as the clinician keeps in mind the standardization group to which the person is being compared.
  • 264. The MCMI-111 also is used in forensic settings, and several authors have provided guidelines for its use (McCann, 2002; Schutte, 2001 ). There has been substantial debate whether the MCMI-III meets the federal standards for evidence in the legal settings with advocates pro (Craig, 2006; Dyer, 2005) and con (Lally, 2003; Rogers, Salekin, & Sewell, 1999). Review of these issues is beyond the scope of this text. The forensic psychologist does need to be well informed about all these issues before using the MCMI-III. Somewhat different issues must be considered in the administration of the MCMI-III in forensic settings compared with the more usual clinical setting. These issues were reviewed in Chapter 6 on the MMPI-2 (pp. 197-198) and will not be reiterated here. These issues need to be considered carefully because the validity (modifier) scales on the MCMI-III appear to be relatively insensitive to response distortions (Morgan, Schoenberg, Dorr, & Burke, 2002; Schoenberg, Dorr, & Morgan, 2003), although Schoenberg, Dorr, and Morgan (2006) developed a discriminant function that looked promising in identifying college students who were simulating psychopathology. Millon et al. (1997) have stated that in child-custody settings when "custody battles reach the point of requiring psychological evaluation, they constitute such a degree of interpersonal difficulty that the evaluation becomes a clinical matter" (p. 144). McCann, Flens, and Campagna (2001) have reported normative data for
  • 265. 259 child-custody examinees. The mean MCMI-III profile for these examinees was an elevation on Scale Y (Social Desirability) and subclinical elevations on Scales 4 (Histrionic), 5 (Narcissistic), and 7 Millon Clinical Multiaxial Inventory-III 277 (Compulsive). Lampel (1999) reported elevations on the same four MCMI-III scales in 50 divorcing couples. Halon (2001) has questioned whether elevations on these four scales in child-custody samples reflect personality difficulties or normal personality characteristics. PSYCHOMETRIC FOUNDATIONS Demographic Variables Age There are minimal effects of age on any of the MCMI-III scales (Raddy et al., 2005). There is a slight tendency for raw scores to decrease slightly past the age of 50 except on Scales 4 (Histrionic), 5 (Narcissistic), and 7 (Compulsive). Raw scores increased slightly in individuals over 50 on these three scales. Dean and Choca (2001) reported similar results when male psychiatric patients were classified as younger (18 to 40) or older (60+). The older patients had lower scores on all MCMI-III scales except Scales 4 (Histrionic), 5 (Narcissistic), and 7 (Compulsive).
  • 266. Gender Gender does not create any general issues in MCMI-111 interpretation because separate base rate (BR) scores are used for men and women. Any gender differences in how individuals responded to the items on each scale are removed when the raw scores are converted to BR scores. Lindsay, Sank.is, and Widiger (2000) reported that women were more likely to endorse the items on Scale 4 (Histrionic). Education There is no research that has looked at the effects of education on MCMI-111 scales. Ethnicity About 15% of the development and cross-validation for the MCMI-III were nonwhite. Millon et al. (1997) reported that some differences were found for the demographic vari- ables (unspecified), but these differences appear to reflect known differences in prevalence of the disorder. Some ethnic differences were noted on the MCMI-1 and MCMl-11, but no published research has looked at the effects of ethnicity on the MCMI-111. There have been several dissertations that examined ethnic differences on the MCMI-111. This ab- sence of such research on the MCMI-111 is remarkable because it is so common with the MMPI/MMPI-2. Until such research is published on the MCMI- III, the MCMI-III should
  • 267. be used cautiously with nonwhite individuals. Reliability The MCMI-III Manual (Millon et al., 1997, Table 3.3, p. 58) reports the reliability data for 87 individuals who were retested after an average of 5 to 14 days. The test-retest correlations ranged from .82 to .96 across the scales with a median of .91, which is very stable. Measures of the internal consistency of each scale (Cronbach's Alpha) also were quite good with only 278 Self-Report Inventories Table 8.7 Standard error of measurement for MCMI-111 scales in male psychiatric patients• Raw Scores SEM in BR Units at Base Rate Scale M SD SEM Alpha* 60 75 85 Personality Styles J (Schizoid) 9.83 5.52 4.47 .81 3.35 2.23 5.14 2A (Avoidant) 8.94 6.64 5.91 .89 3.56 1.35 3.72 2B (Depressive) 9.58 6.77 6.02 .89 3.32 1.66 4.98 3 (Dependent) 8.55 5.86 4.98 .85 4.01 2.81 5.02 4 (Histrionic) 11.80 5.47 4.43 .81 NA NA NA 5 (Narcissistic) 13.06 4.75 3.18 .67 6.28 5.34 4.71 6A (Antisocial) 10.78 6.02 4.64 .77 3.45 2.59 2.16 6B (Sadistic) 9.67 6.06 4.79 .79 1.04 1.46 5.43
  • 268. 7 (Compulsive) 14.12 5.34 3.52 .66 3.69 NA NA BA (Negativistic) 10.39 6.51 5.41 .83 4.07 1.48 4.44 BB (Masochistic) 7.32 5.69 4.95 .87 1.62 1.01 5.86 Severe Personality Styles S (Schizotypal) 8.01 6.65 5.66 .85 1.77 1.77 4.60 C (Borderline) 10.02 6.67 5.67 .85 2.64 3.17 3.53 P (Paranoid) 8.96 6.55 5.50 .84 1.64 4.00 5.45 Clinical Syndromes A (Anxiety) 8.25 5.71 4.91 .86 5.09 2.65 2.65 H (Somatoform) 7.23 4.76 4.09 .86 1.95 7.33 7.33 N (Bipolar: Manic) 6.99 4.39 3.12 .71 2.57 4.81 6.41 D (Dysthymia) 9.55 6.03 5.31 .88 3.39 1.32 5.65 B (Alcohol Dependence) 8.93 6.00 4.92 .82 3.86 2.03 3.46 T (Drug Dependence) 8.86 6.29 5.22 .83 1.92 5.56 NA R (PTSD) 8.92 6.47 5.76 .89 1.74 3.47 NA Severe Clinical Syndromes SS (Thought Disorder) 8.77 6.15 5.35 .87 l.50 4.68 NA CC (Major Depression) 9.54 6.61 5.95 .90 1.34 4.20 5.04 PP (Delusional Disorder) 3.79 3.83 3.03 .79 2.64 5.61 7.26 Validity Scales (Modifier Scales) X (Disclosure) 119.85 34.43 NA NA Y (Desirability) 11.92 4.74 4.07 .86 6.14 4.91 NA Z (Debasement) 14.46 8.84 8.40 .95 1.55 1.79 NA *N = 1,924. 0 Haddy et al. (2005). Millon Clinical Multiaxial Inventor y-III 279
  • 269. six scales (5 [Histrionic]-.67; 6A [Antisocial]-.77; 6B [Sadistic/Aggressive]-.79; 7 [Compulsive]-.66; N [Bipolar: Manic]-.71; PP [Delusional Disorder]-.79) below .80. The standard error of measurement for all MCMI-III scales is provided in Table 8.7 at BR scores of 60, 75, and 85 for male psychiatric patients (Haddy et al., 2005). (There were not a sufficient number of women in this sample to compute standard errors of measurement for them. The standard errors of measurement for raw scor es in men and women were generally similar suggesting that the standard errors of measurement for men could be used in women, too.) The standard error of measurement was calculated in raw score units for each scale and then converted in BR scores at these three points. For example, the standard error of measurement for Scale I (Schizoid) is 3.35, 2.23, and 5.14 at BR scores of 60, 75, and 85, respectively. These values change because the distribution is not uniform around these numbers. When the SEM is about 3 BR points for one of these scales, the individual's true score will be within ±3 BR points two-thirds of the time. The standard error of measurement for BR scores around 75 tends to be small, which means that BR scores above that cutting score are very likely to remain elevated despite any error of measurement. On the other hand, the standard error of measurement for BR scores around 85 tends to be about twice as large as at 75, which means that BR scores
  • 270. above cutting scores of 85 are more likely to change. The maximum BR score on Scales 4 (Histrionic) and 7 (Compulsive) in men is 84 and 83, respectively. Thus, it is not possible for a man to have a BR score above 85 on this scale and the standard error of measurement could not be calculated. The maximum BR on these same two scales in women is 92 and 91, respectively. CONCLUDING COMMENTS The MCMI-III is the self-report inventory most widely used to assess personality disorders. The MCMI-III should be considered any time the presence of a personality disorder is expected in an individual; it is a frequently overlooked set of diagnoses given the more dramatic symptoms in most Axis I disorders. Computer scoring is almost mandatory for the MCMI-111 given its complexity and time-consuming nature. Clinicians must understand the derivation and use of BR scores for the accurate interpretation of the scale scores. REFERENCES American Psychiatric Association. (1980). Diagnostic and statistical manual of mental disorders (3rd ed.). Washington, DC: Author. American Psychiatric Association. (1987). Diagnostic and statistical manual of mental disorders (3rd ed., rev.). Washington, DC: Author. American Psychiatric Association. (1994). Diagnostic and
  • 271. statistical manual ofmental disorders (4th ed.). Washington, DC: Author. American Psychiatric Association. (2000). Diagnostic and statistical manual ofmental disorders (4th ed., text rev.). Washington, D_C: Author. Butcher, J. N., Dahlstrom, W. G., Graham, J. R., Tellegen, A. M., & Kaemmer, B. (1989). MMPI-2: Manual for administration and scoring. Minneapolis: University of Minnesota Press. https://Disorder]-.79 https://Manic]-.71 https://Compulsive]-.66 https://Sadistic/Aggressive]-.79 https://Antisocial]-.77 https://Histrionic]-.67 280 Self-Report Inventories Charter, R. A., & Lopez, M. N. (2002). MCMI-III: The inability of the validity conditions to detect random responders. Journal ofClinical Psychology, 58, 1615- 1617. Choca, J. P. (2004). Interpretive guide to the Millon Clinical Multiaxial Inventory (3rd ed.). Wash- ington, DC: American Psychological Association. Craig, R. J. (Ed.). (2005). New directions in interpreting the MCMI-lll: Essays on current issues. Hoboken, NJ: Wiley. Craig, R. J. (2006). The MCMI-III. In R. P. Archer (Ed.),
  • 272. Forensic uses of clinical assessment instruments (pp. 121-145). Mahwah, NJ: Erlbaum. Dean, K. J., & Choca, J. (2001, August). Psychological changes of emotionally disturbed men with age. Paper presented at the annual meeting of the American Psychological Association, San Francisco. Dyer, F. J. (2005). Forensic applications of the MCMI-III in light of recent controversies. In R. J. Craig (Ed.), New directions in interpreting the MCMI-lll (pp. 201-226). Hoboken, NJ: Wiley. Finn, S. (1996). Using the MMPI-2 as a therapeutic intervention. Minneapolis: University of Min- nesota Press. Fischer, C. T. (1994). Individualizing psychological assessment. Hillsdale, NJ: Erlbaum. Grossman, S. D., & de! Rio, C. (2005). The MCMI-III facet subscales. In R. J. Craig (Ed.), New directions in interpreting the MCMI-Ill (pp. 3-31). Hoboken, NJ: Wiley. Haddy, C., Strack, S., & Choca, J. P. (2005). Linking personality disorders and clinical syndromes on the MCMI-III. Journal ofPersonality Assessment, 84, 193- 204. Halon, R. L. (2001). The MCMI-III: The normal quartet in child custody cases. American Journal of Forensic Psychology, 19, 57-75.
  • 273. Hathaway, S. R., & McKinley, J.C. (1951). MMPI manual. New York: Psychological Corporation. Jankowski, D. (2002). A beginner's guide to the MCMI-lll. Washington, DC: American Psychological Association. Lally, S. J. (2003). What tests are acceptable for use in forensic evaluations?: A survey of experts. Professional Psychology: Research and Practice, 34, 491-498. Lampel, A. K. (1999). Use of the MCMI-III in evaluating child custody litigants. American Journal ofForensic Psychology, 17, 19-31. Lindsay, K. A., Sankis, L. M., & Widiger, T. A. (2000). Sex and gender bias in self-report personality disorder inventories. Journal ofPersonality Disorders, 14, 218- 232. Mandell, D. (1997). An investigation ofthe effects of item omissions on the Millon Clinical Multiax- ial Inventory-II (MCMI-ll). Unpublished doctoral dissertation, Fairleigh Dickinson University, Teaneck, NJ. McCann, J. T. (2002). Guidelines for the forensic applications of the MCMI-III. Journal ofForensic Psychology Practice, 2, 55-70. McCann, J. T., Flens, J. T., & Campagna, V. (2001). The MCMI-III in child custody evaluations: A normative study. Journal ofForensic Psychology Practice, 1, 27- 44. Millon, T. (1977). MCMI manual. Minneapolis, MN:
  • 274. Interpretive Scoring Systems. Millon, T. (1983). Modern psychopathology: A biosocial approach to maladaptive learning and functioning. Prospect Heights, IL: Waveland Press. Millon, T. ( 1987). Manualfor the MCMI-ll ( 2nd ed.). Minneapolis, MN: National Computer Systems. Millon, T., & Davis, R. D. (1996). Disorders of personality: DSM-IV and beyond (Rev. ed.). New York: Wiley. Millon, T., Davis, R., & Millon, C. (1994). MCMI-Ill manual. Minneapolis, MN: National Computer Systems. Millon, T., Davis, R., & Millon, C. (1997). MCMI-lll manual ( 2nd ed.). Minneapolis, MN: National Computer Systems. Millon Clinical Multiaxial Inventory-III 281 Morgan, C. D., Schoenberg, M. R., Dorr, D., & Burke, M. J. (2002). Overreport on the MCMI-III: Concurrent validation with the MMPI-2 using a psychiatric inpatient sample. Journal of Per- sonality Assessment, 78, 288-300. Paulhus, D. L. (1984). Two-component models of socially desirable responding.Journal ofPersonality and Social Psychology, 46, 598-609. Retzlaff, P. D. (1995). Tactical psychotherapy of the personality
  • 275. disorders: An MCMI-III-based approach. Needham Heights, MA: Allyn & Bacon. Retzlaff, P. D., Ofman, P., Hyer, L., & Matheson, S. (1994). MCMI-11 high-point codes: Severe personality disorder and clinical syndrome extensions. Journal of Clinical Psychology, 30, 228-234. Retzlaff, P. D., Stoner, J., & Kleinsasser, D. (2002). The use of the MCMI-III in the screening and triage of offenders. International Journal of Offender Therapy and Comparative Criminology, 46, 319-332. Rogers, R., Salekin, R. T., & Sewell, K. W. (1999). Validation of the MCMI for Axis II disorders: Does it meet the Daubert standard? Law and Human Behavior, 23, 425-443. Schoenberg, M. R., Dorr, D., & Morgan, C. D. (2003). The ability of the MCMI-III to detect malingering. Psychological Assessment, 15, 198-204. Schoenberg, M. R., Dorr, D., & Morgan, C. D. (2006). Development of discriminant functions to detect dissimulation for the MCMI-111. Journal of Forensic Psychiatry and Psychology, 17, 405-416. Schutte, J. W. (2001). Using the MCMI-III in forensic evaluations. American Journal of Forensic Psychology, 19, 5-20. Strack, S. (2002). Essentials ofMillon inventories assessment. Hoboken, NJ: Wiley.