SlideShare a Scribd company logo
Designing with the
Mind in Mind
Simple Guide to Understanding
User Interface Design Guidelines
Second Edition
This page intentionally left blank
Designing with the
Mind in Mind
Simple Guide to Understanding
User Interface Design Guidelines
Second Edition
Jeff Johnson
AMSTERDAM • BOSTON • HEIDELBERG • LONDON
NEW YORK • OXFORD • PARIS • SAN DIEGO
SAN FRANCISCO • SINGAPORE • SYDNEY • TOKYO
Morgan Kaufmann is an imprint of Elsevier
Acquiring Editor: Meg Dunkerley
Editorial Project Manager: Heather Scherer
Project Manager: Priya Kumaraguruparan
Designer: Matthew Limbert
Morgan Kaufmann is an imprint of Elsevier
225 Wyman Street, Waltham, MA, 02451, USA
Copyright © 2014, 2010 Elsevier Inc. All rights reserved.
No part of this publication may be reproduced or transmitted in
any form or by any means, electronic
or mechanical, including photocopying, recording, or any
information storage and retrieval system,
without permission in writing from the publisher. Details on
how to seek permission, further
information about the Publisher’s permissions policies and our
arrangements with organizations such
as the Copyright Clearance Center and the Copyright Licensing
Agency, can be found at our website:
www.elsevier.com/permissions.
This book and the individual contributions contained in it are
protected under copyright by the
Publisher (other than as may be noted herein).
Notices
Knowledge and best practice in this field are constantly
changing. As new research and experience
broaden our understanding, changes in research methods or
professional practices, may become
necessary. Practitioners and researchers must always rely on
their own experience and knowledge in
evaluating and using any information or methods described
herein. In using such information or
methods they should be mindful of their own safety and the
safety of others, including parties for
whom they have a professional responsibility. To the fullest
extent of the law, neither the Publisher nor
the authors, contributors, or editors, assume any liability for
any injury and/or damage to persons or
property as a matter of products liability,negligence or
otherwise, or from any use or operation of any
methods, products, instructions, or ideas contained in the
material herein.
Library of Congress Cataloging-in-Publication Data
Application submitted
British Library Cataloguing-in-Publication Data
A catalogue record for this book is available from the British
Library
ISBN: 978-0-12-407914-4
Printed in China
14 15 16 17 10 9 8 7 6 5 4 3 2 1
For information on all Morgan Kaufmann publications,
visit our Web site at www.mkp.com
http://www.elsevier.com/permissions
http://www.mkp.com
v
Contents
Acknowledgments
...............................................................................................
.......vii
Foreword
...............................................................................................
..................... ix
Introduction
...............................................................................................
.............. xiii
CHAPTER 1 Our Perception is Biased
.............................................. 1
CHAPTER 2 Our Vision is Optimized to See Structure
...................... 13
CHAPTER 3 We Seek and Use Visual Structure
.............................. 29
CHAPTER 4 Our Color Vision is Limited
.......................................... 37
CHAPTER 5 Our Peripheral Vision is Poor
...................................... 49
CHAPTER 6 Reading is Unnatural
.................................................. 67
CHAPTER 7 Our Attention is Limited; Our Memory is
Imperfect ........ 87
CHAPTER 8 Limits on Attention Shape Our Thought and
Action ...... 107
CHAPTER 9 Recognition is Easy; Recall is Hard
........................... 121
CHAPTER 10 Learning from Experience and Performing
Learned
Actions are Easy; Novel Actions, Problem Solving,
and Calculation are Hard .......................................... 131
CHAPTER 11 Many Factors Affect Learning
.................................... 149
CHAPTER 12 Human Decision Making is Rarely Rational
................ 169
CHAPTER 13 Our Hand–Eye Coordination Follows Laws
.................. 187
CHAPTER 14 We Have Time Requirements
..................................... 195
Epilogue
...............................................................................................
................... 217
Appendix
...............................................................................................
.................. 219
Bibliography
...............................................................................................
............. 223
Index
...............................................................................................
........................ 229
This page intentionally left blank
vii
Acknowledgments
I could not have written this book without a lot of help and the
support of many
people.
First are the students of the human–computer interaction course
I taught as an
Erskine Fellow at the University of Canterbury in New Zealand
in 2006. It was for
them that I developed a lecture providing a brief background in
perceptual and
cognitive psychology—just enough to enable them to
understand and apply user-inter-
face design guidelines. That lecture expanded into a
professional development course,
then into the first edition of this book. My need to prepare more
comprehensive psy-
chological background for an upper-level course in human–
computer interaction that
I taught at the University of Canterbury in 2013 provided
motivation for expanding the
topics covered and improving the explanations in this second
edition.
Second, I thank my colleagues at the University of Canterbury
who provided
ideas, feedback on my ideas, and illustrations for the second
edition’s new chapter on
Fitts’ law: Professor Andy Cockburn, Dr. Sylvain Malacria, and
Mathieu Nancel. I also
thank my colleague and friend Professor Tim Bell for sharing
user-interface exam-
ples and for other help while I was at the university working on
the second edition.
Third, I thank the reviewers of the first edition—Susan Fowler,
Robin Jeffries,
Tim McCoy, and Jon Meads—and of the second edition—Susan
Fowler, Robin Jef-
fries, and James Hartman. They made many helpful comments
and suggestions that
allowed me to greatly improve the book.
Fourth, I am grateful to four cognitive science researchers who
directed me to
important references, shared useful illustrations with me, or
allowed me to bounce
ideas off of them:
• Professor Edward Adelson, Department of Brain and
Cognitive Sciences,
Massachusetts Institute of Technology.
• Professor Dan Osherson, Department of Psychology,
Princeton University.
• Dr. Dan Bullock, Department of Cognitive and Neural
Systems, Boston
University.
• Dr. Amy L. Milton, Department of Psychology and Downing
College, University
of Cambridge.
The book also was helped greatly by the care, oversight,
logistical support,
and nurturing provided by the staff at Elsevier, especially Meg
Dunkerley, Heather
Scherer, Lindsay Lawrence, and Priya Kumaraguruparan.
Last but not least, I thank my wife and friend Karen Ande for
her love and support
while I was researching and writing this book.
This page intentionally left blank
ix
Foreword
It is gratifying to see this book go into a second edition because
of the endorsement
that implies for maturing the field of human–computer
interaction beyond pure
empirical methods.
Human–computer interaction (HCI) as a topic is basically
simple. There is a per-
son of some sort who wants to do some task like write an essay
or pilot an airplane.
What makes the activity HCI is inserting a mediating computer.
In principle, our
person could have done the task without the computer. She
could have used a quill
pen and ink, for example, or flown an airplane that uses
hydraulic tubes to work the
controls. These are not quite HCI. They do use intermediary
tools or machines, and
the process of their design and the facts of their use bear
resemblance to those of
HCI. In fact, they fit into HCI’s uncle discipline of human
factors. But it is the com-
puter, and the process of contingent interaction the computer
renders possible, that
makes HCI distinctive.
The computer can transform a task’s representation and needed
skills. It can
change the linear writing process into something more like
sculpturing, the writer
roughing out the whole, then adding or subtracting bits to refine
the text. It can
change the piloting process into a kind of supervision, letting
the computer with
inputs of speed, altitude, and location and outputs of throttle,
flap, and rudder, do
the actual flying. And if instead of one person we have a small
group or a mass
crowd, or if instead of a single computer we have a network of
communicating
mobile or embedded computers, or if instead of a simple task we
have impinging
cultural or coordination considerations, then we get the many
variants of computer
mediation that form the broad spectrum of HCI.
The components of a discipline of HCI would also seem simple.
There is an arti-
fact that must be engineered and implemented. There is the
process of design for the
interaction itself and the objects, virtual or physical, with which
to interact. Then
there are all the principles, abstractions, theories, facts, and
phenomena surround-
ing HCI to know about. Let’s call the first interaction
engineering (e.g., using Harel
statecharts to guide implementation), the second, interaction
design (e.g., the design
of the workflow for a smartphone to record diet), and the third,
perhaps a little
overly grandly, interaction science (e.g., the use of Fitts’ law to
design button sizes
in an application). The hard bit for HCI is that fitting these
three together is not easy.
Beside work in HCI itself, each has its own literature not
friendly to outsiders. The
present book was written to bridge the gap between the relevant
science that has
been built up from the psychological literature and HCI design
problems where the
science could be of use.
Actually, the importance of linking engineering, design, and
science together in
HCI goes deeper. HCI is a technology. As Brian Arthur in his
book The Nature of
Forewordx
Technology tells us, technologies largely derive from other
technologies, not sci-
ence. The flat panel displays now common are a substitute for
CRT devices of yore,
and these go back to modified radar screens on the Whirlwind
computer. Further-
more, technologies are composed of parts that are themselves
technologies. A laptop
computer has a display for output and a key and a touchpad for
input and several
storage systems, and so on, each with its own technologies. But
eventually all these
technologies ground out in some phenomenon of nature that is
not a technology,
and here is a place where science plays a role. Some keyboard
input devices use the
natural phenomenon of electrical capacitance to sense
keystrokes. Pressing a key
brings two D-shaped pads close to a printed circuit board that is
covered by an insu-
lating film, thereby changing the pattern of capacitance. That is
to say, this keyboard
harnesses the natural phenomenon of capacitance in a reliable
way that can be
exploited to provide the HCI function of signaling an intended
interaction to the
computer.
Many natural phenomena are easy to understand and exploit by
simple observa-
tion or modest tinkering. No science needed. But some, like
capacitance, are much
less obvious, and then you really need science to understand
them. In some cases,
the HCI system that is built generates its own phenomena, and
you need science to
understand the unexpected, emergent properties of seemingly
obvious things. Peo-
ple sometimes believe that because they can intuitively
understand the easy cases
(e.g., with usability testing), they can understand all the cases.
But this is not neces-
sarily true. The natural phenomena to be exploited in HCI range
from abstractions
of computer science, such as the notion of the working set, to
psychological theories
of human cognition, perception, and movement, such as the
nature of vision. Psy-
chology, the area addressed by this book, is an area with an
especially messy and at
times contradictory literature, but it is also especially rich in
phenomena that can be
exploited for HCI technology.
I think it is underappreciated how important it is for the future
development of
HCI as a discipline that the field develops a supporting science
base as illustrated by
the current book for the field of psychology. It also involves
HCI growing some of its
own science bits.
Why is this important? There are at least three reasons. First,
having some sort of
theory enables explanatory evaluation. The use of A-B testing is
limited if you don’t
know why there was a difference. On the other hand, if you
have a theory that lets
you interpret the difference, then you can fix it. You will never
understand the prob-
lems of why a windows-based user interface can take excessive
time to use by doing
usability testing, for example, if you don’t have the theoretical
concept of the win-
dow working set. Second, it enables generative design. It allows
a shift in represen-
tation of the design space. Once it is realized that a very
important property of
pointing devices is the bandwidth of the human motor group to
which a transducer
is going to be applied, then the problem gets reformulated to
terms of how to con-
nect those muscles and the consequence for the rest of the
design. Third, it supports
the codification of knowledge. Only by having theories and
abstractions can we
concisely cumulate our results and develop a field with
sufficient power and depth.
xiForeword
Why isn’t there wider use of science or theory in HCI? There
are obvious reasons,
like the fact that it isn’t easy to get the relevant science
linkages or results in the first
place, that it’s hard to make the connection with science in
almost any engineering
field, and that often the connection is made, but invisibly
packaged, in a way that
nonspecialists never need to see it. The poet tosses capacitance
with his finger, but
only knows he writes a poem. He thinks he writes with love,
because someone
understood electricity.
But, mainly, I think there isn’t wider use of science or theory in
HCI because it is
difficult to put that knowledge into a form that is easily useful
at the time of design
need. Jeff Johnson in this book is careful to connect theory with
design choice, and
to do it in a practical way. He has accumulated grounded design
rules that reach
across the component parts of HCI, making it easier for
designers as they design to
keep them in mind.
Stuart K. Card
This page intentionally left blank
xiii
Introduction
USER-INTERFACE DESIGN RULES: WHERE DO THEY
COME FROM
AND HOW CAN THEY BE USED EFFECTIVELY?
For as long as people have been designing interactive computer
systems, some have
attempted to promote good design by publishing user-interface
design guidelines
(also called design rules). Early ones included:
• Cheriton (1976) proposed user-interface design guidelines
for early interactive
(time-shared) computer systems.
• Norman (1983a, 1983b) presented design rules for software
user interfaces
based on human cognition, including cognitive errors.
• Smith and Mosier (1986) wrote perhaps the most
comprehensive set of user-
interface design guidelines.
• Shneiderman (1987) included “Eight Golden Rules of
Interface Design” in the
first edition of his book Designing the User Interface and in all
later editions.
• Brown (1988) wrote a book of design guidelines,
appropriately titled Human–
Computer Interface Design Guidelines.
• Nielsen and Molich (1990) offered a set of design rules for
use in heuristic
evaluation of user interfaces, and Nielsen and Mack (1994)
updated them.
• Marcus (1992) presented guidelines for graphic design in
online documents
and user interfaces.
In the twenty-first century, additional user-interface design
guidelines have been
offered by Stone et al. (2005); Koyani et al. (2006); Johnson
(2007); and Shneiderman
and Plaisant (2009). Microsoft, Apple Computer, and Oracle
publish guidelines for
designing software for their platforms (Apple Computer, 2009;
Microsoft Corporation,
2009; Oracle Corporation/Sun Microsystems, 2001).
How valuable are user-interface design guidelines? That
depends on who applies
them to design problems.
USER-INTERFACE DESIGN AND EVALUATION REQUIRES
UNDERSTANDING AND EXPERIENCE
Following user-interface design guidelines is not as
straightforward as following
cooking recipes. Design rules often describe goals rather than
actions. They are
purposefully very general to make them broadly applicable, but
that means that
Introductionxiv
their exact meaning and applicability to specific design
situations is open to
interpretation.
Complicating matters further, more than one rule will often
seem applicable to a
given design situation. In such cases, the applicable design
rules often conflict—
that is, they suggest different designs. This requires designers
to determine which
competing design rule is more applicable to the given situation
and should take
precedence.
Design problems, even without competing design guidelines,
often have multiple
conflicting goals. For example:
• Bright screen and long battery life
• Lightweight and sturdy
• Multifunctional and easy to learn
• Powerful and simple
• High resolution and fast loading
• WYSIWYG (what you see is what you get) and usable by
blind people
Satisfying all the design goals for a computer-based product or
service usually
requires tradeoffs—lots and lots of tradeoffs. Finding the right
balance point between
competing design rules requires further tradeoffs.
Given all of these complications, user-interface design rules and
guidelines must
be applied thoughtfully, not mindlessly, by people who are
skilled in the art of user-
interface design and/or evaluation. User-interface design rules
and guidelines are
more like laws than like rote recipes. Just as a set of laws is
best applied and inter-
preted by lawyers and judges who are well versed in the laws, a
set of user-interface
design guidelines is best applied and interpreted by people who
understand the
basis for the guidelines and have learned from experience in
applying them.
Unfortunately, with a few exceptions (e.g., Norman, 1983a),
user-interface design
guidelines are provided as simple lists of design edicts with
little or no rationale or
background.
Furthermore, although many early members of the user-interface
design and
usability profession had backgrounds in cognitive psychology,
most newcomers to
the field do not. That makes it difficult for them to apply user-
interface design guide-
lines sensibly. Providing that rationale and background
education is the focus of this
book.
COMPARING USER-INTERFACE DESIGN GUIDELINES
Table I.1 places the two best-known user-interface guideline
lists side by side to
show the types of rules they contain and how they compare to
each other (see the
Appendix for additional guidelines lists). For example, both
lists start with a rule call-
ing for consistency in design. Both lists include a rule about
preventing errors. The
xvIntroduction
Nielsen–Molich rule to “help users recognize, diagnose, and
recover from errors”
corresponds closely to the Shneiderman–Plaisant rule to “permit
easy reversal of
actions.” “User control and freedom” corresponds to “make
users feel they are in
control.” There is a reason for this similarity, and it isn’t just
that later authors were
influenced by earlier ones.
WHERE DO DESIGN GUIDELINES COME FROM?
For present purposes, the detailed design rules in each set of
guidelines, such as
those in Table I.1, are less important than what they have in
common: their basis and
origin. Where did these design rules come from? Were their
authors—like clothing
fashion designers—simply trying to impose their own personal
design tastes on the
computer and software industries?
If that were so, the different sets of design rules would be very
different from
each other, as the various authors sought to differentiate
themselves from the others.
In fact, all of these sets of user-interface design guidelines are
quite similar if we
ignore differences in wording, emphasis, and the state of
computer technology
when each set was written. Why?
The answer is that all of the design rules are based on human
psychology: how
people perceive, learn, reason, remember, and convert
intentions into action. Many
authors of design guidelines had at least some background in
psychology that they
applied to computer system design.
For example, Don Norman was a professor, researcher, and
prolific author in the
field of cognitive psychology long before he began writing
about human–computer
interaction. Norman’s early human–computer design guidelines
were based on
research—his own and others’—on human cognition. He was
especially interested
in cognitive errors that people often make and how computer
systems can be
designed to lessen or eliminate the impact of those errors.
Table I.1 Two Best-Known Lists of User-Interface Design
Guidelines
Shneiderman (1987); Shneiderman
and Plaisant (2009)
Nielsen and Molich (1990)
Strive for consistency
Cater to universal usability
Offer informative feedback
Design task flows to yield closure
Prevent errors
Permit easy reversal of actions
Make users feel they are in control
Minimize short-term memory load
Consistency and standards
Visibility of system status
Match between system and real world
User control and freedom
Error prevention
Recognition rather than recall
Flexibility and efficiency of use
Aesthetic and minimalist design
Help users recognize, diagnose, and recover from errors
Provide online documentation and help
Introductionxvi
Similarly, other authors of user-interface design guidelines—for
example, Brown,
Shneiderman, Nielsen, and Molich—used knowledge of
perceptual and cognitive
psychology to try to improve the design of usable and useful
interactive systems.
Bottom line: User-interface design guidelines are based on
human psychology.
By reading this book, you will learn the most important aspects
of the psychol-
ogy underlying user-interface and usability design guidelines.
INTENDED AUDIENCE OF THIS BOOK
This book is intended mainly for software design and
development professionals
who have to apply user-interface and interaction design
guidelines. This includes
interaction designers, user-interface designers, user-experience
designers, graphic
designers, and hardware product designers. It also includes
usability testers and eval-
uators, who often refer to design heuristics when reviewing
software or analyzing
observed usage problems.
A second intended audience is students of interaction design
and human– computer
interaction. A third intended audience is software development
managers who want
enough of a background in the psychological basis of user-
interface design rules to
understand and evaluate the work of the people they manage.
Designing with the Mind in Mind.
http://dx.doi.org/10.1016/B978-0-12-407914-4.00001-4
© 2014 Elsevier Inc. All rights reserved.
CHAPTER
1
Our Perception is Biased
Our perception of the world around us is not a true depiction of
what is actually
there. Our perceptions are heavily biased by at least three
factors:
l The past: our experience
l The present: the current context
l The future: our goals
PERCEPTION BIASED BY EXPERIENCE
Experience—your past perceptions—can bias your current
perception in several
different ways.
Perceptual priming
Imagine that you own a large insurance company. You are
meeting with a real estate
manager, discussing plans for a new campus of company
buildings. The campus
consists of a row of five buildings, the last two with T-shaped
courtyards providing
light for the cafeteria and fitness center. If the real estate
manager showed you the
map in Figure 1.1, you would see five black shapes representing
the buildings.
Now imagine that instead of a real estate manager, you are
meeting with an adver-
tising manager. You are discussing a new billboard ad to be
placed in certain markets
around the country. The advertising manager shows you the
same image, but in this
scenario the image is a sketch of the ad, consisting of a single
word: LIFE. In this
scenario, you see a word, clearly and unambiguously.
When your perceptual system has been primed to see building
shapes, you see
building shapes, and the white areas between the buildings
barely register in your
perception. When your perceptual system has been primed to
see text, you see text,
and the black areas between the letters barely register.
1
CHAPTER 1 Our Perception is Biased2
A relatively famous example of how priming the mind can
affect perception is an
image, supposedly by R. C. James,1 that initially looks to most
people like a random
splattering of paint (see Fig. 1.2) similar to the work of the
painter Jackson Pollack.
Before reading further, look at the image.
Only after you are told that it is a Dalmatian dog sniffing the
ground near a tree
can your visual system organize the image into a coherent
picture. Moreover, once
you’ve seen the dog, it is hard to go back to seeing just a
random collection of spots.
1 Published in Lindsay and Norman (1972), Figure 3-17, p. 146.
FIGURE 1.1
Building map or word? What you see depends on what you were
told to see.
FIGURE 1.2
Image showing the effect of mental priming of the visual
system. What do you see?
3Perception biased by experience
These priming examples are visual, but priming can also bias
other types of per-
ception, such as sentence comprehension. For example, the
headline “New Vaccine
Contains Rabies” would probably be understood differently by
people who had
recently heard stories about contaminated vaccines than by
people who had recently
heard stories about successful uses of vaccines to fight diseases.
Familiar perceptual patterns or frames
Much of our lives are spent in familiar situations: the rooms in
our homes, our yards,
our routes to and from school or work, our offices,
neighborhood parks, stores, res-
taurants, etc. Repeated exposure to each type of situation builds
a pattern in our
minds of what to expect to see there. These perceptual patterns,
which some
researchers call frames, include the objects or events that are
usually encountered
in that situation.
For example, you know most rooms in your home well enough
that you need not
constantly scrutinize every detail. You know how they are laid
out and where most
objects are located. You can probably navigate much of your
home in total darkness.
But your experience with homes is broader than your specific
home. In addition to
having a pattern for your home, your brain has one for homes in
general. It biases
your perception of all homes, familiar and new. In a kitchen,
you expect to see a
stove and a sink. In a bathroom, you expect to see a toilet, a
sink, and a shower or a
bathtub (or both).
Mental frames for situations bias our perception to see the
objects and events
expected in each situation. They are a mental shortcut: by
eliminating the need for us
to constantly scrutinize every detail of our environment, they
help us get around in
our world. However, mental frames also make us see things that
aren’t really there.
For example, if you visit a house in which there is no stove in
the kitchen, you
might nonetheless later recall seeing one, because your mental
frame for kitchens
has a strong stove component. Similarly, part of the frame for
eating at a restaurant
is paying the bill, so you might recall paying for your dinner
even if you absentmind-
edly walked out without paying. Your brain also has frames for
back yards, schools,
city streets, business offices, supermarkets, dentist visits, taxis,
air travel, and other
familiar situations.
Anyone who uses computers, websites, or smartphones has
frames for the desk-
top and files, web browsers, websites, and various types of
applications and online
services. For example, when they visit a new Web site,
experienced Web users
expect to see a site name and logo, a navigation bar, some other
links, and maybe a
search box. When they book a flight online, they expect to
specify trip details,
examine search results, make a choice, and make a purchase.
Because of the perceptual frames users of computer software
and websites have,
they often click buttons or links without looking carefully at
them. Their perception
of the display is based more on what their frame for the
situation leads them to
expect than on what is actually on the screen. This sometimes
confounds software
designers, who expect users to see what is on the screen—but
that isn’t how human
vision works.
CHAPTER 1 Our Perception is Biased4
For example, if the positions of the “Next” and “Back” buttons
on the last page of
a multistep dialog box2 switched, many people would not
immediately notice the
switch (see Fig. 1.3). Their visual system would have been
lulled into inattention by
the consistent placement of the buttons on the prior several
pages. Even after unin-
tentionally going backward a few times, they might continue to
perceive the buttons
2 Multistep dialog boxes are called wizards in user-interface
designer jargon.
FIGURE 1.3
The “Next” button is perceived to be in a consistent location,
even when it isn’t.
5Perception biased by experience
in their standard locations. This is why consistent placement of
controls is a common
user-interface guideline, to ensure that reality matches the
user’s frame for the
situation.
Similarly, if we are trying to find something but it is in a
different place or looks
different from usual, we might miss it even though it is in plain
view because our
mental frames tune us to look for expected features in expected
locations. For exam-
ple, if the “Submit” button on one form in a Web site is shaped
differently or is a
different color from those on other forms on the site, users
might not find it. This
expectation-induced blindness is discussed more later in this
chapter in the “Percep-
tion Biased by Goals” section.
Habituation
A third way in which experience biases perception is called
habituation. Repeated
exposure to the same (or highly similar) perceptions dulls our
perceptual system’s
sensitivity to them. Habituation is a very low-level phenomenon
of our nervous sys-
tem: it occurs at a neuronal level. Even primitive animals like
flatworms and amoeba,
with very simple nervous systems, habituate to repeated stimuli
(e.g., mild electric
shocks or light-flashes). People, with our complex nervous
systems, habituate to a
range of events, from low-level ones like a continually beeping
tone, to medium-level
ones like a blinking ad on a Web site, to high-level ones like a
person who tells the
same jokes at every party or a politician giving a long,
repetitious speech.
We experience habituation in computer usage when the same
error messages or
“Are you sure?” confirmation messages appear again and again.
People initially notice
them and perhaps respond, but eventually click them closed
reflexively without
bothering to read them.
Habituation is also a factor in a recent phenomenon variously
labeled “social
media burnout” (Nichols, 2013), “social media fatigue,” or
“Facebook vacations”
(Rainie et al., 2013): newcomers to social media sites and
tweeting are initially
excited by the novelty of microblogging about their
experiences, but sooner or later
get tired of wasting time reading tweets about every little thing
that their “friends”
do or see—for example, “Man! Was that ever a great salmon
salad I had for lunch
today.”
Attentional blink
Another low-level biasing of perception by past experience
occurs just after we spot
or hear something important. For a very brief period following
the recognition—
between 0.15 and 0.45 second—we are nearly deaf and blind to
other visual stimuli,
even though our ears and eyes stay functional. Researchers call
this the attentional
blink (Raymond et al., 1992, Stafford and Webb, 2005).3 It is
thought to be caused by
the brain’s perceptual and attention mechanisms being briefly
fully occupied with
processing the first recognition.
3 Chapter 14 discusses the attentional blink interval in the
context of other perceptual intervals.
CHAPTER 1 Our Perception is Biased6
A classic example: You are in a subway car as it enters a
station, planning to meet
two friends at that station. As the train arrives, your car passes
one of your friends, and
you spot him briefly through your window. In the next split
second, your window
passes your other friend, but you fail to notice her because her
image hit your retina
during the attentional blink that resulted from your recognition
of your first friend.
When people use computer-based systems and online services,
attentional blink
can cause them to miss information or events if things appear in
rapid succession. A
popular modern technique for making documentary videos is to
present a series of
still photographs in rapid succession.4 This technique is highly
prone to attentional
blink effects: if an image really captures your attention (e.g., it
has a strong meaning
for you), you will probably miss one or more of the immediately
following images. In
contrast, a captivating image in an auto-running slideshow (e.g.,
on a Web site or an
information kiosk) is unlikely to cause attentional blink (i.e.,
missing the next image)
because each image typically remains displayed for several
seconds.
PERCEPTION BIASED BY CURRENT CONTEXT
When we try to understand how our visual perception works, it
is tempting to think
of it as a bottom-up process, combining basic features such as
edges, lines, angles,
curves, and patterns into figures and ultimately into meaningful
objects. To take read-
ing as an example, you might assume that our visual system
first recognizes shapes as
letters and then combines letters into words, words into
sentences, and so on.
But visual perception—reading in particular—is not strictly a
bottom-up process.
It includes top-down influences too. For example, the word in
which a character
appears may affect how we identify the character (see Fig. 1.4).
Similarly, our overall comprehension of a sentence or a
paragraph can even influence
what words we see in it. For example, the same letter sequence
can be read as different
words depending on the meaning of the surrounding paragraph
(see Fig. 1.5).
Contextual biasing of vision need not involve reading. The
Müller–Lyer illusion is
a famous example (see Fig. 1.6): the two horizontal lines are the
same length, but the
outward-pointing “fins” cause our visual system to see the top
line as longer than the
4 For an example, search YouTube for “history of the world in
two minutes.”
FIGURE 1.4
The same character is perceived as H or A depending on the
surrounding letters.
7Perception biased by current context
line with inward-pointing “fins.” This and other optical
illusions (see Fig. 1.7) trick us
because our visual system does not use accurate, optimal
methods to perceive the
world. It developed through evolution, a semi-random process
that layers jury-
rigged—often incomplete and inaccurate—solutions on top of
each other. It works
fine most of the time, but it includes a lot of approximations,
kludges, hacks, and
outright “bugs” that cause it to fail in certain cases.
The examples in Figures 1.6 and 1.7 show vision being biased
by visual context.
However, biasing of perception by the current context works
between different senses
too. Perceptions in any of our five senses may affect
simultaneous perceptions in any
of our other senses. What we feel with our tactile sense can be
biased by what we
hear, see, or smell. What we see can be biased by what we hear,
and what we hear can
be biased by what we see. The following two examples of visual
perception affect
what we hear:
l McGurk effect. If you watch a video of someone saying “bah,
bah, bah,” then
“dah, dah, dah,” then “vah, vah, vah,” but the audio is “bah,
bah, bah” through-
out, you will hear the syllable indicated by the speaker’s lip
movement rather
than the syllable actually in the audio track.5 Only by closing or
averting your
eyes do you hear the syllable as it really is. I’ll bet you didn’t
know you could
read lips, and in fact do so many times a day.
5 Go to YouTube, search for “McGurk effect,” and view (and
hear) some of the resulting videos.
Fold napkins. Polish silverware. Wash dishes.
French napkins. Polish silverware. German dishes.
FIGURE 1.5
The same phrase is perceived differently depending on the list it
appears in.
FIGURE 1.6
Müller–Lyer illusion: equal-length horizontal lines appear to
have different lengths.
CHAPTER 1 Our Perception is Biased8
l Ventriloquism. Ventriloquists don’t throw their voice; they
just learn to talk
without moving their mouths much. Viewers’ brains perceive
the talking as
coming from the nearest moving mouth: that of the
ventriloquist’s puppet
(Eagleman, 2012).
An example of the opposite—hearing biasing vision—is the
illusory flash effect.
When a spot is flashed once briefly on a display but is
accompanied by two quick
beeps, it appears to flash twice. Similarly, the perceived rate of
a blinking light can
be adjusted by the frequency of a repeating click (Eagleman,
2012).
Later chapters explain how visual perception, reading, and
recognition function
in the human brain. For now, I will simply say that the pattern
of neural activity that
corresponds to recognizing a letter, a word, a face, or any object
includes input from
(A) (B)
(C)
FIGURE 1.7
(A) The checkboard does not bulge in the middle; (B) the
triangle sides are not bent; and (C) the
red vertical lines are parallel.
9Perception biased by goals
neural activity stimulated by the context. This context includes
other nearby per-
ceived objects and events, and even reactivated memories of
previously perceived
objects and events.
Context biases perception not only in people but also in lower
animals. A friend
of mine often brought her dog with her in her car when running
errands. One day,
as she drove into her driveway, a cat was in the front yard. The
dog saw it and
began barking. My friend opened the car door and the dog
jumped out and ran
after the cat, which turned and jumped through a bush to escape.
The dog dove
into the bush but missed the cat. The dog remained agitated for
some time
afterward.
Thereafter, for as long as my friend lived in that house,
whenever she arrived at
home with her dog in the car, he would get excited, bark, jump
out of the car as soon
as the door was opened, dash across the yard, and leap into the
bush. There was no
cat, but that didn’t matter. Returning home in the car was
enough to make the dog
see one—perhaps even smell one. However, walking home, as
the dog did after
being taken for his daily walk, did not evoke the “cat mirage.”
PERCEPTION BIASED BY GOALS
In addition to being biased by our past experience and the
present context, our per-
ception is influenced by our goals and plans for the future.
Specifically, our goals:
l Guide our perceptual apparatus, so we sample what we need
from the world
around us.
l Filter our perceptions: things unrelated to our goals tend to
be filtered out pre-
consciously, never registering in our conscious minds.
For example, when people navigate through software or a Web
site, seeking
information or a specific function, they don’t read carefully.
They scan screens
quickly and superficially for items that seem related to their
goal. They don’t simply
ignore items unrelated to their goals; they often don’t even
notice them.
To see this, glance at Figure 1.8 and look for scissors, and then
immediately flip
back to this page. Try it now.
Did you spot the scissors? Now, without looking back at the
toolbox, can you say
whether there is a screwdriver in the toolbox too?
Our goals filter our perceptions in other perceptual senses as
well as in vision. A
familiar example is the “cocktail party” effect. If you are
conversing with someone
at a crowded party, you can focus your attention to hear mainly
what he or she is
saying even though many other people are talking near you.
The more interested
you are in the conversation, the more strongly your brain filters
out surrounding
chatter. If you are bored by what your conversational partner is
saying, you will
probably hear much more of the conversations around you.
The effect was first documented in studies of air-traffic
controllers, who were
able to carry on a conversation with the pilots of their assigned
aircraft even though
CHAPTER 1 Our Perception is Biased10
many different conversations were occurring simultaneously on
the same radio fre-
quency, coming out of the same speaker in the control room
(Arons, 1992). Research
suggests that our ability to focus on one conversation among
several simultaneous
ones depends not only on our interest level in the conversation,
but also on objective
factors, such as the similarity of voices in the cacophony, the
amount of general
“noise” (e.g., clattering dishes or loud music), and the
predictability of what your
conversational partner is saying (Arons, 1992).
This filtering of perception by our goals is particularly true for
adults, who tend
to be more focused on goals than children are. Children are
more stimulus-driven:
their perception is less filtered by their goals. This
characteristic makes them more
distractible than adults, but it also makes them less biased as
observers.
A parlor game demonstrates this age difference in perceptual
filtering. It is similar
to the Figure 1.8 exercise. Most households have a catch-all
drawer for kitchen imple-
ments or tools. From your living room, send a visitor to the
room where the catch-all
drawer is, with instructions to fetch you a specific tool, such as
measuring spoons or
a pipe wrench. When the person returns with the tool, ask
whether another specific
tool was in the drawer. Most adults will not know what else was
in the drawer. Chil-
dren—if they can complete the task without being distracted by
all the cool stuff in
the drawer—will often be able to tell you more about what else
was there.
Perceptual filtering can also be seen in how people navigate
websites. Suppose I
put you on the homepage of New Zealand’s University of
Canterbury (see Fig. 1.9)
and asked you to find information about financial support for
postgraduate students
in the computer science department. You would scan the page
and probably quickly
click one of the links that share words with the goal that I gave
you: Departments
(top left), Scholarships (middle), then Postgraduate Students
(bottom left) or Post-
graduate (right). If you’re a “search” person, you might instead
go right to the Search
box (top right), type words related to the goal, and click “Go.”
Whether you browse or search, it is likely that you would leave
the homepage
without noticing that you were randomly chosen to win $100
(bottom right). Why?
Because that was not related to your goal.
FIGURE 1.8
Toolbox: Are there scissors here?
11Perception biased by goals
What is the mechanism by which our current goals bias our
perception? There
are two:
l Influencing where we look. Perception is active, not passive.
Think of your
perceptual senses not as simply filtering what comes to you, but
rather as reach-
ing out into the world and pulling in what you need to perceive.
Your hands,
your primary touch sensors, literally do this, but the rest of your
senses do it
too. You constantly move your eyes, ears, hands, feet, body,
and attention so as
to sample exactly the things in your environment that are most
relevant to what
you are doing or about to do (Ware, 2008). If you are looking
on a Web site for
a campus map, your eyes and pointer-controlling hand are
attracted to anything
that might lead you to that goal. You more or less ignore
anything unrelated to
your goal.
l Sensitizing our perceptual system to certain features. When
you are look-
ing for something, your brain can prime your perception to be
especially sensi-
tive to features of what you are looking for (Ware, 2008). For
example, when
you are looking for a red car in a large parking lot, red cars will
seem to pop out
as you scan the lot, and cars of other colors will barely register
in your con-
sciousness, even though you do in some sense see them.
Similarly, when you are
FIGURE 1.9
University of Canterbury Web site: navigating sites requires
perceptual filtering.
CHAPTER 1 Our Perception is Biased12
trying to find your spouse in a dark, crowded room, your brain
“programs” your
auditory system to be especially sensitive to the combination of
frequencies
that make up his or her voice.
TAKING BIASED PERCEPTION INTO ACCOUNT WHEN
DESIGNING
All these sources of perceptual bias of course have implications
for user-interface
design. Here are three.
Avoid ambiguity
Avoid ambiguous information displays, and test your design to
verify that all users
interpret the display in the same way. Where ambiguity is
unavoidable, either rely on
standards or conventions to resolve it, or prime users to resolve
the ambiguity in the
intended way.
For example, computer displays often shade buttons and text
fields to make them
look raised in relation to the background surface (see Fig. 1.10).
This appearance
relies on a convention, familiar to most experienced computer
users, that the light
source is at the top left of the screen. If an object were depicted
as lit by a light
source in a different location, users would not see the object as
raised.
Be consistent
Place information and controls in consistent locations. Controls
and data displays that
serve the same function on different pages should be placed in
the same position on
each page on which they appear. They should also have the
same color, text fonts,
shading, and so on. This consistency allows users to spot and
recognize them quickly.
Understand the goals
Users come to a system with goals they want to achieve.
Designers should under-
stand those goals. Realize that users’ goals may vary, and that
their goals strongly
influence what they perceive. Ensure that at every point in an
interaction, the infor-
mation users need is available, prominent, and maps clearly to a
possible user goal,
so users will notice and use the information.
Search
FIGURE 1.10
Buttons on computer screens are often shaded to make them
look three dimensional, but the
convention works only if the light source is assumed to be on
the top left.
Designing with the Mind in Mind.
http://dx.doi.org/10.1016/B978-0-12-407914-4.00002-6
© 2014 Elsevier Inc. All rights reserved.
CHAPTER
13
Our Vision is Optimized
to See Structure 2
Early in the twentieth century, a group of German psychologists
sought to explain
how human visual perception works. They observed and
catalogued many important
visual phenomena. One of their basic findings was that human
vision is holistic: our
visual system automatically imposes structure on visual input
and is wired to per-
ceive whole shapes, figures, and objects rather than
disconnected edges, lines, and
areas. The German word for “shape” or “figure” is Gestalt, so
these theories became
known as the Gestalt principles of visual perception.
Today’s perceptual and cognitive psychologists regard the
Gestalt theory of per-
ception as more of a descriptive framework than an explanatory
and predictive
theory. Today’s theories of visual perception tend to be based
heavily on the neuro-
physiology of the eyes, optic nerve, and brain (see Chapters 4–
7).
Not surprisingly, the findings of neurophysiological researchers
support the observa-
tions of the Gestalt psychologists. We really are—along with
other animals—“wired” to
perceive our surroundings in terms of whole objects (Stafford
and Webb, 2005; Ware,
2008). Consequently, the Gestalt principles are still valid—if
not as a fundamental expla-
nation of visual perception, at least as a framework for
describing it. They also provide a
useful basis for guidelines for graphic design and user-interface
design (Soegaard, 2007).
For present purposes, the most important Gestalt principles are
Proximity, Simi-
larity, Continuity, Closure, Symmetry, Figure/Ground, and
Common Fate. The fol-
lowing sections describe each principle and provide examples
from both static
graphic design and user-interface design.
GESTALT PRINCIPLE: PROXIMITY
The Gestalt principle of Proximity is that the relative distance
between objects in a
display affects our perception of whether and how the objects
are organized into
CHAPTER 2 Our Vision is Optimized to See Structure14
subgroups. Objects that are near each other (relative to other
objects) appear
grouped, while those that are farther apart do not.
In Figure 2.1A, the stars are closer together horizontally than
they are vertically,
so we see three rows of stars, while the stars in Figure 2.1B are
closer together verti-
cally than they are horizontally, so we see three columns.
(A) (B)
FIGURE 2.1
Proximity: items that are closer appear grouped as rows (A) and
columns (B).
FIGURE 2.2
In Outlook’s Distribution List Membership dialog box, list
buttons are in a group box, separate
from the control buttons.
15Gestalt principle: proximity
The Proximity principle has obvious relevance to the layout of
control panels or
data forms in software, Web sites, and electronic appliances.
Designers often sepa-
rate groups of on-screen controls and data displays by enclosing
them in group boxes
or by placing separator lines between groups (see Fig. 2.2).
However, according to the Proximity principle, items on a
display can be visually
grouped simply by spacing them closer to each other than to
other controls, without
group boxes or visible borders (see Fig. 2.3). Many graphic
design experts recom-
mend this approach to reduce visual clutter and code size in a
user interface (Mullet
and Sano, 1994).
FIGURE 2.3
In Mozilla Thunderbird’s Subscribe Folders dialog box, controls
are grouped using the Proximity
principle.
FIGURE 2.4
In Discreet’s Software Installer, poorly spaced radio buttons
look grouped in vertical columns.
CHAPTER 2 Our Vision is Optimized to See Structure16
Conversely, if controls are poorly spaced (e.g., if connected
controls are too far
apart) people will have trouble perceiving them as related,
making the software
harder to learn and remember. For example, the Discreet
Software Installer displays
six horizontal pairs of radio buttons, each representing a two-
way choice, but their
spacing, due to the Proximity principle, makes them appear to
be two vertical sets
of radio buttons, each representing a six-way choice, at least
until users try them and
learn how they operate (see Fig. 2.4).
GESTALT PRINCIPLE: SIMILARITY
Another factor that affects our perception of grouping is
expressed in the Gestalt
principle of Similarity, where objects that look similar appear
grouped, all other
things being equal. In Figure 2.5, the slightly larger, “hollow”
stars are perceived as
a group.
The Page Setup dialog box in Mac OS applications uses the
Similarity and Proxim-
ity principles to convey groupings (see Fig. 2.6). The three very
similar and tightly
spaced Orientation settings are clearly intended to appear
grouped. The three menus
are not so tightly spaced but look similar enough that they
appear related even
though that probably wasn’t intended.
Similarly, the text fields in a form at book publisher Elsevier’s
Web site are orga-
nized into an upper group of eight for the name and address, a
group of three split
fields for phone numbers, and two single text fields. The four
menus, in addition to
being data fields, help separate the text field groups (see Fig.
2.7). By contrast, the
labels are too far from their fields to seem connected to them.
FIGURE 2.5
Similarity: items appear grouped if they look more similar to
each other than to other objects.
17Gestalt principle: similarity
FIGURE 2.6
Mac OS Page Setup dialog box. The Similarity and Proximity
principles are used to group the
Orientation settings.
FIGURE 2.7
Similarity makes the text fields appear grouped in this online
form at Elsevier.com.
http://Elsevier.com
CHAPTER 2 Our Vision is Optimized to See Structure18
GESTALT PRINCIPLE: CONTINUITY
In addition to the two Gestalt principles concerning our
tendency to organize objects
into groups, several Gestalt principles describe our visual
system’s tendency to
resolve ambiguity or fill in missing data in such a way as to
perceive whole objects.
The first such principle, the principle of Continuity, states that
our visual perception
is biased to perceive continuous forms rather than disconnected
segments.
For example, in Figure 2.8A, we automatically see two crossing
lines—one blue
and one orange. We don’t see two separate orange segments and
two separate blue
ones, and we don’t see a blue-and-orange V on top of an upside-
down orange-and-
blue V. In Figure 2.8B, we see a sea monster in water, not three
pieces of one.
A well-known example of the use of the continuity principle in
graphic design is
the IBM® logo. It consists of disconnected blue patches, and
yet it is not at all ambig-
uous; it is easily seen as three bold letters, perhaps viewed
through something like
venetian blinds (see Fig. 2.9).
(A) (B)
FIGURE 2.8
Continuity: Human vision is biased to see continuous forms,
even adding missing data if
necessary.
FIGURE 2.9
The IBM company logo uses the Continuity principle to form
letters from disconnected patches.
19Gestalt principle: closure
Slider controls are a user-interface example of the Continuity
principle. We see a
slider as depicting a single range controlled by a handle that
appears somewhere on
the slider, not as two separate ranges separated by the handle
(see Fig. 2.10A). Even
displaying different colors on each side of a slider’s handle
doesn’t completely
“break” our perception of a slider as one continuous object,
although Componen-
tOne’s choice of strongly contrasting colors (gray vs. red)
certainly strains that per-
ception a bit (see Fig. 2.10B).
GESTALT PRINCIPLE: CLOSURE
Related to Continuity is the Gestalt principle of Closure, which
states that our visual
system automatically tries to close open figures so that they are
perceived as whole
objects rather than separate pieces. Thus, we perceive the
disconnected arcs in Fig-
ure 2.11A as a circle.
Our visual system is so strongly biased to see objects that it can
even interpret a
totally blank area as an object. We see the combination of
shapes in Figure 2.11B as
a white triangle overlapping another triangle and three black
circles, even though
the figure really only contains three V shapes and three black
pac-men.
The Closure principle is often applied in graphical user
interfaces (GUIs). For
example, GUIs often represent collections of objects (e.g.,
documents or messages)
as stacks (see Fig. 2.12). Just showing one whole object and the
edges of others
“behind” it is enough to make users perceive a stack of objects,
all whole.
(A)
(B)
FIGURE 2.10
Continuity: we see a slider as a single slot with a handle
somewhere on it, not as two slots
separated by a handle: (A) Mac OS and (B) ComponentOne.
CHAPTER 2 Our Vision is Optimized to See Structure20
GESTALT PRINCIPLE: SYMMETRY
A third fact about our tendency to see objects is captured in the
Gestalt principle of
Symmetry. It states that we tend to parse complex scenes in a
way that reduces the
complexity. The data in our visual field usually has more than
one possible interpre-
tation, but our vision automatically organizes and interprets the
data so as to simplify
it and give it symmetry.
For example, we see the complex shape on the far left of Figure
2.13 as two over-
lapping diamonds, not as two touching corner bricks or a pinch-
waist octahedron
with a square in its center. A pair of overlapping diamonds is
simpler than the other
two interpretations shown on the right—it has fewer sides and
more symmetry than
the other two interpretations.
In printed graphics and on computer screens, our visual
system’s reliance on the
symmetry principle can be exploited to represent three-
dimensional objects on a
two-dimensional display. This can be seen in a cover
illustration for Paul Thagard’s
book Coherence in Thought and Action (Thagard, 2002; see Fig.
2.14) and in a
three-dimensional depiction of a cityscape (see Fig. 2.15).
FIGURE 2.12
Icons depicting stacks of objects exhibit the Closure principle:
partially visible objects are
perceived as whole.
(A) (B)
FIGURE 2.11
Closure: Human vision is biased to see whole objects, even
when they are incomplete.
21Gestalt principle: figure/ground
GESTALT PRINCIPLE: FIGURE/GROUND
The next Gestalt principle that describes how our visual system
structures the data
it receives is Figure/Ground. This principle states that our mind
separates the visual
field into the figure (the foreground) and ground (the
background). The foreground
consists of the elements of a scene that are the object of our
primary attention, and
the background is everything else.
The Figure/Ground principle also specifies that the visual
system’s parsing of
scenes into figure and ground is influenced by characteristics of
the scene. For exam-
ple, when a small object or color patch overlaps a larger one,
we tend to perceive the
smaller object as the figure and the larger object as the ground
(see Fig. 2.16).
not= or
FIGURE 2.13
Symmetry: the human visual system tries to resolve complex
scenes into combinations of simple,
symmetrical shapes.
FIGURE 2.14
The cover of the book Coherence in Thought and Action
(Thagard, 2002) uses the symmetry,
Closure, and Continuity principles to depict a cube.
CHAPTER 2 Our Vision is Optimized to See Structure22
However, our perception of figure versus ground is not
completely determined
by scene characteristics. It also depends on the viewer’s focus
of attention. Dutch
artist M. C. Escher exploited this phenomenon to produce
ambiguous images in
which figure and ground switch roles as our attention shifts (see
Fig. 2.17).
In user-interface and Web design, the Figure/Ground principle
is often used to
place an impression-inducing background “behind” the primary
displayed content
FIGURE 2.16
Figure/Ground: when objects overlap, we see the smaller as the
figure and the larger as the
ground.
FIGURE 2.15
Symmetry: the human visual system parses very complex two-
dimensional images into three-
dimensional scenes.
23Gestalt principle: figure/ground
(see Fig. 2.18). The background can convey information (e.g.,
the user’s current loca-
tion), or it can suggest a theme, brand, or mood for
interpretation of the content.
Figure/Ground is also often used to pop up information over
other content. Con-
tent that was formerly the figure—the focus of the users’
attention—temporarily
becomes the background for new information, which appears
briefly as the new
FIGURE 2.17
M. C. Escher exploited figure/ground ambiguity in his art.
FIGURE 2.18
Figure/Ground is used at AndePhotos.com to display a thematic
watermark “behind” the content.
http://AndePhotos.com
CHAPTER 2 Our Vision is Optimized to See Structure24
figure (see Fig. 2.19). This approach is usually better than
temporarily replacing the
old information with the new information, because it provides
context that helps
keep people oriented regarding their place in the interaction.
GESTALT PRINCIPLE: COMMON FATE
The previous six Gestalt principles concerned perception of
static (unmoving) figures
and objects. One final Gestalt principle—Common Fate—
concerns moving objects.
The cCommon Fate principle is related to the Proximity and
Similarity principles—
like them, it affects whether we perceive objects as grouped.
The Common Fate prin-
ciple states that objects that move together are perceived as
grouped or related.
For example, in a display showing dozens of pentagons, if seven
of them wiggled
in synchrony, people would see them as a related group, even if
the wiggling penta-
gons were separated from each other and looked no different
from all the other
pentagons (see Fig. 2.20).
Common motion—implying common fates—is used in some
animations to show
relationships between entities. For example, Google’s
GapMinder graphs animate dots
representing nations to show changes over time in various
factors of economic devel-
opment. Countries that move together share development
histories (see Fig. 2.21).
FIGURE 2.19
Figure/Ground is used at PBS.org’s mobile Web site to pop up a
call-to-action “over” the page
content.
25Gestalt principles: combined
GESTALT PRINCIPLES: COMBINED
Of course, in real-world visual scenes, the Gestalt principles
work in concert, not in
isolation. For example, a typical Mac OS desktop usually
exemplifies six of the seven
principles described here, excluding Common Fate): Proximity,
Similarity, Continu-
ity, Closure, Symmetry, and Figure/Ground (see Fig. 2.22). On
a typical desktop,
Common Fate is used (along with similarity) when a user selects
several files or fold-
ers and drags them as a group to a new location (see Fig. 2.23).
FIGURE 2.20
Common Fate: items appear grouped or related if they move
together.
FIGURE 2.21
Common fate: GapMinder animates dots to show which nations
have similar development
histories (for details, animations, and videos, visit
GapMinder.org).
http://GapMinder.org
CHAPTER 2 Our Vision is Optimized to See Structure26
FIGURE 2.22
All of the Gestalt principles except Common Fate play a role in
this portion of a Mac OS desktop.
FIGURE 2.23
Similarity and Common Fate: when users drag folders that they
have selected, common highlight-
ing and motion make the selected folders appear grouped.
27Gestalt principles: combined
With all these Gestalt principles operating at once, unintended
visual relation-
ships can be implied by a design. A recommended practice, after
designing a display,
is to view it with each of the Gestalt principles in mind—
Proximity, Similarity, Con-
tinuity, Closure, Symmetry, Figure/Ground, and Common
Fate—to see if the design
suggests any relationships between elements that you do not
intend.
This page intentionally left blank
Designing with the Mind in Mind.
http://dx.doi.org/10.1016/B978-0-12-407914-4.00003-8
© 2014 Elsevier Inc. All rights reserved.
CHAPTER
29
We Seek and Use Visual
Structure
Chapter 2 used the Gestalt principles of visual perception to
show how our visual
system is optimized to perceive structure. Perceiving structure
in our environment
helps us make sense of objects and events quickly. Chapter 2
also mentioned that
when people are navigating through software or Web sites, they
don’t scrutinize
screens carefully and read every word. They scan quickly for
relevant information.
This chapter presents examples to show that when information
is presented in a
terse, structured way, it is easier for people to scan and
understand.
Consider two presentations of the same information about an
airline flight reser-
vation. The first presentation is unstructured prose text; the
second is structured
text in outline form (see Fig. 3.1). The structured presentation
of the reservation can
be scanned and understood much more quickly than the prose
presentation.
The more structured and terse the presentation of information,
the more quickly
and easily people can scan and comprehend it. Look at the
Contents page from the
California Department of Motor Vehicles (see Fig. 3.2). The
wordy, repetitive links
slow users down and “bury” the important words they need to
see.
3
Unstructured:
You are booked on United flight 237, which departs from
Auckland at 14:30 on Tuesday 15 Oct and arrives at San
Francisco at 11:40 on Tuesday 15 Oct.
Structured:
Flight: United 237, Auckland San Francisco
Depart: 14:30 Tue 15 Oct
Arrive: 11:40 Tue 15 Oct
FIGURE 3.1
Structured presentation of airline reservation information is
easier to scan and understand.
CHAPTER 3 We Seek and Use Visual Structure30
Compare that with a terser, more structured hypothetical design
that factors out
needless repetition and marks as links only the words that
represent options
(see Fig. 3.3). All options presented in the actual Contents page
are available in the
revision, yet it consumes less screen space and is easier to scan.
Displaying search results is another situation in which
structuring data and avoid-
ing repetitive “noise” can improve people’s ability to scan
quickly and find what they
seek. In 2006, search results at HP.com included so much
repeated navigation data
and metadata for each retrieved item that they were useless. By
2009, HP had elimi-
nated the repetition and structured the results, making them
easier to scan and more
useful (see Fig. 3.4).
Of course, for information displays to be easy to scan, it is not
enough merely to
make them terse, structured, and nonrepetitious. They must also
conform to the
rules of graphic design, some of which were presented in
Chapter 2.
For example, a prerelease version of a mortgage calculator on a
real estate Web
site presented its results in a table that violated at least two
important rules of
graphic design (see Fig. 3.5A). First, people usually read
(online or offline) from top
to bottom, but the labels for calculated amounts were below
their corresponding
values. Second, the labels were just as close to the value below
as to their own
FIGURE 3.2
Contents page at the California Department of Motor Vehicles
(DMV) Web site buries the
important information in repetitive prose.
Licenses & ID Cards: Renewals, Duplicates, Changes
• Renew license: in person by mail by Internet
• Renew: instruction permit
• Apply for duplicate: license ID card
• Change of: name address
• Register as: organ donor
FIGURE 3.3
California DMV Web site Contents page with repetition
eliminated and better visual structure.
http://HP.com
31CHAPTER 3 We Seek and Use Visual Structure
value, so proximity (see Chapter 2) could not be used to
perceive that labels were
grouped with their values. To understand this mortgage results
table, users had to
scrutinize it carefully and slowly figure out which labels went
with which
numbers.
(A) (B)
FIGURE 3.4
In 2006, HP.com’s site search produced repetitious, “noisy”
results (A), but by 2009 was
improved (B).
360
0.00
Mortgage Summary
Monthly Payment $ 1,840.59
Number of Payments
Total of Payments $ 662,611.22
Interest Total $ 318,861.22
Tax Total $ 93,750.00
PMI Total $
Pay off Date Sep 2037
(A) (B)
FIGURE 3.5
(A) Mortgage summary presented by a software mortgage
calculator; (B) an improved design.
CHAPTER 3 We Seek and Use Visual Structure32
The revised design, in contrast, allows users to perceive the
correspondence
between labels and values without conscious thought (see Fig.
3.5B).
STRUCTURE ENHANCES PEOPLE’S ABILITY TO SCAN
LONG
NUMBERS
Even small amounts of information can be made easier to scan
if they are structured.
Two examples are telephone numbers and credit card numbers
(see Fig. 3.6). Tradi-
tionally, such numbers were broken into parts to make them
easier to scan and
remember.
A long number can be broken up in two ways: either the user
interface breaks it
up explicitly by providing a separate field for each part of the
number, or the inter-
face provides a single number field but lets users break the
number into parts with
spaces or punctuation (see Fig. 3.7A). However, many of
today’s computer presenta-
tions of phone and credit card numbers do not segment the
numbers and do not
Easy: (415) 123 4567
Hard: 4151234567
Easy: 1234 5678 9012 3456
Hard: 1234567890123456
FIGURE 3.6
Telephone and credit card numbers are easier to scan and
understand when segmented.
(A)
(B)
FIGURE 3.7
(A) At Democrats.org, credit card numbers can include spaces.
(B) At StuffIt.com, they cannot,
making them harder to scan and verify.
http://Democrats.org
http://StuffIt.com
33Data-specific controls provide even more structure
allow users to include spaces or other punctuation (see Fig.
3.7B). This limitation
makes it harder for people to scan a number or verify that they
typed it correctly, and
so is considered a user-interface design blooper ( Johnson,
2007). Forms presented in
software and Web sites should accept credit card numbers,
social security numbers,
phone numbers, and so on in a variety of different formats and
parse them into the
internal format.
Segmenting data fields can provide useful visual structure even
when the data to
be entered is not, strictly speaking, a number. Dates are an
example of a case in
which segmented fields can improve readability and help
prevent data entry errors,
as shown by a date field at Bank of America’s Web site (see
Fig. 3.8).
DATA-SPECIFIC CONTROLS PROVIDE EVEN MORE
STRUCTURE
A step up in structure from segmented data fields are data-
specific controls. Instead
of using simple text fields—whether segmented or not—
designers can use controls
that are designed specifically to display (and accept as input) a
value of a specific
type. For example, dates can be presented (and accepted) in the
form of menus com-
bined with pop-up calendar controls (see Fig. 3.9).
It is also possible to provide visual structure by mixing
segmented text fields with
data-specific controls, as demonstrated by an email address
field at Southwest Air-
lines’ Web site (see Fig. 3.10).
FIGURE 3.8
At BankOfAmerica.com, segmented data fields provide useful
structure.
FIGURE 3.10
At SWA.com email addresses are entered into fields structured
to accept parts of the address.
FIGURE 3.9
At NWA.com, dates are displayed and entered using a control
that is specifically designed for dates.
http://BankOfAmerica.com
http://SWA.com
http://NWA.com
CHAPTER 3 We Seek and Use Visual Structure34
VISUAL HIERARCHY LETS PEOPLE FOCUS ON THE
RELEVANT
INFORMATION
One of the most important goals in structuring information
presentations is to pro-
vide a visual hierarchy—an arrangement that:
l Breaks the information into distinct sections, and breaks
large sections into
subsections.
l Labels each section and subsection prominently and in such a
way as to clearly
identify its content.
l Presents the sections and subsections as a hierarchy, with
higher-level sections
presented more strongly than lower-level ones.
A visual hierarchy allows people, when scanning information, to
instantly sepa-
rate what is relevant to their goals from what is irrelevant, and
to focus their atten-
tion on the relevant information. They find what they are
looking for more quickly
because they can easily skip everything else.
Try it for yourself. Look at the two information displays in
Figure 3.11 and find the
information about prominence. How much longer does it take
you to find it in the
nonhierarchical presentation?
Create a Clear Visual Hierarchy
Organize and prioritize the contents of a page by
using size, prominence, and content relationships.
Let’s look at these relationships more closely:
• Size. The more important a headline is, the larger
its font size should be. Big bold headlines help to
grab the user’s attention as they scan the Web
page.
• Content Relationships. Group similar content
types by displaying the content in a similar visual
style, or in a clearly defined area.
• Prominence. The more important the headline or
content, the higher up the page it should be placed.
The most important or popular content should
always be positioned prominently near the top of
the page, so users can view it without having to
scroll too far.
Create a Clear Visual Hierarchy
Organize and prioritize the contents
of a page by using size, prominence,
and content relationships. Let’s look
at these relationships more closely.
The more important a headline is,
the larger its font size should be.
Big bold headlines help to grab the
user’s attention as they scan the
Web page. The more important the
headline or content, the higher up
the page it should be placed. The
most important or popular content
should always be positioned
prominently near the top of the page,
so users can view it without having to
scroll too far. Group similar content
types by displaying the content in a
similar visual style, or in a clearly
defined area.
(A) (B)
FIGURE 3.11
Find the advice about prominence in each of these displays.
Prose text format (A) makes
people read everything. Visual hierarchy (B) lets people ignore
information irrelevant to their
goals.
35Visual hierarchy lets people focus on the relevant information
The examples in Figure 3.11 show the value of visual hierarchy
in a textual,
read-only information display. Visual hierarchy is equally
important in interactive
control panels and forms—perhaps even more so. Compare
dialog boxes from
two different music software products (see Fig. 3.12). The
Reharmonize dialog
box of Band-in-a-Box has poor visual hierarchy, making it hard
for users to find
things quickly. In contrast, GarageBand’s Audio/MIDI control
panel has good
visual hierarchy, so users can quickly find the settings they are
interested in.
(A)
(B)
FIGURE 3.12
Visual hierarchy in interactive control panels and forms lets
users find settings quickly: (A)
Band-in-a-Box (bad) and (B) GarageBand (good).
CHAPTER 3 We Seek and Use Visual Structure36
Used by permission, www.OK/Cancel.com.
http://www.OK/Cancel.com
Designing with the Mind in Mind.
http://dx.doi.org/10.1016/B978-0-12-407914-4.00004-X
© 2014 Elsevier Inc. All rights reserved.
CHAPTER
37
Our Color Vision is Limited
Human color perception has both strengths and limitations,
many of which are rel-
evant to user-interface design. For example:
l Our vision is optimized to detect contrasts (edges), not
absolute brightness.
l Our ability to distinguish colors depends on how colors are
presented.
l Some people have color-blindness.
l The user’s display and viewing conditions affect color
perception.
To understand these qualities of human color vision, let’s start
with a brief
description of how the human visual system processes color
information from the
environment.
HOW COLOR VISION WORKS
If you took introductory psychology or neurophysiology in
college, you probably
learned that the retina at the back of the human eye—the surface
onto which the eye
focuses images—has two types of light receptor cells: rods and
cones. You probably
also learned that the rods detect light levels but not colors,
while the cones detect
colors. Finally, you probably learned that there are three types
of cones—sensitive to
red, green, and blue light—suggesting that our color vision is
similar to video cam-
eras and computer displays, which detect or project a wide
variety of colors through
combinations of red, green, and blue pixels.
What you learned in college is only partly right. People with
normal vision do in
fact have rods and three types of cones1 in their retinas. The
rods are sensitive to
overall brightness while the three types of cones are sensitive to
different
1 People with color-blindness may have fewer than three, and
some women have four, cone types (Eagleman,
2012).
4
CHAPTER 4 Our Color Vision is Limited38
frequencies of light. But that is where the truth departs from
what most people
learned in college, until recently.
First, those of us who live in industrialized societies hardly use
our rods at all. They
function only at low levels of light. They are for getting around
in poorly lighted envi-
ronments—the environments our ancestors lived in until the
nineteenth century.
Today, we use our rods only when we are having dinner by
candlelight, feeling our way
around our dark house at night, camping outside after dark, etc.
(see Chapter 5). In
bright daylight and modern artificially lighted environments—
where we spend most of
our time—our rods are completely maxed out, providing no
useful information. Most of
the time, our vision is based entirely on input from our cones
(Ware, 2008).
So how do our cones work? Are the three types of cones
sensitive to red, green,
and blue light, respectively? In fact, each type of cone is
sensitive to a wider range of
light frequencies than you might expect, and the sensitivity
ranges of the three types
overlap considerably. In addition, the overall sensitivity of the
three types of cones
differs greatly (see Fig. 4.1A):
l Low frequency. These cones are sensitive to light over
almost the entire range
of visible light, but are most sensitive to the middle (yellow)
and low (red)
frequencies.
l Medium frequency. These cones respond to light ranging
from the high-fre-
quency blues through the lower middle-frequency yellows and
oranges. Over-
all, they are less sensitive than the low-frequency cones.
l High frequency. These cones are most sensitive to light at the
upper end of
the visible light spectrum—violets and blues—but they also
respond weakly to
middle frequencies, such as green. These cones are much less
sensitive overall
than the other two types of cones, and also less numerous. One
result is that our
eyes are much less sensitive to blues and violets than to other
colors.
Compare a graph of the light sensitivity of our retinal cone cells
(Fig. 4.1A) to
what the graph might look like if electrical engineers had
designed our retinas as a
mosaic of receptors sensitive to red, green, and blue, like a
camera (Fig. 4.1B).
1.0
0.8
0.6
0.4
0.2
(B)(A)
400 500 600 700
Wavelength (nanometers)
L
M
H
400 500 600 700
Wavelength (nanometers)
0.2
0.4
0.6
0.8
1.0
R
e
at
ve
a
bs
or
ba
nc
e
FIGURE 4.1
Sensitivity of the three types of retinal cones (A) versus
artificial red, green, and blue receptors (B).
39Vision is optimized for contrast, not brightness
Given the odd relationships among the sensitivities of our three
types of retinal
cone cells, one might wonder how the brain combines the
signals from the cones to
allow us to see a broad range of colors.
The answer is by subtraction. Neurons in the visual cortex at the
back of our
brain subtract the signals coming over the optic nerves from the
medium- and low-
frequency cones, producing a red–green difference signal
channel. Other neurons in
the visual cortex subtract the signals from the high- and low-
frequency cones, yield-
ing a yellow–blue difference signal channel. A third group of
neurons in the visual
cortex adds the signals coming from the low- and medium-
frequency cones to pro-
duce an overall luminance (or black–white) signal channel.2
These three channels
are called color-opponent channels.
The brain then applies additional subtractive processes to all
three color-oppo-
nent channels: signals coming from a given area of the retina
are effectively sub-
tracted from similar signals coming from nearby areas of the
retina.
VISION IS OPTIMIZED FOR CONTRAST, NOT BRIGHTNESS
All this subtraction makes our visual system much more
sensitive to differences in
color and brightness—that is, to contrasting colors and edges—
than to absolute
brightness levels.
To see this, look at the inner bar in Figure 4.2. The inner bar
looks darker on the
right, but in fact is one solid shade of gray. To our contrast-
sensitive visual system, it
looks lighter on the left and darker on the right because the
outer rectangle is darker
on the left and lighter on the right.
The sensitivity of our visual system to contrast rather than to
absolute brightness
is an advantage: it helped our distant ancestors recognize a
leopard in the nearby
bushes as the same dangerous animal whether they saw it in
bright noon sunlight or
in the early morning hours of a cloudy day. Similarly, being
sensitive to color
2 The overall brightness sum omits the signal from the high-
frequency (blue–violet) cones. Those cones are so
insensitive that their contribution to the total would be
negligible, so omitting them makes little difference.
FIGURE 4.2
The inner gray bar looks darker on the right, but in fact is all
one shade of gray.
CHAPTER 4 Our Color Vision is Limited40
contrasts rather than to absolute colors allows us to see a rose
as the same red
whether it is in the sun or the shade.
Brain researcher Edward H. Adelson at the Massachusetts
Institute of Technology
developed an outstanding illustration of our visual system’s
insensitivity to absolute
brightness and its sensitivity to contrast (see Fig. 4.3). As
difficult as it may be to
believe, square A on the checkerboard is exactly the same shade
as square B. Square
B only appears white because it is depicted as being in the
cylinder’s shadow.
THE ABILITY TO DISCRIMINATE COLORS DEPENDS ON
HOW
COLORS ARE PRESENTED
Even our ability to detect differences between colors is limited.
Because of how our
visual system works, three presentation factors affect our ability
to distinguish col-
ors from each other:
l Paleness. The paler (less saturated) two colors are, the harder
it is to tell them
apart (see Fig. 4.4A).
l Color patch size. The smaller or thinner objects are, the
harder it is to distin-
guish their colors (see Fig. 4.4B). Text is often thin, so the
exact color of text is
often hard to determine.
l Separation. The more separated color patches are, the more
difficult it is to
distinguish their colors, especially if the separation is great
enough to require
eye motion between patches (see Fig. 4.4C).
Several years ago, the online travel website ITN.net used two
pale colors—white
and pale yellow—to indicate which step of the reservation
process the user was on
(see Fig. 4.5). Some site visitors couldn’t see which step they
were on.
FIGURE 4.3
The squares marked A and B are the same gray. We see B as
white because it is shaded from
the cylinder’s shadow.
http://ITN.net
41The ability to discriminate colors depends on how colors are
presented
(A) (B) (C)
FIGURE 4.4
Factors affecting ability to distinguish colors: (A) paleness, (B)
size, and (C) separation.
FIGURE 4.5
The pale color marking the current step makes it hard for users
to see which step in the airline
reservation process they are on in ITN.net’s 2003 website.
1 3 5 7 9
11 13 15 17
1
0.8
0.6
0.4
0.2
0
0.2
0.4
0.6
0.8
1
S1
S6
S16
S11
0.8 1
0.6 0.8
0.4 0.6
0.2 0.4
0 0.2
0.4 0.2
0.6 0.4
0.8 0.6
1 0.8
0.2 0
FIGURE 4.6
Tiny color patches in this chart legend are hard to distinguish.
http://ITN.net
CHAPTER 4 Our Color Vision is Limited42
0 200 400 600 800
Legend
Beverages
Condiments
Confections
Dairy products
Grains/Cereals
Meat/Poultry
Produce
Seafood
FIGURE 4.7
Large color patches make it easier to distinguish the colors.
FIGURE 4.8
The difference in color between visited and unvisited links is
too subtle in MinneapolisFed.org’s
website.
Small color patches are often seen in data charts and plots.
Many business graph-
ics packages produce legends on charts and plots, but make the
color patches in the
legend very small (see Fig. 4.6). Color patches in chart legends
should be large to help
people distinguish the colors (see Fig. 4.7).
On websites, a common use of color is to distinguish
unfollowed links from
already followed ones. On some sites, the “followed” and
“unfollowed” colors are too
similar. The website of the Federal Reserve Bank of
Minneapolis (see Fig. 4.8) has
this problem. Furthermore, the two colors are shades of blue,
the color range in
which our eyes are least sensitive. Can you spot the two
followed links?3
3 Already followed links in Figure 4.8: Housing Units
Authorized and House Price Index.
43Color-blindness
COLOR-BLINDNESS
A fourth factor of color presentation that affects design
principles for interactive systems
is whether the colors can be distinguished by people who have
common types of color-
blindness. Having color-blindness doesn’t mean an inability to
see colors. It just means
that one or more of the color subtraction channels (see the
“How Color Vision Works”
section) don’t function normally, making it difficult to
distinguish certain pairs of colors.
Approximately 8% of men and slightly under 0.5% of women
have a color perception
deficit: difficulty discriminating certain pairs of colors
(Wolfmaier, 1999). The most com-
mon type of color-blindness is red–green; other types are much
rarer. Figure 4.9 shows
color pairs that people with red–green color-blindness have
trouble distinguishing.
(A)
(C)
(B)
FIGURE 4.9
Red–green color-blind people can’t distinguish (A) dark red
from black, (B) blue from purple,
and (C) light green from white.
FIGURE 4.10
MoneyDance’s graph uses colors some users can’t distinguish.
CHAPTER 4 Our Color Vision is Limited44
FIGURE 4.11
MoneyDance’s graph rendered in grayscale.
(A) (B)
FIGURE 4.12
Google logo: (A) normal and (B) after red–green color-
blindness filter.
The home finance application MoneyDance provides a graphical
breakdown of
household expenses, using color to indicate the various expense
categories (see Fig.
4.10). Unfortunately, many of the colors are hues that color-
blind people cannot tell
apart. For example, people with red–green color-blindness
cannot distinguish the
blue from the purple or the green from the khaki. If you are not
color-blind, you can
get an idea of which colors in an image will be hard to
distinguish by converting the
image to grayscale (see Fig. 4.11), but, as described in the
“Guidelines for Using
Color” section later in this chapter, it is best to run the image
through a color-blind-
ness filter or simulator (see Fig. 4.12).
EXTERNAL FACTORS THAT INFLUENCE THE ABILITY TO
DISTINGUISH COLORS
Factors concerning the external environment also impact
people’s ability to distin-
guish colors. For example:
l Variation among color displays. Computer displays vary in
how they dis-
play colors, depending on their technologies, driver software, or
color settings.
45Guidelines for using color
Even monitors of the same model with the same settings may
display colors
slightly differently. Something that looks yellow on one display
may look beige
on another. Colors that are clearly different on one may look the
same on
another.
l Grayscale displays. Although most displays these days are
color, there are
devices, especially small handheld ones, with grayscale
displays. For instance,
Figure 4.11 shows that a grayscale display can make areas of
different colors look
the same.
l Display angle. Some computer displays, particularly LCD
ones, work much
better when viewed straight on than at an angle. When LCD
displays are viewed
at an angle, colors—and color differences—often are altered.
l Ambient illumination. Strong light on a display washes out
colors before it
washes out light and dark areas, reducing color displays to
grayscale ones, as
anyone who has tried to use a bank ATM in direct sunlight
knows. In offices,
glare and venetian blind shadows can mask color differences.
These four external factors are usually out of the software
designer’s control.
Designers should, therefore, keep in mind that they don’t have
full control of users’
color viewing experience. Colors that seem highly
distinguishable in the development
facility on the development team’s computer displays and under
normal office lighting
conditions may not be as distinguishable in some of the
environments where the soft-
ware is used.
GUIDELINES FOR USING COLOR
In interactive software systems that rely on color to convey
information, follow
these five guidelines to assure that the users of the software
receive the
information:
1. Distinguish colors by saturation and brightness, as well as
hue. Avoid
subtle color differences. Make sure the contrast between colors
is high (but see
guideline 5). One way to test whether colors are different
enough is to view
them in grayscale. If you can’t distinguish the colors when they
are rendered in
grays, they aren’t different enough.
2. Use distinctive colors. Recall that our visual system
combines the signals
from retinal cone cells to produce three color-opponent
channels: red–green,
yellow–blue, and black–white (luminance). The colors that
people can distin-
guish most easily are those that cause a strong signal (positive
or negative) on
one of the three color-perception channels, and neutral signals
on the other
two channels. Not surprisingly, those colors are red, green,
yellow, blue, black,
and white (see Fig. 4.13). All other colors cause signals on
more than one color
channel, and so our visual system cannot distinguish them from
other colors as
quickly and easily as it can distinguish those six colors (Ware,
2008).
CHAPTER 4 Our Color Vision is Limited46
3. Avoid color pairs that color-blind people cannot distinguish.
Such pairs
include dark red versus black, dark red versus dark green, blue
versus purple,
light green versus white. Don’t use dark reds, blues, or violets
against any dark
colors. Instead, use dark reds, blues, and violets against light
yellows and greens.
Use an online color-blindness simulator4 to check web pages
and images to see
how people with various color-vision deficiencies would see
them.
4. Use color redundantly with other cues. Don’t rely on color
alone. If you
use color to mark something, mark it another way as well.
Apple’s iPhoto uses
both color and a symbol to distinguish “smart” photo albums
from regular
albums (see Fig. 4.14).
5. Separate strong opponent colors. Placing opponent colors
right next to or
on top of each other causes a disturbing shimmering sensation,
and so it should
be avoided (see Fig. 4.15).
As shown in Figure 4.5, ITN.net used only pale yellow to mark
customers’ current
step in making a reservation, which is too subtle. A simple way
to strengthen the
marking would be to make the current step bold and increase the
saturation of the
4 Search the Web for “color-blindness filter” or “color-
blindness simulator.”
FIGURE 4.13
The most distinctive colors: black, white, red, green, yellow,
blue. Each color causes a strong
signal on only one color-opponent channel.
FIGURE 4.14
Apple’s iPhoto uses color plus a symbol to distinguish two
types of albums.
FIGURE 4.15
Opponent colors, placed on or directly next to each other, clash.
http://ITN.net
47Guidelines for using color
yellow (see Fig. 4.16A). But ITN.net opted for a totally new
design, which also uses
color redundantly with shape (see Figure 4.16B).
A graph from the Federal Reserve Bank uses shades of gray (see
Fig. 4.17). This is
a well-designed graph. Any sighted person could read it.
(A)
(B)
FIGURE 4.16
ITN.net’s current step is highlighted in two ways: with color
and shape.
FIGURE 4.17
MinneapolisFed.org’s graph uses shade differences visible to all
sighted people, on any display.
http://ITN.net
This page intentionally left blank
Designing with the Mind in Mind.
http://dx.doi.org/10.1016/B978-0-12-407914-4.00005-1
© 2014 Elsevier Inc. All rights reserved.
CHAPTER
49
Our Peripheral Vision
is Poor
Chapter 4 explained that the human visual system differs from a
digital camera in
the way it detects and processes color. Our visual system also
differs from a camera
in its resolution. On a digital camera’s photo sensor,
photoreceptive elements are
spread uniformly in a tight matrix, so the spatial resolution is
constant across the
entire image frame. The human visual system is not like that.
This chapter explains why
l Stationary items in muted colors presented in the periphery
of people’s visual
field often will not be noticed.
l Motion in the periphery is usually noticed.
RESOLUTION OF THE FOVEA COMPARED TO THE
PERIPHERY
The spatial resolution of the human visual field drops greatly
from the center to the
edges. There are three reasons for this:
l Pixel density. Each eye has 6 to 7 million retinal cone cells.
They are packed
much more tightly in the center of our visual field—a small
region called the
fovea—than they are at the edges of the retina (see Fig. 5.1).
The fovea has about
158,000 cone cells in each square millimeter. The rest of the
retina has only
9,000 cone cells per square millimeter.
l Data compression. Cone cells in the fovea connect 1:1 to the
ganglial neuron
cells that begin the processing and transmission of visual data,
while elsewhere
on the retina, multiple photoreceptor cells (cones and rods)
connect to each
ganglion cell. In technical terms, information from the visual
periphery is com-
pressed (with data loss) before transmission to the brain, while
information
from the fovea is not.
5
CHAPTER 5 Our Peripheral Vision is Poor50
l Processing resources. The fovea is only about 1% of the
retina, but the brain’s
visual cortex devotes about 50% of its area to input from the
fovea. The other
half of the visual cortex processes data from the remaining 99%
of the retina.
The result is that our vision has much, much greater resolution
in the center of our
visual field than elsewhere (Lindsay and Norman, 1972;
Waloszek, 2005). Said in devel-
oper jargon: in the center 1% of your visual field (i.e., the
fovea), you have a high-
resolution TIFF, and everywhere else, you have only a low-
resolution JPEG. That is noth-
ing like a digital camera.
To visualize how small the fovea is compared to your entire
visual field, hold your
arm straight out and look at your thumb. Your thumbnail,
viewed at arm’s length,
corresponds approximately to the fovea (Ware, 2008). While
you have your eyes
focused on the thumbnail, everything else in your visual field
falls outside of your
fovea on your retina.
In the fovea, people with normal vision have very high
resolution: they can
resolve several thousand dots within that region—better
resolution than many of
today’s pocket digital cameras. Just outside of the fovea, the
resolution is already
down to a few dozen dots per inch viewed at arm’s length. At
the edges of our
vision, the “pixels” of our visual system are as large as a melon
(or human head) at
arm’s length (see Fig. 5.2).
Even though our eyes have more rods than cones—125 million
versus 6–7 million—
peripheral vision has much lower resolution than foveal vision.
This is because while
most of our cone cells are densely packed in the fovea (1% of
the retina’s area), the rods
are spread out over the rest of the retina (99% of the retina’s
area). In people with nor-
mal vision, peripheral vision is about 20/200, which in the
United States is considered
Blind spot
sdoRsdoR
senoCsenoC
180,000
160,000
140,000
120,000
N
um
be
r o
f r
ec
ep
to
rs
pe
r s
qu
ar
e
m
m
et
er
100,000
80,000
60,000
40,000
20,000
0
70 60 50 40 30 20 10 0
Angle (deg)
10 20 30 40 50 60 70 80
FIGURE 5.1
Distribution of photoreceptor cells (cones and rods) across the
retina. From Lindsay and Norman
(1972).
51Resolution of the fovea compared to the periphery
legally blind. Think about that: in the periphery of your visual
field, you are legally
blind. Here is how brain researcher David Eagleman (2012;
page 23) describes it:
The resolution in your peripheral vision is roughly equivalent to
looking through a frosted
shower door, and yet you enjoy the illusion of seeing the
periphery clearly. … Wherever
you cast your eyes appears to be in sharp focus, and therefore
you assume the whole
visual world is in focus.
If our peripheral vision has such low resolution, one might
wonder why we don’t see
the world in a kind of tunnel vision where everything is out of
focus except what we
are directly looking at now. Instead, we seem to see our
surroundings sharply and
clearly all around us. We experience this illusion because our
eyes move rapidly and
constantly about three times per second even when we don’t
realize it, focusing our
fovea on selected pieces of our environment. Our brain fills in
the rest in a gross,
impressionistic way based on what we know and expect.1 Our
brain does not have to
maintain a high-resolution mental model of our environment
because it can order the
eyes to sample and resample details in the environment as
needed (Clark, 1998).
For example, as you read this page, your eyes dart around,
scanning and reading.
No matter where on the page your eyes are focused, you have
the impression of
viewing a complete page of text, because, of course, you are.
1 Our brains also fill in perceptual gaps that occur during rapid
(saccadic) eye movements, when vision is sup-
pressed (see Chapter 14).
(A) (B)
FIGURE 5.2
The resolution of our visual field is high in the center but much
lower at the edges. Right image
from Vision Research, Vol. 14 (1974), Elsevier.
CHAPTER 5 Our Peripheral Vision is Poor52
But now, imagine that you are viewing this page on a computer
screen, and the
computer is tracking your eye movements and knows where
your fovea is on the
page. Imagine that wherever you look, the right text for that
spot on the page is
shown clearly in the small area corresponding to your fovea, but
everywhere else on
the page, the computer shows random, meaningless text. As
your fovea flits around
the page, the computer quickly updates each area where your
fovea stops to show the
correct text there, while the last position of your fovea returns
to textual noise. Amaz-
ingly, experiments have shown that people rarely notice this:
not only can they read,
they believe that they are viewing a full page of meaningful text
(Clark, 1998). How-
ever, it does slow people’s reading, even if they don’t realize it
(Larson, 2004).
The fact that retinal cone cells are distributed tightly in and
near the fovea, and
sparsely in the periphery of the retina, affects not only spatial
resolution but color resolu-
tion. We can discriminate colors better in the center of our
visual field than at the edges.
Another interesting fact about our visual field is that it has a
gap—a small area (blind
spot) in which we see nothing. The gap corresponds to the spot
on our retina where the
optic nerve and blood vessels exit the back of the eye (see Fig.
5.1). There are no retinal
rod or cone cells at that spot, so when the image of an object in
our visual field happens
to fall on that part of the retina, we don’t see it. We usually
don’t notice this hole in our
vision because our brain fills it in with the surrounding content,
like a graphic artist
using Photoshop to fill in a blemish on a photograph by copying
nearby background
pixels.
People sometimes experience the blind spot when they gaze at
stars. As you look
at one star, a nearby star may disappear briefly into the blind
spot until you shift your
gaze. You can also observe the gap by trying the exercise in
Figure 5.3. Some people
have other gaps resulting from imperfections on the retina,
retinal damage, or brain
strokes that affect the visual cortex,2 but the optic nerve gap is
an imperfection
everyone shares.
IS THE VISUAL PERIPHERY GOOD FOR ANYTHING?
It seems that the fovea is better than the periphery at just about
everything. One might
wonder why we have peripheral vision. What is it good for? Our
peripheral vision
serves three important functions: it guides fovea, detects
motion, and lets us see better
in the dark.
2 See VisionSimulations.com.
FIGURE 5.3
To “see” the retinal gap, cover your left eye, hold this book
near your face, and focus your right
eye on the +. Move the book slowly away from you, staying
focused on the +. The @ will disappear
at some point.
53Is the visual periphery good for anything?
Function 1: Guides fovea
First, peripheral vision provides low-resolution cues to guide
our eye movements so that
our fovea visits all the interesting and crucial parts of our visual
field. Our eyes don’t scan
our environment randomly. They move so as to focus our fovea
on important things, the
most important ones (usually) first. The fuzzy cues on the
outskirts of our visual field
provide the data that helps our brain plan where to move our
eyes, and in what order.
For example, when we scan a medicine label for a “use by”
date, a fuzzy blob in
the periphery with the vague form of a date is enough to cause
an eye movement that
lands the fovea there to allow us to check it. If we are browsing
a produce market
looking for strawberries, a blurry reddish patch at the edge of
our visual field draws
our eyes and our attention, even though sometimes it may turn
out to be radishes
instead of strawberries. If we hear an animal growl nearby, a
fuzzy animal-like shape
in the corner of our eye will be enough to zip our eyes in that
direction, especially if
the shape is moving toward us (see Fig. 5.4).
How peripheral vision guides and augments central, foveal
vision is discussed
more in the “Visual Search Is Linear Unless Targets ‘Pop’ in
the Periphery” section
later in this chapter.
Function 2: Detects motion
A related guiding function of peripheral vision is that it is good
at detecting motion.
Anything that moves in our visual periphery, even slightly, is
likely to draw our
attention—and hence our fovea—toward it. The reason for this
phenomenon is that
our ancestors—including prehuman ones—were selected for
their ability to spot
food and avoid predators. As a result, even though we can move
our eyes under
conscious, intentional control, some of the mechanisms that
control where they look
are preconscious, involuntary, and very fast.
FIGURE 5.4
A moving shape at the edge of our vision draws our eye: it
could be food, or it might consider us food.
CHAPTER 5 Our Peripheral Vision is Poor54
What if we have no reason to expect that there might be
anything interesting in
a certain spot in the periphery,3 and nothing in that spot attracts
our attention? Our
eyes may never move our fovea to that spot, so we may never
see what is there.
Function 3: Lets us see better in the dark
A third function of peripheral vision is to allow us to see in
low-light conditions—for
example, on starlit nights, in caves, around campfires, etc.
These were conditions under
which vision evolved, and in which people—like the animals
that preceded them on
Earth—spent much of their time until the invention of the
electric light bulb in the 1800s.
Just as the rods are overloaded in well-lighted conditions (see
Chapter 5), the
cones don’t function very well in low light, so our rods take
over. Low-light, rods-
only vision is called scotopic vision. An interesting fact is that
because there are no
rods in the fovea, you can see objects better in low-light
conditions (e.g., faint stars)
if you don’t look directly at them.
EXAMPLES FROM COMPUTER USER INTERFACES
The low acuity of our peripheral vision explains why software
and website users fail
to notice error messages in some applications and websites.
When someone clicks a
button or a link, that is usually where his or her fovea is
positioned. Everything on the
screen that is not within 1–2 centimeters of the click location
(assuming normal com-
puter viewing distance) is in peripheral vision, where resolution
is low. If, after the
click, an error message appears in the periphery, it should not
be surprising that the
person might not notice it.
For example, at InformaWorld.com, the online publications
website of Informa
Healthcare, if a user enters an incorrect username or password
and clicks “Sign In,”
an error message appears in a “message bar” far away from
where the user’s eyes are
most likely focused (see Fig. 5.5). The red word “Error” might
appear in the user’s
peripheral vision as a small reddish blob, which would help
draw the eyes in that
direction. However, the red blob could fall into a gap in the
viewer’s visual field, and
so not be noticed at all.
Consider the sequence of events from a user’s point of view.
The user enters a
username and password and then clicks “Sign In.” The page
redisplays with blank
fields. The user thinks “Huh? I gave it my login information and
hit ‘Sign In,’ didn’t I?
Did I hit the wrong button?” The user reenters the username and
password, and
clicks “Sign In” again. The page redisplays with empty fields
again. Now the user is
really confused. The user sighs (or curses), sits back in his chair
and lets his eyes scan
the screen. Suddenly noticing the error message, the user says
“A-ha! Has that error
message been there all along?”
3 See Chapter 1 on how expectations bias our perceptions.
http://InformaWorld.com
55Examples from computer user interfaces
Even when an error message is placed nearer to the center of the
viewer’s visual
field than in the preceding example, other factors can diminish
its visibility. For
example, until recently the website of Airborne.com signaled a
login failure by dis-
playing an error message in red just above the Login ID field
(see Fig. 5.6). This error
message is entirely in red and fairly near the “Login” button
where the user’s eyes are
probably focused. Nonetheless, some users would not notice this
error message
when it first appeared. Can you think of any reasons people
might not initially see
this error message?
One reason is that even though the error message is much closer
to where users
will be looking when they click the “Login” button, it is still in
the periphery, not in
the fovea. The fovea is small: just a centimeter or two on a
computer screen, assum-
ing the user is the usual distance from the screen.
A second reason is that the error message is not the only thing
near the top of the
page that is red. The page title is also red. Resolution in the
periphery is low, so when
the error message appears, the user’s visual system may not
register any change:
there was something red up there before, and there still is (see
Fig. 5.7).
If the page title were black or any other color besides red, the
red error message
would be more likely to be noticed, even though it appears in
the periphery of the
users’ visual field.
Error Message
Fovea
FIGURE 5.5
This error message for a faulty sign-in appears in peripheral
vision, where it will probably be
missed.
http://Airborne.com
CHAPTER 5 Our Peripheral Vision is Poor56
COMMON METHODS OF MAKING MESSAGES VISIBLE
There are several common and well-known methods of ensuring
that an error mes-
sage will be seen:
l Put it where users are looking. People focus in predictable
places when
interacting with graphical user interfaces (GUIs). In Western
societies, people
tend to traverse forms and control panels from upper left to
lower right. While
moving the screen pointer, people usually look either at where it
is or where
they are moving it to. When people click a button or link, they
can usually be
assumed to be looking directly at it, at least for a few moments
afterward.
Designers can use this predictability to position error messages
near where
they expect users to be looking.
FIGURE 5.6
This error message for a faulty login is missed by some users
even though it is not far from the
“Login” button.
FIGURE 5.7
Simulation of a user’s visual field while the fovea is fixed on
the “Login” button.
57Common methods of making messages visible
FIGURE 5.8
This error message for faulty sign-in is displayed more
prominently, near where users will be
looking.
l Mark the error. Somehow mark the error prominently to
indicate clearly that
something is wrong. Often this can be done by simply placing
the error mes-
sage near what it refers to, unless that would place the message
too far from
where users are likely to be looking.
l Use an error symbol. Make errors or error messages more
visible by marking
them with an error symbol, such as , , , or .
l Reserve red for errors. By convention, in interactive
computer systems the
color red connotes alert, danger, problem, error, etc. Using red
for any other
information on a computer display invites misinterpretation. But
suppose you
are designing a website for Stanford University, which has red
as its school
color. Or suppose you are designing for a Chinese market,
where red is consid-
ered an auspicious, positive color. What do you do? Use another
color for errors,
mark them with error symbols, or use stronger methods (see the
next section).
An improved version of the InformaWorld sign-in error screen
uses several of
these techniques (see Fig. 5.8).
At America Online’s website, the form for registering for a new
email account fol-
lows the guidelines pretty well (see Fig. 5.9). Data fields with
errors are marked with
red error symbols. Error messages are displayed in red and are
near the error. Further-
more, most of the error messages appear as soon as an erroneous
entry is made, when
CHAPTER 5 Our Peripheral Vision is Poor58
the user is still focused on that part of the form, rather than only
after the user submits
the form. It is unlikely that AOL users will miss seeing these
error messages.
HEAVY ARTILLERY FOR MAKING USERS NOTICE
MESSAGES
If the common, conventional methods of making users notice
messages are not
enough, three stronger methods are available to user-interface
designers: pop-up mes-
sage in error dialog box, use of sound (e.g., beep), and wiggle
or blink briefly. How-
ever, these methods, while very effective, have significant
negative effects, so they
should be used sparingly and with great care.
Method 1: Pop-up message in error dialog box
Displaying an error message in a dialog box sticks it right in the
user’s face, making it
hard to miss. Error dialog boxes interrupt the user’s work and
demand immediate
attention. That is good if the error message signals a critical
condition, but it can annoy
people if such an approach is used for a minor message, such as
confirming the execu-
tion of a user-requested action.
The annoyance of pop-up messages rises with the degree of
modality. Nonmodal
pop-ups allow users to ignore them and continue working.
Application-modal pop-
ups block any further work in the application that displayed the
error, but allow
users to interact with other software on their computer. System-
modal pop-ups
block any user action until the dialog has been dismissed.
Application-modal pop-ups should be used sparingly—for
example, only when
application data may be lost if the user doesn’t attend to the
error. System-modal
FIGURE 5.9
New member registration at AOL.com displays error messages
prominently, near each error.
http://AOL.com
59Heavy artillery for making users notice messages
pop-ups should be used extremely rarely—basically only when
the system is about
to crash and take hours of work with it, or if people will die if
the user misses the
error message.
On the Web, an additional reason to avoid pop-up error dialog
boxes is that some
people set their browsers to block all pop-up windows. If your
website relies on
pop-up error messages, some users may never see them.
REI.com has an example of a pop-up dialog being used to
display an error mes-
sage. The message is displayed when someone who is
registering as a new customer
omits required fields in the form (see Fig. 5.10). Is this an
appropriate use of a pop-up
dialog? AOL.com (see Fig. 5.9) shows that missing data errors
can be signaled quite
well without pop-up dialogs, so REI.com’s use of them seems a
bit heavy-handed.
Examples of more appropriate use of error dialog boxes come
from Microsoft
Excel (see Fig. 5.11A) and Adobe InDesign (see Fig. 5.11B). In
both cases, loss of data
is at stake.
Method 2: Use sound (e.g., beep)
When a computer beeps, that tells its user something has
happened that requires
attention. The person’s eyes reflexively begin scanning the
screen for whatever caused
the beep. This can allow the user to notice an error message that
is someplace other
than where the user was just looking, such as in a standard error
message box on the
display. That is the value of beeping.
However, imagine many people in a cubicle work environment
or a classroom, all
using an application that signals all errors and warnings by
beeping. Such a work-
place would be very annoying, to say the least. Worse, people
wouldn’t be able to tell
whether their own computer or someone else’s was beeping.
FIGURE 5.10
REI’s pop-up dialog box signals required data that was omitted.
It is hard to miss, but perhaps
overkill.
http://AOL.com
CHAPTER 5 Our Peripheral Vision is Poor60
The opposite situation is noisy work environments (e.g.,
factories or computer
server rooms), where auditory signals emitted by an application
might be masked by
ambient noise. Even in non-noisy environments, some computer
users simply prefer
quiet, and mute the sound on their computers or turn it way
down.
For these reasons, signaling errors and other conditions with
sound are remedies
that can be used only in very special, controlled situations.
Computer games often use sound to signal events and
conditions. In games,
sound isn’t annoying; it is expected. Its use in games is
widespread, even in game
arcades, where dozens of machines are all banging, roaring,
buzzing, clanging, beep-
ing, and playing music at once. (Well, it is annoying to parents
who have to go into
the arcades and endure all the screeching and booming to
retrieve their kids, but the
games aren’t designed for parents.)
Method 3: Wiggle or blink briefly
As described earlier in this chapter, our peripheral vision is
good at detecting motion,
and motion in the periphery causes reflexive eye movements
that bring the motion
into the fovea. User-interface designers can make use of this by
wiggling or flashing
messages briefly when they want to ensure that users see them.
It doesn’t take much
motion to trigger eye movement toward the motion. Just a tiny
bit of motion is enough
to make a viewer’s eyes zip over in that direction. Millions of
years of evolution have
had quite an effect.
As an example of using motion to attract users’ eye attention,
Apple’s iCloud
online service briefly shakes the entire dialog box horizontally
when a user enters an
invalid username or password (see Fig. 5.12). In addition to
clearly indicating “No”
(like a person shaking his head), this attracts the users’
eyeballs, guaranteed.
(Because, after all, the motion in the corner of your eye might
be a leopard.)
The most common use of blinking in computer user interfaces
(other than adver-
tisements) is in menu bars. When an action (e.g., Edit or Copy)
is selected from a
(A)
(B)
FIGURE 5.11
Appropriate pop-up error dialogs: (A) Microsoft Excel and (B)
Adobe InDesign.
61Heavy artillery for making users notice messages
menu, it usually blinks once before the menu closes to confirm
that the system “got”
the command—that is, that the user didn’t miss the menu item.
This use of blinking
is very common. It is so quick that most computer users aren’t
even aware of it, but
if menu items didn’t blink once, we would have less confidence
that we actually
selected them.
Motion and blinking, like pop-up dialog boxes and beeping,
must be used spar-
ingly. Most experienced computer users consider wiggling,
blinking objects on
screen to be annoying. Most of us have learned to ignore
displays that blink because
many such displays are advertisements. Conversely, a few
computer users have atten-
tional impairments that make it difficult for them to ignore
something that is blink-
ing or wiggling.
Therefore, if wiggling or blinking is used, it should be brief—it
should last about
a quarter- to a half-second, no longer. Otherwise, it quickly
goes from an uncon-
scious attention-grabber to a conscious annoyance.
Use heavy-artillery methods sparingly to avoid habituating your
users
There is one final reason to use the preceding heavy-artillery
methods sparingly (i.e.,
only for critical messages): to avoid habituating your users.
When pop-ups, sound,
motion, and blinking are used too often to attract users’
attention, a psychological
phenomenon called habituation sets in (see Chapter 1). Our
brain pays less and less
attention to any stimulus that occurs frequently.
It is like the old fable of the boy who cried “Wolf!” too often:
eventually, the vil-
lagers learned to ignore his cries, so when a wolf actually did
come, his cries went
unheeded. Overuse of strong attention-getting methods can
cause important mes-
sages to be blocked by habituation.
FIGURE 5.12
Apple’s iCloud shakes the dialog box briefly on login errors to
attract a user’s fovea toward it.
CHAPTER 5 Our Peripheral Vision is Poor62
VISUAL SEARCH IS LINEAR UNLESS TARGETS
“POP” IN THE PERIPHERY
As explained earlier, one function of peripheral vision is to
drive our eyes to focus the
fovea on important things—things we are seeking or that might
be a threat. Objects
moving in our peripheral vision fairly reliably “yank” our eyes
in that direction.
When we are looking for an object, our entire visual system,
including the periph-
ery, primes itself to detect that object. In fact, the periphery is a
crucial component in
visual search, despite its low spatial and color resolution.
However, just how helpful
the periphery is in aiding visual search depends strongly on
what we are looking for.
Look quickly at Figure 5.13 and find the Z.
To find the Z, you had to scan carefully through the characters
until your fovea
landed on it. In the lingo of vision researchers, the time to find
the Z is linear: it
depends approximately linearly on the number of distracting
characters and the
position of the Z among them.
Now look quickly at Figure 5.14 and find the bold character.
That was much easier (i.e., faster), wasn’t it? You didn’t have
to scan your fovea
carefully through the distracting characters. Your periphery
quickly detected the
boldness and determined its location, and because that is what
you were seeking,
FIGURE 5.13
Finding the Z requires scanning carefully through the
characters.
FIGURE 5.14
Finding the bold letter does not require scanning through
everything.
63Visual search is linear unless targets “pop” in the periphery
your visual system moved your fovea there. Your periphery
could not determine
exactly what was bold—that is beyond its resolution and
abilities—but it did locate the
boldness. In vision-researcher lingo, the periphery was primed
to look for boldness in
parallel over its entire area, and boldness is a distinctive feature
of the target, so search-
ing for a bold target is nonlinear. In designer lingo, we simply
say that boldness “pops
out” (“pops” for short) in the periphery, assuming that only the
target is bold.
Color “pops” even more strongly. Compare counting the L’s in
Figure 5.15 with
counting the blue characters in Figure 5.16.
What else makes things “pop” in the periphery? As described
earlier, the periph-
ery easily detects motion, so motion “pops.” Generalizing from
boldness, we also
can say that font weight “pops,” because if all but one of the
characters on a display
were bold, the nonbold character would stand out. Basically, a
visual target will pop
out in your periphery if it differs from surrounding objects in
features the periphery
can detect. The more distinctive features of the target, the more
it “pops,” assuming
the periphery can detect those features.
Using peripheral “pop” in design
Designers use peripheral “pop” to focus the attention of a
product’s users, as well as
to allow users to find information faster. Chapter 3 described
how visual hierarchy—
titles, headings, boldness, bullets, and indenting—can make it
easier for users to spot
FIGURE 5.15
Counting L’s is hard; character shape doesn’t “pop” among
characters.
FIGURE 5.16
Counting blue characters is easy because color “pops.”
CHAPTER 5 Our Peripheral Vision is Poor64
and extract from text the information they need. Glance back at
Figure 3.11 in Chap-
ter 3 and see how the headings and bullets make the topics and
subtopics “pop” so
readers can go right to them.
Many interactive systems use color to indicate status, usually
reserving red for
problems. Online maps and some vehicle GPS devices mark
traffic jams with red so
they stand out (see Fig. 5.17). Systems for controlling air traffic
mark potential colli-
sions in red (see Fig. 5.18). Applications for monitoring servers
and networks use
color to show the health status of assets or groups of them (see
Fig. 5.19).
These are all uses of peripheral “pop” to make important
information stand out
and visual search nonlinear.
When there are many possible targets
Sometimes in displays of many items, any of them could be
what the user wants.
Examples include command menus (see Fig. 5.20A) and object
pallets (see Fig. 5.20B).
Let’s assume that the application cannot anticipate which item
or items a user is likely
to want, and highlight those. That is a fair assumption for
today’s applications.4 Are
users doomed to have to search linearly through such displays
for the item they want?
That depends. Designers can try to make each item so
distinctive that when a
specific one is the user’s target, the user’s peripheral vision will
be able to spot it among
4 But in the not-too-distant future it might not be.
FIGURE 5.17
Google Maps uses color to show traffic conditions. Red
indicates traffic jams.
65Visual search is linear unless targets “pop” in the periphery
FIGURE 5.18
Air traffic control systems often use red to make potential
collisions stand out.
FIGURE 5.19
Paessler’s monitoring tool uses color to show the health of
network components.
CHAPTER 5 Our Peripheral Vision is Poor66
all the other items. Designing distinctive sets of icons is hard—
especially when the
set is large—but it can be done (see Johnson et. al, 1989).
Designing sets of icons that
are so distinctive that they can be distinguished in peripheral
vision is very hard, but
not impossible. For example, if a user goes to the Mac OS
application pallet to open
his or her calendar, a white rectangular blob in the periphery
with something black
in the middle is more likely to attract the user’s eye than a blue
circular blob (see Fig.
5.20B). The trick is not to get too fancy and detailed with the
icons—give each one
a distinctive color and gross shape.
On the other hand, if the potential targets are all words, as in
command menus
(see Fig. 20A), visual distinctiveness is not an option. In textual
menus and lists,
visual search will be linear, at least at first. With practice, users
learn the positions of
frequently used items in menus, lists, and pallets, so searching
for particular items is
no longer linear.
That is why applications should never move items around in
menus, lists, or
pallets. Doing that prevents users from learning item positions,
thereby dooming
them to search linearly forever. Therefore, “dynamic menus” is
considered a major
user-interface design blooper ( Johnson, 2007).
FIGURE 5.20
(A) Microsoft Word Tools menu, and (B) MacOS application
pallet.
Designing with the Mind in Mind.
http://dx.doi.org/10.1016/B978-0-12-407914-4.00006-3
© 2014 Elsevier Inc. All rights reserved.
CHAPTER
67
Reading is Unnatural
6
Most people in industrialized nations grow up in households and
school districts that
promote education and reading. They learn to read as young
children and become good
readers by adolescence. As adults, most of our activities during
a normal day involve
reading. The process of reading—deciphering words into their
meaning—is for most
educated adults automatic, leaving our conscious minds free to
ponder the meaning
and implications of what we are reading. Because of this
background, it is common for
good readers to consider reading to be as “natural” a human
activity as speaking is.
WE’RE WIRED FOR LANGUAGE, BUT NOT FOR READING
Speaking and understanding spoken language is a natural human
ability, but reading is
not. Over hundreds of thousands—perhaps millions—of years,
the human brain
evolved the neural structures necessary to support spoken
language. As a result, normal
humans are born with an innate ability to learn as toddlers, with
no systematic training,
whatever language they are exposed to. After early childhood,
this ability decreases
significantly. By adolescence, learning a new language is the
same as learning any other
skill: it requires instruction and practice, and the learning and
processing are handled
by different brain areas from those that handled it in early
childhood (Sousa, 2005).
In contrast, writing and reading did not exist until a few
thousand years BCE and
did not become common until only four or five centuries ago—
long after the human
brain had evolved into its modern state. At no time during
childhood do our brains
show any special innate ability to learn to read. Instead, reading
is an artificial skill
that we learn by systematic instruction and practice, like
playing a violin, juggling,
or reading music (Sousa, 2005).
Many people never learn to read well, or at all
Because people are not innately “wired” to learn to read,
children who either lack
caregivers who read to them or who receive inadequate reading
instruction in
CHAPTER 6 Reading is Unnatural68
school may never learn to read. There are a great many such
people, especially in
the developing world. By comparison, very few people never
learn a spoken
language.
For a variety of reasons, some people who learn to read never
become good at it.
Perhaps their parents did not value and promote reading.
Perhaps they attended
substandard schools or didn’t attend school at all. Perhaps they
learned a second
language but never learned to read well in that language. People
who have cognitive
or perceptual impairments such as dyslexia may never read
easily.
A person’s ability to read is specific to a language and a script
(a system of writ-
ing). To see what text looks like to someone who cannot read,
just look at a para-
graph printed in a language and script that you do not know (see
Fig. 6.1).
Alternatively, you can approximate the feeling of illiteracy by
taking a page writ-
ten in a familiar script and language—such as a page of this
book—and turning it
upside down. Turn this book upside down and try reading the
next few paragraphs.
This exercise only approximates the feeling of illiteracy. You
will discover that the
inverted text appears foreign and illegible at first, but after a
minute you will be able
to read it, albeit slowly and laboriously.
Learning to read = training our visual system
Learning to read involves training our visual system to
recognize patterns—the
patterns exhibited by text. These patterns run a gamut from low
level to high
level:
l Lines, contours, and shapes are basic visual features that our
brain recognizes
innately. We don’t have to learn to recognize them.
l Basic features combine to form patterns that we learn to
identify as charac-
ters—letters, numeric digits, and other standard symbols. In
ideographic
scripts, such as Chinese, symbols represent entire words or
concepts.
FIGURE 6.1
To see how it feels to be illiterate, look at text printed in a
foreign script: (A) Amharic and
(B) Tibetan.
69We’re wired for language, but not for reading
l In alphabetic scripts, patterns of characters form morphemes,
which we learn
to recognize as packets of meaning—for example, “farm,”
“tax,” “-ed,” and “-ing”
are morphemes in English.
l Morphemes combine to form patterns that we recognize as
words—for example,
“farm,” “tax,” “-ed,” and “-ing” can be combined to form the
words “farm,”
“farmed,” “farming,” “tax,” “taxed,” and “taxing.” Even
ideographic scripts
include symbols that serve as morphemes or modifiers of
meaning rather than as
words or concepts.
l Words combine to form patterns that we learn to recognize as
phrases, idiom-
atic expressions, and sentences.
l Sentences combine to form paragraphs.
Actually, only part of our visual system is trained to recognize
textual patterns
involved in reading: the fovea and a small area immediately
surrounding it (known as the
perifovea), and the downstream neural networks running
through the optic nerve to
the visual cortex and into various parts of our brain. The neural
networks starting else-
where in our retinas do not get trained to read. More about this
is explained later in the
chapter.
Learning to read also involves training the brain’s systems that
control eye move-
ment to move our eyes in a specific way over text. The main
direction of eye move-
ment depends on the direction in which the language we are
reading is written:
European language scripts are read left to right, many middle
Eastern language
scripts are read right to left, and some language scripts are read
top to bottom.
Beyond that, the precise eye movements differ depending on
whether we are read-
ing, skimming for overall meaning, or scanning for specific
words.
How we read
Assuming our visual system and brain have successfully been
trained, reading
becomes semi-automatic or fully automatic—both the eye
movement and the
processing.
As explained earlier, the center of our visual field—the fovea
and perifovea—is
the only part of our visual field that is trained to read. All text
that we read enters our
visual system after being scanned by the central area, which
means that reading
requires a lot of eye movement.
As explained in Chapter 5 on the discussion of peripheral
vision, our eyes con-
stantly jump around, several times a second. Each of these
movements, called sac-
cades, lasts about 0.1 second. Saccades are ballistic, like firing
a shell from a cannon:
their endpoint is determined when they are triggered, and once
triggered, they
always execute to completion. As described in earlier chapters,
the destinations of
saccadic eye movements are programmed by the brain from a
combination of our
goals, events in the visual periphery, events detected and
localized by other percep-
tual senses, and past history including training.
CHAPTER 6 Reading is Unnatural70
When we read, we may feel that our eyes scan smoothly across
the lines of text,
but that feeling is incorrect. In reality, our eyes continue with
saccades during read-
ing, but the movements generally follow the line of text. They
fix our fovea on a
word, pause there for a fraction of a second to allow basic
patterns to be captured
and transmitted to the brain for further analysis, then jump to
the next important
word (Larson, 2004). Eye fixations while reading always land
on words, usually near
the center, never on word boundaries (see Fig. 6.2). Very
common small connector
and function words like “a,” “and,” “the,” “or,” “is,” and “but”
are usually skipped
over, their presence either detected in perifoveal vision or
simply assumed. Most of
the saccades during reading are in the text’s normal reading
direction, but a few—
about 10%—jump backwards to previous words. At the end of
each line of text, our
eyes jump to where our brain guesses the next line begins.1
How much can we take in during each eye fixation during
reading? For reading
European-language scripts at normal reading distances and text-
font sizes, the fovea
clearly sees 3–4 characters on either side of the fixation point.
The perifovea sees out
about 15–20 characters from the fixation point, but not very
clearly (see Fig. 6.3).
According to reading researcher Kevin Larson (2004), the
reading area in and around
the fovea consists of three distinct zones (for European-
language scripts):
Closest to the fixation point is where word recognition takes
place. This zone is usu-
ally large enough to capture the word being fixated, and often
includes smaller function
words directly to the right of the fixated word. The next zone
extends a few letters past
the word recognition zone, and readers gather preliminary
information about the next let-
ters in this zone. The final zone extends out to 15 letters past
the fixation point. Informa-
tion gathered out this far is used to identify the length of
upcoming words and to identify
the best location for the next fixation point.
1 Later we will see that centered text disrupts the brain’s guess
about where the next line starts.
FIGURE 6.2
Saccadic eye movements during reading jump between
important words.
FIGURE 6.3
Visibility of words in a line of text, with fovea fixed on the
word “years.”
71Is reading feature-driven or context-driven?
Because our visual system has been trained to read, perception
around the fixation
point is asymmetrical: it is more sensitive to characters in the
reading direction than
in the other direction. For European-language scripts, this is
toward the right. That
makes sense because characters to the left of the fixation point
have usually already
been read.
IS READING FEATURE-DRIVEN OR CONTEXT-DRIVEN?
As explained earlier, reading involves recognizing features and
patterns. Pattern rec-
ognition, and therefore reading, can be either a bottom-up,
feature-driven process,
or a top-down, context-driven process.
In feature-driven reading, the visual system starts by identifying
simple features—
line segments in a certain orientation or curves of a certain
radius—on a page or display,
and then combines them into more complex features, such as
angles, multiple curves,
shapes, and patterns. Then the brain recognizes certain shapes
as characters or symbols
representing letters, numbers, or, for ideographic scripts, words.
In alphabetic scripts,
groups of letters are perceived as morphemes and words. In all
types of scripts, sequences
of words are parsed into phrases, sentences, and paragraphs that
have meaning.
Feature-driven reading is sometimes referred to as “bottom-up”
or “context-free.”
The brain’s ability to recognize basic features—lines, edges,
angles, etc.—is built in
and therefore automatic from birth. In contrast, recognition of
morphemes, words,
and phrases has to be learned. It starts out as a nonautomatic,
conscious process
requiring conscious analysis of letters, morphemes, and words,
but with enough
practice it becomes automatic (Sousa, 2005). Obviously, the
more common a mor-
pheme, word, or phrase, the more likely that recognition of it
will become auto-
matic. With ideographic scripts such as Chinese, which have
many times more
symbols than alphabetic scripts do, people typically take many
years longer to
become skilled readers.
Context-driven or top-down reading operates in parallel with
feature-driven read-
ing but it works the opposite way: from whole sentences or the
gist of a paragraph
down to the words and characters. The visual system starts by
recognizing high-
level patterns like words, phrases, and sentences, or by knowing
the text’s meaning
in advance. It then uses that knowledge to figure out—or
guess—what the compo-
nents of the high-level pattern must be (Boulton, 2009).
Context-driven reading is
less likely to become fully automatic because most phrase-level
and sentence-level
patterns and contexts don’t occur frequently enough to allow
their recognition to
become burned into neural firing patterns. But there are
exceptions, such as idiom-
atic expressions.
To experience context-driven reading, glance quickly at Figure
6.4, then immedi-
ately direct your eyes back here and finish reading this
paragraph. Try it now. What
did the text say?
Now look at the same sentence again more carefully. Do you
read it the same way
now?
CHAPTER 6 Reading is Unnatural72
Also, based on what we have already read and our knowledge of
the world, our
brains can sometimes predict text that the fovea has not yet read
(or its meaning),
allowing us to skip reading it. For example, if at the end of a
page we read “It was a
dark and stormy,” we would expect the first word on the next
page to be “night.” We
would be surprised if it was some other word (e.g., “cow”).
Feature-driven, bottom-up reading dominates; context assists
It has been known for decades that reading involves both
feature-driven (bottom-up)
processing and context-driven (top-down) processing. In
addition to being able to
figure out the meaning of a sentence by analyzing the letters
and words in it, people
can determine the words of a sentence by knowing the meaning
of the sentence, or
the letters in a word by knowing what word it is (see Fig. 6.5).
The question is: Is
skilled reading primarily bottom-up or top-down, or is neither
mode dominant?
Early scientific studies of reading—from the late 1800s through
about 1980—
seemed to show that people recognize words first and from that
determine what
letters are present. The theory of reading that emerged from
those findings was that
our visual system recognizes words primarily from their overall
shape. This theory
failed to account for certain experimental results and so was
controversial among
reading researchers, but it nonetheless gained wide acceptance
among nonresearch-
ers, especially in the graphic design field (Larson, 2004;
Herrmann, 2011).
Mray had a ltilte lmab, its feclee was withe as sown. And ervey
wehre taht Mray wnet, the lmab was srue to go.
(A)
(B)
Twinkle twinkle little star how I wonder what you are
FIGURE 6.5
Top-down reading: most readers, especially those who know the
songs from which these text
passages are taken, can read these passages even though the
words (A) have all but their first
and last letters scrambled and (B) are mostly obscured.
The rain in Spain falls
manly in the the plain
FIGURE 6.4
Top-down recognition of the expression can inhibit seeing the
actual text.
73Is reading feature-driven or context-driven?
Similarly, educational researchers in the 1970s applied
information theory to
reading, and assumed that because of redundancies in written
language, top-down,
context-driven reading would be faster than bottom-up, feature-
driven reading. This
assumption led them to hypothesize that reading for highly
skilled (fast) readers
would be dominated by context-driven (top-down) processing.
This theory was
probably responsible for many speed-reading methods of the
1970s and 1980s, which
supposedly trained people to read fast by taking in whole
phrases and sentences at
a time.
However, empirical studies of readers conducted since then
have demonstrated
conclusively that those early theories were false. Summing up
the research are state-
ments from reading researchers Kevin Larson (2004) and Keith
Stanovich (Bolton,
2009), respectively:
Word shape is no longer a viable model of word recognition.
The bulk of scientific evi-
dence says that we recognize a word’s component letters, then
use that visual informa-
tion to recognize a word.
Context [is] important, but it’s a more important aid for the
poorer reader who doesn’t
have automatic context-free recognition instantiated.
In other words, reading consists mainly of context-free, bottom-
up, feature-driven
processes. In skilled readers, these processes are well learned to
the point of being
automatic. Context-driven reading today is considered mainly a
backup method that,
although it operates in parallel with feature-based reading, is
only relevant when
feature-driven reading is difficult or insufficiently automatic.
Skilled readers may resort to context-based reading when
feature-based read-
ing is disrupted by poor presentation of information (see
examples later in this
chapter). Also, in the race between context-based and feature-
based reading to
decipher the text we see, contextual cues sometimes win out
over features. As an
example of context-based reading, Americans visiting England
sometimes mis-
read “to let” signs as “toilet,” because in the United States they
see the word “toi-
let” often, but they almost never see the phrase “to let”—
Americans use “for
rent” instead.
In less skilled readers, feature-based reading is not automatic; it
is conscious and
laborious. Therefore, more of their reading is context-based.
Their involuntary use
of context-based reading and nonautomatic feature-based
reading consumes short-
term cognitive capacity, leaving little for comprehension.2 They
have to focus on
2 Chapter 10 describes the differences between automatic and
controlled cognitive processing. Here, we will
simply say that controlled processes burden working memory,
while automatic processes do not.
CHAPTER 6 Reading is Unnatural74
deciphering the stream of words, leaving no capacity for
constructing the meaning
of sentences and paragraphs. That is why poor readers can read
a passage aloud but
afterward have no idea what they just read.
Why is context-free (bottom-up) reading not automatic in some
adults? Some peo-
ple didn’t get enough experience reading as young children for
the feature-driven
recognition processes to become automatic. As they grow up,
they find reading men-
tally laborious and taxing, so they avoid reading, which
perpetuates and compounds
their deficit (Boulton, 2009).
SKILLED AND UNSKILLED READING USE DIFFERENT
PARTS OF THE BRAIN
Before the 1980s, researchers who wanted to understand which
parts of the brain
are involved in language and reading were limited mainly to
studying people who
had suffered brain injuries. For example, in the mid-19th
century, doctors found that
people with brain damage near the left temple—an area now
called Broca’s area
after the doctor who discovered it—can understand speech but
have trouble speak-
ing, and that people with brain damage near the left ear—now
called Wernicke’s area—
cannot understand speech (Sousa, 2005) (see Fig. 6.6).
In recent decades, new methods of observing the operation of
functioning brains
in living people have been developed: electroencephalography
(EEG), functional
magnetic resonance imaging (fMRI), and functional magnetic
resonance spectros-
copy (fMRS). These methods allow researchers to watch the
response in different
areas of a person’s brain—including the sequence in which they
respond—as the
person perceives various stimuli or performs specific tasks
(Minnery and Fine,
2009).
Broca’s area
Wernicke’s area
FIGURE 6.6
The human brain, showing Broca’s area and Wernicke’s area.
75Poor information design can disrupt reading
Using these methods, researchers have discovered that the
neural pathways
involved in reading differ for novice versus skilled readers. Of
course, the first area
to respond during reading is the occipital (or visual) cortex at
the back of the brain.
That is the same regardless of a person’s reading skill. After
that, the pathways
diverge (Sousa, 2005):
l Novice. First an area of the brain just above and behind
Wernicke’s area
becomes active. Researchers have come to view this as the area
where, at least
with alphabetic scripts such as English and German, words are
“sounded out”
and assembled—that is, letters are analyzed and matched with
their correspond-
ing sounds. The word-analysis area then communicates with
Broca’s area and
the frontal lobe, where morphemes and words—units of
meaning—are recog-
nized and overall meaning is extracted. For ideographic
languages, where sym-
bols represent whole words and often have a graphical
correspondence to their
meaning, sounding out of words is not part of reading.
l Advanced. The word-analysis area is skipped. Instead the
occipitotemporal area
(behind the ear, not far from the visual cortex) becomes active.
The prevailing
view is that this area recognizes words as a whole without
sounding them out,
and then that activity activates pathways toward the front of the
brain that cor-
respond to the word’s meaning and mental image. Broca’s area
is only slightly
involved.
Findings from brain scan methods of course don’t indicate
exactly what pro-
cesses are being used, but they support the theory that advanced
readers use differ-
ent processes from those novice readers use.
POOR INFORMATION DESIGN CAN DISRUPT READING
Careless writing or presentation of text can reduce skilled
readers’ automatic, con-
text-free reading to conscious, context-based reading, burdening
working memory,
thereby decreasing speed and comprehension. In unskilled
readers, poor text pre-
sentation can block reading altogether.
Uncommon or unfamiliar vocabulary
One way software often disrupts reading is by using unfamiliar
vocabulary—words
the intended readers don’t know very well or at all.
One type of unfamiliar terminology is computer jargon,
sometimes known as
“geek speak.” For example, an intranet application displayed
the following error mes-
sage if a user tried to use the application after more than 15
minutes of letting it sit
idle:
Your session has expired. Please reauthenticate.
CHAPTER 6 Reading is Unnatural76
The application was for finding resources—rooms, equipment,
etc.—within
the company. Its users included receptionists, accountants, and
managers, as
well as engineers. Most nontechnical users would not
understand the word
“reauthenticate,” so they would drop out of automatic reading
mode into con-
scious wondering about the message’s meaning. To avoid
disrupting reading, the
application’s developers could have used the more familiar
instruction, “Login
again.” For a discussion of how “geek speak” in computer-based
systems affects
learning, see Chapter 11.
Reading can also be disrupted by uncommon terms even if they
are not computer
technology terms. Here are some rare English words, including
many that appear
mainly in contracts, privacy statements, or other legal
documents:
l Aforementioned: mentioned previously
l Bailiwick: the region in which a sheriff has legal powers;
more generally:
domain of control
l Disclaim: renounce any claim to or connection with; disown;
repudiate
l Heretofore: up to the present time; before now
l Jurisprudence: the principles and theories on which a legal
system is based
l Obfuscate: make something difficult to perceive or
understand
l Penultimate: next to the last, as in “the next to the last
chapter of a book”
When readers—even skilled ones—encounter such a word, their
automatic read-
ing processes probably won’t recognize it. Instead, their brain
uses less automatic
processes, such as sounding out the word’s parts and using them
to figure out its
meaning, figuring out the meaning from the context in which
the word appears, or
looking the word up in a dictionary.
Difficult scripts and typefaces
Even when the vocabulary is familiar, reading can be disrupted
by typefaces with
unfamiliar or hard-to-distinguish shapes. Context-free,
automatic reading is based
on recognizing letters and words bottom-up from their lower-
level visual features.
Our visual system is quite literally a neural network that must
be trained to recog-
nize certain combinations of shapes as characters. Therefore, a
typeface with dif-
ficult-to-recognize features and shapes will be hard to read. For
example, try to
read Abraham Lincoln’s Gettysburg Address in an outline
typeface in ALL CAPS
(see Fig. 6.7).
Comparison studies show that skilled readers read uppercase
text 10–15% more
slowly than lowercase text. Current-day researchers attribute
that difference mainly
to a lack of practice reading uppercase text, not to an inherent
lower recognizability
of uppercase text (Larson, 2004). Nonetheless, it is important
for designers to be
aware of the practice effect (Herrmann, 2011).
77Poor information design can disrupt reading
Tiny fonts
Another way to make text hard to read in software applications,
websites, and elec-
tronic appliances is to use fonts that are too small for their
intended readers’ visual
system to resolve. For example, try to read the first paragraph
of the U.S. Constitu-
tion in a seven-point font (see Fig. 6.8).
Developers sometimes use tiny fonts because they have a lot of
text to display in
a small amount of space. But if the intended users of the system
cannot read the text,
or can read it only laboriously, the text might as well not be
there.
Text on noisy background
Visual noise in and around text can disrupt recognition of
features, characters, and
words, and therefore drop reading out of automatic feature-
based mode into a
more conscious and context-based mode. In software user
interfaces and websites,
visual noise often results from designers’ placing text over a
patterned background
or displaying text in colors that contrast poorly with the
background, as an exam-
ple from Arvanitakis.com shows (see Fig. 6.9).
FIGURE 6.7
Text in ALL CAPS is harder to read because we are not
practiced at doing it. Outline typefaces
complicate feature recognition. This example demonstrates
both.
We the people of the United States, in Order to form a more
perfect Union, establish Justice, insure domestic Tranquility,
provide
for the common defense, promote the general Welfare, and
secure the Blessings of Liberty to ourselves and our Posterity,
do
ordain and establish this Constitution for the United States of
America.
FIGURE 6.8
The opening paragraph of the U.S. Constitution, presented in a
seven-point font.
http://Arvanitakis.com
CHAPTER 6 Reading is Unnatural78
There are situations in which designers intend to make text hard
to read. For
example, a common security measure on the Web is to ask site
users to identify dis-
torted words, as proof that they are a live human beings and not
an Internet “’bot.”
This relies on the fact that most people can read text that
Internet ’bots cannot cur-
rently read. Text displayed as a challenge to test a registrant’s
humanity is called a
captcha3 (see Fig. 6.10).
Of course, most text displayed in a user interface should be easy
to read. A
patterned background need not be especially strong to disrupt
people’s ability to
read text placed over it. For example, the Federal Reserve
Bank’s collection of
websites formerly provided a mortgage calculator that was
decorated with a
repeating pastel background with a home and neighborhood
theme. Although
well-intentioned, the decorated background made the calculator
hard to read
(see Fig. 6.11).
Information buried in repetition
Visual noise can also come from the text itself. If successive
lines of text contain
a lot of repetition, readers receive poor feedback about what
line they are focused
on, plus it is hard to pick out the important information. For
example, recall the
example from the California Department of Motor Vehicles web
site in Chapter 3
(see Fig. 3.2).
3 The term originally comes from the word “capture,” but it is
also said to be an acronym for “Completely
Automated Public Turing test to tell Computers and Humans
Apart.”
FIGURE 6.10
Text that is intentionally displayed with noise so that Web-
crawling software cannot read it is
called a captcha.
FIGURE 6.9
Arvanitakis.com uses text on a noisy background and poor color
contrast.
79Poor information design can disrupt reading
Another example of repetition that creates noise is the computer
store on
Apple.com. The pages for ordering a laptop computer list
different keyboard
options for a computer in a very repetitive way, making it hard
to see that the
essential difference between the keyboards is the language that
they support
(see Fig. 6.12).
Centered text
One aspect of reading that is highly automatic in most skilled
readers is eye move-
ment. In automatic (fast) reading, our eyes are trained to go
back to the same hori-
zontal position and down one line. If text is centered or right-
aligned, each line of
text starts in a different horizontal position. Automatic eye
movements, therefore,
take our eyes back to the wrong place, so we must consciously
adjust our gaze to the
actual start of each line. This drops us out of automatic mode
and slows us down
greatly. With poetry and wedding invitations, that is probably
okay, but with any
FIGURE 6.11
The Federal Reserve Bank’s online mortgage calculator
displayed text on a patterned
background.
FIGURE 6.12
Apple.com’s “Buy Computer” page lists options in which the
important information (keyboard
language compatibility) is buried in repetition.
http://Apple.com
CHAPTER 6 Reading is Unnatural80
other type of text, it is a disadvantage. An example of centered
prose text is provided
by the web site of FargoHomes, a real estate company (see Fig.
6.13). Try reading the
text quickly to demonstrate to yourself how your eyes move.
The same site also centers numbered lists, really messing up
readers’ automatic
eye movement (see Fig. 6.14). Try scanning the list quickly.
Exclusive Buyer Agency Offer
(No Cost) Service to Home Buyers!
Dan and Lida want to work for you if:
.......................................................................................
Would you like to avoid sellers agents who are pushing, selling,
and trying to make sales quotas?
Do you want your agent to be on your side and not the sellers
side?
Do you expect your agent to be responsible and professional....?
If you don’t like to have your time wasted, Dan and Lida want
to work for you....
If you understand that everything we say and do, is to save you
time, money, and keep you out of trouble....
-and if you understand that some agents income and allegiances
are in direct competition with your best interests....
-and if you understand that we take risks, give you 24/7 access,
and put aside other paying business for you...
-and if you understand that we have a vested interest in helping
you learn to make all the right choices...
- then, call us now, because Dan and Lida want to work for
you!!
FIGURE 6.13
FargoHomes.com centers text, thwarting automatic eye
movement patterns.
FIGURE 6.14
FargoHomes.com centers numbered items, really thwarting
automatic eye movement patterns.
http://FargoHomes.com
http://FargoHomes.com
81Poor information design can disrupt reading
Design implications: Don’t disrupt reading; support it!
Obviously, a designer’s goal should be to support reading, not
disrupt it. Skilled (fast)
reading is mostly automatic and mostly based on feature,
character, and word recog-
nition. The easier the recognition, the easier and faster the
reading. Less skilled read-
ing, by contrast, is greatly assisted by contextual cues.
Designers of interactive systems can support both reading
methods by following
these guidelines:
1) Ensure that text in user interfaces allows the feature-based
automatic
processes to function effectively by avoiding the disruptive
flaws
described earlier: difficult or tiny fonts, patterned backgrounds,
centering,
etc.
2) Use restricted, highly consistent vocabularies—sometimes
referred to in the
industry as plain language4 or simplified language (Redish,
2007).
3) Format text to create a visual hierarchy (see Chapter 3) to
facilitate easy scan-
ning: use headings, bulleted lists, tables, and visually
emphasized words (see
Fig. 6.15).
Experienced information architects, content editors, and graphic
designers can
be very useful in ensuring that text is presented to support easy
scanning and
reading.
4 For more information on plain language, see the U.S.
government website, www.plainlanguage.gov.
FIGURE 6.15
Microsoft Word’s “Help” homepage is easy to scan and read.
CHAPTER 6 Reading is Unnatural82
MUCH OF THE READING REQUIRED BY SOFTWARE
IS UNNECESSARY
In addition to committing design mistakes that disrupt reading,
many software user
interfaces simply present too much text, requiring users to read
more than is neces-
sary. Consider how much unnecessary text there is in a dialog
box for setting text
entry properties in the SmartDraw application (see Fig. 6.16).
Software designers often justify lengthy instructions by arguing:
“We need all
that text to explain clearly to users what to do.” However,
instructions can often be
shortened with no loss of clarity. Let’s examine how the Jeep
company, between
2002 and 2007, shortened its instructions for finding a local
Jeep dealer (see Fig.
6.17):
4) 2002: The “Find a Dealer” page displayed a large paragraph
of prose text, with
numbered instructions buried in it, and a form asking for more
information than
needed to find a dealer near the user.
FIGURE 6.16
SmartDraw’s “Text Entry Properties” dialog box displays too
much text for its simple functionality.
83Much of the reading required by software is unnecessary
FIGURE 6.17
Between 2002 and 2007, Jeep.com drastically reduced the
reading required by “Find a Dealer.”
http://Jeep.com
CHAPTER 6 Reading is Unnatural84
5) 2003: The instructions on the “Find a Dealer” page had
been boiled down to
three bullet points, and the form required less information.
6) 2007: “Find a Dealer” had been cut to one field (zip code)
and a “Go” button on
the homepage.
Even when text describes products rather than explaining
instructions, it is
counterproductive to put all a vendor wants to say about a
product into a lengthy
prose description that people have to read from start to end.
Most potential custom-
ers cannot or will not read it. Compare Costco.com’s
descriptions of laptop comput-
ers in 2007 with those in 2009 (see Fig. 6.18).
Design implications: Minimize the need for reading
Too much text in a user interface loses poor readers, who
unfortunately are a signifi-
cant percentage of the population. Too much text even alienates
good readers: it
turns using an interactive system into an intimidating amount of
work.
FIGURE 6.18
Between 2007 and 2009, Costco.com drastically reduced the
text in product descriptions.
http://Costco.com
85Test on real users
Minimize the amount of prose text in a user interface; don’t
present users with
long blocks of prose text to read. In instructions, use the least
amount of text that
gets most users to their intended goals. In a product description,
provide a brief
overview of the product and let users request more detail if they
want it. Technical
writers and content editors can assist greatly in doing this. For
additional advice on
how to eliminate unnecessary text, see Krug (2005) and Redish
(2007).
TEST ON REAL USERS
Finally, designers should test their designs on the intended user
population to be
confident that users can read all essential text quickly and
effortlessly. Some testing
can be done early, using prototypes and partial
implementations, but it should also
be done just before release. Fortunately, last-minute changes to
text font sizes and
formats are usually easy to make.
This page intentionally left blank
Designing with the Mind in Mind.
http://dx.doi.org/10.1016/B978-0-12-407914-4.00007-5
© 2014 Elsevier Inc. All rights reserved.
CHAPTER
87
Our Attention is Limited;
Our Memory is Imperfect
Just as the human visual system has strengths and weaknesses,
so do human attention
and memory. This chapter describes some of those strengths and
weaknesses as back-
ground for understanding how we can design interactive systems
to support and aug-
ment attention and memory rather than burdening or confusing
them. We will start
with an overview of how memory works, and how it is related to
attention.
SHORT- VERSUS LONG-TERM MEMORY
Psychologists historically have distinguished short-term
memory from long-term
memory. Short-term memory covers situations in which
information is retained for
intervals ranging from a fraction of a second to a few minutes.
Long-term memory
covers situations in which information is retained over longer
periods (e.g., hours,
days, years, even lifetimes).
It is tempting to think of short- and long-term memory as
separate memory stores.
Indeed, some theories of memory have considered them
separate. After all, in a digi-
tal computer, the short-term memory stores (central processing
unit [CPU] data reg-
isters) are separate from the long-term memory stores (random
access memory
[RAM], hard disk, flash memory, CD-ROM, etc.). More direct
evidence comes from
findings that damage to certain parts of the human brain results
in short-term mem-
ory deficits but not long-term ones, or vice versa. Finally, the
speed with which infor-
mation or plans can disappear from our immediate awareness
contrasts sharply with
the seeming permanence of our memory of important events in
our lives, faces of
significant people, activities we have practiced, and information
we have studied.
These phenomena led many researchers to theorize that short-
term memory is a
separate store in the brain where information is held temporarily
after entering
through our perceptual senses (e.g., visual or auditory), or after
being retrieved from
long-term memory (see Fig. 7.1).
7
CHAPTER 7 Our Attention is Limited; Our Memory is
Imperfect88
A MODERN VIEW OF MEMORY
Recent research on memory and brain function indicates that
short- and long-term
memory are functions of a single memory system—one that is
more closely linked
with perception than previously thought ( Jonides et al., 2008).
Long-term memory
Perceptions enter through the visual, auditory, olfactory,
gustatory, or tactile sensory
systems and trigger responses starting in areas of the brain
dedicated to each sense
(e.g., visual cortex, auditory cortex), then spread into other
areas of the brain that are
not specific to any particular sensory modality. The sensory
modality–specific areas
of the brain detect only simple features of the data, such as a
dark–light edge, diago-
nal line, high-pitched tone, sour taste, red color, or rightward
motion. Downstream
areas of the brain combine low-level features to detect higher-
level features of the
input, such as animal, the word “duck,” Uncle Kevin, minor
key, threat, or fairness.
As described in Chapter 1, the set of neurons activated by a
perceived stimulus
depends on both the features and context of the stimulus. The
context is as impor-
tant as the features of the stimulus in determining what neural
patterns are acti-
vated. For example, a dog barking near you when you are
walking in your
neighborhood activates a different pattern of neural activity in
your brain than the
same sound heard when you are safely inside your car. The
more similar two percep-
tual stimuli are—that is, the more features and contextual
elements they share—the
more overlap there is between the sets of neurons that fire in
response to them.
The initial strength of a perception depends on how much it is
amplified or damp-
ened by other brain activity. All perceptions create some kind of
trace, but some are
so weak that they can be considered as not registered: the
pattern was activated
once but never again.
Memory formation consists of changes in the neurons involved
in a neural activ-
ity pattern, which make the pattern easier to reactivate in the
future.1 Some such
changes result from chemicals released near neural endings that
boost or inhibit
their sensitivity to stimulation. These changes last only until the
chemicals dissipate
1 There is evidence that the long-term neural changes associated
with learning occur mainly during sleep, sug-
gesting that separating learning sessions by periods of sleep
may facilitate learning (Stafford and Webb, 2005).
Perception Short-Term Memory
hello
Long-Term Memory
duck farm ham
friend Bill greet smile
FIGURE 7.1
Traditional (antiquated) view of short-term versus long-term
memory.
89A modern view of memory
or are neutralized by other chemicals. More permanent changes
occur when neu-
rons grow and branch, forming new connections with others.
Activating a memory consists of reactivating the same pattern
of neural activity
that occurred when the memory was formed. Somehow the brain
distinguishes ini-
tial activations of neural patterns from reactivations—perhaps
by measuring the rela-
tive ease with which the pattern was reactivated. New
perceptions very similar to
the original ones reactivate the same patterns of neurons,
resulting in recognition if
the reactivated perception reaches awareness. In the absence of
a similar percep-
tion, stimulation from activity in other parts of the brain can
also reactivate a pattern
of neural activity, which if it reaches awareness results in
recall.
The more often a neural memory pattern is reactivated, the
stronger it becomes—
that is, the easier it is to reactivate—which in turn means that
the perception it cor-
responds to is easier to recognize and recall. Neural memory
patterns can also be
strengthened or weakened by excitatory or inhibitory signals
from other parts of the
brain.
A particular memory is not located in any specific spot in the
brain. The neural
activity pattern comprising a memory involves a network of
millions of neurons
extending over a wide area. Activity patterns for different
memories overlap,
depending on which features they share. Removing, damaging,
or inhibiting neu-
rons in a particular part of the brain typically does not
completely wipe out mem-
ories that involve those neurons, but rather just reduces the
detail or accuracy of
the memory by deleting features.2 However, some areas in a
neural activity pat-
tern may be critical pathways, so that removing, damaging, or
inhibiting them
may prevent most of the pattern from activating, thereby
effectively eliminating
the corresponding memory.
For example, researchers have long known that the
hippocampus, twin seahorse-
shaped neural clusters near the base of the brain, plays an
important role in storing
long-term memories. The modern view is that the hippocampus
is a controlling
mechanism that directs neural rewiring so as to “burn”
memories into the brain’s
wiring. The amygdala, two jellybean-shaped clusters on the
frontal tips of the hip-
pocampus, has a similar role, but it specializes in storing
memories of emotionally
intense, threatening situations (Eagleman, 2012).
Cognitive psychologists view human long-term memory as
consisting of several
distinct functions:
l Semantic long-term memory stores facts and relationships.
l Episodic long-term memory records past events.
l Procedural long-term memory remembers action sequences.
These distinctions, while important and interesting, are beyond
the scope of this
book.
2 This is similar to the effect of cutting pieces out of a
holographic image: it reduces the overall resolution of
the image, rather than removing areas of it, as with an ordinary
photograph.
CHAPTER 7 Our Attention is Limited; Our Memory is
Imperfect90
Short-term memory
The processes just discussed are about long-term memory. What
about short-term
memory? What psychologists call short-term memory is actually
a combination of
phenomena involving perception, attention, and retrieval from
long-term memory.
One component of short-term memory is perceptual. Each of our
perceptual
senses has its own very brief short-term “memory” that is the
result of residual neural
activity after a perceptual stimulus ceases, like a bell that rings
briefly after it is
struck. Until they fade away, these residual perceptions are
available as possible
input to our brain’s attention and memory-storage mechanisms,
which integrate
input from our various perceptual systems, focus our awareness
on some of that
input, and store some of it in long-term memory. These sensory-
specific residual
perceptions together comprise a minor component of short-term
memory. Here, we
are only interested in them as potential inputs to working
memory.
Also available as potential input to working memory are long-
term memories
reactivated through recognition or recall. As explained earlier,
each long-term mem-
ory corresponds to a specific pattern of neural activity
distributed across our brain.
While activated, a memory pattern is a candidate for our
attention and therefore
potential input for working memory.
The human brain has multiple attention mechanisms, some
voluntary and some
involuntary. They focus our awareness on a very small subset of
the perceptions and
activated long-term memories while ignoring everything else.
That tiny subset of all
the available information from our perceptual systems and our
long-term memories
that we are aware of right now is the main component of our
short-term memory,
the part that cognitive scientists often call working memory. It
integrates informa-
tion from all of our sensory modalities and our long-term
memory. Henceforth, we
will restrict our discussion of short-term memory to working
memory.
So what is working memory? First, here is what it is not: it is
not a store—it is not
a place in the brain where memories and perceptions go to be
worked on. And it is
nothing like accumulators or fast random-access memory in
digital computers.
Instead, working memory is our combined focus of attention:
everything that we
are conscious of at a given time. More precisely, it is a few
perceptions and long-term
memories that are activated enough that we remain aware of
them over a short
period. Psychologists also view working memory as including
an executive func-
tion—based mainly in the frontal cerebral cortex—that
manipulates items we are
attending to and, if needed, refreshes their activation so they
remain in our aware-
ness (Baddeley, 2012).
A useful—if oversimplified—analogy for memory is a huge,
dark, musty ware-
house. The warehouse is full of long-term memories, piled
haphazardly (not stacked
neatly), intermingled and tangled, and mostly covered with dust
and cobwebs. Doors
along the walls represent our perceptual senses: sight, hearing,
smell, taste, touch.
They open briefly to let perceptions in. As perceptions enter,
they are briefly illumi-
nated by light coming in from outside, but they quickly are
pushed (by more enter-
ing perceptions) into the dark tangled piles of old memories.
91A modern view of memory
In the ceiling of the warehouse are a small fixed number of
searchlights, con-
trolled by the attention mechanism’s executive function
(Baddeley, 2012). They
swing around and focus on items in the memory piles,
illuminating them for a while
until they swing away to focus elsewhere. Sometimes one or
two searchlights focus
on new items after they enter through the doors. When a
searchlight moves to focus
on something new, whatever it had been focusing on is plunged
into darkness.
The small fixed number of searchlights represents the limited
capacity of work-
ing memory. What is illuminated by them (and briefly through
the open doors) rep-
resents the contents of working memory: out of the vast
warehouse’s entire contents,
the few items we are attending to at any moment. See Figure 7.2
for a visual.
The warehouse analogy is too simple and should not be taken
too seriously. As
Chapter 1 explained, our senses are not just passive doorways
into our brains,
through which our environment “pushes” perceptions. Rather,
our brain actively
and continually seeks out important events and features in our
environment and
“pulls” perceptions in as needed (Ware, 2008). Furthermore, the
brain is buzzing
with activity most of the time and its internal activity is only
modulated—not deter-
mined—by sensory input (Eagleman, 2012). Also, as described
earlier, memories are
embodied as networks of neurons distributed around the brain,
not as objects in a
specific location. Finally, activating a memory in the brain can
activate related ones;
our warehouse-with-searchlights analogy doesn’t represent that.
Nonetheless, the analogy—especially the part about the
searchlights—illustrates
that working memory is a combination of several foci of
attention—the currently
FriendSally
Hello
Duck
FIGURE 7.2
Modern view of memory: a dark warehouse full of stuff (long-
term memory), with searchlights
focused on a few items (short-term memory).
CHAPTER 7 Our Attention is Limited; Our Memory is
Imperfect92
activated neural patterns of which we are aware—and that the
capacity of working
memory is extremely limited, and the content at any given
moment is very volatile.
What about the earlier finding that damage to some parts of the
brain causes
short-term memory deficits, while other types of brain damage
cause long-term
memory deficits? The current interpretation is that some types
of damage decrease
or eliminate the brain’s ability to focus attention on specific
objects and events,
while other types of damage harm the brain’s ability to store or
retrieve long-term
memories.
CHARACTERISTICS OF ATTENTION AND WORKING
MEMORY
As noted, working memory is equal to the focus of our
attention. Whatever is in that
focus is what we are conscious of at any moment. But what
determines what we
attend to and how much we can attend to at any given time?
Attention is highly focused and selective
Most of what is going on around you at this moment you are
unaware of. Your per-
ceptual system and brain sample very selectively from your
surroundings, because
they don’t have the capacity to process everything.
Right now you are conscious of the last few words and ideas
you’ve read, but
probably not the color of the wall in front of you. But now that
I’ve shifted your atten-
tion, you are conscious of the wall’s color, and may have
forgotten some of the ideas
you read on the previous page.
Chapter 1 described how our perception is filtered and biased
by our goals. For
example, if you are looking for your friend in a crowded
shopping mall, your visual
system “primes” itself to notice people who look like your
friend (including how he
or she is dressed), and barely notice everything else.
Simultaneously, your auditory
system primes itself to notice voices that sound like your
friend’s voice, and even
footsteps that sound like those of your friend. Human-shaped
blobs in your periph-
eral vision and sounds localized by your auditory system that
match your friend snap
your eyes and head toward them. While you look, anyone
looking or sounding simi-
lar to your friend attracts your attention, and you won’t notice
other people or events
that would normally have interested you.
Besides focusing on objects and events related to our current
goals, our attention
is drawn to:
l Movement, especially movement near or toward us. For
example, some-
thing jumps at you while you walk on a street, or something
swings toward your
head in a haunted house ride at an amusement park, or a car in
an adjacent lane
suddenly swerves toward your lane (see the discussion of the
flinch reflex in
Chapter 14).
l Threats. Anything that signals or portends danger to us or
people in our care.
93Characteristics of attention and working memory
l Faces of other people. We are primed from birth to notice
faces more than
other objects in our environment.
l Sex and food. Even if we are happily married and well fed,
these things attract
our attention. Even the mere words probably quickly got your
attention.
These things, along with our current goals, draw our attention
involuntarily. We
don’t become aware of something in our environment and then
orient ourselves
toward it. It’s the other way around: our perceptual system
detects something atten-
tion-worthy and orients us toward it preconsciously, and only
afterwards do we
become aware of it.3
Capacity of attention (a.k.a. working memory)
The primary characteristics of working memory are its low
capacity and volatility.
But what is the capacity? In terms of the warehouse analogy
presented earlier, what
is the small fixed number of searchlights?
Many college-educated people have read about “the magical
number seven, plus or
minus two,” proposed by cognitive psychologist George Miller
in 1956 as the limit on
the number of simultaneous unrelated items in human working
memory (Miller, 1956).
Miller’s characterization of the working memory limit naturally
raises several
questions:
l What are the items in working memory? They are current
perceptions and
retrieved memories. They are goals, numbers, words, names,
sounds, images,
odors—anything one can be aware of. In the brain, they are
patterns of neural
activity.
l Why must items be unrelated? Because if two items are
related, they corre-
spond to one big neural activity pattern—one set of features—
and hence one
item, not two.
l Why the fudge-factor of plus or minus two? Because
researchers cannot
measure with perfect accuracy how much people can keep track
of, and because
of differences between individuals in working memory capacity.
Later research in the 1960s and 1970s found Miller’s estimate
to be too high. In the
experiments Miller considered, some of the items presented to
people to remember
could be “chunked” (i.e., considered related), making it appear
that people’s working
memory was holding more items than it actually was.
Furthermore, all the subjects in
Miller’s experiments were college students. Working memory
capacity varies in the
general population. When the experiments were revised to
disallow unintended chunk-
ing and include noncollege students as subjects, the average
capacity of working mem-
ory was shown to be more like four plus or minus one—that is,
three to five items
(Broadbent, 1975; Mastin, 2010). Thus, in our warehouse
analogy, there would be only
four searchlights.
3 Exactly how long afterwards is discussed in Chapter 14.
CHAPTER 7 Our Attention is Limited; Our Memory is
Imperfect94
More recent research has cast doubt on the idea that the
capacity of working
memory should be measured in whole items or “chunks.” It
turns out that in early
working memory experiments, people were asked to briefly
remember items (e.g.,
words or images) that were quite different from each other—
that is, they had very
few features in common. In such a situation, people don’t have
to remember every
feature of an item to recall it a few seconds later; remembering
some of its features
is enough. So people appeared to recall items as a whole, and
therefore working
memory capacity seemed measurable in whole items.
Recent experiments have given people items to remember that
are similar—that
is, they share many features. In that situation, to recall an item
and not confuse it
with other items, people must remember more of its features. In
these experiments,
researchers found that people remember more details (i.e.,
features) of some items
than of others, and the items they remember in greater detail are
the ones they paid
more attention to (Bays and Husain, 2008). This suggests that
the unit of attention—
and therefore the capacity of working memory—is best
measured in item features
rather than whole items or “chunks” (Cowan et al., 2004). This
jibes with the mod-
ern view of the brain as a feature-recognition device, but it is
controversial among
memory researchers, some of whom argue that the basic
capacity of human working
memory is three to five whole items, but that is reduced if
people attend to a large
number of details (i.e., features) of the items (Alvarez and
Cavanagh, 2004).
Bottom line: The true capacity of human working memory is
still a research topic.
The second important characteristic of working memory is how
volatile it is.
Cognitive psychologists used to say that new items arriving in
working memory
often bump old ones out, but that way of describing the
volatility is based on the
view of working memory as a temporary storage place for
information. The mod-
ern view of working memory as the current focus of attention
makes it even clearer:
focusing attention on new information turns it away from some
of what it was focus-
ing on. That is why the searchlight analogy is useful.
However we describe it, information can easily be lost from
working memory. If
items in working memory don’t get combined or rehearsed, they
are at risk of having
the focus shifted away from them. This volatility applies to
goals as well as to the
details of objects. Losing items from working memory
corresponds to forgetting or
losing track of something you were doing. We have all had such
experiences, for
example:
l Going to another room for something, but once there we
can’t remember why
we came.
l Taking a phone call, and afterward not remembering what we
were doing before
the call.
l Something yanks our attention away from a conversation, and
then we can’t
remember what we were talking about.
l In the middle of adding a long list of numbers, something
distracts us, so we have
to start over.
95Characteristics of attention and working memory
WORKING MEMORY TEST
To test your working memory, get a pen or pencil and two blank
sheets
of paper and follow these instructions:
1. Place one blank sheet of paper after this page in the book
and
use it to cover the next page.
2. Flip to the next page for three seconds, pull the paper cover
down and read the black numbers at the top, and flip back to
this page. Don’t peek at other numbers on that page unless you
want to ruin the test.
3. Say your phone number backward, out loud.
4. Now write down the black numbers from memory. … Did
you
get all of them?
5. Flip back to the next page for three seconds, read the red
numbers (under the black ones), and flip back.
6. Write down the numbers from memory. These would be
easier
to recall than the first ones if you noticed that they are the first
seven digits of π (3.141592), because then they would be only
one number, not seven.
7. Flip back to the next page for 3 seconds, read the green
numbers, and flip back.
8. Write down the numbers from memory. If you noticed that
they
are odd numbers from 1 to 13, they would be easier to recall,
because they would be three chunks (“odd, 1, 13” or “odd,
seven
from 1”), not seven.
9. Flip back to the next page for three seconds, read the orange
words, and flip back.
10. Write down the words from memory. … Could you recall
them
all?
11. Flip back to the next page for three seconds, read the blue
words, and flip back.
12. Write down the words from memory. … It was certainly a
lot
easier to recall them all because they form a sentence, so they
could be memorized as one sentence rather than seven words.
CHAPTER 7 Our Attention is Limited; Our Memory is
Imperfect96
IMPLICATIONS OF WORKING MEMORY
CHARACTERISTICS
FOR USER-INTERFACE DESIGN
The capacity and volatility of working memory have many
implications for the design
of interactive computer systems. The basic implication is that
user interfaces should
help people remember essential information from one moment
to the next. Don’t
require people to remember system status or what they have
done, because their atten-
tion is focused on their primary goal and progress toward it.
Specific examples follow.
Modes
The limited capacity and volatility of working memory is one
reason why user-
interface design guidelines often say to either avoid designs that
have modes or
provide adequate mode feedback. In a moded user interface,
some user actions
have different effects depending on what mode the system is in.
For example:
l In a car, pressing the accelerator pedal can move the car
either forwards, back-
wards, or not at all, depending on whether the transmission is in
drive, reverse,
or neutral. The transmission sets a mode in the car’s user
interface.
l In many digital cameras, pressing the shutter button can
either snap a photo or
start a video recording, depending on which mode is selected.
3 8 4 7 5 3 9
3 1 4 1 5 9 2
1 3 5 7 9 11 31
town river corn string car shovel
what is the meaning of life
97Implications of working memory characteristics for user-
interface design
l In a drawing program, clicking and dragging normally selects
one or more
graphic objects on the drawing, but when the software is in
“draw rectangle”
mode, clicking and dragging adds a rectangle to the drawing and
stretches it to
the desired size.
Moded user interfaces have advantages; that is why many
interactive systems
have them. Modes allow a device to have more functions than
controls: the same
control provides different functions in different modes. Modes
allow an interactive
system to assign different meanings to the same gestures to
reduce the number of
gestures users must learn.
However, one well-known disadvantage of modes is that people
often make mode
errors: they forget what mode the system is in and do the wrong
thing by mistake
(Johnson, 1990). This is especially true in systems that give
poor feedback about what
the current mode is. Because of the problem of mode errors,
many user-interface design
guidelines say to either avoid modes or provide strong feedback
about which mode the
system is in. Human working memory is too unreliable for
designers to assume that
users can, without clear, continuous feedback, keep track of
what mode the system is
in, even when the users are the ones changing the system from
one mode to another.
Search results
When people use a search function on a computer to find
information, they enter
the search terms, start the search, and then review the results.
Evaluating the
results often requires knowing what the search terms were. If
working memory
were less limited, people would always remember, when
browsing the results,
what they had entered as search terms just a few seconds
earlier. But as we have
seen, working memory is very limited. When the results appear,
a person’s atten-
tion naturally turns away from what he or she entered and
toward the results.
Therefore, it should be no surprise that people viewing search
results often do not
remember the search terms they just typed.
Unfortunately, some designers of online search functions don’t
understand that.
Search results sometimes don’t show the search terms that
generated the results. For
example, in 2006, the search results page at Slate.com provided
search fields so users
could search again, but didn’t show what a user had searched
for (see Fig. 7.3A). A
recent version of the site shows the user’s search terms (see Fig.
7.3B), reducing the
burden on users’ working memory.
Calls to action
A well-known “netiquette” guideline for writing email
messages, especially messages
that require responses or ask the recipients to do something, is
to restrict each message
to one topic. If a message contains multiple topics or requests,
its recipients may focus
on one of them (usually the first one), get engrossed in
responding to that, and forget to
respond to the rest of the email. The guideline to put different
topics or requests into
separate emails is a direct result of the limited capacity of
human attention.
CHAPTER 7 Our Attention is Limited; Our Memory is
Imperfect98
Web designers are familiar with a similar guideline: Avoid
putting competing
calls to action on a page. Each page should have only one
dominant call to action—
or one for each possible user goal—to not overwhelm users’
attention capacity
and cause them go down paths that don’t achieve their (or the
site owner’s) goals.
(A)
(B)
FIGURE 7.3
Slate.com search results: (A) in 2007, users’ search terms were
not shown, but (B) in 2013,
search terms are shown.
99Implications of working memory characteristics for user-
interface design
A related guideline: Once users have specified their goal, don’t
distract them from
accomplishing it by displaying extraneous links and calls to
action. Instead, guide
them to the goal by using a design pattern called the process
funnel (van Duyne
et al., 2002; see also Johnson, 2007).
Instructions
If you asked a friend for a recipe or for directions to her home,
and she gave you a
long sequence of steps, you probably would not try to remember
it all. You would
know that you could not reliably keep all of the instructions in
your working mem-
ory, so you would write them down or ask your friend to send
them to you by email.
Later, while following the instructions, you would put them
where you could refer
to them until you reached the goal.
Similarly, interactive systems that display instructions for
multistep operations
should allow people to refer to the instructions while executing
them until com-
pleting all the steps. Most interactive systems do this (see Fig.
7.4), but some do not
(see Fig. 7.5).
Navigation depth
Using a software product, digital device, phone menu system, or
Web site often
involves navigating to the user’s desired information or goal. It
is well established
that navigation hierarchies that are broad and shallow are easier
for most people—
especially those who are nontechnical—to find their way around
in than narrow,
deep hierarchies (Cooper, 1999). This applies to hierarchies of
application win-
dows and dialog boxes, as well as to menu hierarchies (
Johnson, 2007).
FIGURE 7.4
Instructions in Windows Help files remain displayed while users
follow them.
CHAPTER 7 Our Attention is Limited; Our Memory is
Imperfect100
A related guideline: In hierarchies deeper than two levels,
provide navigation
“breadcrumb” paths to constantly remind users where they are
(Nielsen, 1999; van
Duyne et al., 2002).
These guidelines, like the others mentioned earlier, are based on
the limited capac-
ity of human working memory. Requiring a user to drill down
through eight levels of
dialog boxes, web pages, menus, or tables—especially with no
visible reminders of
their location—will probably exceed the user’s working memory
capacity, thereby
causing him or her to forget where he or she came from or what
his or her overall
goals were.
CHARACTERISTICS OF LONG-TERM MEMORY
Long-term memory differs from working memory in many
respects. Unlike working
memory, it actually is a memory store.
However, specific memories are not stored in any one neuron or
location in the
brain. As described earlier, memories, like perceptions, consist
of patterns of activa-
tion of large sets of neurons. Related memories correspond to
overlapping patterns
of activated neurons. This means that every memory is stored in
a distributed fash-
ion, spread among many parts of the brain. In this way, long-
term memory in the
brain is similar to holographic light images.
Long-term memory evolved to serve our ancestors and us very
well in getting
around in our world. However, it has many weaknesses: it is
error-prone, impression-
ist, free-associative, idiosyncratic, retroactively alterable, and
easily biased by a vari-
ety of factors at the time of recording or retrieval. Let’s
examine some of these
weaknesses.
Error-prone
Nearly everything we’ve ever experienced is stored in our long-
term memory. Unlike
working memory, the capacity of human long-term memory
seems almost unlimited.
Adult human brains each contain about 86 billion neurons
(Herculano-Houzel,
2009). As described earlier, individual neurons do not store
memories; memories are
encoded by networks of neurons acting together. Even if only
some of the brain’s
FIGURE 7.5
Instructions for Windows XP wireless setup start by telling
users to close the instructions.
101Characteristics of long-term memory
neurons are involved in memory, the large number of neurons
allows for a great
many different combinations of them, each capable of
representing a different mem-
ory. Still, no one has yet measured or even estimated the
maximum information
capacity of the human brain.4 Whatever the capacity is, it’s a
lot.
However, what is in long-term memory is not an accurate, high-
resolution record-
ing of our experiences. In terms familiar to computer engineers,
one could charac-
terize long-term memory as using heavy compression methods
that drop a great deal
of information. Images, concepts, events, sensations, actions—
all are reduced to
combinations of abstract features. Different memories are stored
at different levels of
detail—that is, with more or fewer features.
For example, the face of a man you met briefly who is not
important to you might
be stored simply as an average Caucasian male face with a
beard, with no other
details—a whole face reduced to three features. If you were
asked later to describe
the man in his absence, the most you could honestly say was
that he was a “white
guy with a beard.” You would not be able to pick him out of a
police lineup of other
Caucasian men with beards. In contrast, your memory of your
best friend’s face
includes many more features, allowing you to give a more
detailed description and
pick your friend out of any police lineup. Nonetheless, it is still
a set of features, not
anything like a bitmap image.
As another example, I have a vivid childhood memory of being
run over by a plow
and badly cut, but my father says it happened to my brother.
One of us is wrong.
In the realm of human–computer interaction, a Microsoft Word
user may remem-
ber that there is a command to insert a page number, but may
not remember which
menu the command is in. That specific feature may not have
been recorded when
the user learned how to insert page numbers. Alternatively,
perhaps the menu-loca-
tion feature was recorded, but just does not reactivate with the
rest of the memory
pattern when the user tries to recall how to insert a page
number.
Weighted by emotions
Chapter 1 described a dog that remembered seeing a cat in his
front yard every time
he returned home in the family car. The dog was excited when
he first saw the cat,
so his memory of it was strong and vivid.
A comparable human example would be an adult could easily
have strong memo-
ries of her first day at nursery school, but probably not of her
tenth. On the first day,
she was probably upset about being left at the school by her
parents, whereas by the
tenth day, being left there was nothing unusual.
Retroactively alterable
Suppose that while you are on an ocean cruise with your family,
you see a whale-shark.
Years later, when you and your family are discussing the trip,
you might remember
4 The closest researchers have come is Landauer’s (1986) use of
the average human learning rate to calculate
the amount of information a person can learn in a lifetime: 109
bits, or a few hundred megabytes.
CHAPTER 7 Our Attention is Limited; Our Memory is
Imperfect102
seeing a whale, and one of your relatives might recall seeing a
shark. For both of you,
some details in long-term memory were dropped because they
did not fit a common
concept.
A true example comes from 1983, when the late President
Ronald Reagan was
speaking with Jewish leaders during his first term as president.
He spoke about being
in Europe during World War II and helping to liberate Jews
from the Nazi concentra-
tion camps. The trouble was, he was never in Europe during
World War II. When he
was an actor, he was in a movie about World War II, made
entirely in Hollywood.
That important detail was missing from his memory.
IMPLICATIONS OF LONG-TERM MEMORY
CHARACTERISTICS
FOR USER-INTERFACE DESIGN
The main thing that the characteristics of long-term memory
imply is that people
need tools to augment it. Since prehistoric times, people have
invented technologies
to help them remember things over long periods: notched sticks,
knotted ropes,
mnemonics, verbal stories and histories retold around
campfires, writing, scrolls,
books, number systems, shopping lists, checklists, phone
directories, datebooks,
accounting ledgers, oven timers, computers, portable digital
assistants (PDAs),
online shared calendars, etc.
Given that humankind has a need for technologies that augment
memory, it
seems clear that software designers should try to provide
software that fulfills that
A LONG-TERM MEMORY TEST
Test your long-term memory by answering the following
questions:
1. Was there a roll of tape in the toolbox in Chapter 1?
2. What was your previous phone number?
3. Which of these words were not in the list presented in the
work-
ing memory test earlier in this chapter: city, stream, corn, auto,
twine, spade?
4. What was your first-grade teacher’s name? Second grade?
Third
grade? …
5. What Web site was presented earlier that does not show
search
terms when it displays search results?
Regarding question 3: When words are memorized, often what
is
retained is the concept, rather than the exact word that was
presented.
For example, one could hear the word “town” and later recall it
as “city.”
103Implications of long-term memory characteristics for user-
interface design
need. At the very least, designers should avoid developing
systems that burden long-
term memory. Yet that is exactly what many interactive systems
do.
Authentication is one functional area in which many software
systems place bur-
densome demands on users’ long-term memory. For example, a
web application
developed a few years ago told users to change their personal
identification number
(PIN) “to a number that is easy to remember,” but then imposed
restrictions that
made it impossible to do so (see Fig. 7.6). Whoever wrote those
instructions seems to
have realized that the PIN requirements were unreasonable,
because the instruc-
tions end by advising users to write down their PIN! Nevermind
that writing a PIN
down creates a security risk and adds yet another memory task:
users must remem-
ber where they hid their written-down PIN.
A contrasting example of burdening people’s long-term memory
for the sake of
security comes from Intuit.com. To purchase software, visitors
must register. The
site requires users to select a security question from a menu (see
Fig. 7.7). What if you
can’t answer any of the questions? What if you don’t recall your
first pet’s name, your
high school mascot, or any of the answers to the other
questions?
But that isn’t where the memory burden ends. Some questions
could have sev-
eral possible answers. Many people had several elementary
schools, childhood
friends, or heroes. To register, they must choose a question and
then remember
which answer they gave to Intuit.com. How? Probably by
writing it down some-
where. Then, when Intuit.com asks them the security question,
they have to
remember where they put the answer. Why burden people’s
memory, when it
would be easy to let users make up a security question for
which they can easily
recall the one possible answer?
Such unreasonable demands on people’s long-term memory
counteract the secu-
rity and productivity that computer-based applications
supposedly provide (Schrage,
2005), as users:
l Place sticky notes on or near computers or “hide” them in
desk drawers.
l Contact customer support to recover passwords they cannot
recall.
FIGURE 7.6
Instructions tell users to create an easy-to-remember PIN, but
the restrictions make that
impossible.
CHAPTER 7 Our Attention is Limited; Our Memory is
Imperfect104
l Use passwords that are easy for others to guess.
l Set up systems with no login requirements at all, or with one
shared login and
password.
The registration form at Network
Solution
s.com represents a small step toward
more usable security. Like Intuit.com, it offers a choice of
security questions, but it
also allows users to create their own security question—one for
which they can
more easily remember the answer (see Fig. 7.8).
Another implication of long-term memory characteristics for
interactive systems
is that learning and long-term retention are enhanced by user-
interface consistency.
FIGURE 7.7
Intuit.com’s registration burdens long-term memory: users may
have no unique, memorable
answer for any of the questions.
FIGURE 7.8
Network
Designing with the  Mind in MindSimple Guide to Unde.docx

More Related Content

PDF
Distributed User Interfaces Usability And Collaboration 1st Edition Pedro G V...
PDF
Universal Ux Design Building Multicultural User Experience Alberto Ferreira
PDF
Designing for the User Experience in Learning Systems Evangelos Kapros
PDF
Engineering a Compiler 2nd Edition Keith Cooper
PDF
Designing for the User Experience in Learning Systems Evangelos Kapros all ch...
PDF
Projects As Sociotechnical Systems In Engineering Education Meyer
PDF
Human computer interaction 3rd Edition Alan Dix
PDF
Leonardo s Laptop Human Needs and the New Computing Technologies 1st Edition ...
Distributed User Interfaces Usability And Collaboration 1st Edition Pedro G V...
Universal Ux Design Building Multicultural User Experience Alberto Ferreira
Designing for the User Experience in Learning Systems Evangelos Kapros
Engineering a Compiler 2nd Edition Keith Cooper
Designing for the User Experience in Learning Systems Evangelos Kapros all ch...
Projects As Sociotechnical Systems In Engineering Education Meyer
Human computer interaction 3rd Edition Alan Dix
Leonardo s Laptop Human Needs and the New Computing Technologies 1st Edition ...

Similar to Designing with the Mind in MindSimple Guide to Unde.docx (20)

PDF
Humantech Ethical And Scientific Foundations 1st Edition Kim Vicente
PDF
Designing Around People CWUAAT 2016 1st Edition Pat Langdon
PDF
Leonardo s Laptop Human Needs and the New Computing Technologies 1st Edition ...
PDF
Human computer interaction 3rd Edition Alan Dix
PDF
Project
PDF
Human factors and web development 2nd Edition Julie Ratner (Editor)
PDF
Cultural Differences in Human Computer Interaction 1st Edition Rüdiger Heimgä...
PDF
Internet Computing Principles Of Distributed Systems And Emerging Internetbas...
PDF
Ethical Assessments Of Emerging Technologies Appraising The Moral Plausibilit...
PDF
Thriving Systems Theory And Metaphordriven Modeling 1st Edition Leslie J Wagu...
PDF
Human factors and web development 2nd Edition Julie Ratner (Editor)
PDF
Cultural Differences In Humancomputer Interaction 1st Edition Rdiger Heimgrtner
PDF
Ecoop 2014 Objectoriented Programming 28th European Conference Uppsala Sweden...
PDF
Research methods in human-computer interaction 2 ed Edition Feng - eBook PDF
PDF
Human factors and web development 2nd Edition Julie Ratner (Editor)
PDF
Co-creating Digital Public Services for an Ageing Society: Evidence for User-...
PDF
Openaccess Multimodality And Writing Center Studies 1st Edition Elisabeth H B...
PDF
Kumar_Akshat
PDF
Thesis Shaw
PDF
Humancomputer Interaction 3rd Edition Alan Dix
Humantech Ethical And Scientific Foundations 1st Edition Kim Vicente
Designing Around People CWUAAT 2016 1st Edition Pat Langdon
Leonardo s Laptop Human Needs and the New Computing Technologies 1st Edition ...
Human computer interaction 3rd Edition Alan Dix
Project
Human factors and web development 2nd Edition Julie Ratner (Editor)
Cultural Differences in Human Computer Interaction 1st Edition Rüdiger Heimgä...
Internet Computing Principles Of Distributed Systems And Emerging Internetbas...
Ethical Assessments Of Emerging Technologies Appraising The Moral Plausibilit...
Thriving Systems Theory And Metaphordriven Modeling 1st Edition Leslie J Wagu...
Human factors and web development 2nd Edition Julie Ratner (Editor)
Cultural Differences In Humancomputer Interaction 1st Edition Rdiger Heimgrtner
Ecoop 2014 Objectoriented Programming 28th European Conference Uppsala Sweden...
Research methods in human-computer interaction 2 ed Edition Feng - eBook PDF
Human factors and web development 2nd Edition Julie Ratner (Editor)
Co-creating Digital Public Services for an Ageing Society: Evidence for User-...
Openaccess Multimodality And Writing Center Studies 1st Edition Elisabeth H B...
Kumar_Akshat
Thesis Shaw
Humancomputer Interaction 3rd Edition Alan Dix
Ad

More from simonithomas47935 (20)

DOCX
Hours, A. (2014). Reading Fairy Tales and Playing A Way of Treati.docx
DOCX
How are authentication and authorization alike and how are the.docx
DOCX
How are self-esteem and self-concept different What is the or.docx
DOCX
How are morality and religion similar and how are they different.docx
DOCX
How are financial statements used to evaluate business activities.docx
DOCX
How are Japanese and Chinese Americans similar How are they differe.docx
DOCX
Hot Spot PolicingPlace can be an important aspect of crime and.docx
DOCX
HOSP3075 Brand Analysis Paper 1This is the first of three assignme.docx
DOCX
Hou, J., Li, Y., Yu, J. & Shi, W. (2020). A Survey on Digital Fo.docx
DOCX
How (Not) to be Secular by James K.A. SmithSecular (1)—the ea.docx
DOCX
Hopefully, you enjoyed this class on Digital Media and Society.Q.docx
DOCX
hoose (1) one childhood experience from the list provided below..docx
DOCX
honesty, hard work, caring, excellence HIS 1110 Dr. .docx
DOCX
hoose one of the four following visualsImage courtesy o.docx
DOCX
HomeworkChoose a site used by the public such as a supermark.docx
DOCX
Homework 2 Please answer the following questions in small paragraph.docx
DOCX
HomeNotificationsMy CommunityBBA 2010-16J-5A21-S1, Introductio.docx
DOCX
HomeAnnouncementsSyllabusDiscussionsQuizzesGra.docx
DOCX
Homeless The Motel Kids of Orange CountyWrite a 1-2 page pa.docx
DOCX
Home work 8 Date 042220201. what are the different between.docx
Hours, A. (2014). Reading Fairy Tales and Playing A Way of Treati.docx
How are authentication and authorization alike and how are the.docx
How are self-esteem and self-concept different What is the or.docx
How are morality and religion similar and how are they different.docx
How are financial statements used to evaluate business activities.docx
How are Japanese and Chinese Americans similar How are they differe.docx
Hot Spot PolicingPlace can be an important aspect of crime and.docx
HOSP3075 Brand Analysis Paper 1This is the first of three assignme.docx
Hou, J., Li, Y., Yu, J. & Shi, W. (2020). A Survey on Digital Fo.docx
How (Not) to be Secular by James K.A. SmithSecular (1)—the ea.docx
Hopefully, you enjoyed this class on Digital Media and Society.Q.docx
hoose (1) one childhood experience from the list provided below..docx
honesty, hard work, caring, excellence HIS 1110 Dr. .docx
hoose one of the four following visualsImage courtesy o.docx
HomeworkChoose a site used by the public such as a supermark.docx
Homework 2 Please answer the following questions in small paragraph.docx
HomeNotificationsMy CommunityBBA 2010-16J-5A21-S1, Introductio.docx
HomeAnnouncementsSyllabusDiscussionsQuizzesGra.docx
Homeless The Motel Kids of Orange CountyWrite a 1-2 page pa.docx
Home work 8 Date 042220201. what are the different between.docx
Ad

Recently uploaded (20)

PDF
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
PPTX
CHAPTER IV. MAN AND BIOSPHERE AND ITS TOTALITY.pptx
PDF
Chinmaya Tiranga quiz Grand Finale.pdf
PDF
What if we spent less time fighting change, and more time building what’s rig...
PDF
Complications of Minimal Access Surgery at WLH
PPTX
History, Philosophy and sociology of education (1).pptx
PDF
Empowerment Technology for Senior High School Guide
PPTX
Digestion and Absorption of Carbohydrates, Proteina and Fats
PDF
Practical Manual AGRO-233 Principles and Practices of Natural Farming
PPTX
Orientation - ARALprogram of Deped to the Parents.pptx
DOC
Soft-furnishing-By-Architect-A.F.M.Mohiuddin-Akhand.doc
PDF
advance database management system book.pdf
PPTX
Onco Emergencies - Spinal cord compression Superior vena cava syndrome Febr...
PPTX
Unit 4 Skeletal System.ppt.pptxopresentatiom
PDF
1_English_Language_Set_2.pdf probationary
PPTX
UV-Visible spectroscopy..pptx UV-Visible Spectroscopy – Electronic Transition...
PPTX
A powerpoint presentation on the Revised K-10 Science Shaping Paper
PDF
medical_surgical_nursing_10th_edition_ignatavicius_TEST_BANK_pdf.pdf
PDF
LDMMIA Reiki Yoga Finals Review Spring Summer
PDF
Trump Administration's workforce development strategy
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
CHAPTER IV. MAN AND BIOSPHERE AND ITS TOTALITY.pptx
Chinmaya Tiranga quiz Grand Finale.pdf
What if we spent less time fighting change, and more time building what’s rig...
Complications of Minimal Access Surgery at WLH
History, Philosophy and sociology of education (1).pptx
Empowerment Technology for Senior High School Guide
Digestion and Absorption of Carbohydrates, Proteina and Fats
Practical Manual AGRO-233 Principles and Practices of Natural Farming
Orientation - ARALprogram of Deped to the Parents.pptx
Soft-furnishing-By-Architect-A.F.M.Mohiuddin-Akhand.doc
advance database management system book.pdf
Onco Emergencies - Spinal cord compression Superior vena cava syndrome Febr...
Unit 4 Skeletal System.ppt.pptxopresentatiom
1_English_Language_Set_2.pdf probationary
UV-Visible spectroscopy..pptx UV-Visible Spectroscopy – Electronic Transition...
A powerpoint presentation on the Revised K-10 Science Shaping Paper
medical_surgical_nursing_10th_edition_ignatavicius_TEST_BANK_pdf.pdf
LDMMIA Reiki Yoga Finals Review Spring Summer
Trump Administration's workforce development strategy

Designing with the Mind in MindSimple Guide to Unde.docx

  • 1. Designing with the Mind in Mind Simple Guide to Understanding User Interface Design Guidelines Second Edition This page intentionally left blank Designing with the Mind in Mind Simple Guide to Understanding User Interface Design Guidelines Second Edition Jeff Johnson AMSTERDAM • BOSTON • HEIDELBERG • LONDON NEW YORK • OXFORD • PARIS • SAN DIEGO SAN FRANCISCO • SINGAPORE • SYDNEY • TOKYO Morgan Kaufmann is an imprint of Elsevier
  • 2. Acquiring Editor: Meg Dunkerley Editorial Project Manager: Heather Scherer Project Manager: Priya Kumaraguruparan Designer: Matthew Limbert Morgan Kaufmann is an imprint of Elsevier 225 Wyman Street, Waltham, MA, 02451, USA Copyright © 2014, 2010 Elsevier Inc. All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Details on how to seek permission, further information about the Publisher’s permissions policies and our arrangements with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency, can be found at our website: www.elsevier.com/permissions. This book and the individual contributions contained in it are protected under copyright by the Publisher (other than as may be noted herein). Notices Knowledge and best practice in this field are constantly changing. As new research and experience broaden our understanding, changes in research methods or professional practices, may become necessary. Practitioners and researchers must always rely on their own experience and knowledge in
  • 3. evaluating and using any information or methods described herein. In using such information or methods they should be mindful of their own safety and the safety of others, including parties for whom they have a professional responsibility. To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability for any injury and/or damage to persons or property as a matter of products liability,negligence or otherwise, or from any use or operation of any methods, products, instructions, or ideas contained in the material herein. Library of Congress Cataloging-in-Publication Data Application submitted British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library ISBN: 978-0-12-407914-4 Printed in China 14 15 16 17 10 9 8 7 6 5 4 3 2 1 For information on all Morgan Kaufmann publications, visit our Web site at www.mkp.com http://www.elsevier.com/permissions http://www.mkp.com v Contents
  • 4. Acknowledgments ............................................................................................... .......vii Foreword ............................................................................................... ..................... ix Introduction ............................................................................................... .............. xiii CHAPTER 1 Our Perception is Biased .............................................. 1 CHAPTER 2 Our Vision is Optimized to See Structure ...................... 13 CHAPTER 3 We Seek and Use Visual Structure .............................. 29 CHAPTER 4 Our Color Vision is Limited .......................................... 37 CHAPTER 5 Our Peripheral Vision is Poor ...................................... 49 CHAPTER 6 Reading is Unnatural .................................................. 67 CHAPTER 7 Our Attention is Limited; Our Memory is Imperfect ........ 87 CHAPTER 8 Limits on Attention Shape Our Thought and Action ...... 107 CHAPTER 9 Recognition is Easy; Recall is Hard ........................... 121 CHAPTER 10 Learning from Experience and Performing Learned Actions are Easy; Novel Actions, Problem Solving, and Calculation are Hard .......................................... 131 CHAPTER 11 Many Factors Affect Learning .................................... 149
  • 5. CHAPTER 12 Human Decision Making is Rarely Rational ................ 169 CHAPTER 13 Our Hand–Eye Coordination Follows Laws .................. 187 CHAPTER 14 We Have Time Requirements ..................................... 195 Epilogue ............................................................................................... ................... 217 Appendix ............................................................................................... .................. 219 Bibliography ............................................................................................... ............. 223 Index ............................................................................................... ........................ 229 This page intentionally left blank vii Acknowledgments I could not have written this book without a lot of help and the support of many people. First are the students of the human–computer interaction course
  • 6. I taught as an Erskine Fellow at the University of Canterbury in New Zealand in 2006. It was for them that I developed a lecture providing a brief background in perceptual and cognitive psychology—just enough to enable them to understand and apply user-inter- face design guidelines. That lecture expanded into a professional development course, then into the first edition of this book. My need to prepare more comprehensive psy- chological background for an upper-level course in human– computer interaction that I taught at the University of Canterbury in 2013 provided motivation for expanding the topics covered and improving the explanations in this second edition. Second, I thank my colleagues at the University of Canterbury who provided ideas, feedback on my ideas, and illustrations for the second edition’s new chapter on Fitts’ law: Professor Andy Cockburn, Dr. Sylvain Malacria, and Mathieu Nancel. I also thank my colleague and friend Professor Tim Bell for sharing user-interface exam- ples and for other help while I was at the university working on the second edition. Third, I thank the reviewers of the first edition—Susan Fowler, Robin Jeffries, Tim McCoy, and Jon Meads—and of the second edition—Susan Fowler, Robin Jef- fries, and James Hartman. They made many helpful comments and suggestions that allowed me to greatly improve the book.
  • 7. Fourth, I am grateful to four cognitive science researchers who directed me to important references, shared useful illustrations with me, or allowed me to bounce ideas off of them: • Professor Edward Adelson, Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology. • Professor Dan Osherson, Department of Psychology, Princeton University. • Dr. Dan Bullock, Department of Cognitive and Neural Systems, Boston University. • Dr. Amy L. Milton, Department of Psychology and Downing College, University of Cambridge. The book also was helped greatly by the care, oversight, logistical support, and nurturing provided by the staff at Elsevier, especially Meg Dunkerley, Heather Scherer, Lindsay Lawrence, and Priya Kumaraguruparan. Last but not least, I thank my wife and friend Karen Ande for her love and support while I was researching and writing this book. This page intentionally left blank
  • 8. ix Foreword It is gratifying to see this book go into a second edition because of the endorsement that implies for maturing the field of human–computer interaction beyond pure empirical methods. Human–computer interaction (HCI) as a topic is basically simple. There is a per- son of some sort who wants to do some task like write an essay or pilot an airplane. What makes the activity HCI is inserting a mediating computer. In principle, our person could have done the task without the computer. She could have used a quill pen and ink, for example, or flown an airplane that uses hydraulic tubes to work the controls. These are not quite HCI. They do use intermediary tools or machines, and the process of their design and the facts of their use bear resemblance to those of HCI. In fact, they fit into HCI’s uncle discipline of human factors. But it is the com- puter, and the process of contingent interaction the computer renders possible, that makes HCI distinctive. The computer can transform a task’s representation and needed skills. It can change the linear writing process into something more like
  • 9. sculpturing, the writer roughing out the whole, then adding or subtracting bits to refine the text. It can change the piloting process into a kind of supervision, letting the computer with inputs of speed, altitude, and location and outputs of throttle, flap, and rudder, do the actual flying. And if instead of one person we have a small group or a mass crowd, or if instead of a single computer we have a network of communicating mobile or embedded computers, or if instead of a simple task we have impinging cultural or coordination considerations, then we get the many variants of computer mediation that form the broad spectrum of HCI. The components of a discipline of HCI would also seem simple. There is an arti- fact that must be engineered and implemented. There is the process of design for the interaction itself and the objects, virtual or physical, with which to interact. Then there are all the principles, abstractions, theories, facts, and phenomena surround- ing HCI to know about. Let’s call the first interaction engineering (e.g., using Harel statecharts to guide implementation), the second, interaction design (e.g., the design of the workflow for a smartphone to record diet), and the third, perhaps a little overly grandly, interaction science (e.g., the use of Fitts’ law to design button sizes in an application). The hard bit for HCI is that fitting these three together is not easy. Beside work in HCI itself, each has its own literature not
  • 10. friendly to outsiders. The present book was written to bridge the gap between the relevant science that has been built up from the psychological literature and HCI design problems where the science could be of use. Actually, the importance of linking engineering, design, and science together in HCI goes deeper. HCI is a technology. As Brian Arthur in his book The Nature of Forewordx Technology tells us, technologies largely derive from other technologies, not sci- ence. The flat panel displays now common are a substitute for CRT devices of yore, and these go back to modified radar screens on the Whirlwind computer. Further- more, technologies are composed of parts that are themselves technologies. A laptop computer has a display for output and a key and a touchpad for input and several storage systems, and so on, each with its own technologies. But eventually all these technologies ground out in some phenomenon of nature that is not a technology, and here is a place where science plays a role. Some keyboard input devices use the natural phenomenon of electrical capacitance to sense keystrokes. Pressing a key brings two D-shaped pads close to a printed circuit board that is covered by an insu-
  • 11. lating film, thereby changing the pattern of capacitance. That is to say, this keyboard harnesses the natural phenomenon of capacitance in a reliable way that can be exploited to provide the HCI function of signaling an intended interaction to the computer. Many natural phenomena are easy to understand and exploit by simple observa- tion or modest tinkering. No science needed. But some, like capacitance, are much less obvious, and then you really need science to understand them. In some cases, the HCI system that is built generates its own phenomena, and you need science to understand the unexpected, emergent properties of seemingly obvious things. Peo- ple sometimes believe that because they can intuitively understand the easy cases (e.g., with usability testing), they can understand all the cases. But this is not neces- sarily true. The natural phenomena to be exploited in HCI range from abstractions of computer science, such as the notion of the working set, to psychological theories of human cognition, perception, and movement, such as the nature of vision. Psy- chology, the area addressed by this book, is an area with an especially messy and at times contradictory literature, but it is also especially rich in phenomena that can be exploited for HCI technology. I think it is underappreciated how important it is for the future development of
  • 12. HCI as a discipline that the field develops a supporting science base as illustrated by the current book for the field of psychology. It also involves HCI growing some of its own science bits. Why is this important? There are at least three reasons. First, having some sort of theory enables explanatory evaluation. The use of A-B testing is limited if you don’t know why there was a difference. On the other hand, if you have a theory that lets you interpret the difference, then you can fix it. You will never understand the prob- lems of why a windows-based user interface can take excessive time to use by doing usability testing, for example, if you don’t have the theoretical concept of the win- dow working set. Second, it enables generative design. It allows a shift in represen- tation of the design space. Once it is realized that a very important property of pointing devices is the bandwidth of the human motor group to which a transducer is going to be applied, then the problem gets reformulated to terms of how to con- nect those muscles and the consequence for the rest of the design. Third, it supports the codification of knowledge. Only by having theories and abstractions can we concisely cumulate our results and develop a field with sufficient power and depth. xiForeword
  • 13. Why isn’t there wider use of science or theory in HCI? There are obvious reasons, like the fact that it isn’t easy to get the relevant science linkages or results in the first place, that it’s hard to make the connection with science in almost any engineering field, and that often the connection is made, but invisibly packaged, in a way that nonspecialists never need to see it. The poet tosses capacitance with his finger, but only knows he writes a poem. He thinks he writes with love, because someone understood electricity. But, mainly, I think there isn’t wider use of science or theory in HCI because it is difficult to put that knowledge into a form that is easily useful at the time of design need. Jeff Johnson in this book is careful to connect theory with design choice, and to do it in a practical way. He has accumulated grounded design rules that reach across the component parts of HCI, making it easier for designers as they design to keep them in mind. Stuart K. Card This page intentionally left blank
  • 14. xiii Introduction USER-INTERFACE DESIGN RULES: WHERE DO THEY COME FROM AND HOW CAN THEY BE USED EFFECTIVELY? For as long as people have been designing interactive computer systems, some have attempted to promote good design by publishing user-interface design guidelines (also called design rules). Early ones included: • Cheriton (1976) proposed user-interface design guidelines for early interactive (time-shared) computer systems. • Norman (1983a, 1983b) presented design rules for software user interfaces based on human cognition, including cognitive errors. • Smith and Mosier (1986) wrote perhaps the most comprehensive set of user- interface design guidelines. • Shneiderman (1987) included “Eight Golden Rules of Interface Design” in the first edition of his book Designing the User Interface and in all later editions. • Brown (1988) wrote a book of design guidelines, appropriately titled Human– Computer Interface Design Guidelines. • Nielsen and Molich (1990) offered a set of design rules for use in heuristic
  • 15. evaluation of user interfaces, and Nielsen and Mack (1994) updated them. • Marcus (1992) presented guidelines for graphic design in online documents and user interfaces. In the twenty-first century, additional user-interface design guidelines have been offered by Stone et al. (2005); Koyani et al. (2006); Johnson (2007); and Shneiderman and Plaisant (2009). Microsoft, Apple Computer, and Oracle publish guidelines for designing software for their platforms (Apple Computer, 2009; Microsoft Corporation, 2009; Oracle Corporation/Sun Microsystems, 2001). How valuable are user-interface design guidelines? That depends on who applies them to design problems. USER-INTERFACE DESIGN AND EVALUATION REQUIRES UNDERSTANDING AND EXPERIENCE Following user-interface design guidelines is not as straightforward as following cooking recipes. Design rules often describe goals rather than actions. They are purposefully very general to make them broadly applicable, but that means that Introductionxiv their exact meaning and applicability to specific design situations is open to
  • 16. interpretation. Complicating matters further, more than one rule will often seem applicable to a given design situation. In such cases, the applicable design rules often conflict— that is, they suggest different designs. This requires designers to determine which competing design rule is more applicable to the given situation and should take precedence. Design problems, even without competing design guidelines, often have multiple conflicting goals. For example: • Bright screen and long battery life • Lightweight and sturdy • Multifunctional and easy to learn • Powerful and simple • High resolution and fast loading • WYSIWYG (what you see is what you get) and usable by blind people Satisfying all the design goals for a computer-based product or service usually requires tradeoffs—lots and lots of tradeoffs. Finding the right balance point between competing design rules requires further tradeoffs. Given all of these complications, user-interface design rules and
  • 17. guidelines must be applied thoughtfully, not mindlessly, by people who are skilled in the art of user- interface design and/or evaluation. User-interface design rules and guidelines are more like laws than like rote recipes. Just as a set of laws is best applied and inter- preted by lawyers and judges who are well versed in the laws, a set of user-interface design guidelines is best applied and interpreted by people who understand the basis for the guidelines and have learned from experience in applying them. Unfortunately, with a few exceptions (e.g., Norman, 1983a), user-interface design guidelines are provided as simple lists of design edicts with little or no rationale or background. Furthermore, although many early members of the user-interface design and usability profession had backgrounds in cognitive psychology, most newcomers to the field do not. That makes it difficult for them to apply user- interface design guide- lines sensibly. Providing that rationale and background education is the focus of this book. COMPARING USER-INTERFACE DESIGN GUIDELINES Table I.1 places the two best-known user-interface guideline lists side by side to show the types of rules they contain and how they compare to each other (see the Appendix for additional guidelines lists). For example, both
  • 18. lists start with a rule call- ing for consistency in design. Both lists include a rule about preventing errors. The xvIntroduction Nielsen–Molich rule to “help users recognize, diagnose, and recover from errors” corresponds closely to the Shneiderman–Plaisant rule to “permit easy reversal of actions.” “User control and freedom” corresponds to “make users feel they are in control.” There is a reason for this similarity, and it isn’t just that later authors were influenced by earlier ones. WHERE DO DESIGN GUIDELINES COME FROM? For present purposes, the detailed design rules in each set of guidelines, such as those in Table I.1, are less important than what they have in common: their basis and origin. Where did these design rules come from? Were their authors—like clothing fashion designers—simply trying to impose their own personal design tastes on the computer and software industries? If that were so, the different sets of design rules would be very different from each other, as the various authors sought to differentiate themselves from the others. In fact, all of these sets of user-interface design guidelines are quite similar if we ignore differences in wording, emphasis, and the state of
  • 19. computer technology when each set was written. Why? The answer is that all of the design rules are based on human psychology: how people perceive, learn, reason, remember, and convert intentions into action. Many authors of design guidelines had at least some background in psychology that they applied to computer system design. For example, Don Norman was a professor, researcher, and prolific author in the field of cognitive psychology long before he began writing about human–computer interaction. Norman’s early human–computer design guidelines were based on research—his own and others’—on human cognition. He was especially interested in cognitive errors that people often make and how computer systems can be designed to lessen or eliminate the impact of those errors. Table I.1 Two Best-Known Lists of User-Interface Design Guidelines Shneiderman (1987); Shneiderman and Plaisant (2009) Nielsen and Molich (1990) Strive for consistency Cater to universal usability Offer informative feedback Design task flows to yield closure Prevent errors
  • 20. Permit easy reversal of actions Make users feel they are in control Minimize short-term memory load Consistency and standards Visibility of system status Match between system and real world User control and freedom Error prevention Recognition rather than recall Flexibility and efficiency of use Aesthetic and minimalist design Help users recognize, diagnose, and recover from errors Provide online documentation and help Introductionxvi Similarly, other authors of user-interface design guidelines—for example, Brown, Shneiderman, Nielsen, and Molich—used knowledge of perceptual and cognitive psychology to try to improve the design of usable and useful interactive systems. Bottom line: User-interface design guidelines are based on human psychology. By reading this book, you will learn the most important aspects of the psychol- ogy underlying user-interface and usability design guidelines. INTENDED AUDIENCE OF THIS BOOK This book is intended mainly for software design and development professionals
  • 21. who have to apply user-interface and interaction design guidelines. This includes interaction designers, user-interface designers, user-experience designers, graphic designers, and hardware product designers. It also includes usability testers and eval- uators, who often refer to design heuristics when reviewing software or analyzing observed usage problems. A second intended audience is students of interaction design and human– computer interaction. A third intended audience is software development managers who want enough of a background in the psychological basis of user- interface design rules to understand and evaluate the work of the people they manage. Designing with the Mind in Mind. http://dx.doi.org/10.1016/B978-0-12-407914-4.00001-4 © 2014 Elsevier Inc. All rights reserved. CHAPTER 1 Our Perception is Biased Our perception of the world around us is not a true depiction of what is actually there. Our perceptions are heavily biased by at least three factors: l The past: our experience
  • 22. l The present: the current context l The future: our goals PERCEPTION BIASED BY EXPERIENCE Experience—your past perceptions—can bias your current perception in several different ways. Perceptual priming Imagine that you own a large insurance company. You are meeting with a real estate manager, discussing plans for a new campus of company buildings. The campus consists of a row of five buildings, the last two with T-shaped courtyards providing light for the cafeteria and fitness center. If the real estate manager showed you the map in Figure 1.1, you would see five black shapes representing the buildings. Now imagine that instead of a real estate manager, you are meeting with an adver- tising manager. You are discussing a new billboard ad to be placed in certain markets around the country. The advertising manager shows you the same image, but in this scenario the image is a sketch of the ad, consisting of a single word: LIFE. In this scenario, you see a word, clearly and unambiguously. When your perceptual system has been primed to see building shapes, you see building shapes, and the white areas between the buildings barely register in your perception. When your perceptual system has been primed to
  • 23. see text, you see text, and the black areas between the letters barely register. 1 CHAPTER 1 Our Perception is Biased2 A relatively famous example of how priming the mind can affect perception is an image, supposedly by R. C. James,1 that initially looks to most people like a random splattering of paint (see Fig. 1.2) similar to the work of the painter Jackson Pollack. Before reading further, look at the image. Only after you are told that it is a Dalmatian dog sniffing the ground near a tree can your visual system organize the image into a coherent picture. Moreover, once you’ve seen the dog, it is hard to go back to seeing just a random collection of spots. 1 Published in Lindsay and Norman (1972), Figure 3-17, p. 146. FIGURE 1.1 Building map or word? What you see depends on what you were told to see. FIGURE 1.2 Image showing the effect of mental priming of the visual system. What do you see?
  • 24. 3Perception biased by experience These priming examples are visual, but priming can also bias other types of per- ception, such as sentence comprehension. For example, the headline “New Vaccine Contains Rabies” would probably be understood differently by people who had recently heard stories about contaminated vaccines than by people who had recently heard stories about successful uses of vaccines to fight diseases. Familiar perceptual patterns or frames Much of our lives are spent in familiar situations: the rooms in our homes, our yards, our routes to and from school or work, our offices, neighborhood parks, stores, res- taurants, etc. Repeated exposure to each type of situation builds a pattern in our minds of what to expect to see there. These perceptual patterns, which some researchers call frames, include the objects or events that are usually encountered in that situation. For example, you know most rooms in your home well enough that you need not constantly scrutinize every detail. You know how they are laid out and where most objects are located. You can probably navigate much of your home in total darkness. But your experience with homes is broader than your specific home. In addition to having a pattern for your home, your brain has one for homes in
  • 25. general. It biases your perception of all homes, familiar and new. In a kitchen, you expect to see a stove and a sink. In a bathroom, you expect to see a toilet, a sink, and a shower or a bathtub (or both). Mental frames for situations bias our perception to see the objects and events expected in each situation. They are a mental shortcut: by eliminating the need for us to constantly scrutinize every detail of our environment, they help us get around in our world. However, mental frames also make us see things that aren’t really there. For example, if you visit a house in which there is no stove in the kitchen, you might nonetheless later recall seeing one, because your mental frame for kitchens has a strong stove component. Similarly, part of the frame for eating at a restaurant is paying the bill, so you might recall paying for your dinner even if you absentmind- edly walked out without paying. Your brain also has frames for back yards, schools, city streets, business offices, supermarkets, dentist visits, taxis, air travel, and other familiar situations. Anyone who uses computers, websites, or smartphones has frames for the desk- top and files, web browsers, websites, and various types of applications and online services. For example, when they visit a new Web site, experienced Web users
  • 26. expect to see a site name and logo, a navigation bar, some other links, and maybe a search box. When they book a flight online, they expect to specify trip details, examine search results, make a choice, and make a purchase. Because of the perceptual frames users of computer software and websites have, they often click buttons or links without looking carefully at them. Their perception of the display is based more on what their frame for the situation leads them to expect than on what is actually on the screen. This sometimes confounds software designers, who expect users to see what is on the screen—but that isn’t how human vision works. CHAPTER 1 Our Perception is Biased4 For example, if the positions of the “Next” and “Back” buttons on the last page of a multistep dialog box2 switched, many people would not immediately notice the switch (see Fig. 1.3). Their visual system would have been lulled into inattention by the consistent placement of the buttons on the prior several pages. Even after unin- tentionally going backward a few times, they might continue to perceive the buttons 2 Multistep dialog boxes are called wizards in user-interface designer jargon.
  • 27. FIGURE 1.3 The “Next” button is perceived to be in a consistent location, even when it isn’t. 5Perception biased by experience in their standard locations. This is why consistent placement of controls is a common user-interface guideline, to ensure that reality matches the user’s frame for the situation. Similarly, if we are trying to find something but it is in a different place or looks different from usual, we might miss it even though it is in plain view because our mental frames tune us to look for expected features in expected locations. For exam- ple, if the “Submit” button on one form in a Web site is shaped differently or is a different color from those on other forms on the site, users might not find it. This expectation-induced blindness is discussed more later in this chapter in the “Percep- tion Biased by Goals” section. Habituation A third way in which experience biases perception is called habituation. Repeated exposure to the same (or highly similar) perceptions dulls our perceptual system’s sensitivity to them. Habituation is a very low-level phenomenon of our nervous sys-
  • 28. tem: it occurs at a neuronal level. Even primitive animals like flatworms and amoeba, with very simple nervous systems, habituate to repeated stimuli (e.g., mild electric shocks or light-flashes). People, with our complex nervous systems, habituate to a range of events, from low-level ones like a continually beeping tone, to medium-level ones like a blinking ad on a Web site, to high-level ones like a person who tells the same jokes at every party or a politician giving a long, repetitious speech. We experience habituation in computer usage when the same error messages or “Are you sure?” confirmation messages appear again and again. People initially notice them and perhaps respond, but eventually click them closed reflexively without bothering to read them. Habituation is also a factor in a recent phenomenon variously labeled “social media burnout” (Nichols, 2013), “social media fatigue,” or “Facebook vacations” (Rainie et al., 2013): newcomers to social media sites and tweeting are initially excited by the novelty of microblogging about their experiences, but sooner or later get tired of wasting time reading tweets about every little thing that their “friends” do or see—for example, “Man! Was that ever a great salmon salad I had for lunch today.” Attentional blink
  • 29. Another low-level biasing of perception by past experience occurs just after we spot or hear something important. For a very brief period following the recognition— between 0.15 and 0.45 second—we are nearly deaf and blind to other visual stimuli, even though our ears and eyes stay functional. Researchers call this the attentional blink (Raymond et al., 1992, Stafford and Webb, 2005).3 It is thought to be caused by the brain’s perceptual and attention mechanisms being briefly fully occupied with processing the first recognition. 3 Chapter 14 discusses the attentional blink interval in the context of other perceptual intervals. CHAPTER 1 Our Perception is Biased6 A classic example: You are in a subway car as it enters a station, planning to meet two friends at that station. As the train arrives, your car passes one of your friends, and you spot him briefly through your window. In the next split second, your window passes your other friend, but you fail to notice her because her image hit your retina during the attentional blink that resulted from your recognition of your first friend. When people use computer-based systems and online services, attentional blink can cause them to miss information or events if things appear in rapid succession. A
  • 30. popular modern technique for making documentary videos is to present a series of still photographs in rapid succession.4 This technique is highly prone to attentional blink effects: if an image really captures your attention (e.g., it has a strong meaning for you), you will probably miss one or more of the immediately following images. In contrast, a captivating image in an auto-running slideshow (e.g., on a Web site or an information kiosk) is unlikely to cause attentional blink (i.e., missing the next image) because each image typically remains displayed for several seconds. PERCEPTION BIASED BY CURRENT CONTEXT When we try to understand how our visual perception works, it is tempting to think of it as a bottom-up process, combining basic features such as edges, lines, angles, curves, and patterns into figures and ultimately into meaningful objects. To take read- ing as an example, you might assume that our visual system first recognizes shapes as letters and then combines letters into words, words into sentences, and so on. But visual perception—reading in particular—is not strictly a bottom-up process. It includes top-down influences too. For example, the word in which a character appears may affect how we identify the character (see Fig. 1.4). Similarly, our overall comprehension of a sentence or a paragraph can even influence what words we see in it. For example, the same letter sequence
  • 31. can be read as different words depending on the meaning of the surrounding paragraph (see Fig. 1.5). Contextual biasing of vision need not involve reading. The Müller–Lyer illusion is a famous example (see Fig. 1.6): the two horizontal lines are the same length, but the outward-pointing “fins” cause our visual system to see the top line as longer than the 4 For an example, search YouTube for “history of the world in two minutes.” FIGURE 1.4 The same character is perceived as H or A depending on the surrounding letters. 7Perception biased by current context line with inward-pointing “fins.” This and other optical illusions (see Fig. 1.7) trick us because our visual system does not use accurate, optimal methods to perceive the world. It developed through evolution, a semi-random process that layers jury- rigged—often incomplete and inaccurate—solutions on top of each other. It works fine most of the time, but it includes a lot of approximations, kludges, hacks, and outright “bugs” that cause it to fail in certain cases. The examples in Figures 1.6 and 1.7 show vision being biased
  • 32. by visual context. However, biasing of perception by the current context works between different senses too. Perceptions in any of our five senses may affect simultaneous perceptions in any of our other senses. What we feel with our tactile sense can be biased by what we hear, see, or smell. What we see can be biased by what we hear, and what we hear can be biased by what we see. The following two examples of visual perception affect what we hear: l McGurk effect. If you watch a video of someone saying “bah, bah, bah,” then “dah, dah, dah,” then “vah, vah, vah,” but the audio is “bah, bah, bah” through- out, you will hear the syllable indicated by the speaker’s lip movement rather than the syllable actually in the audio track.5 Only by closing or averting your eyes do you hear the syllable as it really is. I’ll bet you didn’t know you could read lips, and in fact do so many times a day. 5 Go to YouTube, search for “McGurk effect,” and view (and hear) some of the resulting videos. Fold napkins. Polish silverware. Wash dishes. French napkins. Polish silverware. German dishes. FIGURE 1.5 The same phrase is perceived differently depending on the list it appears in.
  • 33. FIGURE 1.6 Müller–Lyer illusion: equal-length horizontal lines appear to have different lengths. CHAPTER 1 Our Perception is Biased8 l Ventriloquism. Ventriloquists don’t throw their voice; they just learn to talk without moving their mouths much. Viewers’ brains perceive the talking as coming from the nearest moving mouth: that of the ventriloquist’s puppet (Eagleman, 2012). An example of the opposite—hearing biasing vision—is the illusory flash effect. When a spot is flashed once briefly on a display but is accompanied by two quick beeps, it appears to flash twice. Similarly, the perceived rate of a blinking light can be adjusted by the frequency of a repeating click (Eagleman, 2012). Later chapters explain how visual perception, reading, and recognition function in the human brain. For now, I will simply say that the pattern of neural activity that corresponds to recognizing a letter, a word, a face, or any object includes input from (A) (B)
  • 34. (C) FIGURE 1.7 (A) The checkboard does not bulge in the middle; (B) the triangle sides are not bent; and (C) the red vertical lines are parallel. 9Perception biased by goals neural activity stimulated by the context. This context includes other nearby per- ceived objects and events, and even reactivated memories of previously perceived objects and events. Context biases perception not only in people but also in lower animals. A friend of mine often brought her dog with her in her car when running errands. One day, as she drove into her driveway, a cat was in the front yard. The dog saw it and began barking. My friend opened the car door and the dog jumped out and ran after the cat, which turned and jumped through a bush to escape. The dog dove into the bush but missed the cat. The dog remained agitated for some time afterward. Thereafter, for as long as my friend lived in that house, whenever she arrived at home with her dog in the car, he would get excited, bark, jump out of the car as soon
  • 35. as the door was opened, dash across the yard, and leap into the bush. There was no cat, but that didn’t matter. Returning home in the car was enough to make the dog see one—perhaps even smell one. However, walking home, as the dog did after being taken for his daily walk, did not evoke the “cat mirage.” PERCEPTION BIASED BY GOALS In addition to being biased by our past experience and the present context, our per- ception is influenced by our goals and plans for the future. Specifically, our goals: l Guide our perceptual apparatus, so we sample what we need from the world around us. l Filter our perceptions: things unrelated to our goals tend to be filtered out pre- consciously, never registering in our conscious minds. For example, when people navigate through software or a Web site, seeking information or a specific function, they don’t read carefully. They scan screens quickly and superficially for items that seem related to their goal. They don’t simply ignore items unrelated to their goals; they often don’t even notice them. To see this, glance at Figure 1.8 and look for scissors, and then immediately flip back to this page. Try it now. Did you spot the scissors? Now, without looking back at the
  • 36. toolbox, can you say whether there is a screwdriver in the toolbox too? Our goals filter our perceptions in other perceptual senses as well as in vision. A familiar example is the “cocktail party” effect. If you are conversing with someone at a crowded party, you can focus your attention to hear mainly what he or she is saying even though many other people are talking near you. The more interested you are in the conversation, the more strongly your brain filters out surrounding chatter. If you are bored by what your conversational partner is saying, you will probably hear much more of the conversations around you. The effect was first documented in studies of air-traffic controllers, who were able to carry on a conversation with the pilots of their assigned aircraft even though CHAPTER 1 Our Perception is Biased10 many different conversations were occurring simultaneously on the same radio fre- quency, coming out of the same speaker in the control room (Arons, 1992). Research suggests that our ability to focus on one conversation among several simultaneous ones depends not only on our interest level in the conversation, but also on objective factors, such as the similarity of voices in the cacophony, the amount of general
  • 37. “noise” (e.g., clattering dishes or loud music), and the predictability of what your conversational partner is saying (Arons, 1992). This filtering of perception by our goals is particularly true for adults, who tend to be more focused on goals than children are. Children are more stimulus-driven: their perception is less filtered by their goals. This characteristic makes them more distractible than adults, but it also makes them less biased as observers. A parlor game demonstrates this age difference in perceptual filtering. It is similar to the Figure 1.8 exercise. Most households have a catch-all drawer for kitchen imple- ments or tools. From your living room, send a visitor to the room where the catch-all drawer is, with instructions to fetch you a specific tool, such as measuring spoons or a pipe wrench. When the person returns with the tool, ask whether another specific tool was in the drawer. Most adults will not know what else was in the drawer. Chil- dren—if they can complete the task without being distracted by all the cool stuff in the drawer—will often be able to tell you more about what else was there. Perceptual filtering can also be seen in how people navigate websites. Suppose I put you on the homepage of New Zealand’s University of Canterbury (see Fig. 1.9) and asked you to find information about financial support for postgraduate students
  • 38. in the computer science department. You would scan the page and probably quickly click one of the links that share words with the goal that I gave you: Departments (top left), Scholarships (middle), then Postgraduate Students (bottom left) or Post- graduate (right). If you’re a “search” person, you might instead go right to the Search box (top right), type words related to the goal, and click “Go.” Whether you browse or search, it is likely that you would leave the homepage without noticing that you were randomly chosen to win $100 (bottom right). Why? Because that was not related to your goal. FIGURE 1.8 Toolbox: Are there scissors here? 11Perception biased by goals What is the mechanism by which our current goals bias our perception? There are two: l Influencing where we look. Perception is active, not passive. Think of your perceptual senses not as simply filtering what comes to you, but rather as reach- ing out into the world and pulling in what you need to perceive. Your hands, your primary touch sensors, literally do this, but the rest of your senses do it
  • 39. too. You constantly move your eyes, ears, hands, feet, body, and attention so as to sample exactly the things in your environment that are most relevant to what you are doing or about to do (Ware, 2008). If you are looking on a Web site for a campus map, your eyes and pointer-controlling hand are attracted to anything that might lead you to that goal. You more or less ignore anything unrelated to your goal. l Sensitizing our perceptual system to certain features. When you are look- ing for something, your brain can prime your perception to be especially sensi- tive to features of what you are looking for (Ware, 2008). For example, when you are looking for a red car in a large parking lot, red cars will seem to pop out as you scan the lot, and cars of other colors will barely register in your con- sciousness, even though you do in some sense see them. Similarly, when you are FIGURE 1.9 University of Canterbury Web site: navigating sites requires perceptual filtering. CHAPTER 1 Our Perception is Biased12 trying to find your spouse in a dark, crowded room, your brain “programs” your
  • 40. auditory system to be especially sensitive to the combination of frequencies that make up his or her voice. TAKING BIASED PERCEPTION INTO ACCOUNT WHEN DESIGNING All these sources of perceptual bias of course have implications for user-interface design. Here are three. Avoid ambiguity Avoid ambiguous information displays, and test your design to verify that all users interpret the display in the same way. Where ambiguity is unavoidable, either rely on standards or conventions to resolve it, or prime users to resolve the ambiguity in the intended way. For example, computer displays often shade buttons and text fields to make them look raised in relation to the background surface (see Fig. 1.10). This appearance relies on a convention, familiar to most experienced computer users, that the light source is at the top left of the screen. If an object were depicted as lit by a light source in a different location, users would not see the object as raised. Be consistent Place information and controls in consistent locations. Controls and data displays that serve the same function on different pages should be placed in the same position on each page on which they appear. They should also have the
  • 41. same color, text fonts, shading, and so on. This consistency allows users to spot and recognize them quickly. Understand the goals Users come to a system with goals they want to achieve. Designers should under- stand those goals. Realize that users’ goals may vary, and that their goals strongly influence what they perceive. Ensure that at every point in an interaction, the infor- mation users need is available, prominent, and maps clearly to a possible user goal, so users will notice and use the information. Search FIGURE 1.10 Buttons on computer screens are often shaded to make them look three dimensional, but the convention works only if the light source is assumed to be on the top left. Designing with the Mind in Mind. http://dx.doi.org/10.1016/B978-0-12-407914-4.00002-6 © 2014 Elsevier Inc. All rights reserved. CHAPTER 13 Our Vision is Optimized to See Structure 2
  • 42. Early in the twentieth century, a group of German psychologists sought to explain how human visual perception works. They observed and catalogued many important visual phenomena. One of their basic findings was that human vision is holistic: our visual system automatically imposes structure on visual input and is wired to per- ceive whole shapes, figures, and objects rather than disconnected edges, lines, and areas. The German word for “shape” or “figure” is Gestalt, so these theories became known as the Gestalt principles of visual perception. Today’s perceptual and cognitive psychologists regard the Gestalt theory of per- ception as more of a descriptive framework than an explanatory and predictive theory. Today’s theories of visual perception tend to be based heavily on the neuro- physiology of the eyes, optic nerve, and brain (see Chapters 4– 7). Not surprisingly, the findings of neurophysiological researchers support the observa- tions of the Gestalt psychologists. We really are—along with other animals—“wired” to perceive our surroundings in terms of whole objects (Stafford and Webb, 2005; Ware, 2008). Consequently, the Gestalt principles are still valid—if not as a fundamental expla- nation of visual perception, at least as a framework for describing it. They also provide a useful basis for guidelines for graphic design and user-interface design (Soegaard, 2007).
  • 43. For present purposes, the most important Gestalt principles are Proximity, Simi- larity, Continuity, Closure, Symmetry, Figure/Ground, and Common Fate. The fol- lowing sections describe each principle and provide examples from both static graphic design and user-interface design. GESTALT PRINCIPLE: PROXIMITY The Gestalt principle of Proximity is that the relative distance between objects in a display affects our perception of whether and how the objects are organized into CHAPTER 2 Our Vision is Optimized to See Structure14 subgroups. Objects that are near each other (relative to other objects) appear grouped, while those that are farther apart do not. In Figure 2.1A, the stars are closer together horizontally than they are vertically, so we see three rows of stars, while the stars in Figure 2.1B are closer together verti- cally than they are horizontally, so we see three columns. (A) (B) FIGURE 2.1 Proximity: items that are closer appear grouped as rows (A) and columns (B). FIGURE 2.2
  • 44. In Outlook’s Distribution List Membership dialog box, list buttons are in a group box, separate from the control buttons. 15Gestalt principle: proximity The Proximity principle has obvious relevance to the layout of control panels or data forms in software, Web sites, and electronic appliances. Designers often sepa- rate groups of on-screen controls and data displays by enclosing them in group boxes or by placing separator lines between groups (see Fig. 2.2). However, according to the Proximity principle, items on a display can be visually grouped simply by spacing them closer to each other than to other controls, without group boxes or visible borders (see Fig. 2.3). Many graphic design experts recom- mend this approach to reduce visual clutter and code size in a user interface (Mullet and Sano, 1994). FIGURE 2.3 In Mozilla Thunderbird’s Subscribe Folders dialog box, controls are grouped using the Proximity principle. FIGURE 2.4 In Discreet’s Software Installer, poorly spaced radio buttons
  • 45. look grouped in vertical columns. CHAPTER 2 Our Vision is Optimized to See Structure16 Conversely, if controls are poorly spaced (e.g., if connected controls are too far apart) people will have trouble perceiving them as related, making the software harder to learn and remember. For example, the Discreet Software Installer displays six horizontal pairs of radio buttons, each representing a two- way choice, but their spacing, due to the Proximity principle, makes them appear to be two vertical sets of radio buttons, each representing a six-way choice, at least until users try them and learn how they operate (see Fig. 2.4). GESTALT PRINCIPLE: SIMILARITY Another factor that affects our perception of grouping is expressed in the Gestalt principle of Similarity, where objects that look similar appear grouped, all other things being equal. In Figure 2.5, the slightly larger, “hollow” stars are perceived as a group. The Page Setup dialog box in Mac OS applications uses the Similarity and Proxim- ity principles to convey groupings (see Fig. 2.6). The three very similar and tightly spaced Orientation settings are clearly intended to appear grouped. The three menus are not so tightly spaced but look similar enough that they
  • 46. appear related even though that probably wasn’t intended. Similarly, the text fields in a form at book publisher Elsevier’s Web site are orga- nized into an upper group of eight for the name and address, a group of three split fields for phone numbers, and two single text fields. The four menus, in addition to being data fields, help separate the text field groups (see Fig. 2.7). By contrast, the labels are too far from their fields to seem connected to them. FIGURE 2.5 Similarity: items appear grouped if they look more similar to each other than to other objects. 17Gestalt principle: similarity FIGURE 2.6 Mac OS Page Setup dialog box. The Similarity and Proximity principles are used to group the Orientation settings. FIGURE 2.7 Similarity makes the text fields appear grouped in this online form at Elsevier.com. http://Elsevier.com
  • 47. CHAPTER 2 Our Vision is Optimized to See Structure18 GESTALT PRINCIPLE: CONTINUITY In addition to the two Gestalt principles concerning our tendency to organize objects into groups, several Gestalt principles describe our visual system’s tendency to resolve ambiguity or fill in missing data in such a way as to perceive whole objects. The first such principle, the principle of Continuity, states that our visual perception is biased to perceive continuous forms rather than disconnected segments. For example, in Figure 2.8A, we automatically see two crossing lines—one blue and one orange. We don’t see two separate orange segments and two separate blue ones, and we don’t see a blue-and-orange V on top of an upside- down orange-and- blue V. In Figure 2.8B, we see a sea monster in water, not three pieces of one. A well-known example of the use of the continuity principle in graphic design is the IBM® logo. It consists of disconnected blue patches, and yet it is not at all ambig- uous; it is easily seen as three bold letters, perhaps viewed through something like venetian blinds (see Fig. 2.9). (A) (B) FIGURE 2.8 Continuity: Human vision is biased to see continuous forms,
  • 48. even adding missing data if necessary. FIGURE 2.9 The IBM company logo uses the Continuity principle to form letters from disconnected patches. 19Gestalt principle: closure Slider controls are a user-interface example of the Continuity principle. We see a slider as depicting a single range controlled by a handle that appears somewhere on the slider, not as two separate ranges separated by the handle (see Fig. 2.10A). Even displaying different colors on each side of a slider’s handle doesn’t completely “break” our perception of a slider as one continuous object, although Componen- tOne’s choice of strongly contrasting colors (gray vs. red) certainly strains that per- ception a bit (see Fig. 2.10B). GESTALT PRINCIPLE: CLOSURE Related to Continuity is the Gestalt principle of Closure, which states that our visual system automatically tries to close open figures so that they are perceived as whole objects rather than separate pieces. Thus, we perceive the disconnected arcs in Fig- ure 2.11A as a circle. Our visual system is so strongly biased to see objects that it can
  • 49. even interpret a totally blank area as an object. We see the combination of shapes in Figure 2.11B as a white triangle overlapping another triangle and three black circles, even though the figure really only contains three V shapes and three black pac-men. The Closure principle is often applied in graphical user interfaces (GUIs). For example, GUIs often represent collections of objects (e.g., documents or messages) as stacks (see Fig. 2.12). Just showing one whole object and the edges of others “behind” it is enough to make users perceive a stack of objects, all whole. (A) (B) FIGURE 2.10 Continuity: we see a slider as a single slot with a handle somewhere on it, not as two slots separated by a handle: (A) Mac OS and (B) ComponentOne. CHAPTER 2 Our Vision is Optimized to See Structure20 GESTALT PRINCIPLE: SYMMETRY A third fact about our tendency to see objects is captured in the Gestalt principle of Symmetry. It states that we tend to parse complex scenes in a way that reduces the
  • 50. complexity. The data in our visual field usually has more than one possible interpre- tation, but our vision automatically organizes and interprets the data so as to simplify it and give it symmetry. For example, we see the complex shape on the far left of Figure 2.13 as two over- lapping diamonds, not as two touching corner bricks or a pinch- waist octahedron with a square in its center. A pair of overlapping diamonds is simpler than the other two interpretations shown on the right—it has fewer sides and more symmetry than the other two interpretations. In printed graphics and on computer screens, our visual system’s reliance on the symmetry principle can be exploited to represent three- dimensional objects on a two-dimensional display. This can be seen in a cover illustration for Paul Thagard’s book Coherence in Thought and Action (Thagard, 2002; see Fig. 2.14) and in a three-dimensional depiction of a cityscape (see Fig. 2.15). FIGURE 2.12 Icons depicting stacks of objects exhibit the Closure principle: partially visible objects are perceived as whole. (A) (B) FIGURE 2.11
  • 51. Closure: Human vision is biased to see whole objects, even when they are incomplete. 21Gestalt principle: figure/ground GESTALT PRINCIPLE: FIGURE/GROUND The next Gestalt principle that describes how our visual system structures the data it receives is Figure/Ground. This principle states that our mind separates the visual field into the figure (the foreground) and ground (the background). The foreground consists of the elements of a scene that are the object of our primary attention, and the background is everything else. The Figure/Ground principle also specifies that the visual system’s parsing of scenes into figure and ground is influenced by characteristics of the scene. For exam- ple, when a small object or color patch overlaps a larger one, we tend to perceive the smaller object as the figure and the larger object as the ground (see Fig. 2.16). not= or FIGURE 2.13 Symmetry: the human visual system tries to resolve complex scenes into combinations of simple, symmetrical shapes. FIGURE 2.14
  • 52. The cover of the book Coherence in Thought and Action (Thagard, 2002) uses the symmetry, Closure, and Continuity principles to depict a cube. CHAPTER 2 Our Vision is Optimized to See Structure22 However, our perception of figure versus ground is not completely determined by scene characteristics. It also depends on the viewer’s focus of attention. Dutch artist M. C. Escher exploited this phenomenon to produce ambiguous images in which figure and ground switch roles as our attention shifts (see Fig. 2.17). In user-interface and Web design, the Figure/Ground principle is often used to place an impression-inducing background “behind” the primary displayed content FIGURE 2.16 Figure/Ground: when objects overlap, we see the smaller as the figure and the larger as the ground. FIGURE 2.15 Symmetry: the human visual system parses very complex two- dimensional images into three- dimensional scenes.
  • 53. 23Gestalt principle: figure/ground (see Fig. 2.18). The background can convey information (e.g., the user’s current loca- tion), or it can suggest a theme, brand, or mood for interpretation of the content. Figure/Ground is also often used to pop up information over other content. Con- tent that was formerly the figure—the focus of the users’ attention—temporarily becomes the background for new information, which appears briefly as the new FIGURE 2.17 M. C. Escher exploited figure/ground ambiguity in his art. FIGURE 2.18 Figure/Ground is used at AndePhotos.com to display a thematic watermark “behind” the content. http://AndePhotos.com CHAPTER 2 Our Vision is Optimized to See Structure24 figure (see Fig. 2.19). This approach is usually better than temporarily replacing the old information with the new information, because it provides context that helps keep people oriented regarding their place in the interaction. GESTALT PRINCIPLE: COMMON FATE
  • 54. The previous six Gestalt principles concerned perception of static (unmoving) figures and objects. One final Gestalt principle—Common Fate— concerns moving objects. The cCommon Fate principle is related to the Proximity and Similarity principles— like them, it affects whether we perceive objects as grouped. The Common Fate prin- ciple states that objects that move together are perceived as grouped or related. For example, in a display showing dozens of pentagons, if seven of them wiggled in synchrony, people would see them as a related group, even if the wiggling penta- gons were separated from each other and looked no different from all the other pentagons (see Fig. 2.20). Common motion—implying common fates—is used in some animations to show relationships between entities. For example, Google’s GapMinder graphs animate dots representing nations to show changes over time in various factors of economic devel- opment. Countries that move together share development histories (see Fig. 2.21). FIGURE 2.19 Figure/Ground is used at PBS.org’s mobile Web site to pop up a call-to-action “over” the page content.
  • 55. 25Gestalt principles: combined GESTALT PRINCIPLES: COMBINED Of course, in real-world visual scenes, the Gestalt principles work in concert, not in isolation. For example, a typical Mac OS desktop usually exemplifies six of the seven principles described here, excluding Common Fate): Proximity, Similarity, Continu- ity, Closure, Symmetry, and Figure/Ground (see Fig. 2.22). On a typical desktop, Common Fate is used (along with similarity) when a user selects several files or fold- ers and drags them as a group to a new location (see Fig. 2.23). FIGURE 2.20 Common Fate: items appear grouped or related if they move together. FIGURE 2.21 Common fate: GapMinder animates dots to show which nations have similar development histories (for details, animations, and videos, visit GapMinder.org). http://GapMinder.org CHAPTER 2 Our Vision is Optimized to See Structure26 FIGURE 2.22 All of the Gestalt principles except Common Fate play a role in this portion of a Mac OS desktop.
  • 56. FIGURE 2.23 Similarity and Common Fate: when users drag folders that they have selected, common highlight- ing and motion make the selected folders appear grouped. 27Gestalt principles: combined With all these Gestalt principles operating at once, unintended visual relation- ships can be implied by a design. A recommended practice, after designing a display, is to view it with each of the Gestalt principles in mind— Proximity, Similarity, Con- tinuity, Closure, Symmetry, Figure/Ground, and Common Fate—to see if the design suggests any relationships between elements that you do not intend. This page intentionally left blank Designing with the Mind in Mind. http://dx.doi.org/10.1016/B978-0-12-407914-4.00003-8 © 2014 Elsevier Inc. All rights reserved. CHAPTER 29
  • 57. We Seek and Use Visual Structure Chapter 2 used the Gestalt principles of visual perception to show how our visual system is optimized to perceive structure. Perceiving structure in our environment helps us make sense of objects and events quickly. Chapter 2 also mentioned that when people are navigating through software or Web sites, they don’t scrutinize screens carefully and read every word. They scan quickly for relevant information. This chapter presents examples to show that when information is presented in a terse, structured way, it is easier for people to scan and understand. Consider two presentations of the same information about an airline flight reser- vation. The first presentation is unstructured prose text; the second is structured text in outline form (see Fig. 3.1). The structured presentation of the reservation can be scanned and understood much more quickly than the prose presentation. The more structured and terse the presentation of information, the more quickly and easily people can scan and comprehend it. Look at the Contents page from the California Department of Motor Vehicles (see Fig. 3.2). The wordy, repetitive links slow users down and “bury” the important words they need to see.
  • 58. 3 Unstructured: You are booked on United flight 237, which departs from Auckland at 14:30 on Tuesday 15 Oct and arrives at San Francisco at 11:40 on Tuesday 15 Oct. Structured: Flight: United 237, Auckland San Francisco Depart: 14:30 Tue 15 Oct Arrive: 11:40 Tue 15 Oct FIGURE 3.1 Structured presentation of airline reservation information is easier to scan and understand. CHAPTER 3 We Seek and Use Visual Structure30 Compare that with a terser, more structured hypothetical design that factors out needless repetition and marks as links only the words that represent options (see Fig. 3.3). All options presented in the actual Contents page are available in the revision, yet it consumes less screen space and is easier to scan. Displaying search results is another situation in which
  • 59. structuring data and avoid- ing repetitive “noise” can improve people’s ability to scan quickly and find what they seek. In 2006, search results at HP.com included so much repeated navigation data and metadata for each retrieved item that they were useless. By 2009, HP had elimi- nated the repetition and structured the results, making them easier to scan and more useful (see Fig. 3.4). Of course, for information displays to be easy to scan, it is not enough merely to make them terse, structured, and nonrepetitious. They must also conform to the rules of graphic design, some of which were presented in Chapter 2. For example, a prerelease version of a mortgage calculator on a real estate Web site presented its results in a table that violated at least two important rules of graphic design (see Fig. 3.5A). First, people usually read (online or offline) from top to bottom, but the labels for calculated amounts were below their corresponding values. Second, the labels were just as close to the value below as to their own FIGURE 3.2 Contents page at the California Department of Motor Vehicles (DMV) Web site buries the important information in repetitive prose. Licenses & ID Cards: Renewals, Duplicates, Changes
  • 60. • Renew license: in person by mail by Internet • Renew: instruction permit • Apply for duplicate: license ID card • Change of: name address • Register as: organ donor FIGURE 3.3 California DMV Web site Contents page with repetition eliminated and better visual structure. http://HP.com 31CHAPTER 3 We Seek and Use Visual Structure value, so proximity (see Chapter 2) could not be used to perceive that labels were grouped with their values. To understand this mortgage results table, users had to scrutinize it carefully and slowly figure out which labels went with which numbers. (A) (B) FIGURE 3.4 In 2006, HP.com’s site search produced repetitious, “noisy” results (A), but by 2009 was improved (B). 360 0.00
  • 61. Mortgage Summary Monthly Payment $ 1,840.59 Number of Payments Total of Payments $ 662,611.22 Interest Total $ 318,861.22 Tax Total $ 93,750.00 PMI Total $ Pay off Date Sep 2037 (A) (B) FIGURE 3.5 (A) Mortgage summary presented by a software mortgage calculator; (B) an improved design. CHAPTER 3 We Seek and Use Visual Structure32 The revised design, in contrast, allows users to perceive the correspondence between labels and values without conscious thought (see Fig. 3.5B). STRUCTURE ENHANCES PEOPLE’S ABILITY TO SCAN LONG NUMBERS
  • 62. Even small amounts of information can be made easier to scan if they are structured. Two examples are telephone numbers and credit card numbers (see Fig. 3.6). Tradi- tionally, such numbers were broken into parts to make them easier to scan and remember. A long number can be broken up in two ways: either the user interface breaks it up explicitly by providing a separate field for each part of the number, or the inter- face provides a single number field but lets users break the number into parts with spaces or punctuation (see Fig. 3.7A). However, many of today’s computer presenta- tions of phone and credit card numbers do not segment the numbers and do not Easy: (415) 123 4567 Hard: 4151234567 Easy: 1234 5678 9012 3456 Hard: 1234567890123456 FIGURE 3.6 Telephone and credit card numbers are easier to scan and understand when segmented. (A) (B) FIGURE 3.7
  • 63. (A) At Democrats.org, credit card numbers can include spaces. (B) At StuffIt.com, they cannot, making them harder to scan and verify. http://Democrats.org http://StuffIt.com 33Data-specific controls provide even more structure allow users to include spaces or other punctuation (see Fig. 3.7B). This limitation makes it harder for people to scan a number or verify that they typed it correctly, and so is considered a user-interface design blooper ( Johnson, 2007). Forms presented in software and Web sites should accept credit card numbers, social security numbers, phone numbers, and so on in a variety of different formats and parse them into the internal format. Segmenting data fields can provide useful visual structure even when the data to be entered is not, strictly speaking, a number. Dates are an example of a case in which segmented fields can improve readability and help prevent data entry errors, as shown by a date field at Bank of America’s Web site (see Fig. 3.8). DATA-SPECIFIC CONTROLS PROVIDE EVEN MORE STRUCTURE A step up in structure from segmented data fields are data- specific controls. Instead of using simple text fields—whether segmented or not—
  • 64. designers can use controls that are designed specifically to display (and accept as input) a value of a specific type. For example, dates can be presented (and accepted) in the form of menus com- bined with pop-up calendar controls (see Fig. 3.9). It is also possible to provide visual structure by mixing segmented text fields with data-specific controls, as demonstrated by an email address field at Southwest Air- lines’ Web site (see Fig. 3.10). FIGURE 3.8 At BankOfAmerica.com, segmented data fields provide useful structure. FIGURE 3.10 At SWA.com email addresses are entered into fields structured to accept parts of the address. FIGURE 3.9 At NWA.com, dates are displayed and entered using a control that is specifically designed for dates. http://BankOfAmerica.com http://SWA.com http://NWA.com CHAPTER 3 We Seek and Use Visual Structure34 VISUAL HIERARCHY LETS PEOPLE FOCUS ON THE
  • 65. RELEVANT INFORMATION One of the most important goals in structuring information presentations is to pro- vide a visual hierarchy—an arrangement that: l Breaks the information into distinct sections, and breaks large sections into subsections. l Labels each section and subsection prominently and in such a way as to clearly identify its content. l Presents the sections and subsections as a hierarchy, with higher-level sections presented more strongly than lower-level ones. A visual hierarchy allows people, when scanning information, to instantly sepa- rate what is relevant to their goals from what is irrelevant, and to focus their atten- tion on the relevant information. They find what they are looking for more quickly because they can easily skip everything else. Try it for yourself. Look at the two information displays in Figure 3.11 and find the information about prominence. How much longer does it take you to find it in the nonhierarchical presentation? Create a Clear Visual Hierarchy Organize and prioritize the contents of a page by
  • 66. using size, prominence, and content relationships. Let’s look at these relationships more closely: • Size. The more important a headline is, the larger its font size should be. Big bold headlines help to grab the user’s attention as they scan the Web page. • Content Relationships. Group similar content types by displaying the content in a similar visual style, or in a clearly defined area. • Prominence. The more important the headline or content, the higher up the page it should be placed. The most important or popular content should always be positioned prominently near the top of the page, so users can view it without having to scroll too far. Create a Clear Visual Hierarchy Organize and prioritize the contents of a page by using size, prominence, and content relationships. Let’s look at these relationships more closely. The more important a headline is, the larger its font size should be. Big bold headlines help to grab the user’s attention as they scan the Web page. The more important the headline or content, the higher up the page it should be placed. The most important or popular content should always be positioned prominently near the top of the page, so users can view it without having to
  • 67. scroll too far. Group similar content types by displaying the content in a similar visual style, or in a clearly defined area. (A) (B) FIGURE 3.11 Find the advice about prominence in each of these displays. Prose text format (A) makes people read everything. Visual hierarchy (B) lets people ignore information irrelevant to their goals. 35Visual hierarchy lets people focus on the relevant information The examples in Figure 3.11 show the value of visual hierarchy in a textual, read-only information display. Visual hierarchy is equally important in interactive control panels and forms—perhaps even more so. Compare dialog boxes from two different music software products (see Fig. 3.12). The Reharmonize dialog box of Band-in-a-Box has poor visual hierarchy, making it hard for users to find things quickly. In contrast, GarageBand’s Audio/MIDI control panel has good visual hierarchy, so users can quickly find the settings they are interested in. (A)
  • 68. (B) FIGURE 3.12 Visual hierarchy in interactive control panels and forms lets users find settings quickly: (A) Band-in-a-Box (bad) and (B) GarageBand (good). CHAPTER 3 We Seek and Use Visual Structure36 Used by permission, www.OK/Cancel.com. http://www.OK/Cancel.com Designing with the Mind in Mind. http://dx.doi.org/10.1016/B978-0-12-407914-4.00004-X © 2014 Elsevier Inc. All rights reserved. CHAPTER 37 Our Color Vision is Limited Human color perception has both strengths and limitations, many of which are rel- evant to user-interface design. For example: l Our vision is optimized to detect contrasts (edges), not absolute brightness. l Our ability to distinguish colors depends on how colors are presented.
  • 69. l Some people have color-blindness. l The user’s display and viewing conditions affect color perception. To understand these qualities of human color vision, let’s start with a brief description of how the human visual system processes color information from the environment. HOW COLOR VISION WORKS If you took introductory psychology or neurophysiology in college, you probably learned that the retina at the back of the human eye—the surface onto which the eye focuses images—has two types of light receptor cells: rods and cones. You probably also learned that the rods detect light levels but not colors, while the cones detect colors. Finally, you probably learned that there are three types of cones—sensitive to red, green, and blue light—suggesting that our color vision is similar to video cam- eras and computer displays, which detect or project a wide variety of colors through combinations of red, green, and blue pixels. What you learned in college is only partly right. People with normal vision do in fact have rods and three types of cones1 in their retinas. The rods are sensitive to overall brightness while the three types of cones are sensitive to different 1 People with color-blindness may have fewer than three, and
  • 70. some women have four, cone types (Eagleman, 2012). 4 CHAPTER 4 Our Color Vision is Limited38 frequencies of light. But that is where the truth departs from what most people learned in college, until recently. First, those of us who live in industrialized societies hardly use our rods at all. They function only at low levels of light. They are for getting around in poorly lighted envi- ronments—the environments our ancestors lived in until the nineteenth century. Today, we use our rods only when we are having dinner by candlelight, feeling our way around our dark house at night, camping outside after dark, etc. (see Chapter 5). In bright daylight and modern artificially lighted environments— where we spend most of our time—our rods are completely maxed out, providing no useful information. Most of the time, our vision is based entirely on input from our cones (Ware, 2008). So how do our cones work? Are the three types of cones sensitive to red, green, and blue light, respectively? In fact, each type of cone is sensitive to a wider range of light frequencies than you might expect, and the sensitivity ranges of the three types
  • 71. overlap considerably. In addition, the overall sensitivity of the three types of cones differs greatly (see Fig. 4.1A): l Low frequency. These cones are sensitive to light over almost the entire range of visible light, but are most sensitive to the middle (yellow) and low (red) frequencies. l Medium frequency. These cones respond to light ranging from the high-fre- quency blues through the lower middle-frequency yellows and oranges. Over- all, they are less sensitive than the low-frequency cones. l High frequency. These cones are most sensitive to light at the upper end of the visible light spectrum—violets and blues—but they also respond weakly to middle frequencies, such as green. These cones are much less sensitive overall than the other two types of cones, and also less numerous. One result is that our eyes are much less sensitive to blues and violets than to other colors. Compare a graph of the light sensitivity of our retinal cone cells (Fig. 4.1A) to what the graph might look like if electrical engineers had designed our retinas as a mosaic of receptors sensitive to red, green, and blue, like a camera (Fig. 4.1B). 1.0
  • 72. 0.8 0.6 0.4 0.2 (B)(A) 400 500 600 700 Wavelength (nanometers) L M H 400 500 600 700 Wavelength (nanometers) 0.2 0.4 0.6 0.8 1.0 R e
  • 73. at ve a bs or ba nc e FIGURE 4.1 Sensitivity of the three types of retinal cones (A) versus artificial red, green, and blue receptors (B). 39Vision is optimized for contrast, not brightness Given the odd relationships among the sensitivities of our three types of retinal cone cells, one might wonder how the brain combines the signals from the cones to allow us to see a broad range of colors. The answer is by subtraction. Neurons in the visual cortex at the back of our brain subtract the signals coming over the optic nerves from the medium- and low- frequency cones, producing a red–green difference signal channel. Other neurons in the visual cortex subtract the signals from the high- and low- frequency cones, yield-
  • 74. ing a yellow–blue difference signal channel. A third group of neurons in the visual cortex adds the signals coming from the low- and medium- frequency cones to pro- duce an overall luminance (or black–white) signal channel.2 These three channels are called color-opponent channels. The brain then applies additional subtractive processes to all three color-oppo- nent channels: signals coming from a given area of the retina are effectively sub- tracted from similar signals coming from nearby areas of the retina. VISION IS OPTIMIZED FOR CONTRAST, NOT BRIGHTNESS All this subtraction makes our visual system much more sensitive to differences in color and brightness—that is, to contrasting colors and edges— than to absolute brightness levels. To see this, look at the inner bar in Figure 4.2. The inner bar looks darker on the right, but in fact is one solid shade of gray. To our contrast- sensitive visual system, it looks lighter on the left and darker on the right because the outer rectangle is darker on the left and lighter on the right. The sensitivity of our visual system to contrast rather than to absolute brightness is an advantage: it helped our distant ancestors recognize a leopard in the nearby bushes as the same dangerous animal whether they saw it in bright noon sunlight or
  • 75. in the early morning hours of a cloudy day. Similarly, being sensitive to color 2 The overall brightness sum omits the signal from the high- frequency (blue–violet) cones. Those cones are so insensitive that their contribution to the total would be negligible, so omitting them makes little difference. FIGURE 4.2 The inner gray bar looks darker on the right, but in fact is all one shade of gray. CHAPTER 4 Our Color Vision is Limited40 contrasts rather than to absolute colors allows us to see a rose as the same red whether it is in the sun or the shade. Brain researcher Edward H. Adelson at the Massachusetts Institute of Technology developed an outstanding illustration of our visual system’s insensitivity to absolute brightness and its sensitivity to contrast (see Fig. 4.3). As difficult as it may be to believe, square A on the checkerboard is exactly the same shade as square B. Square B only appears white because it is depicted as being in the cylinder’s shadow. THE ABILITY TO DISCRIMINATE COLORS DEPENDS ON HOW COLORS ARE PRESENTED Even our ability to detect differences between colors is limited.
  • 76. Because of how our visual system works, three presentation factors affect our ability to distinguish col- ors from each other: l Paleness. The paler (less saturated) two colors are, the harder it is to tell them apart (see Fig. 4.4A). l Color patch size. The smaller or thinner objects are, the harder it is to distin- guish their colors (see Fig. 4.4B). Text is often thin, so the exact color of text is often hard to determine. l Separation. The more separated color patches are, the more difficult it is to distinguish their colors, especially if the separation is great enough to require eye motion between patches (see Fig. 4.4C). Several years ago, the online travel website ITN.net used two pale colors—white and pale yellow—to indicate which step of the reservation process the user was on (see Fig. 4.5). Some site visitors couldn’t see which step they were on. FIGURE 4.3 The squares marked A and B are the same gray. We see B as white because it is shaded from the cylinder’s shadow. http://ITN.net
  • 77. 41The ability to discriminate colors depends on how colors are presented (A) (B) (C) FIGURE 4.4 Factors affecting ability to distinguish colors: (A) paleness, (B) size, and (C) separation. FIGURE 4.5 The pale color marking the current step makes it hard for users to see which step in the airline reservation process they are on in ITN.net’s 2003 website. 1 3 5 7 9 11 13 15 17 1 0.8 0.6 0.4 0.2 0 0.2 0.4
  • 78. 0.6 0.8 1 S1 S6 S16 S11 0.8 1 0.6 0.8 0.4 0.6 0.2 0.4 0 0.2 0.4 0.2 0.6 0.4 0.8 0.6 1 0.8 0.2 0 FIGURE 4.6 Tiny color patches in this chart legend are hard to distinguish. http://ITN.net CHAPTER 4 Our Color Vision is Limited42
  • 79. 0 200 400 600 800 Legend Beverages Condiments Confections Dairy products Grains/Cereals Meat/Poultry Produce Seafood FIGURE 4.7 Large color patches make it easier to distinguish the colors. FIGURE 4.8 The difference in color between visited and unvisited links is too subtle in MinneapolisFed.org’s website. Small color patches are often seen in data charts and plots. Many business graph- ics packages produce legends on charts and plots, but make the color patches in the legend very small (see Fig. 4.6). Color patches in chart legends
  • 80. should be large to help people distinguish the colors (see Fig. 4.7). On websites, a common use of color is to distinguish unfollowed links from already followed ones. On some sites, the “followed” and “unfollowed” colors are too similar. The website of the Federal Reserve Bank of Minneapolis (see Fig. 4.8) has this problem. Furthermore, the two colors are shades of blue, the color range in which our eyes are least sensitive. Can you spot the two followed links?3 3 Already followed links in Figure 4.8: Housing Units Authorized and House Price Index. 43Color-blindness COLOR-BLINDNESS A fourth factor of color presentation that affects design principles for interactive systems is whether the colors can be distinguished by people who have common types of color- blindness. Having color-blindness doesn’t mean an inability to see colors. It just means that one or more of the color subtraction channels (see the “How Color Vision Works” section) don’t function normally, making it difficult to distinguish certain pairs of colors. Approximately 8% of men and slightly under 0.5% of women have a color perception deficit: difficulty discriminating certain pairs of colors (Wolfmaier, 1999). The most com-
  • 81. mon type of color-blindness is red–green; other types are much rarer. Figure 4.9 shows color pairs that people with red–green color-blindness have trouble distinguishing. (A) (C) (B) FIGURE 4.9 Red–green color-blind people can’t distinguish (A) dark red from black, (B) blue from purple, and (C) light green from white. FIGURE 4.10 MoneyDance’s graph uses colors some users can’t distinguish. CHAPTER 4 Our Color Vision is Limited44 FIGURE 4.11 MoneyDance’s graph rendered in grayscale. (A) (B) FIGURE 4.12 Google logo: (A) normal and (B) after red–green color- blindness filter.
  • 82. The home finance application MoneyDance provides a graphical breakdown of household expenses, using color to indicate the various expense categories (see Fig. 4.10). Unfortunately, many of the colors are hues that color- blind people cannot tell apart. For example, people with red–green color-blindness cannot distinguish the blue from the purple or the green from the khaki. If you are not color-blind, you can get an idea of which colors in an image will be hard to distinguish by converting the image to grayscale (see Fig. 4.11), but, as described in the “Guidelines for Using Color” section later in this chapter, it is best to run the image through a color-blind- ness filter or simulator (see Fig. 4.12). EXTERNAL FACTORS THAT INFLUENCE THE ABILITY TO DISTINGUISH COLORS Factors concerning the external environment also impact people’s ability to distin- guish colors. For example: l Variation among color displays. Computer displays vary in how they dis- play colors, depending on their technologies, driver software, or color settings. 45Guidelines for using color Even monitors of the same model with the same settings may display colors slightly differently. Something that looks yellow on one display
  • 83. may look beige on another. Colors that are clearly different on one may look the same on another. l Grayscale displays. Although most displays these days are color, there are devices, especially small handheld ones, with grayscale displays. For instance, Figure 4.11 shows that a grayscale display can make areas of different colors look the same. l Display angle. Some computer displays, particularly LCD ones, work much better when viewed straight on than at an angle. When LCD displays are viewed at an angle, colors—and color differences—often are altered. l Ambient illumination. Strong light on a display washes out colors before it washes out light and dark areas, reducing color displays to grayscale ones, as anyone who has tried to use a bank ATM in direct sunlight knows. In offices, glare and venetian blind shadows can mask color differences. These four external factors are usually out of the software designer’s control. Designers should, therefore, keep in mind that they don’t have full control of users’ color viewing experience. Colors that seem highly distinguishable in the development facility on the development team’s computer displays and under normal office lighting conditions may not be as distinguishable in some of the
  • 84. environments where the soft- ware is used. GUIDELINES FOR USING COLOR In interactive software systems that rely on color to convey information, follow these five guidelines to assure that the users of the software receive the information: 1. Distinguish colors by saturation and brightness, as well as hue. Avoid subtle color differences. Make sure the contrast between colors is high (but see guideline 5). One way to test whether colors are different enough is to view them in grayscale. If you can’t distinguish the colors when they are rendered in grays, they aren’t different enough. 2. Use distinctive colors. Recall that our visual system combines the signals from retinal cone cells to produce three color-opponent channels: red–green, yellow–blue, and black–white (luminance). The colors that people can distin- guish most easily are those that cause a strong signal (positive or negative) on one of the three color-perception channels, and neutral signals on the other two channels. Not surprisingly, those colors are red, green, yellow, blue, black, and white (see Fig. 4.13). All other colors cause signals on more than one color channel, and so our visual system cannot distinguish them from other colors as
  • 85. quickly and easily as it can distinguish those six colors (Ware, 2008). CHAPTER 4 Our Color Vision is Limited46 3. Avoid color pairs that color-blind people cannot distinguish. Such pairs include dark red versus black, dark red versus dark green, blue versus purple, light green versus white. Don’t use dark reds, blues, or violets against any dark colors. Instead, use dark reds, blues, and violets against light yellows and greens. Use an online color-blindness simulator4 to check web pages and images to see how people with various color-vision deficiencies would see them. 4. Use color redundantly with other cues. Don’t rely on color alone. If you use color to mark something, mark it another way as well. Apple’s iPhoto uses both color and a symbol to distinguish “smart” photo albums from regular albums (see Fig. 4.14). 5. Separate strong opponent colors. Placing opponent colors right next to or on top of each other causes a disturbing shimmering sensation, and so it should be avoided (see Fig. 4.15). As shown in Figure 4.5, ITN.net used only pale yellow to mark customers’ current
  • 86. step in making a reservation, which is too subtle. A simple way to strengthen the marking would be to make the current step bold and increase the saturation of the 4 Search the Web for “color-blindness filter” or “color- blindness simulator.” FIGURE 4.13 The most distinctive colors: black, white, red, green, yellow, blue. Each color causes a strong signal on only one color-opponent channel. FIGURE 4.14 Apple’s iPhoto uses color plus a symbol to distinguish two types of albums. FIGURE 4.15 Opponent colors, placed on or directly next to each other, clash. http://ITN.net 47Guidelines for using color yellow (see Fig. 4.16A). But ITN.net opted for a totally new design, which also uses color redundantly with shape (see Figure 4.16B). A graph from the Federal Reserve Bank uses shades of gray (see Fig. 4.17). This is a well-designed graph. Any sighted person could read it.
  • 87. (A) (B) FIGURE 4.16 ITN.net’s current step is highlighted in two ways: with color and shape. FIGURE 4.17 MinneapolisFed.org’s graph uses shade differences visible to all sighted people, on any display. http://ITN.net This page intentionally left blank Designing with the Mind in Mind. http://dx.doi.org/10.1016/B978-0-12-407914-4.00005-1 © 2014 Elsevier Inc. All rights reserved. CHAPTER 49 Our Peripheral Vision is Poor Chapter 4 explained that the human visual system differs from a digital camera in the way it detects and processes color. Our visual system also
  • 88. differs from a camera in its resolution. On a digital camera’s photo sensor, photoreceptive elements are spread uniformly in a tight matrix, so the spatial resolution is constant across the entire image frame. The human visual system is not like that. This chapter explains why l Stationary items in muted colors presented in the periphery of people’s visual field often will not be noticed. l Motion in the periphery is usually noticed. RESOLUTION OF THE FOVEA COMPARED TO THE PERIPHERY The spatial resolution of the human visual field drops greatly from the center to the edges. There are three reasons for this: l Pixel density. Each eye has 6 to 7 million retinal cone cells. They are packed much more tightly in the center of our visual field—a small region called the fovea—than they are at the edges of the retina (see Fig. 5.1). The fovea has about 158,000 cone cells in each square millimeter. The rest of the retina has only 9,000 cone cells per square millimeter. l Data compression. Cone cells in the fovea connect 1:1 to the ganglial neuron cells that begin the processing and transmission of visual data, while elsewhere on the retina, multiple photoreceptor cells (cones and rods)
  • 89. connect to each ganglion cell. In technical terms, information from the visual periphery is com- pressed (with data loss) before transmission to the brain, while information from the fovea is not. 5 CHAPTER 5 Our Peripheral Vision is Poor50 l Processing resources. The fovea is only about 1% of the retina, but the brain’s visual cortex devotes about 50% of its area to input from the fovea. The other half of the visual cortex processes data from the remaining 99% of the retina. The result is that our vision has much, much greater resolution in the center of our visual field than elsewhere (Lindsay and Norman, 1972; Waloszek, 2005). Said in devel- oper jargon: in the center 1% of your visual field (i.e., the fovea), you have a high- resolution TIFF, and everywhere else, you have only a low- resolution JPEG. That is noth- ing like a digital camera. To visualize how small the fovea is compared to your entire visual field, hold your arm straight out and look at your thumb. Your thumbnail, viewed at arm’s length, corresponds approximately to the fovea (Ware, 2008). While you have your eyes
  • 90. focused on the thumbnail, everything else in your visual field falls outside of your fovea on your retina. In the fovea, people with normal vision have very high resolution: they can resolve several thousand dots within that region—better resolution than many of today’s pocket digital cameras. Just outside of the fovea, the resolution is already down to a few dozen dots per inch viewed at arm’s length. At the edges of our vision, the “pixels” of our visual system are as large as a melon (or human head) at arm’s length (see Fig. 5.2). Even though our eyes have more rods than cones—125 million versus 6–7 million— peripheral vision has much lower resolution than foveal vision. This is because while most of our cone cells are densely packed in the fovea (1% of the retina’s area), the rods are spread out over the rest of the retina (99% of the retina’s area). In people with nor- mal vision, peripheral vision is about 20/200, which in the United States is considered Blind spot sdoRsdoR senoCsenoC 180,000 160,000
  • 91. 140,000 120,000 N um be r o f r ec ep to rs pe r s qu ar e m m et er 100,000 80,000
  • 92. 60,000 40,000 20,000 0 70 60 50 40 30 20 10 0 Angle (deg) 10 20 30 40 50 60 70 80 FIGURE 5.1 Distribution of photoreceptor cells (cones and rods) across the retina. From Lindsay and Norman (1972). 51Resolution of the fovea compared to the periphery legally blind. Think about that: in the periphery of your visual field, you are legally blind. Here is how brain researcher David Eagleman (2012; page 23) describes it: The resolution in your peripheral vision is roughly equivalent to looking through a frosted shower door, and yet you enjoy the illusion of seeing the periphery clearly. … Wherever you cast your eyes appears to be in sharp focus, and therefore you assume the whole visual world is in focus. If our peripheral vision has such low resolution, one might wonder why we don’t see
  • 93. the world in a kind of tunnel vision where everything is out of focus except what we are directly looking at now. Instead, we seem to see our surroundings sharply and clearly all around us. We experience this illusion because our eyes move rapidly and constantly about three times per second even when we don’t realize it, focusing our fovea on selected pieces of our environment. Our brain fills in the rest in a gross, impressionistic way based on what we know and expect.1 Our brain does not have to maintain a high-resolution mental model of our environment because it can order the eyes to sample and resample details in the environment as needed (Clark, 1998). For example, as you read this page, your eyes dart around, scanning and reading. No matter where on the page your eyes are focused, you have the impression of viewing a complete page of text, because, of course, you are. 1 Our brains also fill in perceptual gaps that occur during rapid (saccadic) eye movements, when vision is sup- pressed (see Chapter 14). (A) (B) FIGURE 5.2 The resolution of our visual field is high in the center but much lower at the edges. Right image from Vision Research, Vol. 14 (1974), Elsevier.
  • 94. CHAPTER 5 Our Peripheral Vision is Poor52 But now, imagine that you are viewing this page on a computer screen, and the computer is tracking your eye movements and knows where your fovea is on the page. Imagine that wherever you look, the right text for that spot on the page is shown clearly in the small area corresponding to your fovea, but everywhere else on the page, the computer shows random, meaningless text. As your fovea flits around the page, the computer quickly updates each area where your fovea stops to show the correct text there, while the last position of your fovea returns to textual noise. Amaz- ingly, experiments have shown that people rarely notice this: not only can they read, they believe that they are viewing a full page of meaningful text (Clark, 1998). How- ever, it does slow people’s reading, even if they don’t realize it (Larson, 2004). The fact that retinal cone cells are distributed tightly in and near the fovea, and sparsely in the periphery of the retina, affects not only spatial resolution but color resolu- tion. We can discriminate colors better in the center of our visual field than at the edges. Another interesting fact about our visual field is that it has a gap—a small area (blind spot) in which we see nothing. The gap corresponds to the spot on our retina where the optic nerve and blood vessels exit the back of the eye (see Fig.
  • 95. 5.1). There are no retinal rod or cone cells at that spot, so when the image of an object in our visual field happens to fall on that part of the retina, we don’t see it. We usually don’t notice this hole in our vision because our brain fills it in with the surrounding content, like a graphic artist using Photoshop to fill in a blemish on a photograph by copying nearby background pixels. People sometimes experience the blind spot when they gaze at stars. As you look at one star, a nearby star may disappear briefly into the blind spot until you shift your gaze. You can also observe the gap by trying the exercise in Figure 5.3. Some people have other gaps resulting from imperfections on the retina, retinal damage, or brain strokes that affect the visual cortex,2 but the optic nerve gap is an imperfection everyone shares. IS THE VISUAL PERIPHERY GOOD FOR ANYTHING? It seems that the fovea is better than the periphery at just about everything. One might wonder why we have peripheral vision. What is it good for? Our peripheral vision serves three important functions: it guides fovea, detects motion, and lets us see better in the dark. 2 See VisionSimulations.com. FIGURE 5.3
  • 96. To “see” the retinal gap, cover your left eye, hold this book near your face, and focus your right eye on the +. Move the book slowly away from you, staying focused on the +. The @ will disappear at some point. 53Is the visual periphery good for anything? Function 1: Guides fovea First, peripheral vision provides low-resolution cues to guide our eye movements so that our fovea visits all the interesting and crucial parts of our visual field. Our eyes don’t scan our environment randomly. They move so as to focus our fovea on important things, the most important ones (usually) first. The fuzzy cues on the outskirts of our visual field provide the data that helps our brain plan where to move our eyes, and in what order. For example, when we scan a medicine label for a “use by” date, a fuzzy blob in the periphery with the vague form of a date is enough to cause an eye movement that lands the fovea there to allow us to check it. If we are browsing a produce market looking for strawberries, a blurry reddish patch at the edge of our visual field draws our eyes and our attention, even though sometimes it may turn out to be radishes instead of strawberries. If we hear an animal growl nearby, a fuzzy animal-like shape in the corner of our eye will be enough to zip our eyes in that direction, especially if
  • 97. the shape is moving toward us (see Fig. 5.4). How peripheral vision guides and augments central, foveal vision is discussed more in the “Visual Search Is Linear Unless Targets ‘Pop’ in the Periphery” section later in this chapter. Function 2: Detects motion A related guiding function of peripheral vision is that it is good at detecting motion. Anything that moves in our visual periphery, even slightly, is likely to draw our attention—and hence our fovea—toward it. The reason for this phenomenon is that our ancestors—including prehuman ones—were selected for their ability to spot food and avoid predators. As a result, even though we can move our eyes under conscious, intentional control, some of the mechanisms that control where they look are preconscious, involuntary, and very fast. FIGURE 5.4 A moving shape at the edge of our vision draws our eye: it could be food, or it might consider us food. CHAPTER 5 Our Peripheral Vision is Poor54 What if we have no reason to expect that there might be anything interesting in a certain spot in the periphery,3 and nothing in that spot attracts our attention? Our
  • 98. eyes may never move our fovea to that spot, so we may never see what is there. Function 3: Lets us see better in the dark A third function of peripheral vision is to allow us to see in low-light conditions—for example, on starlit nights, in caves, around campfires, etc. These were conditions under which vision evolved, and in which people—like the animals that preceded them on Earth—spent much of their time until the invention of the electric light bulb in the 1800s. Just as the rods are overloaded in well-lighted conditions (see Chapter 5), the cones don’t function very well in low light, so our rods take over. Low-light, rods- only vision is called scotopic vision. An interesting fact is that because there are no rods in the fovea, you can see objects better in low-light conditions (e.g., faint stars) if you don’t look directly at them. EXAMPLES FROM COMPUTER USER INTERFACES The low acuity of our peripheral vision explains why software and website users fail to notice error messages in some applications and websites. When someone clicks a button or a link, that is usually where his or her fovea is positioned. Everything on the screen that is not within 1–2 centimeters of the click location (assuming normal com- puter viewing distance) is in peripheral vision, where resolution is low. If, after the click, an error message appears in the periphery, it should not be surprising that the
  • 99. person might not notice it. For example, at InformaWorld.com, the online publications website of Informa Healthcare, if a user enters an incorrect username or password and clicks “Sign In,” an error message appears in a “message bar” far away from where the user’s eyes are most likely focused (see Fig. 5.5). The red word “Error” might appear in the user’s peripheral vision as a small reddish blob, which would help draw the eyes in that direction. However, the red blob could fall into a gap in the viewer’s visual field, and so not be noticed at all. Consider the sequence of events from a user’s point of view. The user enters a username and password and then clicks “Sign In.” The page redisplays with blank fields. The user thinks “Huh? I gave it my login information and hit ‘Sign In,’ didn’t I? Did I hit the wrong button?” The user reenters the username and password, and clicks “Sign In” again. The page redisplays with empty fields again. Now the user is really confused. The user sighs (or curses), sits back in his chair and lets his eyes scan the screen. Suddenly noticing the error message, the user says “A-ha! Has that error message been there all along?” 3 See Chapter 1 on how expectations bias our perceptions. http://InformaWorld.com
  • 100. 55Examples from computer user interfaces Even when an error message is placed nearer to the center of the viewer’s visual field than in the preceding example, other factors can diminish its visibility. For example, until recently the website of Airborne.com signaled a login failure by dis- playing an error message in red just above the Login ID field (see Fig. 5.6). This error message is entirely in red and fairly near the “Login” button where the user’s eyes are probably focused. Nonetheless, some users would not notice this error message when it first appeared. Can you think of any reasons people might not initially see this error message? One reason is that even though the error message is much closer to where users will be looking when they click the “Login” button, it is still in the periphery, not in the fovea. The fovea is small: just a centimeter or two on a computer screen, assum- ing the user is the usual distance from the screen. A second reason is that the error message is not the only thing near the top of the page that is red. The page title is also red. Resolution in the periphery is low, so when the error message appears, the user’s visual system may not register any change: there was something red up there before, and there still is (see Fig. 5.7).
  • 101. If the page title were black or any other color besides red, the red error message would be more likely to be noticed, even though it appears in the periphery of the users’ visual field. Error Message Fovea FIGURE 5.5 This error message for a faulty sign-in appears in peripheral vision, where it will probably be missed. http://Airborne.com CHAPTER 5 Our Peripheral Vision is Poor56 COMMON METHODS OF MAKING MESSAGES VISIBLE There are several common and well-known methods of ensuring that an error mes- sage will be seen: l Put it where users are looking. People focus in predictable places when interacting with graphical user interfaces (GUIs). In Western societies, people tend to traverse forms and control panels from upper left to lower right. While moving the screen pointer, people usually look either at where it is or where they are moving it to. When people click a button or link, they can usually be
  • 102. assumed to be looking directly at it, at least for a few moments afterward. Designers can use this predictability to position error messages near where they expect users to be looking. FIGURE 5.6 This error message for a faulty login is missed by some users even though it is not far from the “Login” button. FIGURE 5.7 Simulation of a user’s visual field while the fovea is fixed on the “Login” button. 57Common methods of making messages visible FIGURE 5.8 This error message for faulty sign-in is displayed more prominently, near where users will be looking. l Mark the error. Somehow mark the error prominently to indicate clearly that something is wrong. Often this can be done by simply placing the error mes- sage near what it refers to, unless that would place the message too far from where users are likely to be looking. l Use an error symbol. Make errors or error messages more
  • 103. visible by marking them with an error symbol, such as , , , or . l Reserve red for errors. By convention, in interactive computer systems the color red connotes alert, danger, problem, error, etc. Using red for any other information on a computer display invites misinterpretation. But suppose you are designing a website for Stanford University, which has red as its school color. Or suppose you are designing for a Chinese market, where red is consid- ered an auspicious, positive color. What do you do? Use another color for errors, mark them with error symbols, or use stronger methods (see the next section). An improved version of the InformaWorld sign-in error screen uses several of these techniques (see Fig. 5.8). At America Online’s website, the form for registering for a new email account fol- lows the guidelines pretty well (see Fig. 5.9). Data fields with errors are marked with red error symbols. Error messages are displayed in red and are near the error. Further- more, most of the error messages appear as soon as an erroneous entry is made, when CHAPTER 5 Our Peripheral Vision is Poor58
  • 104. the user is still focused on that part of the form, rather than only after the user submits the form. It is unlikely that AOL users will miss seeing these error messages. HEAVY ARTILLERY FOR MAKING USERS NOTICE MESSAGES If the common, conventional methods of making users notice messages are not enough, three stronger methods are available to user-interface designers: pop-up mes- sage in error dialog box, use of sound (e.g., beep), and wiggle or blink briefly. How- ever, these methods, while very effective, have significant negative effects, so they should be used sparingly and with great care. Method 1: Pop-up message in error dialog box Displaying an error message in a dialog box sticks it right in the user’s face, making it hard to miss. Error dialog boxes interrupt the user’s work and demand immediate attention. That is good if the error message signals a critical condition, but it can annoy people if such an approach is used for a minor message, such as confirming the execu- tion of a user-requested action. The annoyance of pop-up messages rises with the degree of modality. Nonmodal pop-ups allow users to ignore them and continue working. Application-modal pop- ups block any further work in the application that displayed the error, but allow users to interact with other software on their computer. System- modal pop-ups
  • 105. block any user action until the dialog has been dismissed. Application-modal pop-ups should be used sparingly—for example, only when application data may be lost if the user doesn’t attend to the error. System-modal FIGURE 5.9 New member registration at AOL.com displays error messages prominently, near each error. http://AOL.com 59Heavy artillery for making users notice messages pop-ups should be used extremely rarely—basically only when the system is about to crash and take hours of work with it, or if people will die if the user misses the error message. On the Web, an additional reason to avoid pop-up error dialog boxes is that some people set their browsers to block all pop-up windows. If your website relies on pop-up error messages, some users may never see them. REI.com has an example of a pop-up dialog being used to display an error mes- sage. The message is displayed when someone who is registering as a new customer omits required fields in the form (see Fig. 5.10). Is this an appropriate use of a pop-up dialog? AOL.com (see Fig. 5.9) shows that missing data errors
  • 106. can be signaled quite well without pop-up dialogs, so REI.com’s use of them seems a bit heavy-handed. Examples of more appropriate use of error dialog boxes come from Microsoft Excel (see Fig. 5.11A) and Adobe InDesign (see Fig. 5.11B). In both cases, loss of data is at stake. Method 2: Use sound (e.g., beep) When a computer beeps, that tells its user something has happened that requires attention. The person’s eyes reflexively begin scanning the screen for whatever caused the beep. This can allow the user to notice an error message that is someplace other than where the user was just looking, such as in a standard error message box on the display. That is the value of beeping. However, imagine many people in a cubicle work environment or a classroom, all using an application that signals all errors and warnings by beeping. Such a work- place would be very annoying, to say the least. Worse, people wouldn’t be able to tell whether their own computer or someone else’s was beeping. FIGURE 5.10 REI’s pop-up dialog box signals required data that was omitted. It is hard to miss, but perhaps overkill. http://AOL.com
  • 107. CHAPTER 5 Our Peripheral Vision is Poor60 The opposite situation is noisy work environments (e.g., factories or computer server rooms), where auditory signals emitted by an application might be masked by ambient noise. Even in non-noisy environments, some computer users simply prefer quiet, and mute the sound on their computers or turn it way down. For these reasons, signaling errors and other conditions with sound are remedies that can be used only in very special, controlled situations. Computer games often use sound to signal events and conditions. In games, sound isn’t annoying; it is expected. Its use in games is widespread, even in game arcades, where dozens of machines are all banging, roaring, buzzing, clanging, beep- ing, and playing music at once. (Well, it is annoying to parents who have to go into the arcades and endure all the screeching and booming to retrieve their kids, but the games aren’t designed for parents.) Method 3: Wiggle or blink briefly As described earlier in this chapter, our peripheral vision is good at detecting motion, and motion in the periphery causes reflexive eye movements that bring the motion into the fovea. User-interface designers can make use of this by wiggling or flashing
  • 108. messages briefly when they want to ensure that users see them. It doesn’t take much motion to trigger eye movement toward the motion. Just a tiny bit of motion is enough to make a viewer’s eyes zip over in that direction. Millions of years of evolution have had quite an effect. As an example of using motion to attract users’ eye attention, Apple’s iCloud online service briefly shakes the entire dialog box horizontally when a user enters an invalid username or password (see Fig. 5.12). In addition to clearly indicating “No” (like a person shaking his head), this attracts the users’ eyeballs, guaranteed. (Because, after all, the motion in the corner of your eye might be a leopard.) The most common use of blinking in computer user interfaces (other than adver- tisements) is in menu bars. When an action (e.g., Edit or Copy) is selected from a (A) (B) FIGURE 5.11 Appropriate pop-up error dialogs: (A) Microsoft Excel and (B) Adobe InDesign. 61Heavy artillery for making users notice messages
  • 109. menu, it usually blinks once before the menu closes to confirm that the system “got” the command—that is, that the user didn’t miss the menu item. This use of blinking is very common. It is so quick that most computer users aren’t even aware of it, but if menu items didn’t blink once, we would have less confidence that we actually selected them. Motion and blinking, like pop-up dialog boxes and beeping, must be used spar- ingly. Most experienced computer users consider wiggling, blinking objects on screen to be annoying. Most of us have learned to ignore displays that blink because many such displays are advertisements. Conversely, a few computer users have atten- tional impairments that make it difficult for them to ignore something that is blink- ing or wiggling. Therefore, if wiggling or blinking is used, it should be brief—it should last about a quarter- to a half-second, no longer. Otherwise, it quickly goes from an uncon- scious attention-grabber to a conscious annoyance. Use heavy-artillery methods sparingly to avoid habituating your users There is one final reason to use the preceding heavy-artillery methods sparingly (i.e., only for critical messages): to avoid habituating your users. When pop-ups, sound, motion, and blinking are used too often to attract users’
  • 110. attention, a psychological phenomenon called habituation sets in (see Chapter 1). Our brain pays less and less attention to any stimulus that occurs frequently. It is like the old fable of the boy who cried “Wolf!” too often: eventually, the vil- lagers learned to ignore his cries, so when a wolf actually did come, his cries went unheeded. Overuse of strong attention-getting methods can cause important mes- sages to be blocked by habituation. FIGURE 5.12 Apple’s iCloud shakes the dialog box briefly on login errors to attract a user’s fovea toward it. CHAPTER 5 Our Peripheral Vision is Poor62 VISUAL SEARCH IS LINEAR UNLESS TARGETS “POP” IN THE PERIPHERY As explained earlier, one function of peripheral vision is to drive our eyes to focus the fovea on important things—things we are seeking or that might be a threat. Objects moving in our peripheral vision fairly reliably “yank” our eyes in that direction. When we are looking for an object, our entire visual system, including the periph- ery, primes itself to detect that object. In fact, the periphery is a crucial component in visual search, despite its low spatial and color resolution.
  • 111. However, just how helpful the periphery is in aiding visual search depends strongly on what we are looking for. Look quickly at Figure 5.13 and find the Z. To find the Z, you had to scan carefully through the characters until your fovea landed on it. In the lingo of vision researchers, the time to find the Z is linear: it depends approximately linearly on the number of distracting characters and the position of the Z among them. Now look quickly at Figure 5.14 and find the bold character. That was much easier (i.e., faster), wasn’t it? You didn’t have to scan your fovea carefully through the distracting characters. Your periphery quickly detected the boldness and determined its location, and because that is what you were seeking, FIGURE 5.13 Finding the Z requires scanning carefully through the characters. FIGURE 5.14 Finding the bold letter does not require scanning through everything. 63Visual search is linear unless targets “pop” in the periphery
  • 112. your visual system moved your fovea there. Your periphery could not determine exactly what was bold—that is beyond its resolution and abilities—but it did locate the boldness. In vision-researcher lingo, the periphery was primed to look for boldness in parallel over its entire area, and boldness is a distinctive feature of the target, so search- ing for a bold target is nonlinear. In designer lingo, we simply say that boldness “pops out” (“pops” for short) in the periphery, assuming that only the target is bold. Color “pops” even more strongly. Compare counting the L’s in Figure 5.15 with counting the blue characters in Figure 5.16. What else makes things “pop” in the periphery? As described earlier, the periph- ery easily detects motion, so motion “pops.” Generalizing from boldness, we also can say that font weight “pops,” because if all but one of the characters on a display were bold, the nonbold character would stand out. Basically, a visual target will pop out in your periphery if it differs from surrounding objects in features the periphery can detect. The more distinctive features of the target, the more it “pops,” assuming the periphery can detect those features. Using peripheral “pop” in design Designers use peripheral “pop” to focus the attention of a product’s users, as well as to allow users to find information faster. Chapter 3 described
  • 113. how visual hierarchy— titles, headings, boldness, bullets, and indenting—can make it easier for users to spot FIGURE 5.15 Counting L’s is hard; character shape doesn’t “pop” among characters. FIGURE 5.16 Counting blue characters is easy because color “pops.” CHAPTER 5 Our Peripheral Vision is Poor64 and extract from text the information they need. Glance back at Figure 3.11 in Chap- ter 3 and see how the headings and bullets make the topics and subtopics “pop” so readers can go right to them. Many interactive systems use color to indicate status, usually reserving red for problems. Online maps and some vehicle GPS devices mark traffic jams with red so they stand out (see Fig. 5.17). Systems for controlling air traffic mark potential colli- sions in red (see Fig. 5.18). Applications for monitoring servers and networks use color to show the health status of assets or groups of them (see Fig. 5.19). These are all uses of peripheral “pop” to make important information stand out
  • 114. and visual search nonlinear. When there are many possible targets Sometimes in displays of many items, any of them could be what the user wants. Examples include command menus (see Fig. 5.20A) and object pallets (see Fig. 5.20B). Let’s assume that the application cannot anticipate which item or items a user is likely to want, and highlight those. That is a fair assumption for today’s applications.4 Are users doomed to have to search linearly through such displays for the item they want? That depends. Designers can try to make each item so distinctive that when a specific one is the user’s target, the user’s peripheral vision will be able to spot it among 4 But in the not-too-distant future it might not be. FIGURE 5.17 Google Maps uses color to show traffic conditions. Red indicates traffic jams. 65Visual search is linear unless targets “pop” in the periphery FIGURE 5.18 Air traffic control systems often use red to make potential collisions stand out. FIGURE 5.19
  • 115. Paessler’s monitoring tool uses color to show the health of network components. CHAPTER 5 Our Peripheral Vision is Poor66 all the other items. Designing distinctive sets of icons is hard— especially when the set is large—but it can be done (see Johnson et. al, 1989). Designing sets of icons that are so distinctive that they can be distinguished in peripheral vision is very hard, but not impossible. For example, if a user goes to the Mac OS application pallet to open his or her calendar, a white rectangular blob in the periphery with something black in the middle is more likely to attract the user’s eye than a blue circular blob (see Fig. 5.20B). The trick is not to get too fancy and detailed with the icons—give each one a distinctive color and gross shape. On the other hand, if the potential targets are all words, as in command menus (see Fig. 20A), visual distinctiveness is not an option. In textual menus and lists, visual search will be linear, at least at first. With practice, users learn the positions of frequently used items in menus, lists, and pallets, so searching for particular items is no longer linear. That is why applications should never move items around in menus, lists, or
  • 116. pallets. Doing that prevents users from learning item positions, thereby dooming them to search linearly forever. Therefore, “dynamic menus” is considered a major user-interface design blooper ( Johnson, 2007). FIGURE 5.20 (A) Microsoft Word Tools menu, and (B) MacOS application pallet. Designing with the Mind in Mind. http://dx.doi.org/10.1016/B978-0-12-407914-4.00006-3 © 2014 Elsevier Inc. All rights reserved. CHAPTER 67 Reading is Unnatural 6 Most people in industrialized nations grow up in households and school districts that promote education and reading. They learn to read as young children and become good readers by adolescence. As adults, most of our activities during a normal day involve reading. The process of reading—deciphering words into their meaning—is for most educated adults automatic, leaving our conscious minds free to ponder the meaning and implications of what we are reading. Because of this background, it is common for
  • 117. good readers to consider reading to be as “natural” a human activity as speaking is. WE’RE WIRED FOR LANGUAGE, BUT NOT FOR READING Speaking and understanding spoken language is a natural human ability, but reading is not. Over hundreds of thousands—perhaps millions—of years, the human brain evolved the neural structures necessary to support spoken language. As a result, normal humans are born with an innate ability to learn as toddlers, with no systematic training, whatever language they are exposed to. After early childhood, this ability decreases significantly. By adolescence, learning a new language is the same as learning any other skill: it requires instruction and practice, and the learning and processing are handled by different brain areas from those that handled it in early childhood (Sousa, 2005). In contrast, writing and reading did not exist until a few thousand years BCE and did not become common until only four or five centuries ago— long after the human brain had evolved into its modern state. At no time during childhood do our brains show any special innate ability to learn to read. Instead, reading is an artificial skill that we learn by systematic instruction and practice, like playing a violin, juggling, or reading music (Sousa, 2005). Many people never learn to read well, or at all Because people are not innately “wired” to learn to read, children who either lack
  • 118. caregivers who read to them or who receive inadequate reading instruction in CHAPTER 6 Reading is Unnatural68 school may never learn to read. There are a great many such people, especially in the developing world. By comparison, very few people never learn a spoken language. For a variety of reasons, some people who learn to read never become good at it. Perhaps their parents did not value and promote reading. Perhaps they attended substandard schools or didn’t attend school at all. Perhaps they learned a second language but never learned to read well in that language. People who have cognitive or perceptual impairments such as dyslexia may never read easily. A person’s ability to read is specific to a language and a script (a system of writ- ing). To see what text looks like to someone who cannot read, just look at a para- graph printed in a language and script that you do not know (see Fig. 6.1). Alternatively, you can approximate the feeling of illiteracy by taking a page writ- ten in a familiar script and language—such as a page of this book—and turning it upside down. Turn this book upside down and try reading the
  • 119. next few paragraphs. This exercise only approximates the feeling of illiteracy. You will discover that the inverted text appears foreign and illegible at first, but after a minute you will be able to read it, albeit slowly and laboriously. Learning to read = training our visual system Learning to read involves training our visual system to recognize patterns—the patterns exhibited by text. These patterns run a gamut from low level to high level: l Lines, contours, and shapes are basic visual features that our brain recognizes innately. We don’t have to learn to recognize them. l Basic features combine to form patterns that we learn to identify as charac- ters—letters, numeric digits, and other standard symbols. In ideographic scripts, such as Chinese, symbols represent entire words or concepts. FIGURE 6.1 To see how it feels to be illiterate, look at text printed in a foreign script: (A) Amharic and (B) Tibetan. 69We’re wired for language, but not for reading l In alphabetic scripts, patterns of characters form morphemes,
  • 120. which we learn to recognize as packets of meaning—for example, “farm,” “tax,” “-ed,” and “-ing” are morphemes in English. l Morphemes combine to form patterns that we recognize as words—for example, “farm,” “tax,” “-ed,” and “-ing” can be combined to form the words “farm,” “farmed,” “farming,” “tax,” “taxed,” and “taxing.” Even ideographic scripts include symbols that serve as morphemes or modifiers of meaning rather than as words or concepts. l Words combine to form patterns that we learn to recognize as phrases, idiom- atic expressions, and sentences. l Sentences combine to form paragraphs. Actually, only part of our visual system is trained to recognize textual patterns involved in reading: the fovea and a small area immediately surrounding it (known as the perifovea), and the downstream neural networks running through the optic nerve to the visual cortex and into various parts of our brain. The neural networks starting else- where in our retinas do not get trained to read. More about this is explained later in the chapter. Learning to read also involves training the brain’s systems that control eye move- ment to move our eyes in a specific way over text. The main
  • 121. direction of eye move- ment depends on the direction in which the language we are reading is written: European language scripts are read left to right, many middle Eastern language scripts are read right to left, and some language scripts are read top to bottom. Beyond that, the precise eye movements differ depending on whether we are read- ing, skimming for overall meaning, or scanning for specific words. How we read Assuming our visual system and brain have successfully been trained, reading becomes semi-automatic or fully automatic—both the eye movement and the processing. As explained earlier, the center of our visual field—the fovea and perifovea—is the only part of our visual field that is trained to read. All text that we read enters our visual system after being scanned by the central area, which means that reading requires a lot of eye movement. As explained in Chapter 5 on the discussion of peripheral vision, our eyes con- stantly jump around, several times a second. Each of these movements, called sac- cades, lasts about 0.1 second. Saccades are ballistic, like firing a shell from a cannon: their endpoint is determined when they are triggered, and once triggered, they always execute to completion. As described in earlier chapters,
  • 122. the destinations of saccadic eye movements are programmed by the brain from a combination of our goals, events in the visual periphery, events detected and localized by other percep- tual senses, and past history including training. CHAPTER 6 Reading is Unnatural70 When we read, we may feel that our eyes scan smoothly across the lines of text, but that feeling is incorrect. In reality, our eyes continue with saccades during read- ing, but the movements generally follow the line of text. They fix our fovea on a word, pause there for a fraction of a second to allow basic patterns to be captured and transmitted to the brain for further analysis, then jump to the next important word (Larson, 2004). Eye fixations while reading always land on words, usually near the center, never on word boundaries (see Fig. 6.2). Very common small connector and function words like “a,” “and,” “the,” “or,” “is,” and “but” are usually skipped over, their presence either detected in perifoveal vision or simply assumed. Most of the saccades during reading are in the text’s normal reading direction, but a few— about 10%—jump backwards to previous words. At the end of each line of text, our eyes jump to where our brain guesses the next line begins.1 How much can we take in during each eye fixation during
  • 123. reading? For reading European-language scripts at normal reading distances and text- font sizes, the fovea clearly sees 3–4 characters on either side of the fixation point. The perifovea sees out about 15–20 characters from the fixation point, but not very clearly (see Fig. 6.3). According to reading researcher Kevin Larson (2004), the reading area in and around the fovea consists of three distinct zones (for European- language scripts): Closest to the fixation point is where word recognition takes place. This zone is usu- ally large enough to capture the word being fixated, and often includes smaller function words directly to the right of the fixated word. The next zone extends a few letters past the word recognition zone, and readers gather preliminary information about the next let- ters in this zone. The final zone extends out to 15 letters past the fixation point. Informa- tion gathered out this far is used to identify the length of upcoming words and to identify the best location for the next fixation point. 1 Later we will see that centered text disrupts the brain’s guess about where the next line starts. FIGURE 6.2 Saccadic eye movements during reading jump between important words. FIGURE 6.3
  • 124. Visibility of words in a line of text, with fovea fixed on the word “years.” 71Is reading feature-driven or context-driven? Because our visual system has been trained to read, perception around the fixation point is asymmetrical: it is more sensitive to characters in the reading direction than in the other direction. For European-language scripts, this is toward the right. That makes sense because characters to the left of the fixation point have usually already been read. IS READING FEATURE-DRIVEN OR CONTEXT-DRIVEN? As explained earlier, reading involves recognizing features and patterns. Pattern rec- ognition, and therefore reading, can be either a bottom-up, feature-driven process, or a top-down, context-driven process. In feature-driven reading, the visual system starts by identifying simple features— line segments in a certain orientation or curves of a certain radius—on a page or display, and then combines them into more complex features, such as angles, multiple curves, shapes, and patterns. Then the brain recognizes certain shapes as characters or symbols representing letters, numbers, or, for ideographic scripts, words. In alphabetic scripts, groups of letters are perceived as morphemes and words. In all types of scripts, sequences
  • 125. of words are parsed into phrases, sentences, and paragraphs that have meaning. Feature-driven reading is sometimes referred to as “bottom-up” or “context-free.” The brain’s ability to recognize basic features—lines, edges, angles, etc.—is built in and therefore automatic from birth. In contrast, recognition of morphemes, words, and phrases has to be learned. It starts out as a nonautomatic, conscious process requiring conscious analysis of letters, morphemes, and words, but with enough practice it becomes automatic (Sousa, 2005). Obviously, the more common a mor- pheme, word, or phrase, the more likely that recognition of it will become auto- matic. With ideographic scripts such as Chinese, which have many times more symbols than alphabetic scripts do, people typically take many years longer to become skilled readers. Context-driven or top-down reading operates in parallel with feature-driven read- ing but it works the opposite way: from whole sentences or the gist of a paragraph down to the words and characters. The visual system starts by recognizing high- level patterns like words, phrases, and sentences, or by knowing the text’s meaning in advance. It then uses that knowledge to figure out—or guess—what the compo- nents of the high-level pattern must be (Boulton, 2009). Context-driven reading is less likely to become fully automatic because most phrase-level
  • 126. and sentence-level patterns and contexts don’t occur frequently enough to allow their recognition to become burned into neural firing patterns. But there are exceptions, such as idiom- atic expressions. To experience context-driven reading, glance quickly at Figure 6.4, then immedi- ately direct your eyes back here and finish reading this paragraph. Try it now. What did the text say? Now look at the same sentence again more carefully. Do you read it the same way now? CHAPTER 6 Reading is Unnatural72 Also, based on what we have already read and our knowledge of the world, our brains can sometimes predict text that the fovea has not yet read (or its meaning), allowing us to skip reading it. For example, if at the end of a page we read “It was a dark and stormy,” we would expect the first word on the next page to be “night.” We would be surprised if it was some other word (e.g., “cow”). Feature-driven, bottom-up reading dominates; context assists It has been known for decades that reading involves both feature-driven (bottom-up) processing and context-driven (top-down) processing. In addition to being able to
  • 127. figure out the meaning of a sentence by analyzing the letters and words in it, people can determine the words of a sentence by knowing the meaning of the sentence, or the letters in a word by knowing what word it is (see Fig. 6.5). The question is: Is skilled reading primarily bottom-up or top-down, or is neither mode dominant? Early scientific studies of reading—from the late 1800s through about 1980— seemed to show that people recognize words first and from that determine what letters are present. The theory of reading that emerged from those findings was that our visual system recognizes words primarily from their overall shape. This theory failed to account for certain experimental results and so was controversial among reading researchers, but it nonetheless gained wide acceptance among nonresearch- ers, especially in the graphic design field (Larson, 2004; Herrmann, 2011). Mray had a ltilte lmab, its feclee was withe as sown. And ervey wehre taht Mray wnet, the lmab was srue to go. (A) (B) Twinkle twinkle little star how I wonder what you are FIGURE 6.5 Top-down reading: most readers, especially those who know the
  • 128. songs from which these text passages are taken, can read these passages even though the words (A) have all but their first and last letters scrambled and (B) are mostly obscured. The rain in Spain falls manly in the the plain FIGURE 6.4 Top-down recognition of the expression can inhibit seeing the actual text. 73Is reading feature-driven or context-driven? Similarly, educational researchers in the 1970s applied information theory to reading, and assumed that because of redundancies in written language, top-down, context-driven reading would be faster than bottom-up, feature- driven reading. This assumption led them to hypothesize that reading for highly skilled (fast) readers would be dominated by context-driven (top-down) processing. This theory was probably responsible for many speed-reading methods of the 1970s and 1980s, which supposedly trained people to read fast by taking in whole phrases and sentences at a time. However, empirical studies of readers conducted since then have demonstrated conclusively that those early theories were false. Summing up
  • 129. the research are state- ments from reading researchers Kevin Larson (2004) and Keith Stanovich (Bolton, 2009), respectively: Word shape is no longer a viable model of word recognition. The bulk of scientific evi- dence says that we recognize a word’s component letters, then use that visual informa- tion to recognize a word. Context [is] important, but it’s a more important aid for the poorer reader who doesn’t have automatic context-free recognition instantiated. In other words, reading consists mainly of context-free, bottom- up, feature-driven processes. In skilled readers, these processes are well learned to the point of being automatic. Context-driven reading today is considered mainly a backup method that, although it operates in parallel with feature-based reading, is only relevant when feature-driven reading is difficult or insufficiently automatic. Skilled readers may resort to context-based reading when feature-based read- ing is disrupted by poor presentation of information (see examples later in this chapter). Also, in the race between context-based and feature- based reading to decipher the text we see, contextual cues sometimes win out over features. As an example of context-based reading, Americans visiting England sometimes mis- read “to let” signs as “toilet,” because in the United States they
  • 130. see the word “toi- let” often, but they almost never see the phrase “to let”— Americans use “for rent” instead. In less skilled readers, feature-based reading is not automatic; it is conscious and laborious. Therefore, more of their reading is context-based. Their involuntary use of context-based reading and nonautomatic feature-based reading consumes short- term cognitive capacity, leaving little for comprehension.2 They have to focus on 2 Chapter 10 describes the differences between automatic and controlled cognitive processing. Here, we will simply say that controlled processes burden working memory, while automatic processes do not. CHAPTER 6 Reading is Unnatural74 deciphering the stream of words, leaving no capacity for constructing the meaning of sentences and paragraphs. That is why poor readers can read a passage aloud but afterward have no idea what they just read. Why is context-free (bottom-up) reading not automatic in some adults? Some peo- ple didn’t get enough experience reading as young children for the feature-driven recognition processes to become automatic. As they grow up, they find reading men- tally laborious and taxing, so they avoid reading, which
  • 131. perpetuates and compounds their deficit (Boulton, 2009). SKILLED AND UNSKILLED READING USE DIFFERENT PARTS OF THE BRAIN Before the 1980s, researchers who wanted to understand which parts of the brain are involved in language and reading were limited mainly to studying people who had suffered brain injuries. For example, in the mid-19th century, doctors found that people with brain damage near the left temple—an area now called Broca’s area after the doctor who discovered it—can understand speech but have trouble speak- ing, and that people with brain damage near the left ear—now called Wernicke’s area— cannot understand speech (Sousa, 2005) (see Fig. 6.6). In recent decades, new methods of observing the operation of functioning brains in living people have been developed: electroencephalography (EEG), functional magnetic resonance imaging (fMRI), and functional magnetic resonance spectros- copy (fMRS). These methods allow researchers to watch the response in different areas of a person’s brain—including the sequence in which they respond—as the person perceives various stimuli or performs specific tasks (Minnery and Fine, 2009). Broca’s area Wernicke’s area
  • 132. FIGURE 6.6 The human brain, showing Broca’s area and Wernicke’s area. 75Poor information design can disrupt reading Using these methods, researchers have discovered that the neural pathways involved in reading differ for novice versus skilled readers. Of course, the first area to respond during reading is the occipital (or visual) cortex at the back of the brain. That is the same regardless of a person’s reading skill. After that, the pathways diverge (Sousa, 2005): l Novice. First an area of the brain just above and behind Wernicke’s area becomes active. Researchers have come to view this as the area where, at least with alphabetic scripts such as English and German, words are “sounded out” and assembled—that is, letters are analyzed and matched with their correspond- ing sounds. The word-analysis area then communicates with Broca’s area and the frontal lobe, where morphemes and words—units of meaning—are recog- nized and overall meaning is extracted. For ideographic languages, where sym- bols represent whole words and often have a graphical correspondence to their meaning, sounding out of words is not part of reading.
  • 133. l Advanced. The word-analysis area is skipped. Instead the occipitotemporal area (behind the ear, not far from the visual cortex) becomes active. The prevailing view is that this area recognizes words as a whole without sounding them out, and then that activity activates pathways toward the front of the brain that cor- respond to the word’s meaning and mental image. Broca’s area is only slightly involved. Findings from brain scan methods of course don’t indicate exactly what pro- cesses are being used, but they support the theory that advanced readers use differ- ent processes from those novice readers use. POOR INFORMATION DESIGN CAN DISRUPT READING Careless writing or presentation of text can reduce skilled readers’ automatic, con- text-free reading to conscious, context-based reading, burdening working memory, thereby decreasing speed and comprehension. In unskilled readers, poor text pre- sentation can block reading altogether. Uncommon or unfamiliar vocabulary One way software often disrupts reading is by using unfamiliar vocabulary—words the intended readers don’t know very well or at all. One type of unfamiliar terminology is computer jargon, sometimes known as “geek speak.” For example, an intranet application displayed
  • 134. the following error mes- sage if a user tried to use the application after more than 15 minutes of letting it sit idle: Your session has expired. Please reauthenticate. CHAPTER 6 Reading is Unnatural76 The application was for finding resources—rooms, equipment, etc.—within the company. Its users included receptionists, accountants, and managers, as well as engineers. Most nontechnical users would not understand the word “reauthenticate,” so they would drop out of automatic reading mode into con- scious wondering about the message’s meaning. To avoid disrupting reading, the application’s developers could have used the more familiar instruction, “Login again.” For a discussion of how “geek speak” in computer-based systems affects learning, see Chapter 11. Reading can also be disrupted by uncommon terms even if they are not computer technology terms. Here are some rare English words, including many that appear mainly in contracts, privacy statements, or other legal documents: l Aforementioned: mentioned previously l Bailiwick: the region in which a sheriff has legal powers;
  • 135. more generally: domain of control l Disclaim: renounce any claim to or connection with; disown; repudiate l Heretofore: up to the present time; before now l Jurisprudence: the principles and theories on which a legal system is based l Obfuscate: make something difficult to perceive or understand l Penultimate: next to the last, as in “the next to the last chapter of a book” When readers—even skilled ones—encounter such a word, their automatic read- ing processes probably won’t recognize it. Instead, their brain uses less automatic processes, such as sounding out the word’s parts and using them to figure out its meaning, figuring out the meaning from the context in which the word appears, or looking the word up in a dictionary. Difficult scripts and typefaces Even when the vocabulary is familiar, reading can be disrupted by typefaces with unfamiliar or hard-to-distinguish shapes. Context-free, automatic reading is based on recognizing letters and words bottom-up from their lower- level visual features. Our visual system is quite literally a neural network that must be trained to recog-
  • 136. nize certain combinations of shapes as characters. Therefore, a typeface with dif- ficult-to-recognize features and shapes will be hard to read. For example, try to read Abraham Lincoln’s Gettysburg Address in an outline typeface in ALL CAPS (see Fig. 6.7). Comparison studies show that skilled readers read uppercase text 10–15% more slowly than lowercase text. Current-day researchers attribute that difference mainly to a lack of practice reading uppercase text, not to an inherent lower recognizability of uppercase text (Larson, 2004). Nonetheless, it is important for designers to be aware of the practice effect (Herrmann, 2011). 77Poor information design can disrupt reading Tiny fonts Another way to make text hard to read in software applications, websites, and elec- tronic appliances is to use fonts that are too small for their intended readers’ visual system to resolve. For example, try to read the first paragraph of the U.S. Constitu- tion in a seven-point font (see Fig. 6.8). Developers sometimes use tiny fonts because they have a lot of text to display in a small amount of space. But if the intended users of the system cannot read the text, or can read it only laboriously, the text might as well not be
  • 137. there. Text on noisy background Visual noise in and around text can disrupt recognition of features, characters, and words, and therefore drop reading out of automatic feature- based mode into a more conscious and context-based mode. In software user interfaces and websites, visual noise often results from designers’ placing text over a patterned background or displaying text in colors that contrast poorly with the background, as an exam- ple from Arvanitakis.com shows (see Fig. 6.9). FIGURE 6.7 Text in ALL CAPS is harder to read because we are not practiced at doing it. Outline typefaces complicate feature recognition. This example demonstrates both. We the people of the United States, in Order to form a more perfect Union, establish Justice, insure domestic Tranquility, provide for the common defense, promote the general Welfare, and secure the Blessings of Liberty to ourselves and our Posterity, do ordain and establish this Constitution for the United States of America. FIGURE 6.8 The opening paragraph of the U.S. Constitution, presented in a seven-point font.
  • 138. http://Arvanitakis.com CHAPTER 6 Reading is Unnatural78 There are situations in which designers intend to make text hard to read. For example, a common security measure on the Web is to ask site users to identify dis- torted words, as proof that they are a live human beings and not an Internet “’bot.” This relies on the fact that most people can read text that Internet ’bots cannot cur- rently read. Text displayed as a challenge to test a registrant’s humanity is called a captcha3 (see Fig. 6.10). Of course, most text displayed in a user interface should be easy to read. A patterned background need not be especially strong to disrupt people’s ability to read text placed over it. For example, the Federal Reserve Bank’s collection of websites formerly provided a mortgage calculator that was decorated with a repeating pastel background with a home and neighborhood theme. Although well-intentioned, the decorated background made the calculator hard to read (see Fig. 6.11). Information buried in repetition Visual noise can also come from the text itself. If successive lines of text contain a lot of repetition, readers receive poor feedback about what line they are focused
  • 139. on, plus it is hard to pick out the important information. For example, recall the example from the California Department of Motor Vehicles web site in Chapter 3 (see Fig. 3.2). 3 The term originally comes from the word “capture,” but it is also said to be an acronym for “Completely Automated Public Turing test to tell Computers and Humans Apart.” FIGURE 6.10 Text that is intentionally displayed with noise so that Web- crawling software cannot read it is called a captcha. FIGURE 6.9 Arvanitakis.com uses text on a noisy background and poor color contrast. 79Poor information design can disrupt reading Another example of repetition that creates noise is the computer store on Apple.com. The pages for ordering a laptop computer list different keyboard options for a computer in a very repetitive way, making it hard to see that the essential difference between the keyboards is the language that they support (see Fig. 6.12).
  • 140. Centered text One aspect of reading that is highly automatic in most skilled readers is eye move- ment. In automatic (fast) reading, our eyes are trained to go back to the same hori- zontal position and down one line. If text is centered or right- aligned, each line of text starts in a different horizontal position. Automatic eye movements, therefore, take our eyes back to the wrong place, so we must consciously adjust our gaze to the actual start of each line. This drops us out of automatic mode and slows us down greatly. With poetry and wedding invitations, that is probably okay, but with any FIGURE 6.11 The Federal Reserve Bank’s online mortgage calculator displayed text on a patterned background. FIGURE 6.12 Apple.com’s “Buy Computer” page lists options in which the important information (keyboard language compatibility) is buried in repetition. http://Apple.com CHAPTER 6 Reading is Unnatural80 other type of text, it is a disadvantage. An example of centered prose text is provided by the web site of FargoHomes, a real estate company (see Fig.
  • 141. 6.13). Try reading the text quickly to demonstrate to yourself how your eyes move. The same site also centers numbered lists, really messing up readers’ automatic eye movement (see Fig. 6.14). Try scanning the list quickly. Exclusive Buyer Agency Offer (No Cost) Service to Home Buyers! Dan and Lida want to work for you if: ....................................................................................... Would you like to avoid sellers agents who are pushing, selling, and trying to make sales quotas? Do you want your agent to be on your side and not the sellers side? Do you expect your agent to be responsible and professional....? If you don’t like to have your time wasted, Dan and Lida want to work for you.... If you understand that everything we say and do, is to save you time, money, and keep you out of trouble.... -and if you understand that some agents income and allegiances are in direct competition with your best interests.... -and if you understand that we take risks, give you 24/7 access, and put aside other paying business for you... -and if you understand that we have a vested interest in helping you learn to make all the right choices... - then, call us now, because Dan and Lida want to work for you!! FIGURE 6.13
  • 142. FargoHomes.com centers text, thwarting automatic eye movement patterns. FIGURE 6.14 FargoHomes.com centers numbered items, really thwarting automatic eye movement patterns. http://FargoHomes.com http://FargoHomes.com 81Poor information design can disrupt reading Design implications: Don’t disrupt reading; support it! Obviously, a designer’s goal should be to support reading, not disrupt it. Skilled (fast) reading is mostly automatic and mostly based on feature, character, and word recog- nition. The easier the recognition, the easier and faster the reading. Less skilled read- ing, by contrast, is greatly assisted by contextual cues. Designers of interactive systems can support both reading methods by following these guidelines: 1) Ensure that text in user interfaces allows the feature-based automatic processes to function effectively by avoiding the disruptive flaws described earlier: difficult or tiny fonts, patterned backgrounds, centering, etc.
  • 143. 2) Use restricted, highly consistent vocabularies—sometimes referred to in the industry as plain language4 or simplified language (Redish, 2007). 3) Format text to create a visual hierarchy (see Chapter 3) to facilitate easy scan- ning: use headings, bulleted lists, tables, and visually emphasized words (see Fig. 6.15). Experienced information architects, content editors, and graphic designers can be very useful in ensuring that text is presented to support easy scanning and reading. 4 For more information on plain language, see the U.S. government website, www.plainlanguage.gov. FIGURE 6.15 Microsoft Word’s “Help” homepage is easy to scan and read. CHAPTER 6 Reading is Unnatural82 MUCH OF THE READING REQUIRED BY SOFTWARE IS UNNECESSARY In addition to committing design mistakes that disrupt reading, many software user interfaces simply present too much text, requiring users to read more than is neces- sary. Consider how much unnecessary text there is in a dialog box for setting text
  • 144. entry properties in the SmartDraw application (see Fig. 6.16). Software designers often justify lengthy instructions by arguing: “We need all that text to explain clearly to users what to do.” However, instructions can often be shortened with no loss of clarity. Let’s examine how the Jeep company, between 2002 and 2007, shortened its instructions for finding a local Jeep dealer (see Fig. 6.17): 4) 2002: The “Find a Dealer” page displayed a large paragraph of prose text, with numbered instructions buried in it, and a form asking for more information than needed to find a dealer near the user. FIGURE 6.16 SmartDraw’s “Text Entry Properties” dialog box displays too much text for its simple functionality. 83Much of the reading required by software is unnecessary FIGURE 6.17 Between 2002 and 2007, Jeep.com drastically reduced the reading required by “Find a Dealer.” http://Jeep.com CHAPTER 6 Reading is Unnatural84
  • 145. 5) 2003: The instructions on the “Find a Dealer” page had been boiled down to three bullet points, and the form required less information. 6) 2007: “Find a Dealer” had been cut to one field (zip code) and a “Go” button on the homepage. Even when text describes products rather than explaining instructions, it is counterproductive to put all a vendor wants to say about a product into a lengthy prose description that people have to read from start to end. Most potential custom- ers cannot or will not read it. Compare Costco.com’s descriptions of laptop comput- ers in 2007 with those in 2009 (see Fig. 6.18). Design implications: Minimize the need for reading Too much text in a user interface loses poor readers, who unfortunately are a signifi- cant percentage of the population. Too much text even alienates good readers: it turns using an interactive system into an intimidating amount of work. FIGURE 6.18 Between 2007 and 2009, Costco.com drastically reduced the text in product descriptions. http://Costco.com 85Test on real users
  • 146. Minimize the amount of prose text in a user interface; don’t present users with long blocks of prose text to read. In instructions, use the least amount of text that gets most users to their intended goals. In a product description, provide a brief overview of the product and let users request more detail if they want it. Technical writers and content editors can assist greatly in doing this. For additional advice on how to eliminate unnecessary text, see Krug (2005) and Redish (2007). TEST ON REAL USERS Finally, designers should test their designs on the intended user population to be confident that users can read all essential text quickly and effortlessly. Some testing can be done early, using prototypes and partial implementations, but it should also be done just before release. Fortunately, last-minute changes to text font sizes and formats are usually easy to make. This page intentionally left blank Designing with the Mind in Mind. http://dx.doi.org/10.1016/B978-0-12-407914-4.00007-5 © 2014 Elsevier Inc. All rights reserved.
  • 147. CHAPTER 87 Our Attention is Limited; Our Memory is Imperfect Just as the human visual system has strengths and weaknesses, so do human attention and memory. This chapter describes some of those strengths and weaknesses as back- ground for understanding how we can design interactive systems to support and aug- ment attention and memory rather than burdening or confusing them. We will start with an overview of how memory works, and how it is related to attention. SHORT- VERSUS LONG-TERM MEMORY Psychologists historically have distinguished short-term memory from long-term memory. Short-term memory covers situations in which information is retained for intervals ranging from a fraction of a second to a few minutes. Long-term memory covers situations in which information is retained over longer periods (e.g., hours, days, years, even lifetimes). It is tempting to think of short- and long-term memory as separate memory stores. Indeed, some theories of memory have considered them separate. After all, in a digi- tal computer, the short-term memory stores (central processing unit [CPU] data reg- isters) are separate from the long-term memory stores (random
  • 148. access memory [RAM], hard disk, flash memory, CD-ROM, etc.). More direct evidence comes from findings that damage to certain parts of the human brain results in short-term mem- ory deficits but not long-term ones, or vice versa. Finally, the speed with which infor- mation or plans can disappear from our immediate awareness contrasts sharply with the seeming permanence of our memory of important events in our lives, faces of significant people, activities we have practiced, and information we have studied. These phenomena led many researchers to theorize that short- term memory is a separate store in the brain where information is held temporarily after entering through our perceptual senses (e.g., visual or auditory), or after being retrieved from long-term memory (see Fig. 7.1). 7 CHAPTER 7 Our Attention is Limited; Our Memory is Imperfect88 A MODERN VIEW OF MEMORY Recent research on memory and brain function indicates that short- and long-term memory are functions of a single memory system—one that is more closely linked with perception than previously thought ( Jonides et al., 2008). Long-term memory
  • 149. Perceptions enter through the visual, auditory, olfactory, gustatory, or tactile sensory systems and trigger responses starting in areas of the brain dedicated to each sense (e.g., visual cortex, auditory cortex), then spread into other areas of the brain that are not specific to any particular sensory modality. The sensory modality–specific areas of the brain detect only simple features of the data, such as a dark–light edge, diago- nal line, high-pitched tone, sour taste, red color, or rightward motion. Downstream areas of the brain combine low-level features to detect higher- level features of the input, such as animal, the word “duck,” Uncle Kevin, minor key, threat, or fairness. As described in Chapter 1, the set of neurons activated by a perceived stimulus depends on both the features and context of the stimulus. The context is as impor- tant as the features of the stimulus in determining what neural patterns are acti- vated. For example, a dog barking near you when you are walking in your neighborhood activates a different pattern of neural activity in your brain than the same sound heard when you are safely inside your car. The more similar two percep- tual stimuli are—that is, the more features and contextual elements they share—the more overlap there is between the sets of neurons that fire in response to them. The initial strength of a perception depends on how much it is amplified or damp-
  • 150. ened by other brain activity. All perceptions create some kind of trace, but some are so weak that they can be considered as not registered: the pattern was activated once but never again. Memory formation consists of changes in the neurons involved in a neural activ- ity pattern, which make the pattern easier to reactivate in the future.1 Some such changes result from chemicals released near neural endings that boost or inhibit their sensitivity to stimulation. These changes last only until the chemicals dissipate 1 There is evidence that the long-term neural changes associated with learning occur mainly during sleep, sug- gesting that separating learning sessions by periods of sleep may facilitate learning (Stafford and Webb, 2005). Perception Short-Term Memory hello Long-Term Memory duck farm ham friend Bill greet smile FIGURE 7.1 Traditional (antiquated) view of short-term versus long-term memory.
  • 151. 89A modern view of memory or are neutralized by other chemicals. More permanent changes occur when neu- rons grow and branch, forming new connections with others. Activating a memory consists of reactivating the same pattern of neural activity that occurred when the memory was formed. Somehow the brain distinguishes ini- tial activations of neural patterns from reactivations—perhaps by measuring the rela- tive ease with which the pattern was reactivated. New perceptions very similar to the original ones reactivate the same patterns of neurons, resulting in recognition if the reactivated perception reaches awareness. In the absence of a similar percep- tion, stimulation from activity in other parts of the brain can also reactivate a pattern of neural activity, which if it reaches awareness results in recall. The more often a neural memory pattern is reactivated, the stronger it becomes— that is, the easier it is to reactivate—which in turn means that the perception it cor- responds to is easier to recognize and recall. Neural memory patterns can also be strengthened or weakened by excitatory or inhibitory signals from other parts of the brain. A particular memory is not located in any specific spot in the brain. The neural activity pattern comprising a memory involves a network of
  • 152. millions of neurons extending over a wide area. Activity patterns for different memories overlap, depending on which features they share. Removing, damaging, or inhibiting neu- rons in a particular part of the brain typically does not completely wipe out mem- ories that involve those neurons, but rather just reduces the detail or accuracy of the memory by deleting features.2 However, some areas in a neural activity pat- tern may be critical pathways, so that removing, damaging, or inhibiting them may prevent most of the pattern from activating, thereby effectively eliminating the corresponding memory. For example, researchers have long known that the hippocampus, twin seahorse- shaped neural clusters near the base of the brain, plays an important role in storing long-term memories. The modern view is that the hippocampus is a controlling mechanism that directs neural rewiring so as to “burn” memories into the brain’s wiring. The amygdala, two jellybean-shaped clusters on the frontal tips of the hip- pocampus, has a similar role, but it specializes in storing memories of emotionally intense, threatening situations (Eagleman, 2012). Cognitive psychologists view human long-term memory as consisting of several distinct functions: l Semantic long-term memory stores facts and relationships.
  • 153. l Episodic long-term memory records past events. l Procedural long-term memory remembers action sequences. These distinctions, while important and interesting, are beyond the scope of this book. 2 This is similar to the effect of cutting pieces out of a holographic image: it reduces the overall resolution of the image, rather than removing areas of it, as with an ordinary photograph. CHAPTER 7 Our Attention is Limited; Our Memory is Imperfect90 Short-term memory The processes just discussed are about long-term memory. What about short-term memory? What psychologists call short-term memory is actually a combination of phenomena involving perception, attention, and retrieval from long-term memory. One component of short-term memory is perceptual. Each of our perceptual senses has its own very brief short-term “memory” that is the result of residual neural activity after a perceptual stimulus ceases, like a bell that rings briefly after it is struck. Until they fade away, these residual perceptions are available as possible input to our brain’s attention and memory-storage mechanisms, which integrate
  • 154. input from our various perceptual systems, focus our awareness on some of that input, and store some of it in long-term memory. These sensory- specific residual perceptions together comprise a minor component of short-term memory. Here, we are only interested in them as potential inputs to working memory. Also available as potential input to working memory are long- term memories reactivated through recognition or recall. As explained earlier, each long-term mem- ory corresponds to a specific pattern of neural activity distributed across our brain. While activated, a memory pattern is a candidate for our attention and therefore potential input for working memory. The human brain has multiple attention mechanisms, some voluntary and some involuntary. They focus our awareness on a very small subset of the perceptions and activated long-term memories while ignoring everything else. That tiny subset of all the available information from our perceptual systems and our long-term memories that we are aware of right now is the main component of our short-term memory, the part that cognitive scientists often call working memory. It integrates informa- tion from all of our sensory modalities and our long-term memory. Henceforth, we will restrict our discussion of short-term memory to working memory.
  • 155. So what is working memory? First, here is what it is not: it is not a store—it is not a place in the brain where memories and perceptions go to be worked on. And it is nothing like accumulators or fast random-access memory in digital computers. Instead, working memory is our combined focus of attention: everything that we are conscious of at a given time. More precisely, it is a few perceptions and long-term memories that are activated enough that we remain aware of them over a short period. Psychologists also view working memory as including an executive func- tion—based mainly in the frontal cerebral cortex—that manipulates items we are attending to and, if needed, refreshes their activation so they remain in our aware- ness (Baddeley, 2012). A useful—if oversimplified—analogy for memory is a huge, dark, musty ware- house. The warehouse is full of long-term memories, piled haphazardly (not stacked neatly), intermingled and tangled, and mostly covered with dust and cobwebs. Doors along the walls represent our perceptual senses: sight, hearing, smell, taste, touch. They open briefly to let perceptions in. As perceptions enter, they are briefly illumi- nated by light coming in from outside, but they quickly are pushed (by more enter- ing perceptions) into the dark tangled piles of old memories.
  • 156. 91A modern view of memory In the ceiling of the warehouse are a small fixed number of searchlights, con- trolled by the attention mechanism’s executive function (Baddeley, 2012). They swing around and focus on items in the memory piles, illuminating them for a while until they swing away to focus elsewhere. Sometimes one or two searchlights focus on new items after they enter through the doors. When a searchlight moves to focus on something new, whatever it had been focusing on is plunged into darkness. The small fixed number of searchlights represents the limited capacity of work- ing memory. What is illuminated by them (and briefly through the open doors) rep- resents the contents of working memory: out of the vast warehouse’s entire contents, the few items we are attending to at any moment. See Figure 7.2 for a visual. The warehouse analogy is too simple and should not be taken too seriously. As Chapter 1 explained, our senses are not just passive doorways into our brains, through which our environment “pushes” perceptions. Rather, our brain actively and continually seeks out important events and features in our environment and “pulls” perceptions in as needed (Ware, 2008). Furthermore, the brain is buzzing with activity most of the time and its internal activity is only
  • 157. modulated—not deter- mined—by sensory input (Eagleman, 2012). Also, as described earlier, memories are embodied as networks of neurons distributed around the brain, not as objects in a specific location. Finally, activating a memory in the brain can activate related ones; our warehouse-with-searchlights analogy doesn’t represent that. Nonetheless, the analogy—especially the part about the searchlights—illustrates that working memory is a combination of several foci of attention—the currently FriendSally Hello Duck FIGURE 7.2 Modern view of memory: a dark warehouse full of stuff (long- term memory), with searchlights focused on a few items (short-term memory). CHAPTER 7 Our Attention is Limited; Our Memory is Imperfect92 activated neural patterns of which we are aware—and that the capacity of working memory is extremely limited, and the content at any given moment is very volatile. What about the earlier finding that damage to some parts of the
  • 158. brain causes short-term memory deficits, while other types of brain damage cause long-term memory deficits? The current interpretation is that some types of damage decrease or eliminate the brain’s ability to focus attention on specific objects and events, while other types of damage harm the brain’s ability to store or retrieve long-term memories. CHARACTERISTICS OF ATTENTION AND WORKING MEMORY As noted, working memory is equal to the focus of our attention. Whatever is in that focus is what we are conscious of at any moment. But what determines what we attend to and how much we can attend to at any given time? Attention is highly focused and selective Most of what is going on around you at this moment you are unaware of. Your per- ceptual system and brain sample very selectively from your surroundings, because they don’t have the capacity to process everything. Right now you are conscious of the last few words and ideas you’ve read, but probably not the color of the wall in front of you. But now that I’ve shifted your atten- tion, you are conscious of the wall’s color, and may have forgotten some of the ideas you read on the previous page. Chapter 1 described how our perception is filtered and biased by our goals. For
  • 159. example, if you are looking for your friend in a crowded shopping mall, your visual system “primes” itself to notice people who look like your friend (including how he or she is dressed), and barely notice everything else. Simultaneously, your auditory system primes itself to notice voices that sound like your friend’s voice, and even footsteps that sound like those of your friend. Human-shaped blobs in your periph- eral vision and sounds localized by your auditory system that match your friend snap your eyes and head toward them. While you look, anyone looking or sounding simi- lar to your friend attracts your attention, and you won’t notice other people or events that would normally have interested you. Besides focusing on objects and events related to our current goals, our attention is drawn to: l Movement, especially movement near or toward us. For example, some- thing jumps at you while you walk on a street, or something swings toward your head in a haunted house ride at an amusement park, or a car in an adjacent lane suddenly swerves toward your lane (see the discussion of the flinch reflex in Chapter 14). l Threats. Anything that signals or portends danger to us or people in our care.
  • 160. 93Characteristics of attention and working memory l Faces of other people. We are primed from birth to notice faces more than other objects in our environment. l Sex and food. Even if we are happily married and well fed, these things attract our attention. Even the mere words probably quickly got your attention. These things, along with our current goals, draw our attention involuntarily. We don’t become aware of something in our environment and then orient ourselves toward it. It’s the other way around: our perceptual system detects something atten- tion-worthy and orients us toward it preconsciously, and only afterwards do we become aware of it.3 Capacity of attention (a.k.a. working memory) The primary characteristics of working memory are its low capacity and volatility. But what is the capacity? In terms of the warehouse analogy presented earlier, what is the small fixed number of searchlights? Many college-educated people have read about “the magical number seven, plus or minus two,” proposed by cognitive psychologist George Miller in 1956 as the limit on the number of simultaneous unrelated items in human working memory (Miller, 1956).
  • 161. Miller’s characterization of the working memory limit naturally raises several questions: l What are the items in working memory? They are current perceptions and retrieved memories. They are goals, numbers, words, names, sounds, images, odors—anything one can be aware of. In the brain, they are patterns of neural activity. l Why must items be unrelated? Because if two items are related, they corre- spond to one big neural activity pattern—one set of features— and hence one item, not two. l Why the fudge-factor of plus or minus two? Because researchers cannot measure with perfect accuracy how much people can keep track of, and because of differences between individuals in working memory capacity. Later research in the 1960s and 1970s found Miller’s estimate to be too high. In the experiments Miller considered, some of the items presented to people to remember could be “chunked” (i.e., considered related), making it appear that people’s working memory was holding more items than it actually was. Furthermore, all the subjects in Miller’s experiments were college students. Working memory capacity varies in the general population. When the experiments were revised to disallow unintended chunk-
  • 162. ing and include noncollege students as subjects, the average capacity of working mem- ory was shown to be more like four plus or minus one—that is, three to five items (Broadbent, 1975; Mastin, 2010). Thus, in our warehouse analogy, there would be only four searchlights. 3 Exactly how long afterwards is discussed in Chapter 14. CHAPTER 7 Our Attention is Limited; Our Memory is Imperfect94 More recent research has cast doubt on the idea that the capacity of working memory should be measured in whole items or “chunks.” It turns out that in early working memory experiments, people were asked to briefly remember items (e.g., words or images) that were quite different from each other— that is, they had very few features in common. In such a situation, people don’t have to remember every feature of an item to recall it a few seconds later; remembering some of its features is enough. So people appeared to recall items as a whole, and therefore working memory capacity seemed measurable in whole items. Recent experiments have given people items to remember that are similar—that is, they share many features. In that situation, to recall an item and not confuse it with other items, people must remember more of its features. In
  • 163. these experiments, researchers found that people remember more details (i.e., features) of some items than of others, and the items they remember in greater detail are the ones they paid more attention to (Bays and Husain, 2008). This suggests that the unit of attention— and therefore the capacity of working memory—is best measured in item features rather than whole items or “chunks” (Cowan et al., 2004). This jibes with the mod- ern view of the brain as a feature-recognition device, but it is controversial among memory researchers, some of whom argue that the basic capacity of human working memory is three to five whole items, but that is reduced if people attend to a large number of details (i.e., features) of the items (Alvarez and Cavanagh, 2004). Bottom line: The true capacity of human working memory is still a research topic. The second important characteristic of working memory is how volatile it is. Cognitive psychologists used to say that new items arriving in working memory often bump old ones out, but that way of describing the volatility is based on the view of working memory as a temporary storage place for information. The mod- ern view of working memory as the current focus of attention makes it even clearer: focusing attention on new information turns it away from some of what it was focus- ing on. That is why the searchlight analogy is useful.
  • 164. However we describe it, information can easily be lost from working memory. If items in working memory don’t get combined or rehearsed, they are at risk of having the focus shifted away from them. This volatility applies to goals as well as to the details of objects. Losing items from working memory corresponds to forgetting or losing track of something you were doing. We have all had such experiences, for example: l Going to another room for something, but once there we can’t remember why we came. l Taking a phone call, and afterward not remembering what we were doing before the call. l Something yanks our attention away from a conversation, and then we can’t remember what we were talking about. l In the middle of adding a long list of numbers, something distracts us, so we have to start over. 95Characteristics of attention and working memory WORKING MEMORY TEST To test your working memory, get a pen or pencil and two blank sheets
  • 165. of paper and follow these instructions: 1. Place one blank sheet of paper after this page in the book and use it to cover the next page. 2. Flip to the next page for three seconds, pull the paper cover down and read the black numbers at the top, and flip back to this page. Don’t peek at other numbers on that page unless you want to ruin the test. 3. Say your phone number backward, out loud. 4. Now write down the black numbers from memory. … Did you get all of them? 5. Flip back to the next page for three seconds, read the red numbers (under the black ones), and flip back. 6. Write down the numbers from memory. These would be easier to recall than the first ones if you noticed that they are the first seven digits of π (3.141592), because then they would be only one number, not seven. 7. Flip back to the next page for 3 seconds, read the green numbers, and flip back. 8. Write down the numbers from memory. If you noticed that they are odd numbers from 1 to 13, they would be easier to recall, because they would be three chunks (“odd, 1, 13” or “odd, seven from 1”), not seven.
  • 166. 9. Flip back to the next page for three seconds, read the orange words, and flip back. 10. Write down the words from memory. … Could you recall them all? 11. Flip back to the next page for three seconds, read the blue words, and flip back. 12. Write down the words from memory. … It was certainly a lot easier to recall them all because they form a sentence, so they could be memorized as one sentence rather than seven words. CHAPTER 7 Our Attention is Limited; Our Memory is Imperfect96 IMPLICATIONS OF WORKING MEMORY CHARACTERISTICS FOR USER-INTERFACE DESIGN The capacity and volatility of working memory have many implications for the design of interactive computer systems. The basic implication is that user interfaces should help people remember essential information from one moment to the next. Don’t require people to remember system status or what they have done, because their atten- tion is focused on their primary goal and progress toward it. Specific examples follow. Modes The limited capacity and volatility of working memory is one
  • 167. reason why user- interface design guidelines often say to either avoid designs that have modes or provide adequate mode feedback. In a moded user interface, some user actions have different effects depending on what mode the system is in. For example: l In a car, pressing the accelerator pedal can move the car either forwards, back- wards, or not at all, depending on whether the transmission is in drive, reverse, or neutral. The transmission sets a mode in the car’s user interface. l In many digital cameras, pressing the shutter button can either snap a photo or start a video recording, depending on which mode is selected. 3 8 4 7 5 3 9 3 1 4 1 5 9 2 1 3 5 7 9 11 31 town river corn string car shovel what is the meaning of life 97Implications of working memory characteristics for user- interface design l In a drawing program, clicking and dragging normally selects one or more
  • 168. graphic objects on the drawing, but when the software is in “draw rectangle” mode, clicking and dragging adds a rectangle to the drawing and stretches it to the desired size. Moded user interfaces have advantages; that is why many interactive systems have them. Modes allow a device to have more functions than controls: the same control provides different functions in different modes. Modes allow an interactive system to assign different meanings to the same gestures to reduce the number of gestures users must learn. However, one well-known disadvantage of modes is that people often make mode errors: they forget what mode the system is in and do the wrong thing by mistake (Johnson, 1990). This is especially true in systems that give poor feedback about what the current mode is. Because of the problem of mode errors, many user-interface design guidelines say to either avoid modes or provide strong feedback about which mode the system is in. Human working memory is too unreliable for designers to assume that users can, without clear, continuous feedback, keep track of what mode the system is in, even when the users are the ones changing the system from one mode to another. Search results When people use a search function on a computer to find information, they enter
  • 169. the search terms, start the search, and then review the results. Evaluating the results often requires knowing what the search terms were. If working memory were less limited, people would always remember, when browsing the results, what they had entered as search terms just a few seconds earlier. But as we have seen, working memory is very limited. When the results appear, a person’s atten- tion naturally turns away from what he or she entered and toward the results. Therefore, it should be no surprise that people viewing search results often do not remember the search terms they just typed. Unfortunately, some designers of online search functions don’t understand that. Search results sometimes don’t show the search terms that generated the results. For example, in 2006, the search results page at Slate.com provided search fields so users could search again, but didn’t show what a user had searched for (see Fig. 7.3A). A recent version of the site shows the user’s search terms (see Fig. 7.3B), reducing the burden on users’ working memory. Calls to action A well-known “netiquette” guideline for writing email messages, especially messages that require responses or ask the recipients to do something, is to restrict each message to one topic. If a message contains multiple topics or requests, its recipients may focus on one of them (usually the first one), get engrossed in
  • 170. responding to that, and forget to respond to the rest of the email. The guideline to put different topics or requests into separate emails is a direct result of the limited capacity of human attention. CHAPTER 7 Our Attention is Limited; Our Memory is Imperfect98 Web designers are familiar with a similar guideline: Avoid putting competing calls to action on a page. Each page should have only one dominant call to action— or one for each possible user goal—to not overwhelm users’ attention capacity and cause them go down paths that don’t achieve their (or the site owner’s) goals. (A) (B) FIGURE 7.3 Slate.com search results: (A) in 2007, users’ search terms were not shown, but (B) in 2013, search terms are shown. 99Implications of working memory characteristics for user- interface design A related guideline: Once users have specified their goal, don’t
  • 171. distract them from accomplishing it by displaying extraneous links and calls to action. Instead, guide them to the goal by using a design pattern called the process funnel (van Duyne et al., 2002; see also Johnson, 2007). Instructions If you asked a friend for a recipe or for directions to her home, and she gave you a long sequence of steps, you probably would not try to remember it all. You would know that you could not reliably keep all of the instructions in your working mem- ory, so you would write them down or ask your friend to send them to you by email. Later, while following the instructions, you would put them where you could refer to them until you reached the goal. Similarly, interactive systems that display instructions for multistep operations should allow people to refer to the instructions while executing them until com- pleting all the steps. Most interactive systems do this (see Fig. 7.4), but some do not (see Fig. 7.5). Navigation depth Using a software product, digital device, phone menu system, or Web site often involves navigating to the user’s desired information or goal. It is well established that navigation hierarchies that are broad and shallow are easier for most people— especially those who are nontechnical—to find their way around
  • 172. in than narrow, deep hierarchies (Cooper, 1999). This applies to hierarchies of application win- dows and dialog boxes, as well as to menu hierarchies ( Johnson, 2007). FIGURE 7.4 Instructions in Windows Help files remain displayed while users follow them. CHAPTER 7 Our Attention is Limited; Our Memory is Imperfect100 A related guideline: In hierarchies deeper than two levels, provide navigation “breadcrumb” paths to constantly remind users where they are (Nielsen, 1999; van Duyne et al., 2002). These guidelines, like the others mentioned earlier, are based on the limited capac- ity of human working memory. Requiring a user to drill down through eight levels of dialog boxes, web pages, menus, or tables—especially with no visible reminders of their location—will probably exceed the user’s working memory capacity, thereby causing him or her to forget where he or she came from or what his or her overall goals were. CHARACTERISTICS OF LONG-TERM MEMORY Long-term memory differs from working memory in many
  • 173. respects. Unlike working memory, it actually is a memory store. However, specific memories are not stored in any one neuron or location in the brain. As described earlier, memories, like perceptions, consist of patterns of activa- tion of large sets of neurons. Related memories correspond to overlapping patterns of activated neurons. This means that every memory is stored in a distributed fash- ion, spread among many parts of the brain. In this way, long- term memory in the brain is similar to holographic light images. Long-term memory evolved to serve our ancestors and us very well in getting around in our world. However, it has many weaknesses: it is error-prone, impression- ist, free-associative, idiosyncratic, retroactively alterable, and easily biased by a vari- ety of factors at the time of recording or retrieval. Let’s examine some of these weaknesses. Error-prone Nearly everything we’ve ever experienced is stored in our long- term memory. Unlike working memory, the capacity of human long-term memory seems almost unlimited. Adult human brains each contain about 86 billion neurons (Herculano-Houzel, 2009). As described earlier, individual neurons do not store memories; memories are encoded by networks of neurons acting together. Even if only some of the brain’s
  • 174. FIGURE 7.5 Instructions for Windows XP wireless setup start by telling users to close the instructions. 101Characteristics of long-term memory neurons are involved in memory, the large number of neurons allows for a great many different combinations of them, each capable of representing a different mem- ory. Still, no one has yet measured or even estimated the maximum information capacity of the human brain.4 Whatever the capacity is, it’s a lot. However, what is in long-term memory is not an accurate, high- resolution record- ing of our experiences. In terms familiar to computer engineers, one could charac- terize long-term memory as using heavy compression methods that drop a great deal of information. Images, concepts, events, sensations, actions— all are reduced to combinations of abstract features. Different memories are stored at different levels of detail—that is, with more or fewer features. For example, the face of a man you met briefly who is not important to you might be stored simply as an average Caucasian male face with a beard, with no other details—a whole face reduced to three features. If you were
  • 175. asked later to describe the man in his absence, the most you could honestly say was that he was a “white guy with a beard.” You would not be able to pick him out of a police lineup of other Caucasian men with beards. In contrast, your memory of your best friend’s face includes many more features, allowing you to give a more detailed description and pick your friend out of any police lineup. Nonetheless, it is still a set of features, not anything like a bitmap image. As another example, I have a vivid childhood memory of being run over by a plow and badly cut, but my father says it happened to my brother. One of us is wrong. In the realm of human–computer interaction, a Microsoft Word user may remem- ber that there is a command to insert a page number, but may not remember which menu the command is in. That specific feature may not have been recorded when the user learned how to insert page numbers. Alternatively, perhaps the menu-loca- tion feature was recorded, but just does not reactivate with the rest of the memory pattern when the user tries to recall how to insert a page number. Weighted by emotions Chapter 1 described a dog that remembered seeing a cat in his front yard every time he returned home in the family car. The dog was excited when he first saw the cat,
  • 176. so his memory of it was strong and vivid. A comparable human example would be an adult could easily have strong memo- ries of her first day at nursery school, but probably not of her tenth. On the first day, she was probably upset about being left at the school by her parents, whereas by the tenth day, being left there was nothing unusual. Retroactively alterable Suppose that while you are on an ocean cruise with your family, you see a whale-shark. Years later, when you and your family are discussing the trip, you might remember 4 The closest researchers have come is Landauer’s (1986) use of the average human learning rate to calculate the amount of information a person can learn in a lifetime: 109 bits, or a few hundred megabytes. CHAPTER 7 Our Attention is Limited; Our Memory is Imperfect102 seeing a whale, and one of your relatives might recall seeing a shark. For both of you, some details in long-term memory were dropped because they did not fit a common concept. A true example comes from 1983, when the late President Ronald Reagan was speaking with Jewish leaders during his first term as president. He spoke about being
  • 177. in Europe during World War II and helping to liberate Jews from the Nazi concentra- tion camps. The trouble was, he was never in Europe during World War II. When he was an actor, he was in a movie about World War II, made entirely in Hollywood. That important detail was missing from his memory. IMPLICATIONS OF LONG-TERM MEMORY CHARACTERISTICS FOR USER-INTERFACE DESIGN The main thing that the characteristics of long-term memory imply is that people need tools to augment it. Since prehistoric times, people have invented technologies to help them remember things over long periods: notched sticks, knotted ropes, mnemonics, verbal stories and histories retold around campfires, writing, scrolls, books, number systems, shopping lists, checklists, phone directories, datebooks, accounting ledgers, oven timers, computers, portable digital assistants (PDAs), online shared calendars, etc. Given that humankind has a need for technologies that augment memory, it seems clear that software designers should try to provide software that fulfills that A LONG-TERM MEMORY TEST Test your long-term memory by answering the following questions: 1. Was there a roll of tape in the toolbox in Chapter 1? 2. What was your previous phone number?
  • 178. 3. Which of these words were not in the list presented in the work- ing memory test earlier in this chapter: city, stream, corn, auto, twine, spade? 4. What was your first-grade teacher’s name? Second grade? Third grade? … 5. What Web site was presented earlier that does not show search terms when it displays search results? Regarding question 3: When words are memorized, often what is retained is the concept, rather than the exact word that was presented. For example, one could hear the word “town” and later recall it as “city.” 103Implications of long-term memory characteristics for user- interface design need. At the very least, designers should avoid developing systems that burden long- term memory. Yet that is exactly what many interactive systems do. Authentication is one functional area in which many software systems place bur- densome demands on users’ long-term memory. For example, a web application developed a few years ago told users to change their personal
  • 179. identification number (PIN) “to a number that is easy to remember,” but then imposed restrictions that made it impossible to do so (see Fig. 7.6). Whoever wrote those instructions seems to have realized that the PIN requirements were unreasonable, because the instruc- tions end by advising users to write down their PIN! Nevermind that writing a PIN down creates a security risk and adds yet another memory task: users must remem- ber where they hid their written-down PIN. A contrasting example of burdening people’s long-term memory for the sake of security comes from Intuit.com. To purchase software, visitors must register. The site requires users to select a security question from a menu (see Fig. 7.7). What if you can’t answer any of the questions? What if you don’t recall your first pet’s name, your high school mascot, or any of the answers to the other questions? But that isn’t where the memory burden ends. Some questions could have sev- eral possible answers. Many people had several elementary schools, childhood friends, or heroes. To register, they must choose a question and then remember which answer they gave to Intuit.com. How? Probably by writing it down some- where. Then, when Intuit.com asks them the security question, they have to remember where they put the answer. Why burden people’s memory, when it
  • 180. would be easy to let users make up a security question for which they can easily recall the one possible answer? Such unreasonable demands on people’s long-term memory counteract the secu- rity and productivity that computer-based applications supposedly provide (Schrage, 2005), as users: l Place sticky notes on or near computers or “hide” them in desk drawers. l Contact customer support to recover passwords they cannot recall. FIGURE 7.6 Instructions tell users to create an easy-to-remember PIN, but the restrictions make that impossible. CHAPTER 7 Our Attention is Limited; Our Memory is Imperfect104 l Use passwords that are easy for others to guess. l Set up systems with no login requirements at all, or with one shared login and password. The registration form at Network
  • 181. Solution s.com represents a small step toward more usable security. Like Intuit.com, it offers a choice of security questions, but it also allows users to create their own security question—one for which they can more easily remember the answer (see Fig. 7.8). Another implication of long-term memory characteristics for interactive systems is that learning and long-term retention are enhanced by user- interface consistency. FIGURE 7.7 Intuit.com’s registration burdens long-term memory: users may have no unique, memorable answer for any of the questions. FIGURE 7.8 Network