Usability Evaluation by Novice Users in HCI

1
DEPARTMENT OF INFORMATICS – INSTITUT TEKNOLOGI SEPULUH NOPEMBER
Arranged by: Hadziq Fabroyir, Ph.D. ( hadziq@its.ac.id )

HUMAN-COMPUTER INTERACTION - DEPARTMENT OF INFORMATICS, ITS 2
USABILITY EVALUATION
RELIABILITY, VALIDITY, SUMMATIVE, DESIGN, MEASUREMENTS, QUESTIONNAIRE, HOWTHORNE EFFECT

WHY EVALUATE WITH “USABILITY EVALUATION”?
 Following guidelines never sufficient for
good user interfaces
 Need both good design and user
studies
 Similar to users with Contextual Inquiry
 Note: users, subjects  participants

THE “DON’TS” IN USABILITY EVALUATIONS (1)
 Don’t evaluate whether it works (quality assurance)
 Don’t have experimenters (you) evaluate it – get participants
 Don’t (just) ask participant questions. This is NOT an “opinion
survey.” Instead, watch their behavior.
 Don’t evaluate with groups: see how well product works for each
person individually (not a “focus group”)

“DON’TS” OF USABILITY EVALUATIONS (2)
 Don’t train participants:
We need to see if they can figure it out themselves.
 Don’t test participant  evaluate the product
 It is NOT a “user test”  It is called a Usability Evaluation instead.
 Don’t put your ego as a designer on the line

ISSUE: RELIABILITY
 Do the results generalize to other people?
In fact, there might be individual differences among participants
 If comparing two products,
use statistics for confidence intervals, p<.01
 Small number of participants cannot evaluate the entire website or app
Just a sample

ISSUE: VALIDITY
Did the evaluation measure what we want?
 Wrong participants
 “Confounding” factors, etc,
 Issues which were not controlled but may be relevant to the evaluation
 Other usability problems, settings, etc.
 Ordering effects
 Learning effects
 Too much help given to some participants

PLAN OUR EVALUATION
 Goals:
 Formative – help decide features and design  CI (back then)
 Summative – evaluate product  UE (now)
 Pilot evaluations
 Preliminary evaluations to check materials, look for bugs, etc
 Evaluate the instructions, timing, etc
 Participants do not have to be representative

EVALUATION DESIGN
Within Subjects
 Each participant does all conditions
 Removes individual differences
 Add ordering effects,
otherwise, just randomize!
Between Subjects
 Each participant does one
condition
 Quicker for each participant
 But need more participants due to
huge variation in people

SOME MEASUREMENTS
Learnability Efficiency Errors Web Analytics Questionnaire

ANALYZING THE MEASUREMENT DATA
Numeric Data
 Example:
times, number of errors, etc.
 Tables and plots using a
spreadsheet
 Look for trends and outliers
Organize Problems by:
 Scope:
How widespread is the problem?
 Severity:
How critical is the problem?
 http://www.cs.cmu.edu/~bam/ui
course/UsabilityEvalReport_tem
plate2016.docx

GOAL LEVELS
Pick Levels for product:
• Theoretical best level
• Desired (planned) level
• Minimum acceptable level
• Current level or competitor's level
Errors
0 1 2 5
Best Desired
Minimum
Acceptable Current

QUESTIONNAIRE DESIGN (1)
 Collect general demographic information that may be relevant
 Evaluate feelings towards your product and other products
 Important to design questionnaire carefully, otherwise:
 Participants may find questions confusing
 May not measure what you are interested in

QUESTIONNAIRE DESIGN (2)
 “Likert scale”
 Propose something and let people agree or disagree:
agree disagree
The product was easy to use: 1 .. 2 .. 3 .. 4 .. 5
 “Semantic differential scale”
 Two opposite feelings:
difficult easy
Finding the right information was: -2 .. -1 .. 0 .. 1 .. 2
 If multiple choices, rank order them:
Rank the choices in order of preference (with 1 being most preferred and 4 being least):
Interface #1 Interface #2 Interface #3 Interface #4
 (in a real survey, describe the interfaces)

QUESTION DESIGN (STRATEGY)
 Apply clear writing. Use simple sentences.
 If participants make mistakes, then questionnaire is invalid.
 Put all positive answers in a column. Do not alternate!
 This website was easy to use.
 It was difficult to find what I needed on this website.
 Use ranges in the answer options.
 Up to 1000
 1000 – 10,000
 Bigger than 10,000

Standard (Validated)
Questionnaires
“Questionnaire for User Interface Satisfaction” (QUIS)
Chin, J.P., Diehl, V.A.,
Norman, K.L.
(1988) Development of an
Instrument Measuring User
Satisfaction of the Human-
Computer Interface. ACM
CHI'88 Proceedings, 213-218.
http://hcibib.org/perlman/qu
estion.cgi?form=QUIS

HUMAN-COMPUTER INTERACTION - DEPARTMENT OF INF
ORMATICS, ITS
18
OTHER QUESTOINNAIRE
EXAMPLE
Please take a look on UX Book, page 446

VIDEOTAPING
 Useful, but very slow to analyze
 Good for problem demos to developers or management
 Facilitate impact analysis

”THINK ALOUD” PROTOCOLS
 Get participant to continuously verbalize their thoughts
 Encourage participants to express whatever interesting
 May need to “coach” participant to keep talking
 Ask general questions
 “What did you expect”,
 “What are you thinking now”
 Not:
 What do you think that button is for”,
 “Why didn’t you click here”

NUMBER OF PARTICIPANTS
 About 30 for statistical studies
 > 5 for usability evaluation
 Reference:
https://www.nngroup.com/articles
/how-many-test-users/
 Testing more participants didn't
result in appreciably more insights

ETHICAL CONSIDERATIONS
 No harm to the participants
 Emphasize the product being evaluated, not the participants
 Results of evaluation and participants’ identities are kept secret
 Stop evaluation if participant is too upset
 At end, ask for comments, thank the participants

Hawthorne Effect
Definition: When people are aware that they are
being observed, they change their normal behavior
unintentionally.
Example:
You are observing how a
participant interacts with an app.
The participant is informed that
his/her actions on the app would
be recorded. As a result, the
participant may be extra careful
not to make mistakes on the app
to avoid embarrassment.

Hawthorne Effect
Definition: When people are aware that they are
being observed, they change their normal behavior
unintentionally.
Solution:
Inform the participant that there’s
no right or wrong way of
completing their tasks during the
research or experiment. Provide
smaller warm-up tasks at the
beginning of the session so that
the participant can become
comfortable with the environment.

USABILITY EVALUATION SCENES

Usability Evaluation by Novice Users in HCI

More Related Content

Similar to Usability Evaluation by Novice Users in HCI (20)

More from Hadziq Fabroyir (20)

Recently uploaded (20)

Usability Evaluation by Novice Users in HCI

Editor's Notes