SlideShare a Scribd company logo
USABILITY TESTING
- Punto Damar P -
WHY ?
" Since the limitation of data, and the
lack of theoretical foundation in
Game Design, most of games have
been developed based solely on own
experiences and intuitions of the
Designer. As the result, about 80% of
games fail on the market every
year."
( Game Software Industry Report in AlienBrain product catalog. NxN
software. 2001 )
WHY ? (2)
"However, it is necessary to point out that,
too often, video game interfaces are an
afterthought. The reason is, too many
project managers assume the most
important part of a software
development project is the programming,
and then the interface can come later. As
the result, insufficient time is assigned for
interface design which may leads to a poor
quality interface." ( Fox 2005 )
MORE INFORMATION ...
"Human Computer Interaction in
Game Design"
- Nguyen Hung -
http://www.theseus.fi/bitstream/handle/10024/43234/Nguyen_Hung.
pdf?sequence=1
MORE INFORMATION ... (2)
"Quantifying The
User Experince"
- Jeff Sauro / James R. Lewis -
USABILITY TESTING ?
DEBUGGING != USABILITY TESTING
BUG FREE != USABLE
Usability testing
HOW DO WE DO IT ?
• Compare it to a specific benchmark or
goal.
• Get stastistical w ays to get more
precise answers.
• Get statistically significant evidence
from small samples.
HOW DO WE SET A
BENCHMARK ?
• Based on historical data obtained from
previous test that included the task.
• Based on findings reported in published
scientific or marketing research.
• Negotiate criteria with the stakeholders who
are responsible for the product.
HOW DO WE SET A
BENCHMARK ? (2)
Some suggestions :
• The best objective basis are data from previous
usability studies of predecessor or competitive
products.
• The source of historical data should be studies of
similiar types of participans, completing the same
tasks, under the same conditions.
• Negotiate with other stakeholders for the final set of
shared goals.
HOW DO WE SET A
BENCHMARK ? (2)
Some other suggestions :
• Establish some specific objectives
immediately, so you can measure
improvements.
• Revise your product in the early stages.
• Do not change reasonable goals to
accomodate an unusable product.
COMPARING A COMPLETION RATE
TO A BENCHMARK
small sample test & largle sample test
SMALL SAMPLE TEST
• success / fail
• "small" sample size = the total number of
users tested is less than 30.
HERE'S THE FORMULA
( brace yourselves )
Use the exact probabilities from the binomial distribution,
where :
x = the number of users who successfully completed the
task
n = sample size
)(
)1(
)!(!
!
)( xnx
pp
xnx
n
xp 



LIFE HACK ..
Use Microsoft Excel's function :
BINOMDST()
EXAMPLE 1
Eight of nine users successfully
completed a task.
Is there sufficent evidence to conclude
that at least 70% of all users would
be able to complete the same task ?
ANSWER
1556.0)7.01(7.0
)!89(!8
!9
)8( )89(8


 
p
 04035.0)7.01(7.0
)!99(!9
!9
)9( )99(9


 
p
OR..
= BINOMDIST (8 , 9 , 0.7 , FALSE) = 0.1556
= BINOMDIST (9 , 9 , 0.7 , FALSE) = 0.04035
CONCLUSION
0.1556 + 0.04035 = 0.1960
The probability of 8 or 9 successes out of
nine attempts is (1 - 0.1960) * 100 = 80.4%
There is an 80.4% chance that the
completion rate exceeds 70%
MID - PROBABILITY
0.5*(0.1556) + 0.04035 = 0.07782
The probability of 8 or 9 successes out of
nine attempts is (1 - 0.07782) * 100 = 88.4%
There is an 88.4% chance that the
completion rate exceeds 70%
MID - PROBABILITY
0.5*(0.1556) + 0.04035 = 0.07782
The probability of 8 or 9 successes out of
nine attempts is (1 - 0.07782) * 100 = 88.4%
There is an 88.4% chance that the
completion rate exceeds 70%
• Not suitable for production, but sufficent
enough to show that efforts are better spent on
improving other functions.
• The probability we computed is called an "exact"
probability. Not because it's exactly correct, but
because the probabilities are calculated
correctly. Rather than approximated.
• This result tend to be coservative.
IMPORTANT NOTES
LARGE SAMPLE TEST
• success / fail
• "large" sample size = at least 15 failures
and 15 successes.
HERE'S THE FORMULA
( brace yourselves again)
pˆ
n
pp
pp
z
)1(
ˆ



Use normal approximation to the binomial,
where :
= the observed completion rate expressed as a proportion
p = benchmark
n = number of users tested
EXAMPLE 2
85 out of 100 users were able to
successfully locate a specific product
and add it to their shopping cart.
Is there enough evidence to conclude
that at least 75% of all users can
complete this task successfully ?
ANSWER
309.2
100
)75.01(75.0
75.085.0



z
• Use NORMSDIST() to get the z-score.
• Final result = abs( NORMSDIST(2.309) - 1 )
= 0.0105
CONCLUSION
0.0105 * 100 = 1.05 %
There is around 99% chance that at
least 75% of users can complete the
task.
COMPARING A TASK TIME TO A
BENCHMARK
HERE'S THE FORMULA
where :
n
s
x
t ln
lnˆ)ln( 
 
ln
ˆx
lns
= mean of the log values
= standar deviation of the log values
EXAMPLE 3
11 users completed a task in a financial
application.
Task times : 90, 59, 54, 55, 171, 86, 107,
53, 79, 72, 157
Is there enough evidence that the average
task time is less than 100 seconds?
ANSWER
• Task Times =
90, 59, 54, 55, 171, 86, 107, 53, 79, 72, 157
• Log-transformed times =
4.5, 4.08, 3.99, 4.01, 5.14, 4.45, 4.67, 3.97, 4.37, 4.28, 5.06
• Mean of log times = 4.41
• Geometric mean of log times = EXP(4.41) =
82.3
• Standar deviation of log times = 0.411
• Log of benchmark (60s) = 4.61
ANSWER (2)
find the t-statistic value
Use the probability on 10 degrees of freedom
(n-1);
TDIST(1.53,10,1) = 0.0785
53.1
124.0
19.0
11
411.0
41.461.4


t
CONCLUSION
The probability of seeing an average time of 82.3
seconds if the actual population time is greater
than 100 seconds is around 7.87%
OR
We can be 92.15% confident that users can
complete this task in less than 100 seconds.
• What is geometric mean?
The best estimate of the middle task time for
small-sample usability data (less than 25).
• How about large-sample usability data?
Use sample median method.
(won't be explained here)
IMPORTANT NOTES
TOOLS
http://pencil.evolus.vn/
https://marvelapp.com/ https://proto.io/
http://www.invisionapp.com/
FIXING COST
Source : Theo Allen
UNIFIED PROCESS MODEL
https://en.wikipedia.org/wiki/Unified_Process
THANK YOU
Punto Damar P.
facebook.com/puntodamar
@ puntodamar
BikinGame.com

More Related Content

PDF
Stop Flying Blind! Quantifying Risk with Monte Carlo Simulation
PDF
Manual Testing is Here to Stay
PPTX
Steps in Simulation Study
PPTX
Improving Forecasts with Monte Carlo Simulations
PDF
Bengkel Gamelan - Unity APK & Asset Size Optimization
PDF
Introduction into Procedural Content Generation by Yogie Aditya
PDF
Bengkel 5 presentation
PDF
Basic Version Control Using Git - Bengkel Gamelan
Stop Flying Blind! Quantifying Risk with Monte Carlo Simulation
Manual Testing is Here to Stay
Steps in Simulation Study
Improving Forecasts with Monte Carlo Simulations
Bengkel Gamelan - Unity APK & Asset Size Optimization
Introduction into Procedural Content Generation by Yogie Aditya
Bengkel 5 presentation
Basic Version Control Using Git - Bengkel Gamelan

Viewers also liked (20)

PPTX
Bengkel Gamelan 3: HTML 5
PDF
Basic Optimization and Unity Tips & Tricks by Yogie Aditya
PDF
Bengkel Gamelan - Game Balancing
PDF
Presentasi prototype day mobile game advertisement
PPTX
Materi Bengkel Gamelan : Game Marketing
PDF
JGJ48: Baidu Android Store - Edo Surya
PPT
Health Cannot Be Measured
PDF
Brocher Foundation program 2015
PPT
Gbd measure
PDF
Cómo Triunfar con tu Negocio en las Redes Sociales
PPT
Gayprojectfile
PPTX
¿ Qué es el Marketing de Contenidos ?
PDF
New Deck
PPT
Ch ng 4_-_b_i_gi_ng_anten-truy_n_s_ng
PPT
Ch ng 3_-_b_i_gi_ng_anten-truy_n_s_ng_1_
PPT
Baigiangdugio 20-11-08
PPT
Ch ng 3_-_b_i_gi_ng_anten-truy_n_s_ng_2_
PDF
259973943 xbee-node-temperature-sensor
PPT
Online Security - The Good, the Bad, and the Crooks
PPTX
2D Art Dalam Video Game - Kudit
Bengkel Gamelan 3: HTML 5
Basic Optimization and Unity Tips & Tricks by Yogie Aditya
Bengkel Gamelan - Game Balancing
Presentasi prototype day mobile game advertisement
Materi Bengkel Gamelan : Game Marketing
JGJ48: Baidu Android Store - Edo Surya
Health Cannot Be Measured
Brocher Foundation program 2015
Gbd measure
Cómo Triunfar con tu Negocio en las Redes Sociales
Gayprojectfile
¿ Qué es el Marketing de Contenidos ?
New Deck
Ch ng 4_-_b_i_gi_ng_anten-truy_n_s_ng
Ch ng 3_-_b_i_gi_ng_anten-truy_n_s_ng_1_
Baigiangdugio 20-11-08
Ch ng 3_-_b_i_gi_ng_anten-truy_n_s_ng_2_
259973943 xbee-node-temperature-sensor
Online Security - The Good, the Bad, and the Crooks
2D Art Dalam Video Game - Kudit
Ad

Similar to Usability testing (20)

PPTX
Process capability relation between yield and number of parts in assembly und...
PDF
Need for Speed: How to Performance Test the right way by Annie Bhaumik
PPTX
Design of Experiments
PPT
Cs 568 Spring 10 Lecture 5 Estimation
PDF
Demystifying Sample Size - How Many Participants Do You Really Need for UX Re...
PPTX
software engineering metrics concpets in advanced sotwrae
PPT
Unit 2 Unit level testing.ppt
PDF
Keynote AST 2016
PPTX
Process Control
PDF
Bootstrapping of PySpark Models for Factorial A/B Tests
PPTX
Effective Test Cases & Introduction to Hexawise
PPTX
The Art of Testing Less without Sacrificing Quality @ ICSE 2015
PDF
Extreme programming talk wise consulting - www.talkwiseconsulting
PDF
Extreme Programming Talk Wise Consulting Www.Talkwiseconsulting
PDF
Class 12 CBSE Computer Science Investigatory Project
PDF
Monte Carlo Simulation for project estimates v1.0
PPTX
2015 drupalcampcebu estimation_jrf
PDF
Test Pyramid vs Roi
PPTX
Пирамида Тестирования через призму ROI калькулятора и прочая геометрия
PPT
dxDOE design of experiment for students.ppt
Process capability relation between yield and number of parts in assembly und...
Need for Speed: How to Performance Test the right way by Annie Bhaumik
Design of Experiments
Cs 568 Spring 10 Lecture 5 Estimation
Demystifying Sample Size - How Many Participants Do You Really Need for UX Re...
software engineering metrics concpets in advanced sotwrae
Unit 2 Unit level testing.ppt
Keynote AST 2016
Process Control
Bootstrapping of PySpark Models for Factorial A/B Tests
Effective Test Cases & Introduction to Hexawise
The Art of Testing Less without Sacrificing Quality @ ICSE 2015
Extreme programming talk wise consulting - www.talkwiseconsulting
Extreme Programming Talk Wise Consulting Www.Talkwiseconsulting
Class 12 CBSE Computer Science Investigatory Project
Monte Carlo Simulation for project estimates v1.0
2015 drupalcampcebu estimation_jrf
Test Pyramid vs Roi
Пирамида Тестирования через призму ROI калькулятора и прочая геометрия
dxDOE design of experiment for students.ppt
Ad

More from gamelanYK (6)

PDF
Bengkel Gamelan 3D game asset workflow
PPTX
Bengkel Gamelan : Pixel Art Best Practices by Wisageni Studio
PDF
JGJ48 : Intel Realsense - Firstman Marpaung
PDF
Bengkel 6 pengetahuan dasar audio pada game (1)
PPTX
Bengkel 4 bring your unity game to windows phone 8
PDF
Bengkel 8 presentasi press release 101
Bengkel Gamelan 3D game asset workflow
Bengkel Gamelan : Pixel Art Best Practices by Wisageni Studio
JGJ48 : Intel Realsense - Firstman Marpaung
Bengkel 6 pengetahuan dasar audio pada game (1)
Bengkel 4 bring your unity game to windows phone 8
Bengkel 8 presentasi press release 101

Recently uploaded (20)

PPTX
Tartificialntelligence_presentation.pptx
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
Zenith AI: Advanced Artificial Intelligence
PDF
WOOl fibre morphology and structure.pdf for textiles
PDF
2021 HotChips TSMC Packaging Technologies for Chiplets and 3D_0819 publish_pu...
PDF
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
PDF
A contest of sentiment analysis: k-nearest neighbor versus neural network
PPTX
TLE Review Electricity (Electricity).pptx
PDF
Getting started with AI Agents and Multi-Agent Systems
PPTX
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PDF
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
PDF
Developing a website for English-speaking practice to English as a foreign la...
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PPTX
O2C Customer Invoices to Receipt V15A.pptx
PDF
Web App vs Mobile App What Should You Build First.pdf
PPTX
1. Introduction to Computer Programming.pptx
PPTX
Chapter 5: Probability Theory and Statistics
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
A comparative study of natural language inference in Swahili using monolingua...
Tartificialntelligence_presentation.pptx
Assigned Numbers - 2025 - Bluetooth® Document
Zenith AI: Advanced Artificial Intelligence
WOOl fibre morphology and structure.pdf for textiles
2021 HotChips TSMC Packaging Technologies for Chiplets and 3D_0819 publish_pu...
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
A contest of sentiment analysis: k-nearest neighbor versus neural network
TLE Review Electricity (Electricity).pptx
Getting started with AI Agents and Multi-Agent Systems
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
Group 1 Presentation -Planning and Decision Making .pptx
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
Developing a website for English-speaking practice to English as a foreign la...
NewMind AI Weekly Chronicles - August'25-Week II
O2C Customer Invoices to Receipt V15A.pptx
Web App vs Mobile App What Should You Build First.pdf
1. Introduction to Computer Programming.pptx
Chapter 5: Probability Theory and Statistics
Programs and apps: productivity, graphics, security and other tools
A comparative study of natural language inference in Swahili using monolingua...

Usability testing

  • 2. WHY ? " Since the limitation of data, and the lack of theoretical foundation in Game Design, most of games have been developed based solely on own experiences and intuitions of the Designer. As the result, about 80% of games fail on the market every year." ( Game Software Industry Report in AlienBrain product catalog. NxN software. 2001 )
  • 3. WHY ? (2) "However, it is necessary to point out that, too often, video game interfaces are an afterthought. The reason is, too many project managers assume the most important part of a software development project is the programming, and then the interface can come later. As the result, insufficient time is assigned for interface design which may leads to a poor quality interface." ( Fox 2005 )
  • 4. MORE INFORMATION ... "Human Computer Interaction in Game Design" - Nguyen Hung - http://www.theseus.fi/bitstream/handle/10024/43234/Nguyen_Hung. pdf?sequence=1
  • 5. MORE INFORMATION ... (2) "Quantifying The User Experince" - Jeff Sauro / James R. Lewis -
  • 8. BUG FREE != USABLE
  • 10. HOW DO WE DO IT ? • Compare it to a specific benchmark or goal. • Get stastistical w ays to get more precise answers. • Get statistically significant evidence from small samples.
  • 11. HOW DO WE SET A BENCHMARK ? • Based on historical data obtained from previous test that included the task. • Based on findings reported in published scientific or marketing research. • Negotiate criteria with the stakeholders who are responsible for the product.
  • 12. HOW DO WE SET A BENCHMARK ? (2) Some suggestions : • The best objective basis are data from previous usability studies of predecessor or competitive products. • The source of historical data should be studies of similiar types of participans, completing the same tasks, under the same conditions. • Negotiate with other stakeholders for the final set of shared goals.
  • 13. HOW DO WE SET A BENCHMARK ? (2) Some other suggestions : • Establish some specific objectives immediately, so you can measure improvements. • Revise your product in the early stages. • Do not change reasonable goals to accomodate an unusable product.
  • 14. COMPARING A COMPLETION RATE TO A BENCHMARK small sample test & largle sample test
  • 15. SMALL SAMPLE TEST • success / fail • "small" sample size = the total number of users tested is less than 30.
  • 16. HERE'S THE FORMULA ( brace yourselves )
  • 17. Use the exact probabilities from the binomial distribution, where : x = the number of users who successfully completed the task n = sample size )( )1( )!(! ! )( xnx pp xnx n xp    
  • 18. LIFE HACK .. Use Microsoft Excel's function : BINOMDST()
  • 19. EXAMPLE 1 Eight of nine users successfully completed a task. Is there sufficent evidence to conclude that at least 70% of all users would be able to complete the same task ?
  • 20. ANSWER 1556.0)7.01(7.0 )!89(!8 !9 )8( )89(8     p  04035.0)7.01(7.0 )!99(!9 !9 )9( )99(9     p OR.. = BINOMDIST (8 , 9 , 0.7 , FALSE) = 0.1556 = BINOMDIST (9 , 9 , 0.7 , FALSE) = 0.04035
  • 21. CONCLUSION 0.1556 + 0.04035 = 0.1960 The probability of 8 or 9 successes out of nine attempts is (1 - 0.1960) * 100 = 80.4% There is an 80.4% chance that the completion rate exceeds 70%
  • 22. MID - PROBABILITY 0.5*(0.1556) + 0.04035 = 0.07782 The probability of 8 or 9 successes out of nine attempts is (1 - 0.07782) * 100 = 88.4% There is an 88.4% chance that the completion rate exceeds 70%
  • 23. MID - PROBABILITY 0.5*(0.1556) + 0.04035 = 0.07782 The probability of 8 or 9 successes out of nine attempts is (1 - 0.07782) * 100 = 88.4% There is an 88.4% chance that the completion rate exceeds 70%
  • 24. • Not suitable for production, but sufficent enough to show that efforts are better spent on improving other functions. • The probability we computed is called an "exact" probability. Not because it's exactly correct, but because the probabilities are calculated correctly. Rather than approximated. • This result tend to be coservative. IMPORTANT NOTES
  • 25. LARGE SAMPLE TEST • success / fail • "large" sample size = at least 15 failures and 15 successes.
  • 26. HERE'S THE FORMULA ( brace yourselves again)
  • 27. pˆ n pp pp z )1( ˆ    Use normal approximation to the binomial, where : = the observed completion rate expressed as a proportion p = benchmark n = number of users tested
  • 28. EXAMPLE 2 85 out of 100 users were able to successfully locate a specific product and add it to their shopping cart. Is there enough evidence to conclude that at least 75% of all users can complete this task successfully ?
  • 29. ANSWER 309.2 100 )75.01(75.0 75.085.0    z • Use NORMSDIST() to get the z-score. • Final result = abs( NORMSDIST(2.309) - 1 ) = 0.0105
  • 30. CONCLUSION 0.0105 * 100 = 1.05 % There is around 99% chance that at least 75% of users can complete the task.
  • 31. COMPARING A TASK TIME TO A BENCHMARK
  • 32. HERE'S THE FORMULA where : n s x t ln lnˆ)ln(    ln ˆx lns = mean of the log values = standar deviation of the log values
  • 33. EXAMPLE 3 11 users completed a task in a financial application. Task times : 90, 59, 54, 55, 171, 86, 107, 53, 79, 72, 157 Is there enough evidence that the average task time is less than 100 seconds?
  • 34. ANSWER • Task Times = 90, 59, 54, 55, 171, 86, 107, 53, 79, 72, 157 • Log-transformed times = 4.5, 4.08, 3.99, 4.01, 5.14, 4.45, 4.67, 3.97, 4.37, 4.28, 5.06 • Mean of log times = 4.41 • Geometric mean of log times = EXP(4.41) = 82.3 • Standar deviation of log times = 0.411 • Log of benchmark (60s) = 4.61
  • 35. ANSWER (2) find the t-statistic value Use the probability on 10 degrees of freedom (n-1); TDIST(1.53,10,1) = 0.0785 53.1 124.0 19.0 11 411.0 41.461.4   t
  • 36. CONCLUSION The probability of seeing an average time of 82.3 seconds if the actual population time is greater than 100 seconds is around 7.87% OR We can be 92.15% confident that users can complete this task in less than 100 seconds.
  • 37. • What is geometric mean? The best estimate of the middle task time for small-sample usability data (less than 25). • How about large-sample usability data? Use sample median method. (won't be explained here) IMPORTANT NOTES
  • 40. Source : Theo Allen
  • 42. THANK YOU Punto Damar P. facebook.com/puntodamar @ puntodamar BikinGame.com