SlideShare a Scribd company logo
Effective Fault-Localization
Techniques for 	

Concurrent Software
Sangmin Park	

08/06/2014	

!
Committee: 	

RichVuduc, Mayur Naik,Alex Orso	

Milos Prvulovic, Mark Grechanik	

(Mary Jean Harrold)
Introduction Background Prior Work User Study Conclusion
Impact of Concurrency Bugs
2
Northeast Blackout Facebook IPO Glitch
FAIL
Introduction Background Prior Work User Study Conclusion
Debugging Concurrency Bugs
3
Concurrency bugs are rated as the most difficult types of bugs
Survey at Microsoft	

[Godefroid08]
• 72% rated concurrency
bugs ‘very hard’ or ‘hard’
to debug

!
• 83% rated concurrency
bugs ‘most severe’ or
‘severe’
What is the hardest bug?

!
#1: Concurrency bugs 	

(40%, 101/255)
StackOverflow	



http://bit.ly/sohardest
Introduction Background Prior Work User Study Conclusion
Debugging Concurrency Bugs
4
Concurrency bugs are difficult to locate, understand, and fix
“Intermittently I get the following error.	

I would be grateful if anyone could shed any light on this issue.”
Difficult to Locate
* BugID: 27315
“I’ve noticed and reproduced crashes with the following stack trace. …	

I have no clues on why this crash occurs.”
Difficult to Locate and Understand
* BugID: 3596MySQL
!
40% of initial patches to concurrency bugs are buggy	

Highest ratio among all software bugs
Difficult to Fix
[Yin, FSE11]Survey
Introduction Background Prior Work User Study Conclusion
Challenges
5
Non-determinism Complex state changes
Debugging Concurrent Programs [McDowell 89]
Introduction Background Prior Work User Study Conclusion
Debugging Process
6
Software
Testcase
Fault Localization Fault Understanding Fault Correction
Introduction Background Prior Work User Study Conclusion
Debugging Process
7
Software
Testcase
Fault Localization Fault Understanding Fault Correction
Localize Single-Variable Faults
[ICSE 2010]
1
Localize Multi-Variable Faults
[ICST 2012]
2
Provide Fault Explanation
[ISSTA 2013]
3
User Study
4
Before	

Proposal
After	

Proposal
Introduction Background Prior Work User Study Conclusion
Thesis Statement
8
Dynamic fault-localization techniques
can assist developers in locating and
understanding non-deadlock
concurrency bugs by identifying
suspicious memory-access patterns and
providing calling contexts and methods.
Introduction Background Prior Work User Study Conclusion
Concurrency Bugs
9
Bug Type Ratio Bug Cause
Deadlock 30%
Mutual Exclusion, Hold/Wait,
No preemption, Circular Wait
Order Violation 22% Memory Access Orders
Atomicity Violation 47% Memory Access Orders
Others 1%
* Learning from Mistakes - A Comprehensive Survey on Concurrency Bugs [Lu08].
Introduction Background Prior Work User Study Conclusion
OrderViolation
10* https://bugzilla.mozilla.org/show_bug.cgi?id=61369
Thread 1 Thread 2
void init(…) {	

!
mThread = CreateThread();	

!
}
void foo(…) {	

!
mState = mThread -> State;	

!
}
W R
Introduction Background Prior Work User Study Conclusion
AtomicityViolation
11* https://bugzilla.mozilla.org/show_bug.cgi?id=73291
Thread 1 Thread 2
…	

lock(L);	

lptr = str;	

unlock(L);	

!
…	

lock(L);	

llen = length;	

unlock(L);	

…	

lock(L);	

str = newStr;	

unlock(L);	

!
…	

lock(L);	

length = newLength;	

unlock(L);
WR
char* str; // shared vars	

int length; // locked by L	

WR
Introduction Background Prior Work User Study Conclusion
Patterns for Concurrency Bugs
12
OrderViolation: R1(x) W2(x)	

!
AtomicityViolation: R1(x) W2(x) W2(y) R1(y)
Introduction Background Prior Work User Study Conclusion
Patterns for Concurrency Bugs
13
Type Memory Access Patterns
Order Violation
R1(x) W
W1(x) R
W1(x) W
Atomicity Violation 

(one variable)
R1(x) W
W1(x) W
R1(x) W
W1(x) R
W1(x) W
Atomicity Violation

(multiple variables)
W1(x) W
W1(x) W
W1(x) W
W1(x) R
W1(x) R
R1(x) W
R1(x) W
R1(x) W
W1(x) R
* The patterns were identified by previous work [Lu06,Vaziri06, Hammer08].
Developed fault-localization techniques 

for these patterns
Introduction Background Prior Work User Study Conclusion
Prior Work
14
Software
Testcase
Fault Localization Fault Understanding Fault Correction
Localize Single-Variable Faults
[ICSE 2010]
1
Localize Multi-Variable Faults
[ICST 2012]
2
Provide Fault Explanation
[ISSTA 2013]
3
User Study
4
Before
Proposal
After	

Proposal
Introduction Background Prior Work User Study Conclusion
Fault Localization for 	

Single-Variable Faults
15
Ranked List	

for Single-Variable Bugs
1. R-W-R
2. R-W-W
3. R-W-W
4. W-W-W
5. R-W-W
....
Software
Testcase
FALCON
Falcon
Statistical
Fault Localization
Dynamic
Pattern Detection
Single-Variable
Patterns
• Pros: Effective in ranking patterns	

• Cons: Miss multi-variable faults 

(30% of non-deadlock concurrency bugs [Lu 08])
Introduction Background Prior Work User Study Conclusion
Fault Localization for
Multiple-Variable Faults
16
1. R-W-R
2. R-W-W-W
3. R-W-W-R
4. W-W-W-W
5. R-W
....
Ranked List for 	

Single-/Multi-Variable Bugs
UNICORN
Software
Testcase
Unicorn
Dynamic
Pair Detection
Pattern
Combination
Pairs Patterns Statistical
Fault Localization
• Pros: Effective in ranking patterns	

• Cons: Miss contextual information
Introduction Background Prior Work User Study Conclusion
Fault Explanation
17
GRIFFIN
Software
Testcase
Bug Graphs	

(memory accesses +

calling stacks +	

suspicious methods)
Thread 1 Thread 2
150 Foo ()
270 int getS ()
271 return s; R
R
R
W 851 b.s += c.s;
852 b.a += c.a;W
152 Foo ()
680 void Bar()
681 a.s = b.s;
682 a.a = b.a;
Griffin
Unicorn
Fault Localization
Pattern
Clustering
Patterns
per
Execution
Clustered
Patterns
Context
Reconstruction
Effective in clustering memory accesses
and locating the bug at method level
Introduction Background Prior Work User Study Conclusion
Introduction Background Prior Work User Study Conclusion
Fault Explanation
16
Software
Testcase
Bug Graphs!
(memory accesses +

calling stacks +!
suspicious methods)
Griffin
Unicorn
Fault Localization
Execution
Clustering
Pattern
Clustering
Ranked
Lists
Clustered
Failing
Executions
GRIFFIN
Introduction Background Prior Work User Study Conclusion
Fault Localization for
Multiple-Variable Faults
15
Software
Testcase
1. R-W-R
2. R-W-W-W
3. R-W-W-R
4. W-W-W-W
5. R-W
....
Ranked List!
for Single-/Multi-Variable Bugs
Unicorn
Dynamic
Pair Detection
Pattern
Combination
Pairs Patterns Statistical
Fault Localization
UNICORN
Introduction Background Prior Work User Study Conclusion
Fault Localization for !
Single-Variable Faults
14
Ranked List!
for Single-Variable Bugs
1. R-W-R
2. R-W-W
3. R-W-W
4. W-W-W
5. R-W-W
....
Software
Testcase
Falcon
Statistical
Fault Localization
Dynamic
Pattern Detection
Single-Variable
Patterns
FALCON
18
Usefulness
FALCON
UNICORN GRIFFIN
Introduction Background Prior Work User Study Conclusion
Goal
19
Introduction Background Prior Work User Study Conclusion
Fault Explanation
16
Software
Testcase
Bug Graphs!
(memory accesses +

calling stacks +!
suspicious methods)
Griffin
Unicorn
Fault Localization
Execution
Clustering
Pattern
Clustering
Ranked
Lists
Clustered
Failing
Executions
GRIFFIN
Introduction Background Prior Work User Study Conclusion
Fault Localization for
Multiple-Variable Faults
15
Software
Testcase
1. R-W-R
2. R-W-W-W
3. R-W-W-R
4. W-W-W-W
5. R-W
....
Ranked List!
for Single-/Multi-Variable Bugs
Unicorn
Dynamic
Pair Detection
Pattern
Combination
Pairs Patterns Statistical
Fault Localization
UNICORN
Introduction Background Prior Work User Study Conclusion
Fault Localization for !
Single-Variable Faults
14
Ranked List!
for Single-Variable Bugs
1. R-W-R
2. R-W-W
3. R-W-W
4. W-W-W
5. R-W-W
....
Software
Testcase
Falcon
Statistical
Fault Localization
Dynamic
Pattern Detection
Single-Variable
Patterns
FALCON
FALCON
UNICORN GRIFFIN
To determine whether these fault-localization
techniques help developers in understanding and
fixing concurrency bugs
3 techniques implemented in Eclipse tools
Introduction Background Prior Work User Study Conclusion
Debugging Tools
20
Tools Output Comments
Tracer!
(Baseline)
Dump of shared
memory accesses from
a failing execution
• Based on ConcurrencyExplorer
(at Microsoft)
• Tool used for debugging*
Unicorn
Ranked list of memory-
access patterns
• Unicorn subsumes Falcon
• Based on Unicorn
Griffin
List of memory
accesses with calling
context
• Based on Griffin
* Other tools (e.g.,TIE, Jive, Jove) focus on visualizing thread interactions.
Introduction Background Prior Work User Study Conclusion
Tool:Tracer
21
Output: Dump of memory accesses
Thread Selector
Thread, Source Location, Variable…
Compared to ConcurrencyExplorer	

• Same output (dump of memory accesses)	

• Same outlook (tool + editor)
Introduction Background Prior Work User Study Conclusion
Tools Output Comments
Tracer!
(Baseline)
Dump of shared
memory accesses from
a failing execution
• Based on ConcurrencyExplorer
(at Microsoft)
• Tool used for debugging*
Unicorn
Ranked list of memory-
access patterns
• Unicorn subsumes Falcon
• Based on Unicorn
Griffin
List of memory
accesses with calling
context
• Based on Griffin
Debugging Tools
22
Introduction Background Prior Work User Study Conclusion
Tool: Unicorn
23
Output: Ranked list of memory-access patterns
R-W-W pattern
Compared to Tracer	

+ Memory patterns	

- Thread identifier
Introduction Background Prior Work User Study Conclusion
Tools Output Comments
Tracer!
(Baseline)
Dump of shared
memory accesses from
a failing execution
• Based on ConcurrencyExplorer
(at Microsoft)
• Tool used for debugging*
Unicorn
Ranked list of memory-
access patterns
• Unicorn subsumes Falcon
• Based on Unicorn
Griffin
List of memory
accesses with calling
context
• Based on Griffin
Debugging Tools
24
Introduction Background Prior Work User Study Conclusion
Tool: Griffin
25
Interleaving
Output: List of memory accesses with calling context
+ Clustered accesses
+ Suspicious methods
+ Calling context
Compared to Unicorn	

+ Clustered memory accesses	

+ Suspicious methods	

+ Calling context
Introduction Background Prior Work User Study Conclusion
Hypotheses
• H1 (understanding): Unicorn > Tracer	

✦ Unicorn provides summary of bugs	

!
• H2 (understanding): Griffin > Unicorn,Tracer	

✦ Griffin provides more context information	

!
• H3 (fix): Unicorn, Griffin > Tracer	

✦ Understand better => Fix better
26
Introduction Background Prior Work User Study Conclusion
Study Setup
27
3 Subject Programs
32 Developers
Protocol
Introduction Background Prior Work User Study Conclusion
Study Setup
28
3 Java Programs:

- Bank Account (100 LoC)

- Shop (300 LoC)

- List (25 KLoC)
Introduction Background Prior Work User Study Conclusion
Subject 1: Bank Account
29
User 2User 1
Balance: $100Balance: $100
Deposit: $300
Withdraw: -$100
Transfer: -$100
Deposit: $300
Withdraw: -$100
Transfer: -$100
$100
$400
$300
$300
$100
$400
$300
$300$200 $400
• Size: 100 LoC	

• Difficulty: Easy
Introduction Background Prior Work User Study Conclusion
Subject 2: Shop
30
CustomerGetItem
Supplier
Shop
PutItem
CustomerGetItem
Customer
GetItem
Bug:The program crashes with an exception at Shop.
• Size: 300 LoC	

• Difficulty: Medium
Introduction Background Prior Work User Study Conclusion
Subject 3: List
31
A B CInitially, create three synchronized lists
…
.add(item)B
.add(item)B
.add(item)B
C .add(item)
C .add(item)
C .add(item)
B .clear()
A .addAll( )B
Item Item Item Itemnull Item Item ItemA B C :
• Size: 25 KLoC	

• Difficulty: Hard
Introduction Background Prior Work User Study Conclusion
Study Setup
32
32 Developers: 

- Graduate students

- Development experience

(2~30 years, 11 median)

- Concurrency experience

(7 beginners, 10 experts)
Introduction Background Prior Work User Study Conclusion
Study Design
33
T1 T2 T3
S1 S2 S3 1) S1-T1, S2-T2, S3-T3
2) S1-T1, S2-T3, S3-T2
3) S1-T2, S2-T1, S3-T3
4) S1-T2, S2-T3, S3-T1
5) S1-T3, S2-T1, S3-T2
6) S1-T3, S2-T2, S3-T1
Factorial
Design
Introduction Background Prior Work User Study Conclusion
Study Setup
34
Protocol: 

- 1 hr 30 min = 

20 min tutorial + 

20 min per task +

10 min buffer

- Task = Debug + Survey

- 5 surveys
Introduction Background Prior Work User Study Conclusion
Surveys
35
Background: 	

• Programming experience	

• Concurrency experience	

!
For each task:	

• Usefulness	

• Understanding	

• Fix	

!
!
!
Introduction Background Prior Work User Study Conclusion
Surveys
36
Background: 	

• Programming experience	

• Concurrency experience	

!
For each task:	

• Usefulness	

• Understanding	

• Fix	

!
!
!
Introduction Background Prior Work User Study Conclusion
Surveys
37
Background: 	

• Programming experience	

• Concurrency experience	

!
For each task:	

• Usefulness	

• Understanding	

• Fix	

!
!
!
Introduction Background Prior Work User Study Conclusion
Surveys
38
Background: 	

• Programming experience	

• Concurrency experience	

!
For each task:	

• Usefulness	

• Understanding	

• Fix	

!
Final:	

• Rank of the tools	

• General feedback
Evaluation


Scores (1 to 5 scale)	

• Usefulness	

• Understanding: graded	

• Fix: ranking-based	

!
Hypothesis Testing	

• For each task, we performed 

unpaired t-test for different tool users
Introduction Background Prior Work User Study Conclusion
Overall Result
39
Score Type
Hypothesis
Testing
Task 1:!
Bank Account!
Task 2:!
Shop
Task 3:
List
Usefulness
Griffin > Tracer 0.67 2.53 2.17
Griffin > Unicorn -0.14 0.31 1.44
Unicorn > Tracer 0.81 2.22 0.72
Understanding
Griffin > Tracer 0.96 0.18 0.98
Griffin > Unicorn 0.62 0.07 1.11
Unicorn > Tracer -0.07 0.11 -0.13
Fix
Griffin > Tracer 0.24 0.64 0.42
Griffin > Unicorn 0.51 0.09 1.11
Unicorn > Tracer -0.29 0.56 -0.69
* Numbers = Mean difference (-4 to 4); Bold = Statistically significant (p < 0.05).
Introduction Background Prior Work User Study Conclusion
Hypothesis Testing
40
Score Type
Hypothesis
Testing
Task 1:!
Bank Account!
Task 2:!
Shop
Task 3:
List
Usefulness
Griffin > Tracer 0.67 2.53 2.17
Griffin > Unicorn -0.14 0.31 1.44
Unicorn > Tracer 0.81 2.22 0.72
Understanding
Griffin > Tracer 0.96 0.18 0.98
Griffin > Unicorn 0.62 0.07 1.11
Unicorn > Tracer -0.07 0.11 -0.13
Fix
Griffin > Tracer 0.24 0.64 0.42
Griffin > Unicorn 0.51 0.09 1.11
Unicorn > Tracer -0.29 0.56 -0.69
H1 (understanding): Unicorn > Tracer

* Numbers = Mean difference (-4 to 4); Bold = Statistically significant (p < 0.05).
Introduction Background Prior Work User Study Conclusion
Hypothesis Testing
41
Score Type
Hypothesis
Testing
Task 1:!
Bank Account!
Task 2:!
Shop
Task 3:
List
Usefulness
Griffin > Tracer 0.67 2.53 2.17
Griffin > Unicorn -0.14 0.31 1.44
Unicorn > Tracer 0.81 2.22 0.72
Understanding
Griffin > Tracer 0.96 0.18 0.98
Griffin > Unicorn 0.62 0.07 1.11
Unicorn > Tracer -0.07 0.11 -0.13
Fix
Griffin > Tracer 0.24 0.64 0.42
Griffin > Unicorn 0.51 0.09 1.11
Unicorn > Tracer -0.29 0.56 -0.69
H2 (understanding): Griffin > Unicorn,Tracer

* Numbers = Mean difference (-4 to 4); Bold = Statistically significant (p < 0.05).
Introduction Background Prior Work User Study Conclusion
Hypothesis Testing
42
Score Type
Hypothesis
Testing
Task 1:!
Bank Account!
Task 2:!
Shop
Task 3:
List
Usefulness
Griffin > Tracer 0.67 2.53 2.17
Griffin > Unicorn -0.14 0.31 1.44
Unicorn > Tracer 0.81 2.22 0.72
Understanding
Griffin > Tracer 0.96 0.18 0.98
Griffin > Unicorn 0.62 0.07 1.11
Unicorn > Tracer -0.07 0.11 -0.13
Fix
Griffin > Tracer 0.24 0.64 0.42
Griffin > Unicorn 0.51 0.09 1.11
Unicorn > Tracer -0.29 0.56 -0.69
H3 (fix): Unicorn, Griffin > Tracer

* Numbers = Mean difference (-4 to 4); Bold = Statistically significant (p < 0.05).
Introduction Background Prior Work User Study Conclusion
• H1 (understanding): Unicorn > Tracer

• H2 (understanding): Griffin > Unicorn,Tracer

• H3 (fix): Unicorn, Griffin > Tracer
Hypothesis Testing
43
Introduction Background Prior Work User Study Conclusion
Analysis by Tool Preference
• How many participants rate Griffin as the
best tool?	

!
• Did these participants actually understand
bugs better?
44
Introduction Background Prior Work User Study Conclusion
Results by Tool Preference
45
Task Score Type
Group-T!
(2)
Group-U!
(7)
Group-G!
(21)
Task 1: !
Bank Account
Understanding 3.0 3.75 3.78
Fix 2.0 2.37 3.05
Task 2: !
Shop
Understanding 3.33 4.12 4.26
Fix 2.33 3.75 4.0
Task 3:!
List
Understanding 2.66 2.75 3.05
Fix 1.33 2.87 2.68
21
* Numbers in headers = # participants, Numbers in other cells = average scores
• How many participants rate Griffin as the best
tool? 21 (70%)	

• Did these participants actually understand bugs
better? Yes
Introduction Background Prior Work User Study Conclusion
Discussion:Tool Usage
46
Griffin Tracer
Track
Confirm
“There are three dimensions to think about:
Time vs.Thread vs. Context. Griffin showed
these quite effectively. However, the other two
tools lacked in these aspects.”
•“Tracer might be useful for simple code.
However, overall it won’t scale in real life
scenarios because most programs are complex.”	

•“Tracer wasn’t very useful on this task because
there were too many threads and instructions
to keep track of.”
Introduction Background Prior Work User Study Conclusion
Discussion: Improvements
47
Fix Advice
Interactive 	

Debugging
Visual 	

Improvement
Introduction Background Prior Work User Study Conclusion
Future Work
48
Software
Testcase
Fault Localization Fault Understanding Fault Correction
Increase Bug Coverage
Reduce Overhead
ImproveVisualization
Support Interactive
Debugging
Use Multiple Inputs
Provide
Fix Advice
Introduction Background Prior Work User Study Conclusion
Data Available in Public
49
Data Location
Unicorn http://www.cc.gatech.edu/~sangminp/unicorn
Griffin http://www.cc.gatech.edu/~sangminp/griffin
Subject Programs http://www.cc.gatech.edu/~sangminp/bugs
User Study
http://www.cc.gatech.edu/~sangminp/concurrency-
study
Introduction Background Prior Work User Study Conclusion
Contributions
50
Introduction Background Prior Work User Study Conclusion
• H1: Participants using Unicorn will understand
concurrency bugs better than participants using Tracer

• H2: Participants using Griffin will understand concurrency
bugs better than participants using Unicorn

• H3: Participants using Unicorn and Griffin will fix
concurrency bugs better than participants using Tracer
Hypothesis Testing
32
Introduction Background Prior Work User Study Conclusion
Data Available in Public
45
Data Location
Unicorn http://www.cc.gatech.edu/~sangminp/unicorn
Griffin http://www.cc.gatech.edu/~sangminp/griffin
Subject Programs http://www.cc.gatech.edu/~sangminp/bugs
User Study
http://www.cc.gatech.edu/~sangminp/concurrency-
study
Introduction Background Prior Work User Study Conclusion
Fault Localization for !
Single-Variable Faults
14
Ranked List!
for Single-Variable Bugs
1. R-W-R
2. R-W-W
3. R-W-W
4. W-W-W
5. R-W-W
....
Software
Testcase
Falcon
Statistical
Fault Localization
Dynamic
Pattern Detection
Single-Variable
Patterns
FALCON Introduction Background Prior Work User Study Conclusion
Fault Localization for
Multiple-Variable Faults
15
Software
Testcase
1. R-W-R
2. R-W-W-W
3. R-W-W-R
4. W-W-W-W
5. R-W
....
Ranked List!
for Single-/Multi-Variable Bugs
Unicorn
Dynamic
Pair Detection
Pattern
Combination
Pairs Patterns Statistical
Fault Localization
UNICORN
Introduction Background Prior Work User Study Conclusion
Fault Explanation
17
GRIFFIN
Software
Testcase
Bug Graphs!
(memory accesses +

calling stacks +!
suspicious methods)
Thread 1 Thread 2
150 Foo ()
270 int getS ()
271 return s; R
R
R
W 851 b.s += c.s;
852 b.a += c.a;W
152 Foo ()
680 void Bar()
681 a.s = b.s;
682 a.a = b.a;
Griffin
Unicorn
Fault Localization
Pattern
Clustering
Patterns
per
Executions
Clustered
Patterns
Context
Reconstruction
Backup Slides
Introduction Background Prior Work User Study Conclusion
Why did you implement
Tracer as Eclipse plugin?
• To minimize the effect of UI: 	

• Same IDE: Eclipse	

• Similar UI for all debuggers: similar colors,
list view	

• Language difference: C#, Java
52
Introduction Background Prior Work User Study Conclusion
Concurrency Explorer
53Shared-memory dump Editor windows
Introduction Background Prior Work User Study Conclusion
ConcurrencyExplorer
vs Tracer
54
ConcurrencyExplorer Tracer
Output
Memory dump
(Source line, Thread, Object ID
Memory dump
(Source line, Thread)
UI Elements
• Window for Dump
• Editor for Source
• Window for Dump
• Editor for Source
IDE Visual Studio Eclipse
* ConcurrencyExplorer doesn’t show values of variables.
Introduction Background Prior Work User Study Conclusion
Tools for Concurrent S/W
55
Jive and Jove	

!
• Link: http://
cs.brown.edu/~spr/
research/visjove.html

!
• To show thread
interactions 	

• Not focused on
showing bugs
Introduction Background Prior Work User Study Conclusion
56
Tools for Concurrent S/W
TIE	

!
• Link: 

https://
www.youtube.com/
watch?
v=kbNXlLAkPgU 	

!
• To show thread
interactions	

• Not focused on
showing bugs
Introduction Background Prior Work User Study Conclusion
57
Tools for Concurrent S/W
ConcurrencyVisualizer (for Performance): The snapshot
shows inter-thread dependencies.
Introduction Background Prior Work User Study Conclusion
Tutorial
• Tutorial on Java Concurrency	

• Bugs: Order/AtomicityViolations	

• Fix Strategies	

• Example Program	

• Demo on Debugging Tools
58
Introduction Background Prior Work User Study Conclusion
Survey Links
• Link (Background): https://docs.google.com/forms/d/
1xthnR5Ibw8q1qrqn-WrBti5zjFVD1b-
nYZZ1S4RurMM/viewform 	

• Link (Task): https://docs.google.com/forms/d/
1SNlg4anVAZmR99EZjErvwG0rnW2nLrLYpwqZjNf
Q-yc/viewform 	

• Link (Final): https://docs.google.com/forms/d/
1L3_Intjm6oSwoZp3wWfHIv8Z2jcpNcbbn1nPeuiv1
b8/viewform
59
Introduction Background Prior Work User Study Conclusion
Fix Strategies
60
Introduction Background Prior Work User Study Conclusion
Study Design
61
T1 T2 T3
S1 S2 S3 1) S1-T1, S2-T2, S3-T3
2) S1-T1, S2-T3, S3-T2
3) S1-T2, S2-T1, S3-T3
4) S1-T2, S2-T3, S3-T1
5) S1-T3, S2-T1, S3-T2
6) S1-T3, S2-T2, S3-T1
Factorial
Design
1 3


1 1

1 1
!
1 2
!
1 2
!
2 1
Beginner
Expert
• Setup: Random distribution of participants	

• Results: No significant score differences
between groups
Introduction Background Prior Work User Study Conclusion
Factorial Design
62https://explorable.com/factorial-design
•“Factorial designs are extremely useful to
psychologists and field scientists as a preliminary study,
allowing them to judge whether there is a link
between variables, whilst reducing the possibility of
experimental error and confounding variables.”	

•“The main disadvantage is the difficulty of
experimenting with more than two factors, or
many levels.A”
Introduction Background Prior Work User Study Conclusion
Eclipse Navigation Data
63
Task 1:!
Bank Account
Task 2:!
Shop
Task 3:!
List
Tracer!
Users
54.5 46.11 75.1
Unicorn!
Users
63.4 60.6 62.33
Griffin!
Users
59.11 69 39.11
• Numbers = Average navigation data (click+keyboard)	

• For Task 3, Griffin users have fewer navigation, but the
result is not statistically significant.
Introduction Background Prior Work User Study Conclusion
Why is Fixing more difficult?
• Many strategies	

• Adding a lock	

• Adding a condition (if, while)	

• Switch statements, …	

• Many decisions in one strategy	

• Where should we add a lock?	

• Should we use an existing lock or add a new one?	

• Fix becomes bugs (e.g., adding a new lock -> deadlock)64
Introduction Background Prior Work User Study Conclusion
Limitations
•Participants - size, quality	

•Factorial design	

•Debugging - no editing
65
Introduction Background Prior Work User Study Conclusion
Related Work
• Empirical Studies for Sequential Bugs/Debuggers	

• Weiser,Whyline, Parnin & Orso	

• Empirical Studies for Concurrency	

• for writing faster code	

• for education 	

• Empirical Studies for Concurrency Bugs/Debuggers	

• Sadowski andYi’s study
66

More Related Content

PPTX
Griffin: Grouping Suspicious Memory-Access Patterns to Improve Understanding...
PPTX
Automated Program Repair Keynote talk
PPTX
STAR: Stack Trace based Automatic Crash Reproduction
PDF
SherLog: Error Diagnosis by Connecting Clues from Run-time Logs
PPTX
LSRepair: Live Search of Fix Ingredients for Automated Program Repair
PDF
Automated Vulnerability Testing Using Machine Learning and Metaheuristic Search
PPTX
You Cannot Fix What You Cannot Find! --- An Investigation of Fault Localizati...
PPTX
Binary Analysis - Luxembourg
Griffin: Grouping Suspicious Memory-Access Patterns to Improve Understanding...
Automated Program Repair Keynote talk
STAR: Stack Trace based Automatic Crash Reproduction
SherLog: Error Diagnosis by Connecting Clues from Run-time Logs
LSRepair: Live Search of Fix Ingredients for Automated Program Repair
Automated Vulnerability Testing Using Machine Learning and Metaheuristic Search
You Cannot Fix What You Cannot Find! --- An Investigation of Fault Localizati...
Binary Analysis - Luxembourg

What's hot (20)

PDF
Automated Repair of Feature Interaction Failures in Automated Driving Systems
PPT
Dissertation Defense
PDF
Log-Based Slicing for System-Level Test Cases
PPTX
Heterogeneous Defect Prediction (

ESEC/FSE 2015)
PPTX
TBar: Revisiting Template-based Automated Program Repair
PDF
Impact of Tool Support in Patch Construction
PDF
Bench4BL: Reproducibility Study on the Performance of IR-Based Bug Localization
PPTX
A Closer Look at Real-World Patches
PDF
Malicious ELF Binaries: A Landscape
PDF
Search-driven String Constraint Solving for Vulnerability Detection
PDF
Mining Fix Patterns for FindBugs Violations
PDF
Near-memory & In-Memory Detection of Fileless Malware
PDF
Test Case Prioritization for Acceptance Testing of Cyber Physical Systems
PDF
A Search-based Testing Approach for XML Injection Vulnerabilities in Web Appl...
PPTX
iFixR: Bug Report Driven Program Repair
PDF
Testing of Cyber-Physical Systems: Diversity-driven Strategies
PPTX
PDF
Learning to Spot and Refactor Inconsistent Method Names
PDF
System Testing of Timing Requirements based on Use Cases and Timed Automata
Automated Repair of Feature Interaction Failures in Automated Driving Systems
Dissertation Defense
Log-Based Slicing for System-Level Test Cases
Heterogeneous Defect Prediction (

ESEC/FSE 2015)
TBar: Revisiting Template-based Automated Program Repair
Impact of Tool Support in Patch Construction
Bench4BL: Reproducibility Study on the Performance of IR-Based Bug Localization
A Closer Look at Real-World Patches
Malicious ELF Binaries: A Landscape
Search-driven String Constraint Solving for Vulnerability Detection
Mining Fix Patterns for FindBugs Violations
Near-memory & In-Memory Detection of Fileless Malware
Test Case Prioritization for Acceptance Testing of Cyber Physical Systems
A Search-based Testing Approach for XML Injection Vulnerabilities in Web Appl...
iFixR: Bug Report Driven Program Repair
Testing of Cyber-Physical Systems: Diversity-driven Strategies
Learning to Spot and Refactor Inconsistent Method Names
System Testing of Timing Requirements based on Use Cases and Timed Automata
Ad

Viewers also liked (15)

PPTX
Potential Biases in Bug Localization: Do They Matter?
PPTX
Falcon: Fault Localization in Concurrent Programs
PPTX
CarFast: Achieving Higher Statement Coverage Faster
PDF
The power of datomic
PPTX
Testing Concurrent Programs to Achieve High Synchronization Coverage
PPTX
Hitchhiker Trees - Strangeloop 2016
PDF
PyCon APAC 2016 Keynote
PDF
Apache Arrow and Python: The latest
PDF
Huohua: A Distributed Time Series Analysis Framework For Spark
PDF
Python Data Ecosystem: Thoughts on Building for the Future
PDF
Python Data Wrangling: Preparing for the Future
PPTX
Raising the Tides: Open Source Analytics for Data Science
PDF
Improving Python and Spark (PySpark) Performance and Interoperability
PPT
Datomic
PDF
Mesos: The Operating System for your Datacenter
Potential Biases in Bug Localization: Do They Matter?
Falcon: Fault Localization in Concurrent Programs
CarFast: Achieving Higher Statement Coverage Faster
The power of datomic
Testing Concurrent Programs to Achieve High Synchronization Coverage
Hitchhiker Trees - Strangeloop 2016
PyCon APAC 2016 Keynote
Apache Arrow and Python: The latest
Huohua: A Distributed Time Series Analysis Framework For Spark
Python Data Ecosystem: Thoughts on Building for the Future
Python Data Wrangling: Preparing for the Future
Raising the Tides: Open Source Analytics for Data Science
Improving Python and Spark (PySpark) Performance and Interoperability
Datomic
Mesos: The Operating System for your Datacenter
Ad

Similar to Effective Fault-Localization Techniques for Concurrent Software (20)

PPTX
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
PDF
Fighting Software Inefficiency Through Automated Bug Detection
PDF
BSides IR in Heterogeneous Environment
PDF
OORPT Dynamic Analysis
PPTX
FaultHunter workshop (SourceMeter for SonarQube plugin module)
PDF
Bh us 12_miller_exploit_mitigation_slides
PDF
VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking
PPTX
Detect Kernel-Mode Rootkits via Real Time Logging & Controlling Memory Access
PDF
PDF
Icpc16.ppt
PPT
Role of locking- cds
PPTX
Nguyen Phuong Truong Anh - Some new vulnerabilities in modern web application
PPTX
Talos: Neutralizing Vulnerabilities with Security Workarounds for Rapid Respo...
PPT
Role of locking
PPTX
Unifi
PDF
Using Robots for App Testing
PDF
Binding android piece by piece
PDF
PhD Dissertation Defense (April 2015)
PDF
From Thousands of Hours to a Couple of Minutes: Automating Exploit Generation...
PDF
VRP2013 - Comp Aspects VRP
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
Fighting Software Inefficiency Through Automated Bug Detection
BSides IR in Heterogeneous Environment
OORPT Dynamic Analysis
FaultHunter workshop (SourceMeter for SonarQube plugin module)
Bh us 12_miller_exploit_mitigation_slides
VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking
Detect Kernel-Mode Rootkits via Real Time Logging & Controlling Memory Access
Icpc16.ppt
Role of locking- cds
Nguyen Phuong Truong Anh - Some new vulnerabilities in modern web application
Talos: Neutralizing Vulnerabilities with Security Workarounds for Rapid Respo...
Role of locking
Unifi
Using Robots for App Testing
Binding android piece by piece
PhD Dissertation Defense (April 2015)
From Thousands of Hours to a Couple of Minutes: Automating Exploit Generation...
VRP2013 - Comp Aspects VRP

Recently uploaded (20)

PDF
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
PPTX
ai tools demonstartion for schools and inter college
PPTX
history of c programming in notes for students .pptx
PDF
Softaken Excel to vCard Converter Software.pdf
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PDF
Digital Strategies for Manufacturing Companies
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PPTX
Introduction to Artificial Intelligence
PPTX
ISO 45001 Occupational Health and Safety Management System
PPTX
CHAPTER 2 - PM Management and IT Context
PDF
top salesforce developer skills in 2025.pdf
PDF
How to Migrate SBCGlobal Email to Yahoo Easily
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PPTX
L1 - Introduction to python Backend.pptx
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PDF
medical staffing services at VALiNTRY
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PDF
2025 Textile ERP Trends: SAP, Odoo & Oracle
PPTX
Transform Your Business with a Software ERP System
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
ai tools demonstartion for schools and inter college
history of c programming in notes for students .pptx
Softaken Excel to vCard Converter Software.pdf
Navsoft: AI-Powered Business Solutions & Custom Software Development
Digital Strategies for Manufacturing Companies
Which alternative to Crystal Reports is best for small or large businesses.pdf
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
Introduction to Artificial Intelligence
ISO 45001 Occupational Health and Safety Management System
CHAPTER 2 - PM Management and IT Context
top salesforce developer skills in 2025.pdf
How to Migrate SBCGlobal Email to Yahoo Easily
Internet Downloader Manager (IDM) Crack 6.42 Build 41
L1 - Introduction to python Backend.pptx
Odoo Companies in India – Driving Business Transformation.pdf
medical staffing services at VALiNTRY
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
2025 Textile ERP Trends: SAP, Odoo & Oracle
Transform Your Business with a Software ERP System

Effective Fault-Localization Techniques for Concurrent Software

  • 1. Effective Fault-Localization Techniques for Concurrent Software Sangmin Park 08/06/2014 ! Committee: RichVuduc, Mayur Naik,Alex Orso Milos Prvulovic, Mark Grechanik (Mary Jean Harrold)
  • 2. Introduction Background Prior Work User Study Conclusion Impact of Concurrency Bugs 2 Northeast Blackout Facebook IPO Glitch FAIL
  • 3. Introduction Background Prior Work User Study Conclusion Debugging Concurrency Bugs 3 Concurrency bugs are rated as the most difficult types of bugs Survey at Microsoft [Godefroid08] • 72% rated concurrency bugs ‘very hard’ or ‘hard’ to debug
 ! • 83% rated concurrency bugs ‘most severe’ or ‘severe’ What is the hardest bug?
 ! #1: Concurrency bugs (40%, 101/255) StackOverflow 
 http://bit.ly/sohardest
  • 4. Introduction Background Prior Work User Study Conclusion Debugging Concurrency Bugs 4 Concurrency bugs are difficult to locate, understand, and fix “Intermittently I get the following error. I would be grateful if anyone could shed any light on this issue.” Difficult to Locate * BugID: 27315 “I’ve noticed and reproduced crashes with the following stack trace. … I have no clues on why this crash occurs.” Difficult to Locate and Understand * BugID: 3596MySQL ! 40% of initial patches to concurrency bugs are buggy Highest ratio among all software bugs Difficult to Fix [Yin, FSE11]Survey
  • 5. Introduction Background Prior Work User Study Conclusion Challenges 5 Non-determinism Complex state changes Debugging Concurrent Programs [McDowell 89]
  • 6. Introduction Background Prior Work User Study Conclusion Debugging Process 6 Software Testcase Fault Localization Fault Understanding Fault Correction
  • 7. Introduction Background Prior Work User Study Conclusion Debugging Process 7 Software Testcase Fault Localization Fault Understanding Fault Correction Localize Single-Variable Faults [ICSE 2010] 1 Localize Multi-Variable Faults [ICST 2012] 2 Provide Fault Explanation [ISSTA 2013] 3 User Study 4 Before Proposal After Proposal
  • 8. Introduction Background Prior Work User Study Conclusion Thesis Statement 8 Dynamic fault-localization techniques can assist developers in locating and understanding non-deadlock concurrency bugs by identifying suspicious memory-access patterns and providing calling contexts and methods.
  • 9. Introduction Background Prior Work User Study Conclusion Concurrency Bugs 9 Bug Type Ratio Bug Cause Deadlock 30% Mutual Exclusion, Hold/Wait, No preemption, Circular Wait Order Violation 22% Memory Access Orders Atomicity Violation 47% Memory Access Orders Others 1% * Learning from Mistakes - A Comprehensive Survey on Concurrency Bugs [Lu08].
  • 10. Introduction Background Prior Work User Study Conclusion OrderViolation 10* https://bugzilla.mozilla.org/show_bug.cgi?id=61369 Thread 1 Thread 2 void init(…) { ! mThread = CreateThread(); ! } void foo(…) { ! mState = mThread -> State; ! } W R
  • 11. Introduction Background Prior Work User Study Conclusion AtomicityViolation 11* https://bugzilla.mozilla.org/show_bug.cgi?id=73291 Thread 1 Thread 2 … lock(L); lptr = str; unlock(L); ! … lock(L); llen = length; unlock(L); … lock(L); str = newStr; unlock(L); ! … lock(L); length = newLength; unlock(L); WR char* str; // shared vars int length; // locked by L WR
  • 12. Introduction Background Prior Work User Study Conclusion Patterns for Concurrency Bugs 12 OrderViolation: R1(x) W2(x) ! AtomicityViolation: R1(x) W2(x) W2(y) R1(y)
  • 13. Introduction Background Prior Work User Study Conclusion Patterns for Concurrency Bugs 13 Type Memory Access Patterns Order Violation R1(x) W W1(x) R W1(x) W Atomicity Violation 
 (one variable) R1(x) W W1(x) W R1(x) W W1(x) R W1(x) W Atomicity Violation
 (multiple variables) W1(x) W W1(x) W W1(x) W W1(x) R W1(x) R R1(x) W R1(x) W R1(x) W W1(x) R * The patterns were identified by previous work [Lu06,Vaziri06, Hammer08]. Developed fault-localization techniques 
 for these patterns
  • 14. Introduction Background Prior Work User Study Conclusion Prior Work 14 Software Testcase Fault Localization Fault Understanding Fault Correction Localize Single-Variable Faults [ICSE 2010] 1 Localize Multi-Variable Faults [ICST 2012] 2 Provide Fault Explanation [ISSTA 2013] 3 User Study 4 Before Proposal After Proposal
  • 15. Introduction Background Prior Work User Study Conclusion Fault Localization for Single-Variable Faults 15 Ranked List for Single-Variable Bugs 1. R-W-R 2. R-W-W 3. R-W-W 4. W-W-W 5. R-W-W .... Software Testcase FALCON Falcon Statistical Fault Localization Dynamic Pattern Detection Single-Variable Patterns • Pros: Effective in ranking patterns • Cons: Miss multi-variable faults 
 (30% of non-deadlock concurrency bugs [Lu 08])
  • 16. Introduction Background Prior Work User Study Conclusion Fault Localization for Multiple-Variable Faults 16 1. R-W-R 2. R-W-W-W 3. R-W-W-R 4. W-W-W-W 5. R-W .... Ranked List for Single-/Multi-Variable Bugs UNICORN Software Testcase Unicorn Dynamic Pair Detection Pattern Combination Pairs Patterns Statistical Fault Localization • Pros: Effective in ranking patterns • Cons: Miss contextual information
  • 17. Introduction Background Prior Work User Study Conclusion Fault Explanation 17 GRIFFIN Software Testcase Bug Graphs (memory accesses +
 calling stacks + suspicious methods) Thread 1 Thread 2 150 Foo () 270 int getS () 271 return s; R R R W 851 b.s += c.s; 852 b.a += c.a;W 152 Foo () 680 void Bar() 681 a.s = b.s; 682 a.a = b.a; Griffin Unicorn Fault Localization Pattern Clustering Patterns per Execution Clustered Patterns Context Reconstruction Effective in clustering memory accesses and locating the bug at method level
  • 18. Introduction Background Prior Work User Study Conclusion Introduction Background Prior Work User Study Conclusion Fault Explanation 16 Software Testcase Bug Graphs! (memory accesses +
 calling stacks +! suspicious methods) Griffin Unicorn Fault Localization Execution Clustering Pattern Clustering Ranked Lists Clustered Failing Executions GRIFFIN Introduction Background Prior Work User Study Conclusion Fault Localization for Multiple-Variable Faults 15 Software Testcase 1. R-W-R 2. R-W-W-W 3. R-W-W-R 4. W-W-W-W 5. R-W .... Ranked List! for Single-/Multi-Variable Bugs Unicorn Dynamic Pair Detection Pattern Combination Pairs Patterns Statistical Fault Localization UNICORN Introduction Background Prior Work User Study Conclusion Fault Localization for ! Single-Variable Faults 14 Ranked List! for Single-Variable Bugs 1. R-W-R 2. R-W-W 3. R-W-W 4. W-W-W 5. R-W-W .... Software Testcase Falcon Statistical Fault Localization Dynamic Pattern Detection Single-Variable Patterns FALCON 18 Usefulness FALCON UNICORN GRIFFIN
  • 19. Introduction Background Prior Work User Study Conclusion Goal 19 Introduction Background Prior Work User Study Conclusion Fault Explanation 16 Software Testcase Bug Graphs! (memory accesses +
 calling stacks +! suspicious methods) Griffin Unicorn Fault Localization Execution Clustering Pattern Clustering Ranked Lists Clustered Failing Executions GRIFFIN Introduction Background Prior Work User Study Conclusion Fault Localization for Multiple-Variable Faults 15 Software Testcase 1. R-W-R 2. R-W-W-W 3. R-W-W-R 4. W-W-W-W 5. R-W .... Ranked List! for Single-/Multi-Variable Bugs Unicorn Dynamic Pair Detection Pattern Combination Pairs Patterns Statistical Fault Localization UNICORN Introduction Background Prior Work User Study Conclusion Fault Localization for ! Single-Variable Faults 14 Ranked List! for Single-Variable Bugs 1. R-W-R 2. R-W-W 3. R-W-W 4. W-W-W 5. R-W-W .... Software Testcase Falcon Statistical Fault Localization Dynamic Pattern Detection Single-Variable Patterns FALCON FALCON UNICORN GRIFFIN To determine whether these fault-localization techniques help developers in understanding and fixing concurrency bugs 3 techniques implemented in Eclipse tools
  • 20. Introduction Background Prior Work User Study Conclusion Debugging Tools 20 Tools Output Comments Tracer! (Baseline) Dump of shared memory accesses from a failing execution • Based on ConcurrencyExplorer (at Microsoft) • Tool used for debugging* Unicorn Ranked list of memory- access patterns • Unicorn subsumes Falcon • Based on Unicorn Griffin List of memory accesses with calling context • Based on Griffin * Other tools (e.g.,TIE, Jive, Jove) focus on visualizing thread interactions.
  • 21. Introduction Background Prior Work User Study Conclusion Tool:Tracer 21 Output: Dump of memory accesses Thread Selector Thread, Source Location, Variable… Compared to ConcurrencyExplorer • Same output (dump of memory accesses) • Same outlook (tool + editor)
  • 22. Introduction Background Prior Work User Study Conclusion Tools Output Comments Tracer! (Baseline) Dump of shared memory accesses from a failing execution • Based on ConcurrencyExplorer (at Microsoft) • Tool used for debugging* Unicorn Ranked list of memory- access patterns • Unicorn subsumes Falcon • Based on Unicorn Griffin List of memory accesses with calling context • Based on Griffin Debugging Tools 22
  • 23. Introduction Background Prior Work User Study Conclusion Tool: Unicorn 23 Output: Ranked list of memory-access patterns R-W-W pattern Compared to Tracer + Memory patterns - Thread identifier
  • 24. Introduction Background Prior Work User Study Conclusion Tools Output Comments Tracer! (Baseline) Dump of shared memory accesses from a failing execution • Based on ConcurrencyExplorer (at Microsoft) • Tool used for debugging* Unicorn Ranked list of memory- access patterns • Unicorn subsumes Falcon • Based on Unicorn Griffin List of memory accesses with calling context • Based on Griffin Debugging Tools 24
  • 25. Introduction Background Prior Work User Study Conclusion Tool: Griffin 25 Interleaving Output: List of memory accesses with calling context + Clustered accesses + Suspicious methods + Calling context Compared to Unicorn + Clustered memory accesses + Suspicious methods + Calling context
  • 26. Introduction Background Prior Work User Study Conclusion Hypotheses • H1 (understanding): Unicorn > Tracer ✦ Unicorn provides summary of bugs ! • H2 (understanding): Griffin > Unicorn,Tracer ✦ Griffin provides more context information ! • H3 (fix): Unicorn, Griffin > Tracer ✦ Understand better => Fix better 26
  • 27. Introduction Background Prior Work User Study Conclusion Study Setup 27 3 Subject Programs 32 Developers Protocol
  • 28. Introduction Background Prior Work User Study Conclusion Study Setup 28 3 Java Programs:
 - Bank Account (100 LoC)
 - Shop (300 LoC)
 - List (25 KLoC)
  • 29. Introduction Background Prior Work User Study Conclusion Subject 1: Bank Account 29 User 2User 1 Balance: $100Balance: $100 Deposit: $300 Withdraw: -$100 Transfer: -$100 Deposit: $300 Withdraw: -$100 Transfer: -$100 $100 $400 $300 $300 $100 $400 $300 $300$200 $400 • Size: 100 LoC • Difficulty: Easy
  • 30. Introduction Background Prior Work User Study Conclusion Subject 2: Shop 30 CustomerGetItem Supplier Shop PutItem CustomerGetItem Customer GetItem Bug:The program crashes with an exception at Shop. • Size: 300 LoC • Difficulty: Medium
  • 31. Introduction Background Prior Work User Study Conclusion Subject 3: List 31 A B CInitially, create three synchronized lists … .add(item)B .add(item)B .add(item)B C .add(item) C .add(item) C .add(item) B .clear() A .addAll( )B Item Item Item Itemnull Item Item ItemA B C : • Size: 25 KLoC • Difficulty: Hard
  • 32. Introduction Background Prior Work User Study Conclusion Study Setup 32 32 Developers: 
 - Graduate students
 - Development experience
 (2~30 years, 11 median)
 - Concurrency experience
 (7 beginners, 10 experts)
  • 33. Introduction Background Prior Work User Study Conclusion Study Design 33 T1 T2 T3 S1 S2 S3 1) S1-T1, S2-T2, S3-T3 2) S1-T1, S2-T3, S3-T2 3) S1-T2, S2-T1, S3-T3 4) S1-T2, S2-T3, S3-T1 5) S1-T3, S2-T1, S3-T2 6) S1-T3, S2-T2, S3-T1 Factorial Design
  • 34. Introduction Background Prior Work User Study Conclusion Study Setup 34 Protocol: 
 - 1 hr 30 min = 
 20 min tutorial + 
 20 min per task +
 10 min buffer
 - Task = Debug + Survey
 - 5 surveys
  • 35. Introduction Background Prior Work User Study Conclusion Surveys 35 Background: • Programming experience • Concurrency experience ! For each task: • Usefulness • Understanding • Fix ! ! !
  • 36. Introduction Background Prior Work User Study Conclusion Surveys 36 Background: • Programming experience • Concurrency experience ! For each task: • Usefulness • Understanding • Fix ! ! !
  • 37. Introduction Background Prior Work User Study Conclusion Surveys 37 Background: • Programming experience • Concurrency experience ! For each task: • Usefulness • Understanding • Fix ! ! !
  • 38. Introduction Background Prior Work User Study Conclusion Surveys 38 Background: • Programming experience • Concurrency experience ! For each task: • Usefulness • Understanding • Fix ! Final: • Rank of the tools • General feedback Evaluation 
 Scores (1 to 5 scale) • Usefulness • Understanding: graded • Fix: ranking-based ! Hypothesis Testing • For each task, we performed 
 unpaired t-test for different tool users
  • 39. Introduction Background Prior Work User Study Conclusion Overall Result 39 Score Type Hypothesis Testing Task 1:! Bank Account! Task 2:! Shop Task 3: List Usefulness Griffin > Tracer 0.67 2.53 2.17 Griffin > Unicorn -0.14 0.31 1.44 Unicorn > Tracer 0.81 2.22 0.72 Understanding Griffin > Tracer 0.96 0.18 0.98 Griffin > Unicorn 0.62 0.07 1.11 Unicorn > Tracer -0.07 0.11 -0.13 Fix Griffin > Tracer 0.24 0.64 0.42 Griffin > Unicorn 0.51 0.09 1.11 Unicorn > Tracer -0.29 0.56 -0.69 * Numbers = Mean difference (-4 to 4); Bold = Statistically significant (p < 0.05).
  • 40. Introduction Background Prior Work User Study Conclusion Hypothesis Testing 40 Score Type Hypothesis Testing Task 1:! Bank Account! Task 2:! Shop Task 3: List Usefulness Griffin > Tracer 0.67 2.53 2.17 Griffin > Unicorn -0.14 0.31 1.44 Unicorn > Tracer 0.81 2.22 0.72 Understanding Griffin > Tracer 0.96 0.18 0.98 Griffin > Unicorn 0.62 0.07 1.11 Unicorn > Tracer -0.07 0.11 -0.13 Fix Griffin > Tracer 0.24 0.64 0.42 Griffin > Unicorn 0.51 0.09 1.11 Unicorn > Tracer -0.29 0.56 -0.69 H1 (understanding): Unicorn > Tracer
 * Numbers = Mean difference (-4 to 4); Bold = Statistically significant (p < 0.05).
  • 41. Introduction Background Prior Work User Study Conclusion Hypothesis Testing 41 Score Type Hypothesis Testing Task 1:! Bank Account! Task 2:! Shop Task 3: List Usefulness Griffin > Tracer 0.67 2.53 2.17 Griffin > Unicorn -0.14 0.31 1.44 Unicorn > Tracer 0.81 2.22 0.72 Understanding Griffin > Tracer 0.96 0.18 0.98 Griffin > Unicorn 0.62 0.07 1.11 Unicorn > Tracer -0.07 0.11 -0.13 Fix Griffin > Tracer 0.24 0.64 0.42 Griffin > Unicorn 0.51 0.09 1.11 Unicorn > Tracer -0.29 0.56 -0.69 H2 (understanding): Griffin > Unicorn,Tracer
 * Numbers = Mean difference (-4 to 4); Bold = Statistically significant (p < 0.05).
  • 42. Introduction Background Prior Work User Study Conclusion Hypothesis Testing 42 Score Type Hypothesis Testing Task 1:! Bank Account! Task 2:! Shop Task 3: List Usefulness Griffin > Tracer 0.67 2.53 2.17 Griffin > Unicorn -0.14 0.31 1.44 Unicorn > Tracer 0.81 2.22 0.72 Understanding Griffin > Tracer 0.96 0.18 0.98 Griffin > Unicorn 0.62 0.07 1.11 Unicorn > Tracer -0.07 0.11 -0.13 Fix Griffin > Tracer 0.24 0.64 0.42 Griffin > Unicorn 0.51 0.09 1.11 Unicorn > Tracer -0.29 0.56 -0.69 H3 (fix): Unicorn, Griffin > Tracer
 * Numbers = Mean difference (-4 to 4); Bold = Statistically significant (p < 0.05).
  • 43. Introduction Background Prior Work User Study Conclusion • H1 (understanding): Unicorn > Tracer
 • H2 (understanding): Griffin > Unicorn,Tracer
 • H3 (fix): Unicorn, Griffin > Tracer Hypothesis Testing 43
  • 44. Introduction Background Prior Work User Study Conclusion Analysis by Tool Preference • How many participants rate Griffin as the best tool? ! • Did these participants actually understand bugs better? 44
  • 45. Introduction Background Prior Work User Study Conclusion Results by Tool Preference 45 Task Score Type Group-T! (2) Group-U! (7) Group-G! (21) Task 1: ! Bank Account Understanding 3.0 3.75 3.78 Fix 2.0 2.37 3.05 Task 2: ! Shop Understanding 3.33 4.12 4.26 Fix 2.33 3.75 4.0 Task 3:! List Understanding 2.66 2.75 3.05 Fix 1.33 2.87 2.68 21 * Numbers in headers = # participants, Numbers in other cells = average scores • How many participants rate Griffin as the best tool? 21 (70%) • Did these participants actually understand bugs better? Yes
  • 46. Introduction Background Prior Work User Study Conclusion Discussion:Tool Usage 46 Griffin Tracer Track Confirm “There are three dimensions to think about: Time vs.Thread vs. Context. Griffin showed these quite effectively. However, the other two tools lacked in these aspects.” •“Tracer might be useful for simple code. However, overall it won’t scale in real life scenarios because most programs are complex.” •“Tracer wasn’t very useful on this task because there were too many threads and instructions to keep track of.”
  • 47. Introduction Background Prior Work User Study Conclusion Discussion: Improvements 47 Fix Advice Interactive Debugging Visual Improvement
  • 48. Introduction Background Prior Work User Study Conclusion Future Work 48 Software Testcase Fault Localization Fault Understanding Fault Correction Increase Bug Coverage Reduce Overhead ImproveVisualization Support Interactive Debugging Use Multiple Inputs Provide Fix Advice
  • 49. Introduction Background Prior Work User Study Conclusion Data Available in Public 49 Data Location Unicorn http://www.cc.gatech.edu/~sangminp/unicorn Griffin http://www.cc.gatech.edu/~sangminp/griffin Subject Programs http://www.cc.gatech.edu/~sangminp/bugs User Study http://www.cc.gatech.edu/~sangminp/concurrency- study
  • 50. Introduction Background Prior Work User Study Conclusion Contributions 50 Introduction Background Prior Work User Study Conclusion • H1: Participants using Unicorn will understand concurrency bugs better than participants using Tracer
 • H2: Participants using Griffin will understand concurrency bugs better than participants using Unicorn
 • H3: Participants using Unicorn and Griffin will fix concurrency bugs better than participants using Tracer Hypothesis Testing 32 Introduction Background Prior Work User Study Conclusion Data Available in Public 45 Data Location Unicorn http://www.cc.gatech.edu/~sangminp/unicorn Griffin http://www.cc.gatech.edu/~sangminp/griffin Subject Programs http://www.cc.gatech.edu/~sangminp/bugs User Study http://www.cc.gatech.edu/~sangminp/concurrency- study Introduction Background Prior Work User Study Conclusion Fault Localization for ! Single-Variable Faults 14 Ranked List! for Single-Variable Bugs 1. R-W-R 2. R-W-W 3. R-W-W 4. W-W-W 5. R-W-W .... Software Testcase Falcon Statistical Fault Localization Dynamic Pattern Detection Single-Variable Patterns FALCON Introduction Background Prior Work User Study Conclusion Fault Localization for Multiple-Variable Faults 15 Software Testcase 1. R-W-R 2. R-W-W-W 3. R-W-W-R 4. W-W-W-W 5. R-W .... Ranked List! for Single-/Multi-Variable Bugs Unicorn Dynamic Pair Detection Pattern Combination Pairs Patterns Statistical Fault Localization UNICORN Introduction Background Prior Work User Study Conclusion Fault Explanation 17 GRIFFIN Software Testcase Bug Graphs! (memory accesses +
 calling stacks +! suspicious methods) Thread 1 Thread 2 150 Foo () 270 int getS () 271 return s; R R R W 851 b.s += c.s; 852 b.a += c.a;W 152 Foo () 680 void Bar() 681 a.s = b.s; 682 a.a = b.a; Griffin Unicorn Fault Localization Pattern Clustering Patterns per Executions Clustered Patterns Context Reconstruction
  • 52. Introduction Background Prior Work User Study Conclusion Why did you implement Tracer as Eclipse plugin? • To minimize the effect of UI: • Same IDE: Eclipse • Similar UI for all debuggers: similar colors, list view • Language difference: C#, Java 52
  • 53. Introduction Background Prior Work User Study Conclusion Concurrency Explorer 53Shared-memory dump Editor windows
  • 54. Introduction Background Prior Work User Study Conclusion ConcurrencyExplorer vs Tracer 54 ConcurrencyExplorer Tracer Output Memory dump (Source line, Thread, Object ID Memory dump (Source line, Thread) UI Elements • Window for Dump • Editor for Source • Window for Dump • Editor for Source IDE Visual Studio Eclipse * ConcurrencyExplorer doesn’t show values of variables.
  • 55. Introduction Background Prior Work User Study Conclusion Tools for Concurrent S/W 55 Jive and Jove ! • Link: http:// cs.brown.edu/~spr/ research/visjove.html
 ! • To show thread interactions • Not focused on showing bugs
  • 56. Introduction Background Prior Work User Study Conclusion 56 Tools for Concurrent S/W TIE ! • Link: 
 https:// www.youtube.com/ watch? v=kbNXlLAkPgU ! • To show thread interactions • Not focused on showing bugs
  • 57. Introduction Background Prior Work User Study Conclusion 57 Tools for Concurrent S/W ConcurrencyVisualizer (for Performance): The snapshot shows inter-thread dependencies.
  • 58. Introduction Background Prior Work User Study Conclusion Tutorial • Tutorial on Java Concurrency • Bugs: Order/AtomicityViolations • Fix Strategies • Example Program • Demo on Debugging Tools 58
  • 59. Introduction Background Prior Work User Study Conclusion Survey Links • Link (Background): https://docs.google.com/forms/d/ 1xthnR5Ibw8q1qrqn-WrBti5zjFVD1b- nYZZ1S4RurMM/viewform • Link (Task): https://docs.google.com/forms/d/ 1SNlg4anVAZmR99EZjErvwG0rnW2nLrLYpwqZjNf Q-yc/viewform • Link (Final): https://docs.google.com/forms/d/ 1L3_Intjm6oSwoZp3wWfHIv8Z2jcpNcbbn1nPeuiv1 b8/viewform 59
  • 60. Introduction Background Prior Work User Study Conclusion Fix Strategies 60
  • 61. Introduction Background Prior Work User Study Conclusion Study Design 61 T1 T2 T3 S1 S2 S3 1) S1-T1, S2-T2, S3-T3 2) S1-T1, S2-T3, S3-T2 3) S1-T2, S2-T1, S3-T3 4) S1-T2, S2-T3, S3-T1 5) S1-T3, S2-T1, S3-T2 6) S1-T3, S2-T2, S3-T1 Factorial Design 1 3 
 1 1
 1 1 ! 1 2 ! 1 2 ! 2 1 Beginner Expert • Setup: Random distribution of participants • Results: No significant score differences between groups
  • 62. Introduction Background Prior Work User Study Conclusion Factorial Design 62https://explorable.com/factorial-design •“Factorial designs are extremely useful to psychologists and field scientists as a preliminary study, allowing them to judge whether there is a link between variables, whilst reducing the possibility of experimental error and confounding variables.” •“The main disadvantage is the difficulty of experimenting with more than two factors, or many levels.A”
  • 63. Introduction Background Prior Work User Study Conclusion Eclipse Navigation Data 63 Task 1:! Bank Account Task 2:! Shop Task 3:! List Tracer! Users 54.5 46.11 75.1 Unicorn! Users 63.4 60.6 62.33 Griffin! Users 59.11 69 39.11 • Numbers = Average navigation data (click+keyboard) • For Task 3, Griffin users have fewer navigation, but the result is not statistically significant.
  • 64. Introduction Background Prior Work User Study Conclusion Why is Fixing more difficult? • Many strategies • Adding a lock • Adding a condition (if, while) • Switch statements, … • Many decisions in one strategy • Where should we add a lock? • Should we use an existing lock or add a new one? • Fix becomes bugs (e.g., adding a new lock -> deadlock)64
  • 65. Introduction Background Prior Work User Study Conclusion Limitations •Participants - size, quality •Factorial design •Debugging - no editing 65
  • 66. Introduction Background Prior Work User Study Conclusion Related Work • Empirical Studies for Sequential Bugs/Debuggers • Weiser,Whyline, Parnin & Orso • Empirical Studies for Concurrency • for writing faster code • for education • Empirical Studies for Concurrency Bugs/Debuggers • Sadowski andYi’s study 66