SlideShare a Scribd company logo
Figh%ng	So*ware	Inefficiency	
Through	Automated	Bug	Detec%on	
Shan	Lu	
University	of	Chicago	
1
A	li=le	bit	about	myself	
2
My	life	J	
3
I	learned	a	lot	from	…	
4
Bug	Detec%on	
Background	
5
Figh%ng	so*ware	bugs	is	crucial	
•  So3ware	is	everywhere	
– h5p://en.wikipedia.org/wiki/List_of_so3ware_bugs	
•  So3ware	bugs	are	widespread	and	costly	
– Lead	to	40%	system	down	Eme	[Blueprints	2000]	
– Cost	312	Billion	lost	per	year	[Cambridge	2013]	
6
What	is	your	favorite	bug?	
	
•  How	many	of	you	have	been	bothered	by	bugs?	
7
How	to	detect	bugs?	
•  Study	&	understand	real-world	bugs	
•  Discover	pa5erns	of	common	bugs	
– Source	code	level	
– Binary	code	level	
– …	
•  Design	pa5ern-matching	program	analysis	
– StaEc	analysis	
– Dynamic	analysis	
– …	
8
Bug	detec%on	examples	
– Memory	bug	detecEon	
•  Pa5ern:	over-bound	writes,	…	
•  DetecEon:	check	memory	accesses	&	…	
– Concurrency	bug	detecEon	
•  Pa5ern:	data	races,	atomicity	violaEons,	…	
•  DetecEon:	check	memory	accesses	&	…	
9
P = malloc (10);
P[100] = ‘a’;
if (P)
*P=‘a’;
P=NULL;
Why	Performance-Bug	
Detec%on?	
10
How	did	this	start?	
	
•  One	of	our	bug	detectors	is	strangely	slow	
– Why	not	profiling?	
•  Lots	of	noises	in	profiling	
•  Measuring	cost	not	inefficiency	
•  Am	I	able	to	do	this?	
•  Where	should	I	start?	
11
An	Empirical	Study	of		
Real-World	Performance	Bugs	
Are	there	performance	bugs?		
Are	they	important?	
What	types	of	performance	bugs	are	there?	
	
12
Understanding and detecting real-world performance bugs [PLDI '12]
Methodology	
13
ApplicaEon
Apache
Chrome
GCC
Mozilla
MySQL
So3ware	Type
Server	So3ware
GUI	ApplicaEon
GUI	ApplicaEon
Compiler
Command-line	UElity	+	
Server	+	Library
Language
C/Java
C/C++
C/C++
C++/JS
C/C++/C#
MLOC	
1.3
Bug	DB		
History Tags
Compile-
Eme-hog
5.7
4.7
14.0
N/A
N/A
perf
S5
0.45
14	y
13	y
10	y
13	y
4	y
#	Bugs
25
10
10
36
28
Total:	109
Findings		
•  Are	there	performance	bugs?	
– Yes		
•  Are	they	important?	
– Some	are	
•  What	types	of	performance	bugs	are	there?	
– What	are	their	root	causes?	
– Where	are	they	typically	located?	
– How	are	they	usually	fixed?	
14
Bug	Examples	
15
+		int	i	=	-k.length();	
-			while	(s.indexOf(k)	==	-1)	{	
+		while	(i++<0	||		
+															s.substring(i).indexOf(k)==-1)	
								{s.append	(nextchar());}	
Patch	for	Apache-Ant	Bug	34464	
for	(i	=	0;	i	<	tabs.length;	i++)	{	
							…	
						tabs[i].doTransact();	
}	
+	doAggregateTransact(tabs);
Mozilla	Bug	490742	&	Patch
What	is	next?	
Can	we	detect	performance	bugs?	
What	“pa5ern”	did	we	find?	
	
16
A	Patch-Based		
Inefficiency	Detector	
17
Sta%c	inefficiency	pa=erns	exist
•  StaEcally	checkable	inefficiency	pa5erns	exist	
	
+		int	i	=	-k.length();	
-			while	(s.indexOf(k)	==	-1)	{	
+		while	(i++<0	||		
+															s.substring(i).indexOf(k)==-1)	
								{s.append	(nextchar());}	
Patch	for	Apache-Ant	Bug	34464	
for	(i	=	0;	i	<	tabs.length;	i++)	{	
							…	
						tabs[i].doTransact();	
}	
+	doAggregateTransact(tabs);
Mozilla	Bug	490742	&	Patch
How	to	get	these	pa=erns?	
•  Manually	extract	from	patches	
19
Not	Contain	Rules
Dynamic	Rules
LLVM	Checkers
Python	Checkers
Detec%on	Results
•  17	checkers	find	PPPs	in	original	buggy	versions	
•  13	checkers	find	332	PPPs	in	latest	versions	
Found	by		
cross-applicaKon		
checking
Inherits	from		
buggy	versions
Introduced	later
*	PPP:	Poten6al	Performance	Problem	
Inefficiency	pa=ern	based	bug	detec%on	is	promising!
What	is	next?	
Do	we	have	to	manually	specify	rules?	
Can	we	build	generic	detectors?	
	
21
Toddler	
A	dynamic	and	generic	detector	
targeEng	inefficient	nested	loops		
22
Toddler: Detecting Performance Problems via Similar
Memory-Access Patterns [ICSE '13]
What	are	generic	inefficiency	pa=erns?
23
Previous	example	
24
		while	(s.indexOf(k)	==	-1)		
								{s.append	(nextchar());}	
Apache-Ant	Bug	34464	
Password: abcdefg
Password: abcdefgh
Password: abcdefghi
What	is	the	pa=ern?	
•  What	type	of	nested	loops	are	likely	inefficient?	
– Many	inner	loops	are	similar	with	each	other		
•  Some	instrucEons	keeps	reading	similar	sequences	of	values
25
abcdefg
abcdefgh
abcdefghi
How	to	detect?	
•  Input:	Test	code	+	system	under	test	
•  Steps:	
1.  Instrument	the	system	under	test	
Monitor	loop	starts/ends	&	memory	reads	inside	loops	
2.  Analyze	trace	produced	by	instrumentaEon	
					IdenEfy	repeEEve	memory-read	sequences	
•  Output:	Loops	that	are	likely	performance	bugs	
26
Evalua%on	Subjects	and	New	Bugs	
27
Applica%on	 Descrip%on	 LOC	 Known	Bugs	 New	Bugs	 Fixed	 Confirmed	
Ant	 Build	tool	 109,765	 1	 8	 1	 0	
Apache	CollecEons	 CollecEons	library	 51,516	 1	 20	 10	 4	
Groovy	 Dynamic	language	 136,994	 1	 2	 2	 0	
Google	Core	Libraries	 CollecEons	library	 156,004	 2	 10	 1	 2	
JFreeChart	 Chart	framework	 64,184	 1	 1	 0	 0	
Jmeter	 Load	tesEng	tool	 86,549	 1	 0	 0	 0	
Lucene	 Text	search	engine	 320,899	 2	 0	 0	 0	
PDFBox	 PDF	framework	 78,578	 1	 0	 0	 0	
Solr	 Search	server	 373,138	 1	 0	 0	 0	
JDK	standard	library	 2	 0	 0	
JUnit	tesEng	framework	 1	 1	 0	
9	Apps	+	2	Libs	 50,000	–	320,000	 11	 44	 15	 6
Toddler	vs.	HProf	
28
Known	Bug	
Bug	Detected?	 False	P.	 Rank	 Slowdown	
TODD.	 PROF	 TODD.	 PROF	 TODD.	 PROF	
Ant	 ü	 û	 0	 19.3	 13.7	 4.2	
Apache	CollecEons	 ü	 ü	 0	 1.0	 10.0	 2.1	
Groovy	 ü	 ü	 0	 3.7	 15.5	 3.7	
Google	Core	Libraries	#1	 ü	 ü	 0	 1.8	 9.0	 3.8	
Google	Core	Libraries	#2	 ü	 û	 0	 5.3	 7.5	 3.2	
JFreeChart	 ü	 û	 0	 53.7	 13.4	 8.8	
JMeter	 ü	 û	 0	 10.3	 8.5	 1.9	
Lucene	#1	 ü	 û	 0	 7.7	 6.8	 2.5	
Lucene	#2	 ü	 ü	 0	 3.1	 25.4	 3.1	
PDFBox	 ü	 û	 1	 18.8	 51.8	 12.1	
Solr	 ü	 û	 0	 178.3	 114.2	 7.1	
11	 11	 4	 1	 n/a	 15.9X	 4.0X
What	is	next?	
Why	so	many	bugs	are	not	fixed	by	
developers?	
29
Caramel	
A	staEc	and	generic	detector	
targeEng	inefficient	loops	with	
simple	patches	
30
CARAMEL: Detecting and Fixing Performance Problems That
Have Non-Intrusive Fixes [ICSE'15]
Won SIGSOFT Distinguished Paper Award
What	are	perf.	bugs	not	fixed?	
31
Correctness
Maintainability
Manual effort
Potential speedup under certain workload
Can we detect bugs with simple fixes?
What	is	the	pa=ern?	
•  What	is	a	typical	simple	fix	for	an	inefficient	loop?	
•  What	types	of	bugs	have	the	above	type	of	fix?	
– We	thought	for	a	long	Eme	…	
32
for(…)
+ if (cond) break;
Bug	example	
33
boolean	alreadyPresent	=	false;	
while	(isActualEmbeddedProperty.hasNext())	{	
		if	(alreadyPresent)	break;	//	CondBreak	FIX	
		if	(oldVal.getStr().equals(newVal.getStr()))	
				alreadyPresent	=	true;	
		if	(	!	alreadyPresent	)	
				prop.container().addProp(newVal);	//	side	effect	
}	
•  New	bug	in	PDFBox	found	by	us,	fixed	by	developers	
•  Developers	fix	bugs	that	have	CondBreak	fixes:	
– Waste	computaEon	in	loops	
– Fix	is	non-intrusive
What	Bugs	Have	CondBreak	Fixes?	
34
Every	
Itera%on	
Late	
Itera%ons	
Early	
Itera%ons	
No-Result	 Type	1	 Type	2	 Type	Y	
Useless-Result	 Type	X	 Type	3	 Type	4	
Where Is Computation Wasted?How Is
Computation
Wasted?
Ingredient	1:	Result	Instruc%on	
boolean	alreadyPresent	=	false;	
while	(isActualEmbeddedProperty.hasNext())	{	
		if	(alreadyPresent)	break;	//	CondBreak	FIX	
		if	(oldVal.getStr().equals(newVal.getStr()))	
				alreadyPresent	=	true;	
		if	(	!	alreadyPresent	)	
				prop.container().addProp(newVal);	//	side	effect	
}	
Result	InstrucEon
Ingredient	2:	Instruc%on-Condi%on	
boolean	alreadyPresent	=	false;	
while	(isActualEmbeddedProperty.hasNext())	{	
		if	(alreadyPresent)	break;	//	CondBreak	FIX	
		if	(oldVal.getStr().equals(newVal.getStr()))	
				alreadyPresent	=	true;	
		if	(	!	alreadyPresent	)	
				prop.container().addProp(newVal);	//	Result	Ins.	
}	
InstrucEon-CondiEon	
36
Ingredient	3:	Loop-Condi%on	
boolean	alreadyPresent	=	false;	
while	(isActualEmbeddedProperty.hasNext())	{	
		if	(alreadyPresent)	break;	//	CondBreak	FIX	
		if	(oldVal.getStr().equals(newVal.getStr()))	
				alreadyPresent	=	true;	
		if	(	!	alreadyPresent	)	
				prop.container().addProp(newVal);	//	Result	Ins.	
}	
InstrucEon-CondiEon	
Also	Loop-CondiEon	
37
Evalua%on	Subjects	and	New	Bugs	
•  15	applica%ons	
–  11	Java,	4	C/C++	
•  150	new	bugs	
•  116	bugs	fixed	
–  51	in	Java	
–  65	in	C/C++	
•  Only	4	rejected	
•  22	bugs	in	GCC	fixed	
•  149/150	fixed	
automaEcally	
•  Only	23	false	posiEves	
Applica%on	 Descrip%on	 LOC	 Bugs	
Ant	 Build	tool	 140,674	 1	
Groovy	 Dynamic	language	 161,487	 9	
JMeter	 Load	tesEng	tool	 114,645	 4	
Log4J	 Logging	framework	 51,936	 6	
Lucene	 Text	search	engine	 441,649	 14	
PDFBox	 PDF	framework	 108,796	 10	
Sling	 Web	app.	framework	 202,171	 6	
Solr	 Search	server	 176,937	 2	
Struts	 Web	app.	framework	 175,026	 4	
Tika	 Content	extracEon	 50,503	 1	
Tomcat	 Web	server	 295,223	 4	
Google	Chrome	 Web	browser	 13,371,208	 22	
GCC	 Compiler	 1,445,425	 22	
Mozilla	 Web	browser	 5,893,397	 27	
MySQL	 Database	server	 1,774,926	 18
Different	aspects	of	figh%ng	bugs	
39
In-house		
bug	detecEon		
In-field		
failure	recovery	
In-field		
failure	diagnosis	
In-house		
bug	fixing	
Low	overhead	
High	accuracy	 High	accuracy
Work	from	my	group	
40
In-house		
bug	detecEon		
[ASPLOS06];
[SOSP07];
[ASPLOS09];
[ASPLOS10];	
[ASPLOS11];	
[OOPSLA13]	
[PLDI12];									
[ICSE13];					
[ICSE15]	
In-field		
failure	recovery	
[ASPLOS13.A]	
[FSE14]	
Not	yet	
In-field		
failure	diagnosis	
[OOPSLA10];	
[ASPLOS13.B];	
[ASPLOS14];					
[OOPSLA16*]	
[OOPSLA14]	
In-house		
bug	fixing	
[PLDI11];		
[OSDI12];
[FSE16]	
[CAV13]	
concurrency	
bugs	
performance	
bugs
Conclusions	&	Future	Work	
41
Constraints/Requirements
Techniques
Bugs
Thanks!	
Ques%ons?	
42
My	collaborators	
•  Prof.	Darko	Marinov	
•  Adrian	Nistor		
•  Linhai	Song

More Related Content

PDF
To Mock or Not To Mock
PDF
Fighting advanced malware using machine learning (English)
PDF
JIT Feedback — what Experienced Developers like about Static Analysis (icpc2018)
PDF
Software Carpentry and the Hydrological Sciences @ AGU 2013
PPTX
Research is a Social Process
PDF
The Truth, the Whole Truth, and Nothing but the Truth: A Pragmatic Guide to A...
PDF
Getting People to Listen
PDF
Programming with Estimates
To Mock or Not To Mock
Fighting advanced malware using machine learning (English)
JIT Feedback — what Experienced Developers like about Static Analysis (icpc2018)
Software Carpentry and the Hydrological Sciences @ AGU 2013
Research is a Social Process
The Truth, the Whole Truth, and Nothing but the Truth: A Pragmatic Guide to A...
Getting People to Listen
Programming with Estimates

Viewers also liked (10)

PDF
Felleisen_Keynote
PDF
Professional Communication
KEY
Applications and Abstractions: A Cautionary Tale (invited talk at a DIMACS Wo...
PDF
Career Management (invited talk at ICSE 2014 NFRS)
PDF
Whither Software Engineering Research? (keynote talk at APSEC 2012)
PDF
Known Unknowns: Testing in the Presence of Uncertainty (talk at ACM SIGSOFT F...
PDF
Probability and Uncertainty in Software Engineering (keynote talk at NASAC 2013)
PDF
Felicitous Computing (invited Talk for UC Irvine ISR Distinguished Speaker Se...
PDF
The Power of Probabilistic Thinking (keynote talk at ASE 2016)
PDF
Jogging While Driving, and Other Software Engineering Research Problems (invi...
Felleisen_Keynote
Professional Communication
Applications and Abstractions: A Cautionary Tale (invited talk at a DIMACS Wo...
Career Management (invited talk at ICSE 2014 NFRS)
Whither Software Engineering Research? (keynote talk at APSEC 2012)
Known Unknowns: Testing in the Presence of Uncertainty (talk at ACM SIGSOFT F...
Probability and Uncertainty in Software Engineering (keynote talk at NASAC 2013)
Felicitous Computing (invited Talk for UC Irvine ISR Distinguished Speaker Se...
The Power of Probabilistic Thinking (keynote talk at ASE 2016)
Jogging While Driving, and Other Software Engineering Research Problems (invi...
Ad

Similar to Fighting Software Inefficiency Through Automated Bug Detection (20)

PPTX
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
PDF
TechGIG_Memory leaks in_java_webnair_26th_july_2012
PDF
Effective Fault-Localization Techniques for Concurrent Software
PPTX
Are Automated Debugging Techniques Actually Helping Programmers
PDF
Enhancing Developer Productivity with Code Forensics
PPT
PHP - Introduction to PHP Bugs - Debugging
PDF
Because you can’t fix what you don’t know is broken...
PDF
Works For Me! Characterizing Non-Reproducible Bug Reports
PPTX
debuggingSession.pptx
PDF
Unleashing the power of Unit Testing - Franck Ninsabira.pdf
PPTX
FaultHunter workshop (SourceMeter for SonarQube plugin module)
PDF
Рахманов Александр "Что полезного в разборе дампов для .NET-разработчиков?"
PPTX
Understanding Key Concepts and Applications in Week 11: A Comprehensive Overv...
PDF
Lecture 7 program development issues (supplementary)
PDF
Technical Workshop - Win32/Georbot Analysis
PPTX
It Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
PDF
ICSME2014
ODP
DevOps Days Vancouver 2014 Slides
PPTX
Production Debugging at Code Camp Philly
PDF
Dev and Ops Collaboration and Awareness at Etsy and Flickr
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
TechGIG_Memory leaks in_java_webnair_26th_july_2012
Effective Fault-Localization Techniques for Concurrent Software
Are Automated Debugging Techniques Actually Helping Programmers
Enhancing Developer Productivity with Code Forensics
PHP - Introduction to PHP Bugs - Debugging
Because you can’t fix what you don’t know is broken...
Works For Me! Characterizing Non-Reproducible Bug Reports
debuggingSession.pptx
Unleashing the power of Unit Testing - Franck Ninsabira.pdf
FaultHunter workshop (SourceMeter for SonarQube plugin module)
Рахманов Александр "Что полезного в разборе дампов для .NET-разработчиков?"
Understanding Key Concepts and Applications in Week 11: A Comprehensive Overv...
Lecture 7 program development issues (supplementary)
Technical Workshop - Win32/Georbot Analysis
It Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
ICSME2014
DevOps Days Vancouver 2014 Slides
Production Debugging at Code Camp Philly
Dev and Ops Collaboration and Awareness at Etsy and Flickr
Ad

Recently uploaded (20)

PPTX
Comparative Structure of Integument in Vertebrates.pptx
PDF
bbec55_b34400a7914c42429908233dbd381773.pdf
PPTX
Introduction to Fisheries Biotechnology_Lesson 1.pptx
PDF
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
PDF
HPLC-PPT.docx high performance liquid chromatography
PDF
Placing the Near-Earth Object Impact Probability in Context
PPTX
Classification Systems_TAXONOMY_SCIENCE8.pptx
PPTX
ECG_Course_Presentation د.محمد صقران ppt
PDF
The scientific heritage No 166 (166) (2025)
PDF
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
PPTX
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
PPTX
2. Earth - The Living Planet earth and life
PPTX
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
PPTX
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
PPTX
Cell Membrane: Structure, Composition & Functions
PPTX
INTRODUCTION TO EVS | Concept of sustainability
PPTX
microscope-Lecturecjchchchchcuvuvhc.pptx
PDF
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
PPTX
Taita Taveta Laboratory Technician Workshop Presentation.pptx
PPTX
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
Comparative Structure of Integument in Vertebrates.pptx
bbec55_b34400a7914c42429908233dbd381773.pdf
Introduction to Fisheries Biotechnology_Lesson 1.pptx
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
HPLC-PPT.docx high performance liquid chromatography
Placing the Near-Earth Object Impact Probability in Context
Classification Systems_TAXONOMY_SCIENCE8.pptx
ECG_Course_Presentation د.محمد صقران ppt
The scientific heritage No 166 (166) (2025)
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
2. Earth - The Living Planet earth and life
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
Cell Membrane: Structure, Composition & Functions
INTRODUCTION TO EVS | Concept of sustainability
microscope-Lecturecjchchchchcuvuvhc.pptx
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
Taita Taveta Laboratory Technician Workshop Presentation.pptx
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS

Fighting Software Inefficiency Through Automated Bug Detection