SlideShare a Scribd company logo
Icpc08b.ppt
Motivation
□ >50% of maintenance time spent trying to
understand the program
2ICPC 2008Marc Eaddy
Motivation
□ >50% of maintenance time spent trying to
understand the program
□ Where are the features,
reqs, etc. in the code?
3ICPC 2008Marc Eaddy
Motivation
□ >50% of maintenance time spent trying to
understand the program
□ Where are the features,
reqs, etc. in the code?
□ What is this code for?
4ICPC 2008Marc Eaddy
Motivation
□ >50% of maintenance time spent trying to
understand the program
□ Where are the features,
reqs, etc. in the code?
□ What is this code for?
□ Why is it hard to
understand and change
the program?
5ICPC 2008Marc Eaddy
What is a “concern?”
□ Feature, requirement, design pattern,
coding idiom, etc.
□ Raison d'être for code
□ Every line of code exists to satisfy some concern
6
Anything that affects the implementation of a program
ICPC 2008Marc Eaddy
Concern location problem
7
Concerns
Program
Elements
Concern–code relationship hard to obtain
ICPC 2008Marc Eaddy
Concern location problem
8
Concerns
Program
Elements
□ Concern–code relationship undocumented
Concern–code relationship hard to obtain
ICPC 2008Marc Eaddy
Concern location problem
9
Concerns
Program
Elements
□ Concern–code relationship undocumented
□ Reverse engineer the relationship
□ (but, which one?)
Concern–code relationship hard to obtain
ICPC 2008Marc Eaddy
Software pruning
□ Remove code that supports certain features,
reqs, etc.
□ Reduce program’s footprint
□ Support different platforms
□ Simplify program
10ICPC 2008Marc Eaddy
Prune dependency rule [ACOM’07]
□ Code is prune dependent on concern if
□ Pruning the concern requires removing or
altering the code
□ Must alter code that depends on removed
code
□ Prevent compile errors
□ Eliminate “dead code”
□ Easy to determine/approximate
11ICPC 2008Marc Eaddy
Automated concern location
□ Experts mine clues in code, docs, etc.
□ Existing techniques use 1 or 2 experts only
□ Our solution: Cerberus
1. Information retrieval
2. Execution tracing
3. Prune dependency analysis
12
Concern–code relationship predicted by an “expert”
ICPC 2008Marc Eaddy
IR-based concern location
□ i.e., Google for code
□ Program entities are documents
□ Requirements are queries
13
join
Id_joi
njs_join(
)
Requirement
“Array.join”
Source
Code
ICPC 2008Marc Eaddy
Vector space model [Salton]
□ Parse code and reqs doc to extract term vectors
□ NativeArray.js_join() method “native,” “array,” “join”
□ “Array.join” requirement “array,” “join”
□ Our contributions
□ Expand abbreviations
□ numconns number, connections, numberconnections
□ Index fields
□ Weigh terms (tf · idf)
□ Term frequency (tf)
□ Inverse document frequency (idf)
□ Similarity = cosine distance between document and
query vectors
14ICPC 2008Marc Eaddy
Tracing-based concern location
□ Observe elements activated when concern is
exercised
□ Unit tests for each concern
□ e.g., find elements uniquely activated by a concern
15ICPC 2008Marc Eaddy
Tracing-based concern location
□ Observe elements activated when concern is
exercised
□ Unit tests for each concern
□ e.g., find elements uniquely activated by a concern
16
Call
Graph
js_joi
n
var a = new Array(1,
2);
if (a.join(',') ==
"1,2")
{
print "Test
passed";
}
else {
print "Test
failed";
js_construct
Unit Test
for “Array.join”
Marc Eaddy
Tracing-based concern location
□ Observe elements activated when concern is
exercised
□ Unit tests for each concern
□ e.g., find elements uniquely activated by a concern
17
Call
Graph
js_joi
n
var a = new Array(1,
2);
if (a.join(',') ==
"1,2")
{
print "Test
passed";
}
else {
print "Test
failed";
js_construct
Unit Test
for “Array.join”
Marc Eaddy
Prune dependency analysis
□ Infer relevant elements based on structural
relationship to relevant element e (seed)
□ Assumes we already have some seeds
□ Prune dependency analysis
□ Determines prune dependency rule using
program analysis
□ Find references to e
□ Find superclasses and subclasses of e
18ICPC 2008Marc Eaddy
PDA example
19
C AB
foofoomain bar
calls
contains
refs
containscontains contains
Program Dependency Graph
interface A {
public void foo();
}
public class B implements A {
public void foo() { ... }
public void bar() { ... }
}
public class C {
public static void main() {
B b = new B();
b.bar();
}
Source Code
inherits
ICPC 2008Marc Eaddy
20
C AB
foofoomain bar
callscalls
contains
refs
containscontains contains
Program Dependency Graph
interface A {
public void foo();
}
public class B implements A {
public void foo() { ... }
public void bar() { ... }
}
public class C {
public static void main() {
B b = new B();
b.bar();
}
Source Code
inherits
PDA example
ICPC 2008Marc Eaddy
PDA example
21
C AB
foofoomain bar
containscontains
refs
containscontainscontains contains
Program Dependency Graph
interface A {
public void foo();
}
public class B implements A {
public void foo() { ... }
public void bar() { ... }
}
public class C {
public static void main() {
B b = new B();
b.bar();
}
Source Code
calls
inherits
ICPC 2008Marc Eaddy
PDA example
22
Program Dependency Graph
interface A {
public void foo();
}
public class B implements A {
public void foo() { ... }
public void bar() { ... }
}
public class C {
public static void main() {
B b = new B();
b.bar();
}
Source Code
inheritsinherits
C AB
foofoomain bar
refs
contains contains
calls
contains
contains
ICPC 2008Marc Eaddy
PDA example
23
Program Dependency Graph
interface A {
public void foo();
}
public class B implements A {
public void foo() { ... }
public void bar() { ... }
}
public class C {
public static void main() {
B b = new B();
b.bar();
}
Source Code
C AB
foofoomain bar
refs
contains containscontains
calls
contains
contains
inherits
ICPC 2008Marc Eaddy
Cerberus
24ICPC 2008Marc Eaddy
Cerberus PROMESIR + SNIAFL
25ICPC 2008Marc Eaddy
≈
Cerberus effectiveness
Marc Eaddy 26
Cerberus
Ignoring “No results found”
27
Cerberus
Marc Eaddy
Future work
□ Improve PDA
□ Reimplemented using Soot and Polyglot
□ Generalize using prune dependency predicates
□ Improve precision using points-to analysis
□ Improve accuracy using
□Dominator heuristic
□Variable liveness analysis
□ Improve accuracy of Cerberus
□ Combine experts using matrix linear regression
28ICPC 2008Marc Eaddy
Cerberus contributions
□ Effectively combined 3
concern location techniques
□ PDA boosts performance of
other techniques
Marc Eaddy ICPC 2008 29
Program Dependency Graph
interface A {
public void foo();
}
public class B implements A {
public void foo() { ... }
public void bar() { ... }
}
public class C {
public static void main() {
B b = new B();
b.bar();
}
Source Code
C AB
foofoomain bar
refs
contains contains
calls
contains
contains
Questions?
Marc Eaddy
Columbia University
eaddy@cs.columbia.edu
30ICPC 2008Marc Eaddy

More Related Content

PDF
ICPC08b.ppt
PPTX
Isorc18 keynote
PPT
Code Analysis-run time error prediction
PDF
Verilog HDL - 3
PDF
C multiple choice questions and answers pdf
PPTX
Singapore International Cyberweek 2020
PDF
Verilog HDL- 2
PDF
Object orientering, test driven development og c
ICPC08b.ppt
Isorc18 keynote
Code Analysis-run time error prediction
Verilog HDL - 3
C multiple choice questions and answers pdf
Singapore International Cyberweek 2020
Verilog HDL- 2
Object orientering, test driven development og c

What's hot (20)

PDF
Introduction to D programming language at Weka.IO
PDF
VHDL CODE
PDF
Beyond C++17
PDF
Domain Specific Languages for Parallel Graph AnalytiX (PGX)
PPT
PPTX
Simple c program
PPTX
SharePoint Saturday Belgium 2014 - Production debugging of SharePoint applica...
PDF
VHDL Programs
PPTX
Mobilesoft 2017 Keynote
PDF
C mcq practice test 4
PDF
Open source report writing tools for IBM i Vienna 2012
PDF
Comparing IDL to C++ with IDL to C++11
PPTX
Computer Engineering (Programming Language: Swift)
PDF
Scaling Language Specifications
PDF
Measuring maintainability; software metrics explained
PDF
learning vhdl by examples
PDF
Verilog tutorial
PDF
Comunicare, condividere e mantenere decisioni architetturali nei team di svil...
PDF
VHdl lab report
PPTX
Verilog overview
Introduction to D programming language at Weka.IO
VHDL CODE
Beyond C++17
Domain Specific Languages for Parallel Graph AnalytiX (PGX)
Simple c program
SharePoint Saturday Belgium 2014 - Production debugging of SharePoint applica...
VHDL Programs
Mobilesoft 2017 Keynote
C mcq practice test 4
Open source report writing tools for IBM i Vienna 2012
Comparing IDL to C++ with IDL to C++11
Computer Engineering (Programming Language: Swift)
Scaling Language Specifications
Measuring maintainability; software metrics explained
learning vhdl by examples
Verilog tutorial
Comunicare, condividere e mantenere decisioni architetturali nei team di svil...
VHdl lab report
Verilog overview
Ad

Similar to Icpc08b.ppt (20)

PDF
ICPC08b.ppt
PPTX
Semantic-Aware Code Model: Elevating the Future of Software Development
PDF
Building a web application with ontinuation monads
PPTX
PDC Video on C# 4.0 Futures
PDF
Code metrics in PHP
PPTX
CMPT470-usask-guest-lecture
PPTX
Build the Roof First
PDF
70-494 it examen braindumps
PDF
Revisiting Code Ownership and Its Relationship with Software Quality in the S...
PDF
Agados POC Report to Build/Rebuild for ERP PKG
PDF
Juan josefumeroarray14
ODP
Java 5 6 Generics, Concurrency, Garbage Collection, Tuning
PPT
DotNet Introduction
PDF
Ida python intro
PDF
Accelerated .NET Memory Dump Analysis training public slides
PDF
XConf 2022 - Code As Data: How data insights on legacy codebases can fill the...
PDF
C 20 for Programmers 3rd Edition P. Deitel
PDF
Breaking a monolith: In-place refactoring with service-oriented architecture ...
PDF
LF_APIStrat17_Breaking a Monolith: In-Place Refactoring with Service-Oriented...
PDF
PHP Technical Question
ICPC08b.ppt
Semantic-Aware Code Model: Elevating the Future of Software Development
Building a web application with ontinuation monads
PDC Video on C# 4.0 Futures
Code metrics in PHP
CMPT470-usask-guest-lecture
Build the Roof First
70-494 it examen braindumps
Revisiting Code Ownership and Its Relationship with Software Quality in the S...
Agados POC Report to Build/Rebuild for ERP PKG
Juan josefumeroarray14
Java 5 6 Generics, Concurrency, Garbage Collection, Tuning
DotNet Introduction
Ida python intro
Accelerated .NET Memory Dump Analysis training public slides
XConf 2022 - Code As Data: How data insights on legacy codebases can fill the...
C 20 for Programmers 3rd Edition P. Deitel
Breaking a monolith: In-place refactoring with service-oriented architecture ...
LF_APIStrat17_Breaking a Monolith: In-Place Refactoring with Service-Oriented...
PHP Technical Question
Ad

More from Yann-Gaël Guéhéneuc (20)

PDF
Rights, Copyrights, and Licences for Software Engineering Research v1.0
PDF
Evolution and Examples of Java Features, from Java 1.7 to Java 24
PDF
Projects Panama, Valhalla, and Babylon: Java is the New Python v0.9
PDF
Consequences and Principles of Software Quality v1.0
PDF
About Empirical Studies on Software Quality
PDF
A (Very) Brief History of Ethics for Software Engineering Research
PDF
Project Manifold (Forwarding and Delegation)
PDF
Reviewing Processes and Tools, Publishers, Open Access
PDF
Custom Annotations in Java with Project Lombok
PDF
Some Pitfalls with Python and Their Possible Solutions v1.0
PDF
Advice for writing a NSERC Discovery grant application v0.5
PDF
Ptidej Architecture, Design, and Implementation in Action v2.1
PDF
Evolution and Examples of Java Features, from Java 1.7 to Java 22
PDF
Consequences and Principles of Software Quality v0.3
PDF
Some Pitfalls with Python and Their Possible Solutions v0.9
PDF
An Explanation of the Unicode, the Text Encoding Standard, Its Usages and Imp...
PDF
An Explanation of the Halting Problem and Its Consequences
PDF
Are CPUs VMs Like Any Others? v1.0
PDF
Informaticien(ne)s célèbres (v1.0.2, 19/02/20)
PDF
Well-known Computer Scientists v1.0.2
Rights, Copyrights, and Licences for Software Engineering Research v1.0
Evolution and Examples of Java Features, from Java 1.7 to Java 24
Projects Panama, Valhalla, and Babylon: Java is the New Python v0.9
Consequences and Principles of Software Quality v1.0
About Empirical Studies on Software Quality
A (Very) Brief History of Ethics for Software Engineering Research
Project Manifold (Forwarding and Delegation)
Reviewing Processes and Tools, Publishers, Open Access
Custom Annotations in Java with Project Lombok
Some Pitfalls with Python and Their Possible Solutions v1.0
Advice for writing a NSERC Discovery grant application v0.5
Ptidej Architecture, Design, and Implementation in Action v2.1
Evolution and Examples of Java Features, from Java 1.7 to Java 22
Consequences and Principles of Software Quality v0.3
Some Pitfalls with Python and Their Possible Solutions v0.9
An Explanation of the Unicode, the Text Encoding Standard, Its Usages and Imp...
An Explanation of the Halting Problem and Its Consequences
Are CPUs VMs Like Any Others? v1.0
Informaticien(ne)s célèbres (v1.0.2, 19/02/20)
Well-known Computer Scientists v1.0.2

Recently uploaded (20)

PPTX
Log360_SIEM_Solutions Overview PPT_Feb 2020.pptx
PPTX
AMADEUS TRAVEL AGENT SOFTWARE | AMADEUS TICKETING SYSTEM
PPTX
history of c programming in notes for students .pptx
PPTX
assetexplorer- product-overview - presentation
PDF
Salesforce Agentforce AI Implementation.pdf
DOCX
Greta — No-Code AI for Building Full-Stack Web & Mobile Apps
PPTX
L1 - Introduction to python Backend.pptx
PDF
iTop VPN 6.5.0 Crack + License Key 2025 (Premium Version)
PPTX
Why Generative AI is the Future of Content, Code & Creativity?
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PPTX
Transform Your Business with a Software ERP System
PDF
CCleaner Pro 6.38.11537 Crack Final Latest Version 2025
PDF
wealthsignaloriginal-com-DS-text-... (1).pdf
PDF
Cost to Outsource Software Development in 2025
PDF
Product Update: Alluxio AI 3.7 Now with Sub-Millisecond Latency
PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PPTX
Oracle Fusion HCM Cloud Demo for Beginners
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PDF
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
PPTX
CHAPTER 2 - PM Management and IT Context
Log360_SIEM_Solutions Overview PPT_Feb 2020.pptx
AMADEUS TRAVEL AGENT SOFTWARE | AMADEUS TICKETING SYSTEM
history of c programming in notes for students .pptx
assetexplorer- product-overview - presentation
Salesforce Agentforce AI Implementation.pdf
Greta — No-Code AI for Building Full-Stack Web & Mobile Apps
L1 - Introduction to python Backend.pptx
iTop VPN 6.5.0 Crack + License Key 2025 (Premium Version)
Why Generative AI is the Future of Content, Code & Creativity?
Internet Downloader Manager (IDM) Crack 6.42 Build 41
Transform Your Business with a Software ERP System
CCleaner Pro 6.38.11537 Crack Final Latest Version 2025
wealthsignaloriginal-com-DS-text-... (1).pdf
Cost to Outsource Software Development in 2025
Product Update: Alluxio AI 3.7 Now with Sub-Millisecond Latency
Wondershare Filmora 15 Crack With Activation Key [2025
Oracle Fusion HCM Cloud Demo for Beginners
Navsoft: AI-Powered Business Solutions & Custom Software Development
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
CHAPTER 2 - PM Management and IT Context

Icpc08b.ppt

  • 2. Motivation □ >50% of maintenance time spent trying to understand the program 2ICPC 2008Marc Eaddy
  • 3. Motivation □ >50% of maintenance time spent trying to understand the program □ Where are the features, reqs, etc. in the code? 3ICPC 2008Marc Eaddy
  • 4. Motivation □ >50% of maintenance time spent trying to understand the program □ Where are the features, reqs, etc. in the code? □ What is this code for? 4ICPC 2008Marc Eaddy
  • 5. Motivation □ >50% of maintenance time spent trying to understand the program □ Where are the features, reqs, etc. in the code? □ What is this code for? □ Why is it hard to understand and change the program? 5ICPC 2008Marc Eaddy
  • 6. What is a “concern?” □ Feature, requirement, design pattern, coding idiom, etc. □ Raison d'être for code □ Every line of code exists to satisfy some concern 6 Anything that affects the implementation of a program ICPC 2008Marc Eaddy
  • 7. Concern location problem 7 Concerns Program Elements Concern–code relationship hard to obtain ICPC 2008Marc Eaddy
  • 8. Concern location problem 8 Concerns Program Elements □ Concern–code relationship undocumented Concern–code relationship hard to obtain ICPC 2008Marc Eaddy
  • 9. Concern location problem 9 Concerns Program Elements □ Concern–code relationship undocumented □ Reverse engineer the relationship □ (but, which one?) Concern–code relationship hard to obtain ICPC 2008Marc Eaddy
  • 10. Software pruning □ Remove code that supports certain features, reqs, etc. □ Reduce program’s footprint □ Support different platforms □ Simplify program 10ICPC 2008Marc Eaddy
  • 11. Prune dependency rule [ACOM’07] □ Code is prune dependent on concern if □ Pruning the concern requires removing or altering the code □ Must alter code that depends on removed code □ Prevent compile errors □ Eliminate “dead code” □ Easy to determine/approximate 11ICPC 2008Marc Eaddy
  • 12. Automated concern location □ Experts mine clues in code, docs, etc. □ Existing techniques use 1 or 2 experts only □ Our solution: Cerberus 1. Information retrieval 2. Execution tracing 3. Prune dependency analysis 12 Concern–code relationship predicted by an “expert” ICPC 2008Marc Eaddy
  • 13. IR-based concern location □ i.e., Google for code □ Program entities are documents □ Requirements are queries 13 join Id_joi njs_join( ) Requirement “Array.join” Source Code ICPC 2008Marc Eaddy
  • 14. Vector space model [Salton] □ Parse code and reqs doc to extract term vectors □ NativeArray.js_join() method “native,” “array,” “join” □ “Array.join” requirement “array,” “join” □ Our contributions □ Expand abbreviations □ numconns number, connections, numberconnections □ Index fields □ Weigh terms (tf · idf) □ Term frequency (tf) □ Inverse document frequency (idf) □ Similarity = cosine distance between document and query vectors 14ICPC 2008Marc Eaddy
  • 15. Tracing-based concern location □ Observe elements activated when concern is exercised □ Unit tests for each concern □ e.g., find elements uniquely activated by a concern 15ICPC 2008Marc Eaddy
  • 16. Tracing-based concern location □ Observe elements activated when concern is exercised □ Unit tests for each concern □ e.g., find elements uniquely activated by a concern 16 Call Graph js_joi n var a = new Array(1, 2); if (a.join(',') == "1,2") { print "Test passed"; } else { print "Test failed"; js_construct Unit Test for “Array.join” Marc Eaddy
  • 17. Tracing-based concern location □ Observe elements activated when concern is exercised □ Unit tests for each concern □ e.g., find elements uniquely activated by a concern 17 Call Graph js_joi n var a = new Array(1, 2); if (a.join(',') == "1,2") { print "Test passed"; } else { print "Test failed"; js_construct Unit Test for “Array.join” Marc Eaddy
  • 18. Prune dependency analysis □ Infer relevant elements based on structural relationship to relevant element e (seed) □ Assumes we already have some seeds □ Prune dependency analysis □ Determines prune dependency rule using program analysis □ Find references to e □ Find superclasses and subclasses of e 18ICPC 2008Marc Eaddy
  • 19. PDA example 19 C AB foofoomain bar calls contains refs containscontains contains Program Dependency Graph interface A { public void foo(); } public class B implements A { public void foo() { ... } public void bar() { ... } } public class C { public static void main() { B b = new B(); b.bar(); } Source Code inherits ICPC 2008Marc Eaddy
  • 20. 20 C AB foofoomain bar callscalls contains refs containscontains contains Program Dependency Graph interface A { public void foo(); } public class B implements A { public void foo() { ... } public void bar() { ... } } public class C { public static void main() { B b = new B(); b.bar(); } Source Code inherits PDA example ICPC 2008Marc Eaddy
  • 21. PDA example 21 C AB foofoomain bar containscontains refs containscontainscontains contains Program Dependency Graph interface A { public void foo(); } public class B implements A { public void foo() { ... } public void bar() { ... } } public class C { public static void main() { B b = new B(); b.bar(); } Source Code calls inherits ICPC 2008Marc Eaddy
  • 22. PDA example 22 Program Dependency Graph interface A { public void foo(); } public class B implements A { public void foo() { ... } public void bar() { ... } } public class C { public static void main() { B b = new B(); b.bar(); } Source Code inheritsinherits C AB foofoomain bar refs contains contains calls contains contains ICPC 2008Marc Eaddy
  • 23. PDA example 23 Program Dependency Graph interface A { public void foo(); } public class B implements A { public void foo() { ... } public void bar() { ... } } public class C { public static void main() { B b = new B(); b.bar(); } Source Code C AB foofoomain bar refs contains containscontains calls contains contains inherits ICPC 2008Marc Eaddy
  • 25. Cerberus PROMESIR + SNIAFL 25ICPC 2008Marc Eaddy ≈
  • 27. Ignoring “No results found” 27 Cerberus Marc Eaddy
  • 28. Future work □ Improve PDA □ Reimplemented using Soot and Polyglot □ Generalize using prune dependency predicates □ Improve precision using points-to analysis □ Improve accuracy using □Dominator heuristic □Variable liveness analysis □ Improve accuracy of Cerberus □ Combine experts using matrix linear regression 28ICPC 2008Marc Eaddy
  • 29. Cerberus contributions □ Effectively combined 3 concern location techniques □ PDA boosts performance of other techniques Marc Eaddy ICPC 2008 29 Program Dependency Graph interface A { public void foo(); } public class B implements A { public void foo() { ... } public void bar() { ... } } public class C { public static void main() { B b = new B(); b.bar(); } Source Code C AB foofoomain bar refs contains contains calls contains contains