SlideShare a Scribd company logo
Identifying
Crosscutting Concerns
Using
Historical Code Changes
Bram Adams
Zhen Ming Jiang
Ahmed E. Hassan
SAIL, Queen's University
http://sailhome.cs.queensu.ca/~bram/
What are crosscutting concerns?
2
Crosscutting
Concerns
3
Crosscutting
Concerns
3
Crosscutting
Concerns
multi-threading
tracingexception
handling
data
persistence
security
memory
cleanup
3D rendering
performance
sound
support
3
Crosscutting
Concerns
multi-threading
tracingexception
handling
data
persistence
security
memory
cleanup
3D rendering
performance
sound
support
3
Crosscutting
Concerns
multi-threading
tracingexception
handling
data
persistence
security
memory
cleanup
3D rendering
performance
sound
support
3
Crosscutting
Concerns
multi-threading
tracingexception
handling
data
persistence
security
memory
cleanup
3D rendering
performance
sound
support
3
Crosscutting
Concerns
multi-threading
tracingexception
handling
data
persistence
security
memory
cleanup
3D rendering
performance
sound
support
3
Crosscutting
Concerns
multi-threading
tracingexception
handling
data
persistence
security
memory
cleanup
3D rendering
performance
sound
support
3
Crosscutting
Concerns
multi-threading
tracingexception
handling
data
persistence
security
memory
cleanup
3D rendering
performance
sound
support
3
Crosscutting
Concerns
multi-threading
tracingexception
handling
data
persistence
security
memory
cleanup
3D rendering
performance
sound
support
3
Crosscutting
Concerns
multi-threading
tracingexception
handling
data
persistence
security
memory
cleanup
3D rendering
performance
sound
support
3
Crosscutting
Concerns
multi-threading
tracingexception
handling
data
persistence
security
memory
cleanup
3D rendering
performance
sound
support
3
1. Which concerns are implemented?
2. Where?
3. How are concerns composed together?
(Crosscutting) Concern
Mining
4
1. What is a Crosscutting Concern?
2. The Concern Mining Process and its
Shortcomings
3. COMMIT
4. Case Study
5. Conclusion
5
Concern Mining Process
data
source
6
Concern Mining Process
data
source
concern
seeds
1
6
Concern Mining Process
data
source
concern
seeds
Concern Mining
Techniques :-)
1
6
Concern Mining Process
data
source
concern
seeds
Concern Mining
Techniques :-)
concerns
1 2
6
Concern Mining Process
data
source
concern
seeds
expanded
concerns
Concern Mining
Techniques :-)
concerns
1 2 3
6
Concern Mining Process
data
source
concern
seeds
expanded
concerns
concern
composition
Concern Mining
Techniques :-)
concerns
1 2 3 4
6
Concern Mining Process
data
source
concern
seeds
expanded
concerns
concern
composition
Concern Mining
Techniques :-)
MANUAL :-(
concerns
1 2 3 4
6
Concern Mining Process
data
source
concern
seeds
expanded
concerns
concern
composition
Concern Mining
Techniques :-)
MANUAL :-(
concerns
1 2 3 4
6
S1: Limited
Context
1 2 3 4
7
S1: Limited
Context
thread()
process()
block()
clean()
1 2 3 4
7
S1: Limited
Context
thread()
process()
mutex
semaphore_t
address
sender subject
block()
DEFINED_LINUX
clean()
1 2 3 4
7
S1: Limited
Context
thread()
process()
mutex
semaphore_t
address
sender subject
CVS
block()
DEFINED_LINUX
clean()
thread()
1 2 3 4
7
S2: Noise
1 2 3 4
8
S2: Noise
1 2 3 4
8
S2: Noise
1 2 3 4
8
S3: No
Composition
random
encrypt
decrypt
seed
1 2 3 4
9
S3: No
Composition
random
encrypt
decrypt
seed
random
encrypt
decrypt
seed
1 2 3 4
9
1. What is a Crosscutting Concern?
2. The Concern Mining Process and its
Shortcomings
3. COMMIT
4. Case Study
5. Conclusion
10
COncern Mining using Mutual
Information over Time
CVS
11
limited
context
noise
no
composition
COncern Mining using Mutual
Information over Time
analyze historical
changes to all code
entities
CVS
11
limited
context
noise
no
composition
COncern Mining using Mutual
Information over Time
analyze historical
changes to all code
entities
statistical clustering
based on mutual
information
CVS
11
limited
context
noise
no
composition
S1. Historical Data Sources
CVS
CVS
12
S1. Historical Data Sources
CVS
transactions
CVS
12
S1. Historical Data Sources
CVS
transactions
CVS
12
S1. Historical Data Sources
CVS
transactions
CVS
function call or
variable access added
12
S1. Historical Data Sources
CVS
transactions
CVS
function call or
variable access added
intentional co-
addition of calls and
accesses 12
S1. Historical Data Sources
CVS
transactions
CVS
function call or
variable access added
intentional co-
addition of calls and
accesses
concern
seed
12
S2. Mutual
Information
13
S2. Mutual
Information
13
S2. Mutual
Information
13
S2. Mutual
Information
13
S2. Mutual
Information
13
S2. Mutual
Information
13
S2. Mutual
Information
13
S2. Mutual
Information
13
S2. Mutual
Information
13
S2. Mutual
Information
13
S2. Mutual
Information
13
S2. Mutual
Information
13
S2. Mutual
Information
13
S2. Mutual
Information
14
How much does occurrence of
reveal about occurrence of ?
S2. Mutual
Information
14
How much does occurrence of
reveal about occurrence of ?
S2. Mutual
Information
14
How much does occurrence of
reveal about occurrence of ?
S2. Mutual
Information
14
How much does occurrence of
reveal about occurrence of ?
S2. Mutual
Information
14
How much does occurrence of
reveal about occurrence of ?
S2. Mutual
Information
14
S3. Concern Relations
seed graph
15
S3. Concern Relations
15
S3. Concern Relations
composite
concern
simple
concern
15
1. What is a Crosscutting Concern?
2. The Concern Mining Process and its
Shortcomings
3. COMMIT
4. Case Study
5. Conclusion
16
Case Study
1996-2002 1993-2003
(800 kLOC) (2 MLOC)
17
Comparative Study
18
CBFA
HAM
COMMIT
similar entity
names ✖
identical set
of callers ✖
mutual
information ✔
limited
context
noise no
composition
Comparative Study
18
CBFA
HAM
COMMIT
similar entity
names ✖
identical set
of callers ✖
mutual
information ✔
limited
context
noise no
composition
Comparative Study
18
CBFA
HAM
COMMIT
similar entity
names ✖
identical set
of callers ✖
mutual
information ✔
limited
context
noise no
composition
CVS
CVS
snapshot
Comparative Study
18
CBFA
HAM
COMMIT
similar entity
names ✖
identical set
of callers ✖
mutual
information ✔
limited
context
noise no
composition
CVS
CVS
snapshot
Comparative Study
18
CBFA
HAM
COMMIT
similar entity
names ✖
identical set
of callers ✖
mutual
information ✔
limited
context
noise no
composition
CVS
CVS
snapshot
Study Design
19
Study Design
19
Study Design
CBFA
HAM
COMMIT
19
Study Design
CBFA
HAM
COMMIT
top 20
top 20
top 20
19
Study Design
CBFA
HAM
COMMIT
top 20
top 20
top 20
19
concern?
Study Design
CBFA
HAM
COMMIT
top 20
top 20
top 20
19
Study Design
CBFA
HAM
COMMIT
19
Study Design
CBFA
HAM
COMMIT
19
Study Design
CBFA
HAM
COMMIT
top 20
top 20
top 20
19
Study Design
CBFA
HAM
COMMIT
top 20
top 20
top 20
19
top 20
top 20
top 20
H1. Richer Data Sources
Yield richer Seeds
CVS
20
H1. Richer Data Sources
Yield richer Seeds
0
8
16
24
32
40
CBFA HAM COMMIT
0
45
90
135
180
225
CBFA HAM COMMIT
#non-function
entities
#functions
CVS
20
H1. Richer Data Sources
Yield richer Seeds
0
8
16
24
32
40
CBFA HAM COMMIT
0
45
90
135
180
225
CBFA HAM COMMIT
#non-function
entities
#functions
CVS
20
50%
79% 83%
29% 88%
75%
H2. COMMIT Identifies a Larger
Percentage of unique Concerns
21
H2. COMMIT Identifies a Larger
Percentage of unique Concerns
21
0
20
40
60
80
100
CBFA HAM COMMIT
0
20
40
60
80
100
CBFA HAM COMMIT
H2. COMMIT Identifies a Larger
Percentage of unique Concerns
21
0
20
40
60
80
100
CBFA HAM COMMIT
56% 56%
0
20
40
60
80
100
CBFA HAM COMMIT
87.5% 50%
H3. COMMIT complements
CBFA and HAM (1)
22
H3. COMMIT complements
CBFA and HAM (1)
22
CBFA HAM
COMMIT
0
0
1
08
14
9
CBFA HAM
COMMIT
1
0
0
09
14
9
H3. COMMIT complements
CBFA and HAM (2)
23
d1 d2 d3 d4 d5 d6 d7 d8 d9
H3. COMMIT complements
CBFA and HAM (2)
device drivers
23
kernel
d1 d2 d3 d4 d5 d6 d7 d8 d9
H3. COMMIT complements
CBFA and HAM (2)
CBFA concern
(e.g., driver API)
23
kernel
d1 d2 d3 d4 d5 d6 d7 d8 d9
H3. COMMIT complements
CBFA and HAM (2)
CBFA concern
(e.g., driver API)
HAM concern
(e.g., cloned driver code)
23
kernel
d1 d2 d3 d4 d5 d6 d7 d8 d9
H3. COMMIT complements
CBFA and HAM (2)
CBFA concern
(e.g., driver API)
HAM concern
(e.g., cloned driver code)
COMMIT concern
(e.g., driver +
infrastructure)
23
kernel
d1 d2 d3 d4 d5 d6 d7 d8 d9
H3. COMMIT complements
CBFA and HAM (2)
CBFA concern
(e.g., driver API)
HAM concern
(e.g., cloned driver code)
COMMIT concern
(e.g., driver +
infrastructure)
23
kernel
24
ODBC Data Retrieval
Composite Concern
24
ODBC Data Retrieval
Composite Concern
1. connection configuration
1
24
ODBC Data Retrieval
Composite Concern
ODBC
1. connection configuration
1
2
24
ODBC Data Retrieval
Composite Concern
ODBC
1. connection configuration
2. connection error handling
1 3
2
24
ODBC Data Retrieval
Composite Concern
ODBC
1. connection configuration
2. connection error handling
3. data transfer
1 3
2
4
24
ODBC Data Retrieval
Composite Concern
ODBC
1. connection configuration
2. connection error handling
3. data transfer
4. SQL-to-ODBC conversion
1 3
2
45
24
ODBC Data Retrieval
Composite Concern
ODBC
1. connection configuration
2. connection error handling
3. data transfer
4. SQL-to-ODBC conversion
5. ODBC-to-ESQL conversion
1 3
2
6
45
24
ODBC Data Retrieval
Composite Concern
ODBC
1. connection configuration
2. connection error handling
3. data transfer
4. SQL-to-ODBC conversion
5. ODBC-to-ESQL conversion
6. conversion error handling
1
3
4
6
2
1 3
2
6
45
5
24
ODBC Data Retrieval
Composite Concern
ODBC
1. connection configuration
2. connection error handling
3. data transfer
4. SQL-to-ODBC conversion
5. ODBC-to-ESQL conversion
6. conversion error handling
36 seeds
25
36 seeds
ODBC Data
Retrieval
Concern
25
36 seeds
ODBC Data
Retrieval
Concern
25
5 other
composite
concerns
Threats toValidity
• generalizability to other systems
• subjectivity substantial agreement (Kappa)
• seed quality not checked
• threshold optimization is task-specific
26
1. What is a Crosscutting Concern?
2. The Concern Mining Process and its
Shortcomings
3. COMMIT
4. Case Study
5. Conclusion
27
28
Crosscutting
Concerns
multi-threading
tracingexception
handling
data
persistence
security
memory
cleanup
3D rendering
performance
sound
support
28
Crosscutting
Concerns
multi-threading
tracingexception
handling
data
persistence
security
memory
cleanup
3D rendering
performance
sound
support
28
Concern Mining Shortcomings
S1. limited seed context
S2. noise between seeds
S3. no composition of concerns
Crosscutting
Concerns
multi-threading
tracingexception
handling
data
persistence
security
memory
cleanup
3D rendering
performance
sound
support
28
Concern Mining Shortcomings
S1. limited seed context
S2. noise between seeds
S3. no composition of concerns
COMMIT
CVS
transactions
function call or
variable access added
intentional co-
addition of calls and
accesses
concern
seed
Crosscutting
Concerns
multi-threading
tracingexception
handling
data
persistence
security
memory
cleanup
3D rendering
performance
sound
support
28
COMMIT complements
CBFA and HAM
CBFA HAM
COMMIT
0
0
1
08
14
9
CBFA HAM
COMMIT
1
0
0
09
14
9
Concern Mining Shortcomings
S1. limited seed context
S2. noise between seeds
S3. no composition of concerns
COMMIT
CVS
transactions
function call or
variable access added
intentional co-
addition of calls and
accesses
concern
seed
QUESTIONS?
Crosscutting
Concerns
multi-threading
tracingexception
handling
data
persistence
security
memory
cleanup
3D rendering
performance
sound
support
28
COMMIT complements
CBFA and HAM
CBFA HAM
COMMIT
0
0
1
08
14
9
CBFA HAM
COMMIT
1
0
0
09
14
9
Concern Mining Shortcomings
S1. limited seed context
S2. noise between seeds
S3. no composition of concerns
COMMIT
CVS
transactions
function call or
variable access added
intentional co-
addition of calls and
accesses
concern
seed

More Related Content

PPT
The AOSD Research Community in Brazil and its Crosscutting Impact
PPTX
Aspect Oriented Programming - AOP/AOSD
PDF
Scylla Summit 2022: Stream Processing with ScyllaDB
PDF
Project Report (Summer 2016)
PDF
From a student to an apache committer practice of apache io tdb
ZIP
Forget The ORM!
PDF
A Framework for Dynamic Data Source Identification and Orchestration on the Web
The AOSD Research Community in Brazil and its Crosscutting Impact
Aspect Oriented Programming - AOP/AOSD
Scylla Summit 2022: Stream Processing with ScyllaDB
Project Report (Summer 2016)
From a student to an apache committer practice of apache io tdb
Forget The ORM!
A Framework for Dynamic Data Source Identification and Orchestration on the Web

Similar to Icse2010 adams presentation (20)

PDF
B131626
PPTX
Ch21-Software Engineering 9
PDF
OC Big Data Monthly Meetup #5 - Session 2 - Sumo Logic
PPTX
Revision
PDF
76201929
DOCX
JPJ1421 Facilitating Document Annotation Using Content and Querying Value
PPT
Semantic Interoperability in Infocosm: Beyond Infrastructural and Data Intero...
PDF
Creating Semantic Mashups Bridging Web 2 0 And The Semantic Web Presentation 1
PDF
Creating Semantic Mashups Bridging Web 2 0 And The Semantic Web Presentation 1
PPTX
Ch21.pptx
PDF
high_level_parallel_processing_model
PDF
Seminar.2010.NoSql
PDF
S-CUBE LP: Service Versioning, Compatibility and Evolution
PDF
Data Integration in Multi-sources Information Systems
PPTX
Ieee metadata-conf-1999-keynote-amit sheth
PPTX
A Statistical Approach to Resolve Conflicting Requirements in Pervasive Compu...
PDF
Enough suffering, fix your architecture!
PDF
Accurate Networks Measurements Environment
PPTX
MongoDB for Time Series Data: Setting the Stage for Sensor Management
PDF
What You Need To Know About The Top Database Trends
B131626
Ch21-Software Engineering 9
OC Big Data Monthly Meetup #5 - Session 2 - Sumo Logic
Revision
76201929
JPJ1421 Facilitating Document Annotation Using Content and Querying Value
Semantic Interoperability in Infocosm: Beyond Infrastructural and Data Intero...
Creating Semantic Mashups Bridging Web 2 0 And The Semantic Web Presentation 1
Creating Semantic Mashups Bridging Web 2 0 And The Semantic Web Presentation 1
Ch21.pptx
high_level_parallel_processing_model
Seminar.2010.NoSql
S-CUBE LP: Service Versioning, Compatibility and Evolution
Data Integration in Multi-sources Information Systems
Ieee metadata-conf-1999-keynote-amit sheth
A Statistical Approach to Resolve Conflicting Requirements in Pervasive Compu...
Enough suffering, fix your architecture!
Accurate Networks Measurements Environment
MongoDB for Time Series Data: Setting the Stage for Sensor Management
What You Need To Know About The Top Database Trends
Ad

More from SAIL_QU (20)

PDF
Studying the Integration Practices and the Evolution of Ad Libraries in the G...
PDF
Studying the Dialogue Between Users and Developers of Free Apps in the Google...
PPTX
Improving the testing efficiency of selenium-based load tests
PDF
Studying User-Developer Interactions Through the Distribution and Reviewing M...
PDF
Studying online distribution platforms for games through the mining of data f...
PPTX
Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...
PDF
Investigating the Challenges in Selenium Usage and Improving the Testing Effi...
PDF
Mining Development Knowledge to Understand and Support Software Logging Pract...
PPTX
Which Log Level Should Developers Choose For a New Logging Statement?
PPTX
Towards Just-in-Time Suggestions for Log Changes
PDF
The Impact of Task Granularity on Co-evolution Analyses
PPTX
A Framework for Evaluating the Results of the SZZ Approach for Identifying Bu...
PPTX
How are Discussions Associated with Bug Reworking? An Empirical Study on Open...
PPTX
A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...
PDF
A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...
PPTX
Studying the Dialogue Between Users and Developers of Free Apps in the Google...
PDF
What Do Programmers Know about Software Energy Consumption?
PPTX
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...
PDF
Revisiting the Experimental Design Choices for Approaches for the Automated R...
PPTX
Measuring Program Comprehension: A Large-Scale Field Study with Professionals
Studying the Integration Practices and the Evolution of Ad Libraries in the G...
Studying the Dialogue Between Users and Developers of Free Apps in the Google...
Improving the testing efficiency of selenium-based load tests
Studying User-Developer Interactions Through the Distribution and Reviewing M...
Studying online distribution platforms for games through the mining of data f...
Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...
Investigating the Challenges in Selenium Usage and Improving the Testing Effi...
Mining Development Knowledge to Understand and Support Software Logging Pract...
Which Log Level Should Developers Choose For a New Logging Statement?
Towards Just-in-Time Suggestions for Log Changes
The Impact of Task Granularity on Co-evolution Analyses
A Framework for Evaluating the Results of the SZZ Approach for Identifying Bu...
How are Discussions Associated with Bug Reworking? An Empirical Study on Open...
A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...
A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...
Studying the Dialogue Between Users and Developers of Free Apps in the Google...
What Do Programmers Know about Software Energy Consumption?
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...
Revisiting the Experimental Design Choices for Approaches for the Automated R...
Measuring Program Comprehension: A Large-Scale Field Study with Professionals
Ad

Icse2010 adams presentation