SlideShare a Scribd company logo
Automated Evolution of Feature Logging
Statement Levels Using Git Histories and Degree
of Interest
Science of Computer Programming, Volume 214, 1 Feb 2022, 102724
Yiming Tang1
Allan Spektor2
Raffi Khatchadourian2,3
Mehdi
Bagherzadeh4
1
Concordia University, Canada
2
City University of New York (CUNY) Hunter College, USA
3
City University of New York (CUNY) Graduate Center, USA
4
Oakland University, USA
IEEE International Conference on Software Analysis, Evolution &
Re-engineering
March 17, 2022, Honolulu, HI, USA (remote)
Introduction Motivation Approach Evaluation Conclusion Logging Issues
Logging in Modern Software in the Big Data Era
Logging is pervasive in the modern software.
Big data systems deal with high-volumes of transactions.
Source code is tangled with scattered logging statements capturing
important event information.
Essential for reporting security and privacy breaches.
Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 2 / 12
Introduction Motivation Approach Evaluation Conclusion Logging Issues
Feature Logging Statements
Modern software is also feature-heavy, implementing hundreds of
features.
Logging statements—although more informational—also capture
important aspects of feature implementations.
Useful for validating feature implementations and diagnosing
unintended interactions with other features.
Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 3 / 12
Introduction Motivation Approach Evaluation Conclusion Logging Issues
Logging Issues
Source: Stuart Pilbrow / CC BY-SA
(https://creativecommons.org/licenses/by-sa/2.0)
Too much logging causes
information overload.
Makes postmortem analysis
difficult.
Understanding system behavior
in production and diagnosing
problems can be challenging.
Also challenging during
development as logs pertaining
to auxiliary features are tangled
with those under current
development.
Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 4 / 12
Introduction Motivation Approach Evaluation Conclusion
Feature Logging Statement Level Evolution
Logging statements are typically associated with a log level.
Dictates if the log should be emitted, if at all.
Example
logger.log(Level.FINER, "Health:" + systemHealthStatus());
Outputs system health iff the run time level of logger ≤ Level.FINER.
As software evolves, logging statements levels correlated with
surrounding feature implementations may also need to be modified.
Ideally, feature log levels would evolve with the system as it is
developed.
Higher log levels (e.g., INFO) being assigned to logs corresponding to
features with more current stakeholder interest.
Lower log levels for those with less interest (e.g., FINEST).
Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 5 / 12
Introduction Motivation Approach Evaluation Conclusion Overview DOI Manipulation
Automation Approach Overview
Figure: Logging Level rejuvenation approach overview (details in paper).
Automatically evolve feature logging statement levels.
Mine Git repositories to discover the “interestingness” of code
surrounding feature logging statements.
Adapt Mylyn degree of interest (DOI) model [Kersten and Murphy,
2005].
Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 6 / 12
Introduction Motivation Approach Evaluation Conclusion Overview DOI Manipulation
What is Mylyn?
Standard Eclipse Integrated Development
Environment (IDE) plug-in.
Focuses graphical components of the IDE.
Only “interesting” artifacts related to the
currently active task are revealed [Kersten
and Murphy, 2006].
The more interaction with an artifact
(e.g., file), the more prominent it appears
in the IDE.
Less recently used artifacts appear less
prominently.
Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 7 / 12
Introduction Motivation Approach Evaluation Conclusion Overview DOI Manipulation
Mylyn Adaptation
Programmatically manipulate a DOI model using Git code changes.
Transform source code to “rejuvenate” feature logging statement
levels.
Pull those related to features whose implementation is worked on
more and more recently to the forefront.
Push those related to features whose implementations are worked on
less and less recently to the background.
Goals
Reduce information overload.
Support system evolution.
Automatically bring more relevant features to developers’ attention
and vice-versa.
Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 8 / 12
Introduction Motivation Approach Evaluation Conclusion Overview DOI Manipulation
Mylyn Adaptation
Programmatically manipulate a DOI model using Git code changes.
Transform source code to “rejuvenate” feature logging statement
levels.
Pull those related to features whose implementation is worked on
more and more recently to the forefront.
Push those related to features whose implementations are worked on
less and less recently to the background.
Goals
Reduce information overload.
Support system evolution.
Automatically bring more relevant features to developers’ attention
and vice-versa.
Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 8 / 12
Introduction Motivation Approach Evaluation Conclusion Overview DOI Manipulation
Mylyn Adaptation
Programmatically manipulate a DOI model using Git code changes.
Transform source code to “rejuvenate” feature logging statement
levels.
Pull those related to features whose implementation is worked on
more and more recently to the forefront.
Push those related to features whose implementations are worked on
less and less recently to the background.
Goals
Reduce information overload.
Support system evolution.
Automatically bring more relevant features to developers’ attention
and vice-versa.
Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 8 / 12
Introduction Motivation Approach Evaluation Conclusion Overview DOI Manipulation
Mylyn Adaptation
Programmatically manipulate a DOI model using Git code changes.
Transform source code to “rejuvenate” feature logging statement
levels.
Pull those related to features whose implementation is worked on
more and more recently to the forefront.
Push those related to features whose implementations are worked on
less and less recently to the background.
Goals
Reduce information overload.
Support system evolution.
Automatically bring more relevant features to developers’ attention
and vice-versa.
Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 8 / 12
Implementation
Implemented as an open-source plug-in to the Eclipse IDE.
May also be used with popular build systems via plug-ins.
Supports two popular logging frameworks, SLF4J and JUL.
Integrates with JGit and Mylyn.
Available at https://git.io/fjlTY.
Introduction Motivation Approach Evaluation Conclusion
Research Questions
1 How applicable is our tool to
and how does it behave with
real-world open source
software?
2 Does our tool help developers
focus on feature
implementation bugs?
3 Do developers find the results
acceptable? What is the impact
of our tool?
Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 10 / 12
Introduction Motivation Approach Evaluation Conclusion
Evaluation Overview
18 Java projects, ˜3 MLOC, and ˜4K logging statements.
Fully-automated analysis running-time:
10.66 secs per analyzed logging statement.
0.89 secs per KLOC changed.
Developers do not actively think about how their logging statement
levels evolve with their software.
Successfully analyzes 99.26% of candidate logging statements.
Increases log level distributions by an average of ˜20%.
Ideally transforms log levels in bug contexts ˜83% of the time.
Preliminary pull request study successfully integrated into 2 large
and popular open-source projects (comparable to related work [S. Li
et al., 2018]).
More details in the paper!
Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 11 / 12
Introduction Motivation Approach Evaluation Conclusion
Conclusion
Feature logging statements document important values and track
progress of feature implementations.
As interest of features evolve, feature logging levels may also require
modification to combat information overload.
Our approach discovers and rectifies mismatches between feature
interest levels and logging levels.
Results show that the technique is promising in alleviating the
burden of manually evolving logging levels.
Future Work
Expand pull request study.
Issue widescale developer surveys.
Enhance feature logging statement classification heuristics with ML.
Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 12 / 12
Introduction Motivation Approach Evaluation Conclusion
Conclusion
Feature logging statements document important values and track
progress of feature implementations.
As interest of features evolve, feature logging levels may also require
modification to combat information overload.
Our approach discovers and rectifies mismatches between feature
interest levels and logging levels.
Results show that the technique is promising in alleviating the
burden of manually evolving logging levels.
Future Work
Expand pull request study.
Issue widescale developer surveys.
Enhance feature logging statement classification heuristics with ML.
Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 12 / 12
Appendix Additional Material
For Further Reading I
Apache Software Foundation (2020). Log4j. Log4j 2 Architecture. url:
http://logging.apache.org/log4j/2.x/manual/architecture.html#Logger_Hierarchy (visited on 06/12/2020).
Chen, Boyuan and Zhen Ming (Jack) Jiang (2017). “Characterizing and Detecting Anti-Patterns in the Logging Code”. In:
International Conference on Software Engineering. ICSE ’17. Buenos Aires, Argentina: IEEE Press, pp. 71–81. isbn:
9781538638682. doi: 10.1109/ICSE.2017.15.
Eclipse Foundation, Inc. (2020). JGit. url: http://eclip.se/gF (visited on 03/02/2020).
Hassani, Mehran et al. (Mar. 2018). “Studying and detecting log-related issues”. In: Empirical Software Engineering. issn:
1573-7616. doi: 10.1007/s10664-018-9603-z. url: https://doi.org/10.1007/s10664-018-9603-z.
He, Pinjia et al. (2018). “Characterizing the Natural Language Descriptions in Software Logging Statements”. In: International
Conference on Automated Software Engineering. ASE 2018. Montpellier, France: ACM, pp. 178–189. isbn: 9781450359375. doi:
10.1145/3238147.3238193.
Kabinna, Suhas et al. (Feb. 2018). “Examining the Stability of Logging Statements”. In: Empirical Softw. Engg. 23.1,
pp. 290–333. issn: 1382-3256. doi: 10.1007/s10664-017-9518-0.
Kersten, Mik and Gail C. Murphy (2005). “Mylar: a degree-of-interest model for IDEs”. In: International Conference on
Aspect-Oriented Software Development. Chicago, Illinois: ACM, pp. 159–168. isbn: 1-59593-042-6. doi:
10.1145/1052898.1052912.
Kersten, Mik and Gail C. Murphy (2006). “Using Task Context to Improve Programmer Productivity”. In: ACM Symposium on
the Foundations of Software Engineering. SIGSOFT ’06/FSE-14. Portland, Oregon, USA: ACM, pp. 1–11. isbn: 1-59593-468-5.
doi: 10.1145/1181775.1181777.
Li, Heng, Weiyi Shang, and Ahmed E. Hassan (Aug. 2017). “Which Log Level Should Developers Choose for a New Logging
Statement?” In: Empirical Softw. Engg. 22.4, pp. 1684–1716. issn: 1382-3256. doi: 10.1007/s10664-016-9456-2.
Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 1 / 5
Appendix Additional Material
For Further Reading II
Li, Shanshan et al. (2018). “Logtracker: Learning Log Revision Behaviors Proactively from Software Evolution History”. In:
International Conference on Program Comprehension. ICPC ’18. Gothenburg, Sweden: ACM, pp. 178–188. isbn:
978-1-4503-5714-2. doi: 10.1145/3196321.3196328.
Oracle (2018). Logger (Java SE 10 & JDK 10). url:
http://docs.oracle.com/javase/10/docs/api/java/util/logging/Logger.html (visited on 02/29/2020).
Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 2 / 5
Appendix Additional Material
Related Work
Source: Jonathan Joseph Bondhus / CC BY-SA
(https://creativecommons.org/licenses/by-sa/3.0)
Existing approaches [Chen and
Jiang, 2017; Hassani et al.,
2018; He et al., 2018; Kabinna
et al., 2018; H. Li et al., 2017]
are inclined to focus on either
new logging statements or log
messages.
Logger hierarchies [Apache
Software Foundation, 2020;
Oracle, 2018] may be but still
require manual maintenance.
Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 3 / 5
Appendix Additional Material
Rename Refactorings & Copying
Program elements (e.g.,
methods) changed in Git may
no longer exist in current
project version.
Must process rename
refactorings.
Maintain a data structure that
associates rename relationships
between program elements,
e.g., method signatures.
Use lightweight refactoring
approximations.
Use copy detection features of
Git at the file level.
New copy “inherits” old DOI
values.
Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 4 / 5
Appendix Additional Material
Classifying Feature Logging Statements
Logging levels are often used to differentiate various logging
“categories” (e.g., severe errors, security breaches).
Need to distinguish between these and feature logs.
Derive a set of heuristics based on first-hand developer interactions.
Also distinguish between less-critical debugging logs (e.g., tracing)
using a keyword-based approach.
Goals
Focus on only manipulating logging statements tied to features to better
align them with developers’ current interests.
Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 5 / 5

More Related Content

PDF
A Tool for Rejuvenating Feature Logging Levels via Git Histories and Degree o...
PDF
Automated Evolution of Feature Logging Statement Levels Using Git Histories a...
PPTX
Log Engineering: Towards Systematic Log Mining to Support the Development of ...
PPTX
Log Engineering: Towards Systematic Log Mining to Support the Development of ...
PDF
cheat-sheets.pdf
PPTX
Towards better software quality assurance by providing intelligent support
PDF
Feature-Oriented Software Evolution
PPTX
Towards Just-in-Time Suggestions for Log Changes
A Tool for Rejuvenating Feature Logging Levels via Git Histories and Degree o...
Automated Evolution of Feature Logging Statement Levels Using Git Histories a...
Log Engineering: Towards Systematic Log Mining to Support the Development of ...
Log Engineering: Towards Systematic Log Mining to Support the Development of ...
cheat-sheets.pdf
Towards better software quality assurance by providing intelligent support
Feature-Oriented Software Evolution
Towards Just-in-Time Suggestions for Log Changes

Similar to Automated Evolution of Feature Logging Statement Levels Using Git Histories and Degree of Interest (20)

PDF
Logs are-magic-devfestweekend2018
PDF
Mining Development Knowledge to Understand and Support Software Logging Pract...
PDF
4Developers 2018: Structured logging (Bartek Szurgot)
PPTX
Mining Sociotechnical Information From Software Repositories
PDF
Populating a Release History Database (ICSM 2013 MIP)
PPTX
Automation in the Bug Flow - Machine Learning for Triaging and Tracing
PDF
SYN: Ultra-Scale
Software Evolution Comprehension [ICPC 2023]
PDF
Are logs a software engineer’s best friend? Yes -- follow these best practices
PDF
LOGGING FOR FUN, AND PROFIT
PDF
Logs Are Magic: Why Git Workflows and Commit Structure Should Matter To You
PPTX
Managing and Versioning Machine Learning Models in Python
PPT
Application Logging Good Bad Ugly ... Beautiful?
PDF
Analyzing Log Data With Apache Spark
PDF
OC Big Data Monthly Meetup #5 - Session 2 - Sumo Logic
PDF
Software Mining and Software Datasets
PDF
Vulnerability Detection Based on Git History
PDF
Hale @FOSS4G2013
PDF
Exploring Scenarios of Flink CDC in Streaming Data Integration
PDF
Git risky using git metadata to predict code bug risk
PPTX
Ian wcre2011
Logs are-magic-devfestweekend2018
Mining Development Knowledge to Understand and Support Software Logging Pract...
4Developers 2018: Structured logging (Bartek Szurgot)
Mining Sociotechnical Information From Software Repositories
Populating a Release History Database (ICSM 2013 MIP)
Automation in the Bug Flow - Machine Learning for Triaging and Tracing
SYN: Ultra-Scale
Software Evolution Comprehension [ICPC 2023]
Are logs a software engineer’s best friend? Yes -- follow these best practices
LOGGING FOR FUN, AND PROFIT
Logs Are Magic: Why Git Workflows and Commit Structure Should Matter To You
Managing and Versioning Machine Learning Models in Python
Application Logging Good Bad Ugly ... Beautiful?
Analyzing Log Data With Apache Spark
OC Big Data Monthly Meetup #5 - Session 2 - Sumo Logic
Software Mining and Software Datasets
Vulnerability Detection Based on Git History
Hale @FOSS4G2013
Exploring Scenarios of Flink CDC in Streaming Data Integration
Git risky using git metadata to predict code bug risk
Ian wcre2011

More from Raffi Khatchadourian (20)

PDF
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
PDF
Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Lea...
PDF
Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Lea...
PDF
Towards Safe Automated Refactoring of Imperative Deep Learning Programs to Gr...
PDF
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
PPTX
Actor Concurrency Bugs: A Comprehensive Study on Symptoms, Root Causes, API U...
PDF
An Empirical Study of Refactorings and Technical Debt in Machine Learning Sys...
PDF
An Empirical Study on the Use and Misuse of Java 8 Streams
PDF
Safe Automated Refactoring for Intelligent Parallelization of Java 8 Streams
PDF
Safe Automated Refactoring for Intelligent Parallelization of Java 8 Streams
PDF
A Brief Introduction to Type Constraints
PDF
Safe Automated Refactoring for Intelligent Parallelization of Java 8 Streams ...
PDF
A Tool for Optimizing Java 8 Stream Software via Automated Refactoring
PDF
Porting the NetBeans Java 8 Enhanced For Loop Lambda Expression Refactoring t...
PDF
Towards Safe Refactoring for Intelligent Parallelization of Java 8 Streams
PDF
Proactive Empirical Assessment of New Language Feature Adoption via Automated...
PDF
Defaultification Refactoring: A Tool for Automatically Converting Java Method...
PDF
Defaultification Refactoring: A Tool for Automatically Converting Java Method...
PDF
Automated Refactoring of Legacy Java Software to Default Methods Talk at ICSE...
PDF
Poster on Automated Refactoring of Legacy Java Software to Default Methods
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Lea...
Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Lea...
Towards Safe Automated Refactoring of Imperative Deep Learning Programs to Gr...
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Actor Concurrency Bugs: A Comprehensive Study on Symptoms, Root Causes, API U...
An Empirical Study of Refactorings and Technical Debt in Machine Learning Sys...
An Empirical Study on the Use and Misuse of Java 8 Streams
Safe Automated Refactoring for Intelligent Parallelization of Java 8 Streams
Safe Automated Refactoring for Intelligent Parallelization of Java 8 Streams
A Brief Introduction to Type Constraints
Safe Automated Refactoring for Intelligent Parallelization of Java 8 Streams ...
A Tool for Optimizing Java 8 Stream Software via Automated Refactoring
Porting the NetBeans Java 8 Enhanced For Loop Lambda Expression Refactoring t...
Towards Safe Refactoring for Intelligent Parallelization of Java 8 Streams
Proactive Empirical Assessment of New Language Feature Adoption via Automated...
Defaultification Refactoring: A Tool for Automatically Converting Java Method...
Defaultification Refactoring: A Tool for Automatically Converting Java Method...
Automated Refactoring of Legacy Java Software to Default Methods Talk at ICSE...
Poster on Automated Refactoring of Legacy Java Software to Default Methods

Recently uploaded (20)

PDF
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PDF
2025 Textile ERP Trends: SAP, Odoo & Oracle
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
PPTX
Operating system designcfffgfgggggggvggggggggg
PDF
System and Network Administration Chapter 2
PPTX
CHAPTER 2 - PM Management and IT Context
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PDF
top salesforce developer skills in 2025.pdf
PDF
How Creative Agencies Leverage Project Management Software.pdf
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PPTX
Reimagine Home Health with the Power of Agentic AI​
PDF
Design an Analysis of Algorithms II-SECS-1021-03
PPTX
ai tools demonstartion for schools and inter college
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
2025 Textile ERP Trends: SAP, Odoo & Oracle
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
Operating system designcfffgfgggggggvggggggggg
System and Network Administration Chapter 2
CHAPTER 2 - PM Management and IT Context
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
Design an Analysis of Algorithms I-SECS-1021-03
Wondershare Filmora 15 Crack With Activation Key [2025
top salesforce developer skills in 2025.pdf
How Creative Agencies Leverage Project Management Software.pdf
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
Navsoft: AI-Powered Business Solutions & Custom Software Development
Odoo Companies in India – Driving Business Transformation.pdf
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
Reimagine Home Health with the Power of Agentic AI​
Design an Analysis of Algorithms II-SECS-1021-03
ai tools demonstartion for schools and inter college

Automated Evolution of Feature Logging Statement Levels Using Git Histories and Degree of Interest

  • 1. Automated Evolution of Feature Logging Statement Levels Using Git Histories and Degree of Interest Science of Computer Programming, Volume 214, 1 Feb 2022, 102724 Yiming Tang1 Allan Spektor2 Raffi Khatchadourian2,3 Mehdi Bagherzadeh4 1 Concordia University, Canada 2 City University of New York (CUNY) Hunter College, USA 3 City University of New York (CUNY) Graduate Center, USA 4 Oakland University, USA IEEE International Conference on Software Analysis, Evolution & Re-engineering March 17, 2022, Honolulu, HI, USA (remote)
  • 2. Introduction Motivation Approach Evaluation Conclusion Logging Issues Logging in Modern Software in the Big Data Era Logging is pervasive in the modern software. Big data systems deal with high-volumes of transactions. Source code is tangled with scattered logging statements capturing important event information. Essential for reporting security and privacy breaches. Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 2 / 12
  • 3. Introduction Motivation Approach Evaluation Conclusion Logging Issues Feature Logging Statements Modern software is also feature-heavy, implementing hundreds of features. Logging statements—although more informational—also capture important aspects of feature implementations. Useful for validating feature implementations and diagnosing unintended interactions with other features. Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 3 / 12
  • 4. Introduction Motivation Approach Evaluation Conclusion Logging Issues Logging Issues Source: Stuart Pilbrow / CC BY-SA (https://creativecommons.org/licenses/by-sa/2.0) Too much logging causes information overload. Makes postmortem analysis difficult. Understanding system behavior in production and diagnosing problems can be challenging. Also challenging during development as logs pertaining to auxiliary features are tangled with those under current development. Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 4 / 12
  • 5. Introduction Motivation Approach Evaluation Conclusion Feature Logging Statement Level Evolution Logging statements are typically associated with a log level. Dictates if the log should be emitted, if at all. Example logger.log(Level.FINER, "Health:" + systemHealthStatus()); Outputs system health iff the run time level of logger ≤ Level.FINER. As software evolves, logging statements levels correlated with surrounding feature implementations may also need to be modified. Ideally, feature log levels would evolve with the system as it is developed. Higher log levels (e.g., INFO) being assigned to logs corresponding to features with more current stakeholder interest. Lower log levels for those with less interest (e.g., FINEST). Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 5 / 12
  • 6. Introduction Motivation Approach Evaluation Conclusion Overview DOI Manipulation Automation Approach Overview Figure: Logging Level rejuvenation approach overview (details in paper). Automatically evolve feature logging statement levels. Mine Git repositories to discover the “interestingness” of code surrounding feature logging statements. Adapt Mylyn degree of interest (DOI) model [Kersten and Murphy, 2005]. Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 6 / 12
  • 7. Introduction Motivation Approach Evaluation Conclusion Overview DOI Manipulation What is Mylyn? Standard Eclipse Integrated Development Environment (IDE) plug-in. Focuses graphical components of the IDE. Only “interesting” artifacts related to the currently active task are revealed [Kersten and Murphy, 2006]. The more interaction with an artifact (e.g., file), the more prominent it appears in the IDE. Less recently used artifacts appear less prominently. Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 7 / 12
  • 8. Introduction Motivation Approach Evaluation Conclusion Overview DOI Manipulation Mylyn Adaptation Programmatically manipulate a DOI model using Git code changes. Transform source code to “rejuvenate” feature logging statement levels. Pull those related to features whose implementation is worked on more and more recently to the forefront. Push those related to features whose implementations are worked on less and less recently to the background. Goals Reduce information overload. Support system evolution. Automatically bring more relevant features to developers’ attention and vice-versa. Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 8 / 12
  • 9. Introduction Motivation Approach Evaluation Conclusion Overview DOI Manipulation Mylyn Adaptation Programmatically manipulate a DOI model using Git code changes. Transform source code to “rejuvenate” feature logging statement levels. Pull those related to features whose implementation is worked on more and more recently to the forefront. Push those related to features whose implementations are worked on less and less recently to the background. Goals Reduce information overload. Support system evolution. Automatically bring more relevant features to developers’ attention and vice-versa. Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 8 / 12
  • 10. Introduction Motivation Approach Evaluation Conclusion Overview DOI Manipulation Mylyn Adaptation Programmatically manipulate a DOI model using Git code changes. Transform source code to “rejuvenate” feature logging statement levels. Pull those related to features whose implementation is worked on more and more recently to the forefront. Push those related to features whose implementations are worked on less and less recently to the background. Goals Reduce information overload. Support system evolution. Automatically bring more relevant features to developers’ attention and vice-versa. Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 8 / 12
  • 11. Introduction Motivation Approach Evaluation Conclusion Overview DOI Manipulation Mylyn Adaptation Programmatically manipulate a DOI model using Git code changes. Transform source code to “rejuvenate” feature logging statement levels. Pull those related to features whose implementation is worked on more and more recently to the forefront. Push those related to features whose implementations are worked on less and less recently to the background. Goals Reduce information overload. Support system evolution. Automatically bring more relevant features to developers’ attention and vice-versa. Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 8 / 12
  • 12. Implementation Implemented as an open-source plug-in to the Eclipse IDE. May also be used with popular build systems via plug-ins. Supports two popular logging frameworks, SLF4J and JUL. Integrates with JGit and Mylyn. Available at https://git.io/fjlTY.
  • 13. Introduction Motivation Approach Evaluation Conclusion Research Questions 1 How applicable is our tool to and how does it behave with real-world open source software? 2 Does our tool help developers focus on feature implementation bugs? 3 Do developers find the results acceptable? What is the impact of our tool? Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 10 / 12
  • 14. Introduction Motivation Approach Evaluation Conclusion Evaluation Overview 18 Java projects, ˜3 MLOC, and ˜4K logging statements. Fully-automated analysis running-time: 10.66 secs per analyzed logging statement. 0.89 secs per KLOC changed. Developers do not actively think about how their logging statement levels evolve with their software. Successfully analyzes 99.26% of candidate logging statements. Increases log level distributions by an average of ˜20%. Ideally transforms log levels in bug contexts ˜83% of the time. Preliminary pull request study successfully integrated into 2 large and popular open-source projects (comparable to related work [S. Li et al., 2018]). More details in the paper! Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 11 / 12
  • 15. Introduction Motivation Approach Evaluation Conclusion Conclusion Feature logging statements document important values and track progress of feature implementations. As interest of features evolve, feature logging levels may also require modification to combat information overload. Our approach discovers and rectifies mismatches between feature interest levels and logging levels. Results show that the technique is promising in alleviating the burden of manually evolving logging levels. Future Work Expand pull request study. Issue widescale developer surveys. Enhance feature logging statement classification heuristics with ML. Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 12 / 12
  • 16. Introduction Motivation Approach Evaluation Conclusion Conclusion Feature logging statements document important values and track progress of feature implementations. As interest of features evolve, feature logging levels may also require modification to combat information overload. Our approach discovers and rectifies mismatches between feature interest levels and logging levels. Results show that the technique is promising in alleviating the burden of manually evolving logging levels. Future Work Expand pull request study. Issue widescale developer surveys. Enhance feature logging statement classification heuristics with ML. Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 12 / 12
  • 17. Appendix Additional Material For Further Reading I Apache Software Foundation (2020). Log4j. Log4j 2 Architecture. url: http://logging.apache.org/log4j/2.x/manual/architecture.html#Logger_Hierarchy (visited on 06/12/2020). Chen, Boyuan and Zhen Ming (Jack) Jiang (2017). “Characterizing and Detecting Anti-Patterns in the Logging Code”. In: International Conference on Software Engineering. ICSE ’17. Buenos Aires, Argentina: IEEE Press, pp. 71–81. isbn: 9781538638682. doi: 10.1109/ICSE.2017.15. Eclipse Foundation, Inc. (2020). JGit. url: http://eclip.se/gF (visited on 03/02/2020). Hassani, Mehran et al. (Mar. 2018). “Studying and detecting log-related issues”. In: Empirical Software Engineering. issn: 1573-7616. doi: 10.1007/s10664-018-9603-z. url: https://doi.org/10.1007/s10664-018-9603-z. He, Pinjia et al. (2018). “Characterizing the Natural Language Descriptions in Software Logging Statements”. In: International Conference on Automated Software Engineering. ASE 2018. Montpellier, France: ACM, pp. 178–189. isbn: 9781450359375. doi: 10.1145/3238147.3238193. Kabinna, Suhas et al. (Feb. 2018). “Examining the Stability of Logging Statements”. In: Empirical Softw. Engg. 23.1, pp. 290–333. issn: 1382-3256. doi: 10.1007/s10664-017-9518-0. Kersten, Mik and Gail C. Murphy (2005). “Mylar: a degree-of-interest model for IDEs”. In: International Conference on Aspect-Oriented Software Development. Chicago, Illinois: ACM, pp. 159–168. isbn: 1-59593-042-6. doi: 10.1145/1052898.1052912. Kersten, Mik and Gail C. Murphy (2006). “Using Task Context to Improve Programmer Productivity”. In: ACM Symposium on the Foundations of Software Engineering. SIGSOFT ’06/FSE-14. Portland, Oregon, USA: ACM, pp. 1–11. isbn: 1-59593-468-5. doi: 10.1145/1181775.1181777. Li, Heng, Weiyi Shang, and Ahmed E. Hassan (Aug. 2017). “Which Log Level Should Developers Choose for a New Logging Statement?” In: Empirical Softw. Engg. 22.4, pp. 1684–1716. issn: 1382-3256. doi: 10.1007/s10664-016-9456-2. Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 1 / 5
  • 18. Appendix Additional Material For Further Reading II Li, Shanshan et al. (2018). “Logtracker: Learning Log Revision Behaviors Proactively from Software Evolution History”. In: International Conference on Program Comprehension. ICPC ’18. Gothenburg, Sweden: ACM, pp. 178–188. isbn: 978-1-4503-5714-2. doi: 10.1145/3196321.3196328. Oracle (2018). Logger (Java SE 10 & JDK 10). url: http://docs.oracle.com/javase/10/docs/api/java/util/logging/Logger.html (visited on 02/29/2020). Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 2 / 5
  • 19. Appendix Additional Material Related Work Source: Jonathan Joseph Bondhus / CC BY-SA (https://creativecommons.org/licenses/by-sa/3.0) Existing approaches [Chen and Jiang, 2017; Hassani et al., 2018; He et al., 2018; Kabinna et al., 2018; H. Li et al., 2017] are inclined to focus on either new logging statements or log messages. Logger hierarchies [Apache Software Foundation, 2020; Oracle, 2018] may be but still require manual maintenance. Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 3 / 5
  • 20. Appendix Additional Material Rename Refactorings & Copying Program elements (e.g., methods) changed in Git may no longer exist in current project version. Must process rename refactorings. Maintain a data structure that associates rename relationships between program elements, e.g., method signatures. Use lightweight refactoring approximations. Use copy detection features of Git at the file level. New copy “inherits” old DOI values. Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 4 / 5
  • 21. Appendix Additional Material Classifying Feature Logging Statements Logging levels are often used to differentiate various logging “categories” (e.g., severe errors, security breaches). Need to distinguish between these and feature logs. Derive a set of heuristics based on first-hand developer interactions. Also distinguish between less-critical debugging logs (e.g., tracing) using a keyword-based approach. Goals Focus on only manipulating logging statements tied to features to better align them with developers’ current interests. Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 5 / 5