This study introduces a system for classifying self-admitted technical debt (SATD) in source code comments and commits into five categories: requirement, design, defect, test, and documentation, using various natural language processing (NLP) techniques and machine learning classifiers. It analyzes two datasets, achieving high accuracy rates with classifiers like random forest and convolutional neural networks, especially when utilizing features from pre-trained language models. The research emphasizes the importance of identifying SATD for effective software development, as it relates to code quality and maintenance.
Related topics: