The document presents a stacking ensemble method for author identification of source code segments written by multiple authors, combining deep neural networks, random forests, and support vector machines to achieve a classification accuracy of 87%. The approach utilizes code metrics that are language-independent to analyze writing styles, addressing challenges such as varied author styles and code metric selection. Experimental results demonstrate that the stacking ensemble classifier performs significantly better than other individual classification methods.
Related topics: