This document discusses software modularization through data mining techniques. The research investigates clustering as a way to group related source code entities and improve software modularization. The method involves preprocessing a dataset of source code files and their function calls to calculate relationships between files. Clustering is then used to group related files based on these relationships to produce modules. The results are evaluated based on metrics like precision and recall compared to expert suggestions. This modularization aims to make software easier to develop and maintain.