This study presents the development of an automatic text classification system specifically for Pashto language documents, addressing a significant gap due to the lack of available datasets. Utilizing various machine learning models, the research found that the multilayer perceptron algorithm combined with TF-IDF feature extraction achieved an impressive accuracy of 94%. The study also highlights the unique challenges posed by the Pashto language, including its resource-poor status and the influence of dialects on text representation.
Related topics: