This document discusses predicting loan defaults through machine learning models. It begins by introducing the business problem of banks suffering losses from customer loan defaults. It then describes preprocessing the loan dataset, which includes handling missing data, label encoding categorical variables, and balancing the dataset using SMOTE and SMOTEENN techniques. Logistic regression, decision trees, AdaBoost and random forest algorithms are applied to both the original and balanced datasets. The random forest model on the balanced data using SMOTEENN achieved the best accuracy of 92%. The model is then pickled and integrated into a web application using Flask for users to predict loan defaults.