University of Rwanda Digital Repository

Supervised machine learning modeling for credit risk prediction and evaluation in Malawi.

Show simple item record

dc.contributor.author Chidothi, Alexander Peter
dc.date.accessioned 2025-11-03T15:42:10Z
dc.date.available 2025-11-03T15:42:10Z
dc.date.issued 2023
dc.identifier.uri http://dr.ur.ac.rw/handle/123456789/2663
dc.description Master's Dissertation en_US
dc.description.abstract Credit institutions in Malawi are facing difficulties in their operations due to their unreliable methods of assessing credit risk. They tend to group subjects to predict their credit risk, which results in biased predictions. Some use qualitative methods to determine credit risk, and their data is limited to their own portfolio. This study aimed to overcome these challenges by creating a robust and reliable machine learning model to predict credit risk as probability of default and credit score sequentially, based on demographic and loan details. Big data was sourced from a credit reference bureau that collects data from all credit institutions in the country. The research design was quantitative where data was quantitatively analyzed using Knowledge Discovery in Databases process. The process involved problem formulation, data collection, pre-processing and cleaning, transformation, mining tasks and methods, result evaluation and visualization and knowledge discovery. Upon cleaning, transforming and splitting the data into training, validation and testing datasets, six classifier models were trained namely, KNearest Neighbors, Naïve Bayes, Logistic Regression, Support Vector Machine, Decision Tree and Random Forest. The target variable was credit status which was binary with values either ‘defaulted’ or ‘paid up’ status. The features were sex, marital status, age, district, no of dependents, principal amount, payment terms, source of income, collateral type, interest type, interest calculation method, participation code, liability type, loan duration which was a combination of numerical and categorical datatypes. A decision tree with maximum depth 10 was preferred to the other based on performance evaluation using accuracy score, ROC curve, precision score, recall score and confusion matrices. This model was used to predict probabilities of default using the testing data. Credit score was predicted based on the probabilities of default by the analogy of equating maximum FICO score to the minimum probability and minimum FICO score to maximum probability. Recommendations to implement either unsupervised or deep learning models for the same exercise subsequently followed. en_US
dc.language.iso en en_US
dc.publisher University of Rwanda en_US
dc.publisher University of Rwanda en_US
dc.subject Credit risk Machine learning Probability of default ·Credit score · Knowledge Discovery in Databases · FICO scores en_US
dc.title Supervised machine learning modeling for credit risk prediction and evaluation in Malawi. en_US
dc.type Dissertation en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Repository


Browse

My Account