dc.description.abstract |
The importance of Micro, Small and Medium Enterprises for East Africa’s economic endurance and innovation has been acknowledged by policy makers, credit institutions and business owners.
The highly uncertain and volatile nature of the economic environment in which they operate is constantly affected by external factors such as politics, pandemics, and many other factors, hence making the process of analysing information to evaluate whether a Micro, Small and Medium Enterprise will be profitable a challenging task. This prediction problem shows that there is need for a model, which facilitates an unbiased approach to Micro, Small and Medium Enterprises profitability prediction. In this dissertation, the objective is to understand how the application of machine learning on survey data can be operationalized to study Micro, Small and Medium Enterprises profit growth prediction and develop a reproducible model via web-based deployment.
Predicting Micro, Small and Medium Enterprise profitability growth models currently almost exclusively relies on data collected by credit institutions based on their relationship with their customers. Additionally, data is collected by conducting physical business site visits. This data is used to assess the business. It is therefore very difficult to apply this method on a wide scale in a repeatable and automated way for growth prediction. This may also be a hindrance for Micro,
Small and Medium Enterprises that are usually startups meaning they lack substantial credit history. In this dissertation, data from the national small business survey in Uganda is used and machine learning methods are applied as they are known to provide a better balance of speed and accuracy in the decision-making process than previously used methods.
Overall, three different models are applied to predict profit growth, which are random forest, extreme gradient boosting, and logistic regression. These models will allow us to compare performance of traditional models versus advanced models. Goodness-of-fit tests are applied to the models, and the best ones are extreme gradient boosting and random forest which are ensemble methods, with accuracies of 92.31% and 92.02%. The most relevant variables in the best performing models are ‘sales made last year’, ‘operation time’ and ‘business owner education level’. Models generated in this dissertation can be used to predict Micro, Small and Medium Enterprise profit growth rate in a repeatable way, using annually available survey data. |
en_US |