Abstract:
Crop yield estimate is crucial indicator computed by National Institute of Statistics of Rwanda (NISR) in cooperation with the Ministry of Agriculture and Animal Resources (MINAGRI) through seasonal Agricultural survey (SAS) for monitoring the agriculture programs and policies as well as addressing key agriculture issues and providing to policy makers and other stakeholders.
After a comprehensive review on the prediction of crop yield, some re- cent work was focusing on estimating crop yield by considering the total reported quantity of harvested over the harvested area of that crop without considering the other factor which may affect the crop yield production in Rwanda. The cultivation period in Rwanda is divided into three cultivation seasons, Season A which is conducted from September to February of the following year, Season B which starts from March to June and Season C which take place from June to August. This study will analyze the crop yields of three consecutive years using data mining techniques. Specifically, the study will identify the effectiveness of Artificial Neural Network (ANN) model on crop yield prediction for typical
environment factors, to compare the effectiveness of Polynomial Linear Regression Model, Multiple Linear Regression model with Artificial Neural Network model on yield prediction to assess the impact of various inputs on crop yield production of main crops in Rwanda. The focus of this study is the development of data mining techniques in agricultural field. Various descriptive methods will be used to summarize and to preprocess the attributes and to test statistical significant of the results obtained. Different regression models namely the, Linear Regression, Multiple Linear Regression, Polynomial Regression Model and Artificial Neural Network are proposed to accurately predict the yield of Maize, beans, Irish Potatoes, and paddy rice. Finally, regression models are proposed to accurately predict the yield of those crops. The Artificial Neural Network
(ANN) model predict better results for maize and paddy rice rather than Polynomial Linear Regression (PLR) and Multiple Linear Regression (MLR) models with MAE (0.05), MSE (0.29), RMSE (0.54) and R2 (0.99) for maize; and MAE (2.80), MSE (15.810), RMSE (12.57) and R2 (0.99) for paddy rice while PLR predicts better results for Irish potatoes and bush beans with MAE (0.40), MSE (3.30), RMSE (1.82) and R2
(0.95) for Irish potatoes; and MAE (0.11), MSE (3.47), RMSE (1.18) and R2
(0.95) for bush beans respectively. The results approved that the best regression model selected based on R2 , MAE, MSE, and RMSE for predicting better solution for the farmers about how they may improve their yields, The findings proved that the ANN perform well in predicting the yields of major crops produced in Rwanda.