Housing Price Prediction over Countrywide Data : A comparison of XGBoost and Random Forest regressor models
Abstract: The aim of this research project is to investigate how an XGBoost regressor compares to a Random Forest regressor in terms of predictive performance of housing prices with the help of two data sets. The comparison considers training time, inference time and the three evaluation metrics R2, RMSE and MAPE. The data sets are described in detail together with background about the regressor models that are used. The method makes substantial data cleaning of the two data sets, it involves hyperparameter tuning to find optimal parameters and 5foldcrossvalidation in order to achieve good performance estimates. The finding of this research project is that XGBoost performs better on both small and large data sets. While the Random Forest model can achieve similar results as the XGBoost model, it needs a much longer training time, between 2 and 50 times as long, and has a longer inference time, around 40 times as long. This makes it especially superior when used on larger sets of data.
AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)