Forecasting High Yield Corporate Bond Industry Excess Return

University essay from KTH/Matematisk statistik

Author: Carlos Junior Lopez Vydrin; [2018]

Keywords: ;

Abstract: In this thesis, we apply unsupervised and supervised statistical learning methods on the high-yield corporate bond market with the goal of predicting its future excess return. We analyse the excess return of industry based indices of high-yield corporate bonds belonging to the Chemical, Metals, Paper, Building Materials, Packaging, Telecom, and Electric Utility industry. To predict the excess return of these high-yield corporate bond industry indices we utilised externally given market-observable financial time series from 96 different asset and indices that we believe to be of predictive value for the excess return. These input time series covers assets and indices of major equity indices, corporate credit spreads, FX-currencies, stock-, bond-, and FX volatility, swap rates, swap spreads, certain commodities, and macro economic surprise indices. After pre-processing the input data we arrive at 154 predictors that are used in a two-phase implementation procedure consisting of an unsupervised time series Agglomerative Hierarchical clustering and a supervised Random Forest regression model. We use the Hierarchical time series clustering and the Random Forest unbiased variable importance estimates as means to reduce our input predictor space to the ten most influential predictor variables for each industry. These ten most influential predictors are then used in a Random Forest regression model to predict [1, 3, 5, 10] day future cumulative excess return. To accommodate for the characteristics of sequential time series data we also apply a sliding window method to the input predictors and the response variable in our Random Forest model. The results show that excess returns in the various industries under study are predictable using Random Forest regression with our market-observable input data. The out-of-sample coefficient of determination R²out is in majority of the cases statistically significant at 0.01 level. The predictability varies across the industries and is in some cases dependent on whether we apply the sliding window method or not. Furthermore, applying the sliding window method on the predictors and the response variable showed in majority of the cases statistically significant improvements in the mean-squared prediction error. The variable importance estimates from such models show that the excess return time series exhibit some degree of autocorrelation.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)