Processing, Modeling, and Forecasting: A Time Series Analysis of Sick Leave Absences in Sweden and Evaluating the Impact of Macro Factors

University essay from Lunds universitet/Matematisk statistik

Abstract: Sick absence affects companies both operationally and economically and the matter has become increasingly prominent following the COVID-19 pandemic. MedHelp Care is an e-health company working towards increasing workplace wellness by offering insights into absence and rehabilitation matters. Forecasting future absence levels could help with matters such as staffing and garner more insight into which factors that impact absenteeism. Moreover, gauging how external factors affect the absence could help deepen the knowledge further. With absence data now being digital, it is able to be modeled using statistical tools. The dataset was aggregated to a daily basis and preprocessed to better represent the Swedish labour force. Then, by applying time series analysis and machine learning methods, predictions of sick absence were formed. Datasets of the Swedish stock market and the discourse around COVID-19 on Twitter, respectively, were processed to time series and their relation to the sick absence was explored using crosscorrelation. The sets were then individually incorporated into exogenous models using autoregression, and then their impact was evaluated. Results show that the SARIMA model is better for predicting short and medium length horizons while the machine learning methods used were more apt for long horizons. Results from the exogenous models were mixed; the OMX set improved short term prediction accuracy while other horizons were largely unaffected, and the Twitter set worsened performance for all horizon lengths. While the model results are promising, their current complexity hinder large-scale deployment and further research is needed to further verify the effect of exogenous datasets.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)