Probabilistic Forecasting through Reformer Conditioned Normalizing Flows
Abstract: Forecasts are essential for human decision-making in several fields, such as weather forecasts, retail prices, or stock predictions. Recently the Transformer neural network, commonly used for sequence-to-sequence tasks, has shown great potential in achieving state-of-the-art forecasting results when combined with density estimations models such as Autoregressive Flows. The main benefit of the resulting model, called Transformer Masked Autoregressive Flow (TMAF), is its novel architecture which significantly improves the computational efficiency compared to the older architectures such as recurrent neural networks. However, the Transformer comes with a high computational cost with time complexity of O(N2). In an attempt to mitigate this limitation, this thesis introduces a new model for forecasting, the Reformer Masked Autoregressive Model (RMAF), based on the Transformer Masked Autoregressive Flow (TMAF), where we replace the Transformer part of the model with a Reformer. The Reformer is a modified Transformer with a more efficient attention function and reversible residual layers instead of residual layers. While The Reformer has shown great promise in reducing the computational complexity in long sequence machine translations tasks, our analysis shows that the overhead induced by the Reformer model leads to a 7-8 times increase of memory allocated compared to the TMAF to reach the same forecasting quality on Solar and Electricity datasets.
AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)