Fernando Moreno-Pino, University of Oxford ([email protected])
This repository implements in PyTorch two different deep learning models for time series forecasting: DeepAR
("DeepAR: Probabilistic Forecasting with Autoregressive Recurrent Networks") and ConvTrans
("Enhancing the Locality and Breaking the Memory Bottleneck of Transformer on Time Series Forecasting").
In this section, we formally state the problem of time series forecasting and introduce a base architecture that represents the core of most deep learning-based autoregressive models in the state-of-the-art.
Given a set of
where
Several state-of-the-art deep-autoregressive models, including DeepAR and ConvTrans, share a high-level architecture, see Figure 1, characterised by the following components:
-
Embedding Function,
$\mathbf{e}_{t}^{i} = f_{\phi}\left(\mathbf{e}_{t-1}^{i}, z^{i}_{t-1}, \mathbf{x}_{t}^{i} \right) \in \mathbb{R}^{D}$ , where$f_{\phi}(\cdot)$ is the transit function with parameters$\phi$ . At each time step$t$ , the embedding function takes as input the previous time step's embedding$\mathbf{e}_{t-1}^{i}$ , the previous value of the time series$z_{t-1}^{i}$ , and the current covariates$\mathbf{x}_{t}^{i}$ . This function can be implemented using various architectures such as a RNN, a LSTM, a Temporal Convolutional Network (TCN), or a Transformer model. -
Probabilistic Model,
$p\left(z_{t}^{i} \mid \mathbf{e}_{t}^{i} \right)$ , with parameters$\psi$ , which utilises the embedding$\mathbf{e}_{t}^{i}$ to estimate the next value of the time series$\hat{z}_{t}^{i}$ . Typically, this probabilistic model is implemented as a neural network function that parameterises the required probability distribution. For instance, a Gaussian distribution can be represented through its mean$\mu = g_{\mu}(\mathbf{w}_{\mu}^{T} \mathbf{e}_{t}^{i} + b_{\mu})$ and standard deviation$\sigma = \log \left(1 + \exp \left(g_{\sigma}(\mathbf{w}_{\sigma}^{T} \mathbf{e}_{t}^{i} + b_{\sigma})\right)\right)$ , where$g_{\mu}$ and$g_{\sigma}$ are neural networks.
Figure 1: Base architecture of deep learning-based autoregressive models. Gray represents observed variables.
The model's parameters
All quantities required for computing the log-likelihood are deterministic, eliminating the need for inference.
During both training and testing, the conditioning range
For forecasting, predictions are made by sampling directly from the model:
where the model uses the previous time step's prediction
Note that Transformers, unlike RNNs or LSTMs, do not compute the embedding in a sequential manner. Accordingly, when obtaining the embedding through a Transformer model and so to use the encoder-decoder architecture previously described, we use the Transformer decoder-only mode.
Figure 2: Unrolled base architecture. On the left of the forecast horizon, the conditioning range can be found. On its right, the forecasting range.
Common metrics used for forecasting evaluation are the Normalized Deviation (ND) and Root Mean Square Error (RMSE):
Also, the quantile loss,
- Salinas, D., Flunkert, V., Gasthaus, J., & Januschowski, T. (2020). DeepAR: Probabilistic forecasting with autoregressive recurrent networks. International journal of forecasting, 36(3), 1181-1191.
- Li, S., Jin, X., Xuan, Y., Zhou, X., Chen, W., Wang, Y. X., & Yan, X. (2019). Enhancing the locality and breaking the memory bottleneck of transformer on time series forecasting. Advances in neural information processing systems, 32.
- Moreno-Pino, F., Olmos, P. M., & Artés-Rodríguez, A. (2023). Deep autoregressive models with spectral attention. Pattern Recognition, 133, 109014.
The repo contains a dlts.yml
file, a conda environment that allows running both models both models. To import and activate it, you can do:
conda env create -f dlts.yml
conda activate dlts
Alternatively, you can create a new conda environment with Python 3.10 (version used for testing) and install the rest of the packages (see requirements.txt):
conda create --name name-of-env python=3.10
conda activate name-of-env
pip install -r requirements.txt
(Note: it is recommended to run the code in VS Code).
-
Paper: https://arxiv.org/abs/1704.04110.
-
Code from: https://github.com/husnejahan/DeepAR-pytorch.
- Download the electricity dataset: https://archive.ics.uci.edu/ml/datasets/ElectricityLoadDiagrams20112014.
- Move it into the
/DeepAR/
folder and run:
python preprocess_elect.py
- Run the main file:
python main.py
Note: Once you have downloaded the data, instead of directly running the main.py
source code, you can explore the notebook /DeepAR/DeepAR_intro.ipynb
, which introduces some of the most important functions of DeepAR.
- Results will be saved in
/DeepAR/experiments/base_model/
.
Note that, if training is taking too much time, you can use the option max_samples
when setting up the WeightedSampler
for training only on a subset of time series (but take into account that you probably won't converge to a proper solution and performance on test will drop) with the following line:
sampler = WeightedSampler(data_dir, args.dataset, replacement=True, max_samples=5000)
Enhancing the Locality and Breaking the Memory Bottleneck of Transformer on Time Series Forecasting [ConvTrans]
- Paper: https://arxiv.org/abs/1907.00235.
- Run the main file:
python main.py
. The datasets used are located in/ConvTrans/data/
.
If training is taking too much time, you can set the argument train-ins-num
to a lower number for training only on a subset of time series (but take into account that you probably won't converge to a proper solution and performance on test will drop)
The model will report the results in the command line.