| Title: | Temporal Encoder-Masked Probabilistic Ensemble Regressor |
|---|---|
| Description: | Implements a probabilistic ensemble time-series forecaster that combines an auto-encoder with a neural decision forest whose split variables are learned through a differentiable feature-mask layer. Functions are written with 'torch' tensors and provide CRPS (Continuous Ranked Probability Scores) training plus mixture-distribution post-processing. |
| Authors: | Giancarlo Vercellino [aut, cre, cph] |
| Maintainer: | Giancarlo Vercellino <[email protected]> |
| License: | GPL-3 |
| Version: | 1.1.0 |
| Built: | 2026-05-20 09:15:00 UTC |
| Source: | https://github.com/cran/temper |
A multivariate dataset for closing prices for several major tech stocks over time. Source: YahooFinance.
data(dummy_set)data(dummy_set)
A data frame with 2133 observations of 4 variables:
Character vector of dates in "YYYY-MM-DD" format.
Numeric. Closing prices for Tesla.
Numeric. Closing prices for Microsoft.
Numeric. Closing prices for MARA Holdings.
data(dummy_set) plot(as.Date(dummy_set$dates), dummy_set$TSLA.Close, type = "l")data(dummy_set) plot(as.Date(dummy_set$dates), dummy_set$TSLA.Close, type = "l")
Temper trains and deploys a hybrid forecasting model that couples a temporal auto-encoder (shrinks a sliding window of length 'past' into a latent representation of size 'latent_dim') and a masked neural decision forest (an ensemble of 'n_trees' soft decision trees of depth 'depth'; feature-level dropout is governed by 'init_prob' and annealed by a Gumbel–Softmax with parameter 'temperature') and a CRPS loss (Continuous Ranked Probability Score) that blends the probabilistic forecasting error with a reconstruction term ('lambda_rec × MSE'), to yield multi-step probabilistic forecasts and their fan chart. Model weights are optimized with ADAM or other options, optional early stopping.
Implements a probabilistic ensemble time-series forecaster that combines an auto-encoder with a neural decision forest whose split variables are learned through a differentiable feature-mask layer. Functions are written with 'torch' tensors and provide CRPS (Continuous Ranked Probability Scores) training plus mixture-distribution post-processing.
temper( ts, future, past, latent_dim, n_trees = 30, depth = 6, init_prob = 0.8, temperature = 0.5, n_bases = 10, train_rate = 0.7, epochs = 30, optimizer = "adam", lr = 0.005, batch = 32, lambda_rec = 0.3, patience = 15, verbose = TRUE, alpha = 0.1, dates = NULL, seed = 42 )temper( ts, future, past, latent_dim, n_trees = 30, depth = 6, init_prob = 0.8, temperature = 0.5, n_bases = 10, train_rate = 0.7, epochs = 30, optimizer = "adam", lr = 0.005, batch = 32, lambda_rec = 0.3, patience = 15, verbose = TRUE, alpha = 0.1, dates = NULL, seed = 42 )
ts |
Numeric vector of length at least past + future. Represents the input time series in levels (not log-returns). Missing values are automatically imputed using na_kalman. |
future |
Integer |
past |
Integer |
latent_dim |
Integer |
n_trees |
Integer |
depth |
Integer |
init_prob |
Numeric in |
temperature |
Positive numeric. Temperature parameter for the Gumbel–Softmax distribution used during feature masking. Lower values lead to harder (closer to binary) masks; higher values encourage smoother gradients. Default: 0.5. |
n_bases |
Integer |
train_rate |
Numeric in |
epochs |
Positive integer. Maximum number of training epochs. Have a look at the loss plot to decide the right number of epochs. Default: 30. |
optimizer |
Character string. Optimizer to use for training (adam, adamw, sgd, rprop, rmsprop, adagrad, asgd, adadelta). Default: adam. |
lr |
Positive numeric. Learning rate for the optimizer. Default: 0.005. |
batch |
Positive integer. Mini-batch size used during training. Default: 32. |
lambda_rec |
Non-negative numeric. Weight applied to the reconstruction loss relative to the probabilistic CRPS forecasting loss. Default: 0.3. |
patience |
Positive integer. Number of consecutive epochs without improvement on the validation CRPS before early stopping is triggered. Default: 15. |
verbose |
Logical. If |
alpha |
Numeric in |
dates |
Optional |
seed |
Optional integer. Used to seed both R and Torch random number generators for reproducibility. Default: 42. |
A named list with four components
A ggplot in which training and validation CRPS are plotted against epoch number, useful for diagnosing over-/under-fitting.
A length-'future' list. Each element contains four empirical distribution functions (pdf, cdf, icdf, sampler) created by empfun
A ggplot object showing the historical series, median forecast and predictive interval. A print-ready fan chart.
An object measuring the wall-clock training time.
Maintainer: Giancarlo Vercellino [email protected] [copyright holder]
Useful links:
set.seed(2025) ts <- cumsum(rnorm(250)) # synthetic price series fit <- temper(ts, future = 3, past = 20, latent_dim = 5, epochs = 2) # 80 % predictive interval for the 3-step-ahead forecast pfun <- fit$pred_funs$t3$pfun pred_interval_80 <- c(pfun(0.1), pfun(0.9)) # Visual diagnostics print(fit$plot) print(fit$loss)set.seed(2025) ts <- cumsum(rnorm(250)) # synthetic price series fit <- temper(ts, future = 3, past = 20, latent_dim = 5, epochs = 2) # 80 % predictive interval for the 3-step-ahead forecast pfun <- fit$pred_funs$t3$pfun pred_interval_80 <- c(pfun(0.1), pfun(0.9)) # Visual diagnostics print(fit$plot) print(fit$loss)