Title: | Automatic Sequence Prediction by Expansion of the Distance Matrix |
---|---|
Description: | Each sequence is predicted by expanding the distance matrix. The compact set of hyper-parameters is tuned through random search. |
Authors: | Giancarlo Vercellino |
Maintainer: | Giancarlo Vercellino <[email protected]> |
License: | GPL-3 |
Version: | 1.3.0 |
Built: | 2025-01-23 05:27:21 UTC |
Source: | https://github.com/cran/tetragon |
A data frame with with daily and cumulative cases of Covid infections and deaths in Europe since March 2021.
covid_in_europe
covid_in_europe
A data frame with 5 columns and 163 rows.
www.ecdc.europa.eu
Each sequence is predicted by expanding the distance matrix. The compact set of hyper-parameters is tuned via grid or random search.
tetragon( df, seq_len = NULL, smoother = F, ci = 0.8, method = NULL, distr = NULL, n_windows = 3, n_sample = 30, dates = NULL, error_scale = "naive", error_benchmark = "naive", seed = 42 )
tetragon( df, seq_len = NULL, smoother = F, ci = 0.8, method = NULL, distr = NULL, n_windows = 3, n_sample = 30, dates = NULL, error_scale = "naive", error_benchmark = "naive", seed = 42 )
df |
A data frame with time features as columns. They could be continuous variables or not. |
seq_len |
Positive integer. Time-step number of the projected sequence. Default: NULL (random selection between maximum boundaries). |
smoother |
Logical. Perform optimal smoothing using standard loess. Default: FALSE |
ci |
Confidence interval. Default: 0.8. |
method |
String. Distance method for calculating distance matrix among sequences. Options are: "euclidean", "manhattan", "maximum", "minkowski". Default: NULL (random selection among all possible options). |
distr |
String. Distribution used to expand the distance matrix. Options are: "norm", "logis", "t", "exp", "chisq". Default: NULL (random selection among all possible options). |
n_windows |
Positive integer. Number of validation tests to measure/sample error. Default: 3 (but a larger value is strongly suggested to really understand your accuracy). |
n_sample |
Positive integer. Number of samples for random search. Default: 30. |
dates |
Date. Vector with dates for time features. |
error_scale |
String. Scale for the scaled error metrics (only for continuous variables). Two options: "naive" (average of naive one-step absolute error for the historical series) or "deviation" (standard error of the historical series). Default: "naive". |
error_benchmark |
String. Benchmark for the relative error metrics (only for continuous variables). Two options: "naive" (sequential extension of last value) or "average" (mean value of true sequence). Default: "naive". |
seed |
Positive integer. Random seed. Default: 42. |
This function returns a list including:
exploration: list of all explored models, complete with predictions, testing metrics and plots
history: a table with the sampled models, hyper-parameters, validation errors
best: results for the best model including:
predictions: min, max, q25, q50, q75, quantiles at selected ci, and a bunch of specific measures for each point fo predicted sequences
testing_errors: testing errors for one-step and sequence for each ts feature
plots: confidence interval plot for each time feature
time_log
Giancarlo Vercellino [email protected]
Useful links:
tetragon(covid_in_europe[, c(2, 4)], seq_len = 40, n_sample = 2)
tetragon(covid_in_europe[, c(2, 4)], seq_len = 40, n_sample = 2)