| Title: | Sequence Generalization Through Similarity Network |
|---|---|
| Description: | Proposes an application for sequence prediction generalizing the similarity within the network of previous sequences. |
| Authors: | Giancarlo Vercellino [aut, cre, cph] |
| Maintainer: | Giancarlo Vercellino <[email protected]> |
| License: | GPL-3 |
| Version: | 2.0.0 |
| Built: | 2026-05-19 08:24:57 UTC |
| Source: | https://github.com/cran/segen |
Sequence Generalization Through Similarity Network
segen( df, seq_len = NULL, similarity = NULL, dist_method = NULL, rescale = NULL, smoother = FALSE, ci = 0.8, error_scale = "naive", error_benchmark = "naive", n_windows = 10, n_samp = 30, dates = NULL, seed = 42, use_parallel = FALSE, parallel_workers = NULL )segen( df, seq_len = NULL, similarity = NULL, dist_method = NULL, rescale = NULL, smoother = FALSE, ci = 0.8, error_scale = "naive", error_benchmark = "naive", n_windows = 10, n_samp = 30, dates = NULL, seed = 42, use_parallel = FALSE, parallel_workers = NULL )
df |
data.frame of time features (all numeric OR all categorical). |
seq_len |
integer, forecasting horizon. If NULL, auto-sampled. |
similarity |
numeric in (0,1), similarity quantile. If NULL, sampled. |
dist_method |
character. Options: "euclidean","manhattan","maximum","minkowski","correlation","dtw". If NULL, sampled from available methods (skips 'dtw' if pkg missing). |
rescale |
logical, rescale weights before normalization. |
smoother |
logical, apply loess smoothing for numeric features. |
ci |
numeric in (0,1), confidence level. |
error_scale |
"naive" or "deviation". |
error_benchmark |
"naive" or "average". |
n_windows |
integer, rolling validation windows. |
n_samp |
integer, random search samples. |
dates |
Date vector aligned with rows of df (optional). |
seed |
integer, RNG seed. |
use_parallel |
logical, use furrr/future for parallel exploration. |
parallel_workers |
NULL or integer, number of workers when parallel. |
list with exploration, history, best_model, time_log.
This function returns a list including:
exploration: list of all not-null models, complete with predictions and error metrics
history: a table with the sampled models, hyper-parameters, validation errors
best_model: results for the best selected model according to the weighted average rank, including:
predictions: for continuous variables, min, max, q25, q50, q75, quantiles at selected ci, mean, sd, mode, skewness, kurtosis, IQR to range, risk ratio, upside probability and divergence for each point fo predicted sequences; for factor variables, min, max, q25, q50, q75, quantiles at selected ci, proportions, difformity (deviation of proportions normalized over the maximum possible deviation), entropy, upgrade probability and divergence for each point fo predicted sequences
testing_errors: testing errors for each time feature for the best selected model (for continuous variables: me, mae, mse, rmsse, mpe, mape, rmae, rrmse, rame, mase, smse, sce, gmrae; for factor variables: czekanowski, tanimoto, cosine, hassebrook, jaccard, dice, canberra, gower, lorentzian, clark)
plots: standard plots with confidence interval for each time feature
time_log
Giancarlo Vercellino [email protected]
Maintainer: Giancarlo Vercellino [email protected] [copyright holder]
Useful links:
segen(time_features[, 1, drop = FALSE], seq_len = 30, similarity = 0.7, n_windows = 3, n_samp = 1)segen(time_features[, 1, drop = FALSE], seq_len = 30, similarity = 0.7, n_windows = 3, n_samp = 1)
A data frame with with daily with daily prices for IBM and Microsoft since April 2020
time_featurestime_features
A data frame with 2 columns and 1324 rows.
finance.yahoo.com