covlmc | R Documentation |
This function fits a Variable Length Markov Chain with covariates (coVLMC) to a discrete time series coupled with a time series of covariates.
covlmc(
x,
covariate,
alpha = 0.05,
min_size = 5L,
max_depth = 100L,
keep_data = TRUE,
control = covlmc_control(...),
...
)
x |
a discrete time series; can be numeric, character, factor or logical. |
covariate |
a data frame of covariates. |
alpha |
number in (0,1) (default: 0.05) cut off value in the pruning phase (in quantile scale). |
min_size |
number >= 1 (default: 5). Tune the minimum number of observations for a context in the growing phase of the context tree (see below for details). |
max_depth |
integer >= 1 (default: 100). Longest context considered in growing phase of the context tree. |
keep_data |
logical (defaults to |
control |
a list with control parameters, see |
... |
arguments passed to |
The model is built using the algorithm described in Zanin Zambom et al. As
for the vlmc()
approach, the algorithm builds first a context tree (see
ctx_tree()
). The min_size
parameter is used to compute the actual number
of observations per context in the growing phase of the tree. It is computed
as min_size*(1+ncol(covariate)*d)*(s-1)
where d
is the length of the
context (a.k.a. the depth in the tree) and s
is the number of states. This
corresponds to ensuring min_size observations per parameter of the logistic
regression during the estimation phase.
Then logistic models are adjusted in the leaves at the tree: the goal of each logistic model is to estimate the conditional distribution of the next state of the times series given the context (the recent past of the time series) and delayed versions of the covariates. A pruning strategy is used to simplified the models (mainly to reduce the time window associated to the covariates) and the tree itself.
Parameters specified by control
are used to fine tune the behaviour of the
algorithm.
a fitted covlmc model.
By default, covlmc
uses two different computing engines for logistic
models:
when the time series has only two states, covlmc
uses stats::glm()
with a binomial link (stats::binomial()
);
when the time series has at least three
states, covlmc
use VGAM::vglm()
with a multinomial link
(VGAM::multinomial()
).
Both engines are able to detect degenerate cases and lead to more robust
results that using nnet::multinom()
. It is nevertheless possible to
replace stats::glm()
and VGAM::vglm()
with nnet::multinom()
by setting
the global option mixvlmc.predictive
to "multinom"
(the default value is
"glm"
). Notice that while results should be comparable, there is no
guarantee that they will be identical.
Bühlmann, P. and Wyner, A. J. (1999), "Variable length Markov chains." Ann. Statist. 27 (2) 480-513 \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1214/aos/1018031204")}
Zanin Zambom, A., Kim, S. and Lopes Garcia, N. (2022), "Variable length Markov chain with exogenous covariates." J. Time Ser. Anal., 43 (2) 312-328 \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1111/jtsa.12615")}
cutoff.covlmc()
and prune.covlmc()
for post-pruning.
pc <- powerconsumption[powerconsumption$week == 5, ]
dts <- cut(pc$active_power, breaks = c(0, quantile(pc$active_power, probs = c(1 / 3, 2 / 3, 1))))
dts_cov <- data.frame(day_night = (pc$hour >= 7 & pc$hour <= 17))
m_cov <- covlmc(dts, dts_cov, min_size = 15)
draw(m_cov)
withr::with_options(
list(mixvlmc.predictive = "multinom"),
m_cov_nnet <- covlmc(dts, dts_cov, min_size = 15)
)
draw(m_cov_nnet)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.