vlmc | R Documentation |
Fit a Variable Length Markov Chain (VLMC) to a discrete time series,
in basically two steps:
First a large Markov Chain is generated containing (all if
threshold.gen = 1
) the context states of the time series. In
the second step, many states of the MC are collapsed by pruning
the corresponding context tree.
Currently, the “alphabet” may contain can at most 26 different “character”s.
vlmc(dts,
cutoff.prune = qchisq(alpha.c, df=max(.1,alpha.len-1),lower.tail=FALSE)/2,
alpha.c = 0.05,
threshold.gen = 2,
code1char = TRUE, y = TRUE, debug = FALSE, quiet = FALSE,
dump = 0, ctl.dump = c(width.ct = 1+log10(n), nmax.set = -1) )
is.vlmc(x)
## S3 method for class 'vlmc'
print(x, digits = max(3, getOption("digits") - 3), ...)
dts |
a discrete “time series”; can be a numeric, character or factor. |
cutoff.prune |
non-negative number; the cutoff used for pruning;
defaults to half the |
alpha.c |
number in (0,1) used to specify |
threshold.gen |
integer |
code1char |
logical; if true (default), the data |
y |
logical; if true (default), the data |
debug |
logical; should debugging info be printed to stderr. |
quiet |
logical; if true, don't print some warnings. |
dump |
integer in |
ctl.dump |
integer of length 2, say |
x |
a fitted |
digits |
integer giving the number of significant digits for printing numbers. |
... |
potentially further arguments [Generic]. |
A "vlmc"
object, basically a list with components
nobs |
length of data series when fit. (was named |
threshold.gen , cutoff.prune |
the arguments (or their defaults). |
alpha.len |
the alphabet size. |
alpha |
the alphabet used, as one string. |
size |
a named integer vector of length (>=) 4, giving characteristic sizes of the fitted VLMC. Its named components are
|
vlmc.vec |
integer vector, containing (an encoding of) the fitted VLMC tree. |
y |
if |
call |
the |
Set cutoff = 0, thresh = 1
for getting a “perfect fit”,
i.e. a VLMC which perfectly re-predicts the data (apart from the first
observation). Note that even with cutoff = 0
some pruning may
happen, for all (terminal) nodes with \delta
=0.
Martin Maechler
Buhlmann P. and Wyner A. (1998) Variable Length Markov Chains. Annals of Statistics 27, 480–513.
Mächler M. and Bühlmann P. (2004) Variable Length Markov Chains: Methodology, Computing, and Software. J. Computational and Graphical Statistics 2, 435–455.
Mächler M. (2004) VLMC — Implementation and R interface; working paper.
draw.vlmc
,
entropy
, simulate.vlmc
for “VLMC bootstrapping”.
f1 <- c(1,0,0,0)
f2 <- rep(1:0,2)
(dt1 <- c(f1,f1,f2,f1,f2,f2,f1))
(vlmc.dt1 <- vlmc(dt1))
vlmc(dt1, dump = 1,
ctl.dump = c(wid = 3, nmax = 20), debug = TRUE)
(vlmc.dt1c01 <- vlmc(dts = dt1, cutoff.prune = .1, dump=1))
data(presidents)
dpres <- cut(presidents, c(0,45,70, 100)) # three values + NA
table(dpres <- factor(dpres, exclude = NULL)) # NA as 4th level
levels(dpres)#-> make the alphabet -> warning
vlmc.pres <- vlmc(dpres, debug = TRUE)
vlmc.pres
## alphabet & and its length:
vlmc.pres$alpha
stopifnot(
length(print(strsplit(vlmc.pres$alpha,NULL)[[1]])) == vlmc.pres$ alpha.len
)
## You now can use larger alphabets (up to 95) letters:
set.seed(7); it <- sample(40, 20000, replace=TRUE)
v40 <- vlmc(it)
v40
## even larger alphabets now give an error:
il <- sample(100, 10000, replace=TRUE)
ee <- tryCatch(vlmc(il), error= function(e)e)
stopifnot(is(ee, "error"))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.