# logLik: Log Likelihood of and between VLMC objects In VLMC: Variable Length Markov Chains ('VLMC') Models

## Description

Compute the log-likelihood or “entropy” of a fitted `vlmc` object. This is a method for the generic `logLik`.

## Usage

 ```1 2 3 4``` ```entropy(object) ## S3 method for class 'vlmc' logLik(object, ...) entropy2(ivlmc1, ivlmc2, alpha.len = ivlmc1[1]) ```

## Arguments

 `object` typically the result of `vlmc(..)`.
 `ivlmc1,ivlmc2` two `vlmc` (sub) trees, see `vlmc`. `alpha.len` positive integer specifying the alphabet length. `...` (potentially more arguments; required by generic)

## Details

The `logLik.vlmc()` method computes the log likelihood for a fitted `vlmc` object. `entropy` is an alias for `logLik` for reasons of back compatibility.

`entropy2` is less clear ... ... [[[ FIXME ]]] ... ...

## Value

a negative number, in some contexts typically further divided by `log(x\$alpha.len)`.

Note that the `logLik` method is used by the default method of the `AIC` generic function (from R version 1.4.x), and hence provides `AIC(object)` for vlmc objects. Also, since vlmc version 1.3-13, `BIC()` works as well.

## Author(s)

Martin Maechler

`deviance.vlmc`, `vlmc`, `draw.vlmc`.

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34``` ```dd <- cumsum(rpois(999, 1.5)) %% 10 (vd <- vlmc(dd)) entropy(vd)# the bare number logLik(vd) logLik(vdL <- vlmc(dd, cutoff = 3)) entropy2(vd \$vlmc.vec, vdL\$vlmc.vec) ## AIC model selection: f1 <- c(1,0,0,0) # as in example(vlmc) f2 <- rep(1:0,2) (dt1 <- c(f1,f1,f2,f1,f2,f2,f1)) AIC(print(vlmc(dt1))) AIC(print(vlmc(dt1, cutoff = 2.6))) AIC(print(vlmc(dt1, cutoff = 0.4)))# these two differ ``not really'' AIC(print(vlmc(dt1, cutoff = 0.1))) ## Show how to compute it from the fitted conditional probabilities : logLikR <- function(x) { dn <- dimnames(pr <- predict(x)) sum(log(pr[cbind(2:nrow(pr), match(dn[[1]][-1], dn[[2]]))])) } all.equal( logLikR(vd), c(logLik (vd)), tol=1e-10) # TRUE, they do the same ## Compare different ones: [cheap example]: example(draw) for(n in ls()) if(is.vlmc(get(n))) { vv <- get(n) cat(n,":",formatC(logLik(vv) / log(vv\$alpha.len), format= "f", wid=10),"\n") } ```

### Example output

```'vlmc' a Variable Length Markov Chain;
alphabet '0123456789', |alphabet| = 10, n = 999.
Call: vlmc(dts = dd);  default cutoff =8.459
-> extensions (= \$size ) :
ord.MC   context nr.leaves     total
1        10        10       222
AIC =  3174
[1] -1497.225
'log Lik.' -1497.225 (df=90)
'log Lik.' -1286.197 (df=576)
[1] -641.7994
[1] 1 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1 0 1 0 1 0 1 0 1 0 0 0

'vlmc' a Variable Length Markov Chain;
alphabet '01', |alphabet| = 2, n = 28.
Call: vlmc(dts = dt1);  default cutoff =1.921
-> extensions (= \$size ) :
ord.MC   context nr.leaves     total
3         4         3        26
AIC =  21.46
[1] 21.46023

'vlmc' a Variable Length Markov Chain;
alphabet '01', |alphabet| = 2, n = 28.
Call: vlmc(dts = dt1, cutoff.prune = 2.6);  cutoff =2.6
-> extensions (= \$size ) :
ord.MC   context nr.leaves     total
1         2         1        10
AIC =  27.51
[1] 27.50815

'vlmc' a Variable Length Markov Chain;
alphabet '01', |alphabet| = 2, n = 28.
Call: vlmc(dts = dt1, cutoff.prune = 0.4);  cutoff =0.4
-> extensions (= \$size ) :
ord.MC   context nr.leaves     total
7        11         5        62
AIC =  27.55
[1] 27.54518

'vlmc' a Variable Length Markov Chain;
alphabet '01', |alphabet| = 2, n = 28.
Call: vlmc(dts = dt1, cutoff.prune = 0.1);  cutoff =0.1
-> extensions (= \$size ) :
ord.MC   context nr.leaves     total
7        11         6        66
AIC =  27.55
[1] 27.54518
[1] TRUE

draw>   example(vlmc)

vlmc> f1 <- c(1,0,0,0)

vlmc> f2 <- rep(1:0,2)

vlmc> (dt1 <- c(f1,f1,f2,f1,f2,f2,f1))
[1] 1 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1 0 1 0 1 0 1 0 1 0 0 0

vlmc> (vlmc.dt1  <- vlmc(dt1))

'vlmc' a Variable Length Markov Chain;
alphabet '01', |alphabet| = 2, n = 28.
Call: vlmc(dts = dt1);  default cutoff =1.921
-> extensions (= \$size ) :
ord.MC   context nr.leaves     total
3         4         3        26
AIC =  21.46

vlmc>  vlmc(dt1, dump = 1,
vlmc+       ctl.dump = c(wid = 3, nmax = 20), debug = TRUE)
vlmc: Alpha = '01' ; |X| = 2
vlmc: ctl.dump =  3 20
vlmc: n = |data| = 28, cutoff{prune} = 1.92073, threshold{gen} = 2
vlmc: |alphabet| = 2, alphabet = 01
generating... pruning... Dump{Tree} __after__ pruning:
Lev Ch|   0   1 |  tot |  num size : ..{set->list}..
------+--------------------------------------------------
0 x |  18  10 |   28 |   28   32 : 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 ..
1 0 |   8   9 |   17 |   17   32 : 2 3 4 6 7 8 10 12 14 15 16 18 20 22 24 26 27
2 0 |   4   3 |    7 |    7   16 : 3 4 7 8 15 16 27
3 0 |   0   3 |    3 |    3   16 : 4 8 16
3 1 |   4   0 |    4 |    4   16 : 3 7 15 27
1 1 |  10   0 |   10 |   10   16 : 1 5 9 11 13 17 19 21 23 25
computing differences['completing'] ... writing tree...
[0]01
0 0 0
[1]1 4 6
[2]2 0 0
[3]3 0 3
- -1
- -1

[3]3 4 0
- -1
- -1

- -1

[1]1 10 0
- -1
- -1

'vlmc' a Variable Length Markov Chain;
alphabet '01', |alphabet| = 2, n = 28.
Call: vlmc(dts = dt1, debug = TRUE, dump = 1, ctl.dump = c(wid = 3,     nmax = 20));  default cutoff =1.921
-> extensions (= \$size ) :
ord.MC   context nr.leaves     total
3         4         3        26
AIC =  21.46

vlmc> (vlmc.dt1c01 <- vlmc(dts = dt1, cutoff.prune = .1, dump=1))
Lev Ch|  0  1 | tot | num size : ..{set->list}..
------+---------------------------------------------
0 x | 18 10 |  28 |  28  32 : 0 1 2 3 4 5 6 7 8 9 10 11 12 ..
1 0 |  8  9 |  17 |  17  32 : 2 3 4 6 7 8 10 12 14 15 16 18 20 ..
2 0 |  4  3 |   7 |   7  16 : 3 4 7 8 15 16 27
3 0 |  0  3 |   3 |   3  16 : 4 8 16
3 1 |  4  0 |   4 |   4  16 : 3 7 15 27
2 1 |  4  6 |  10 |  10  16 : 2 6 10 12 14 18 20 22 24 26
3 0 |  3  6 |   9 |   9  16 : 6 10 12 14 18 20 22 24 26
4 0 |  1  2 |   3 |   3  16 : 6 10 18
5 0 |  1  2 |   3 |   3  16 : 6 10 18
6 1 |  1  2 |   3 |   3  16 : 6 10 18
7 0 |  0  2 |   2 |   2  16 : 10 18
4 1 |  2  4 |   6 |   6  16 : 12 14 20 22 24 26
5 0 |  2  4 |   6 |   6  16 : 12 14 20 22 24 26
6 0 |  0  2 |   2 |   2  16 : 12 20
6 1 |  2  2 |   4 |   4  16 : 14 22 24 26
1 1 | 10  0 |  10 |  10  16 : 1 5 9 11 13 17 19 21 23 25

'vlmc' a Variable Length Markov Chain;
alphabet '01', |alphabet| = 2, n = 28.
Call: vlmc(dts = dt1, cutoff.prune = 0.1, dump = 1);  cutoff =0.1
-> extensions (= \$size ) :
ord.MC   context nr.leaves     total
7        11         6        66
AIC =  27.55

vlmc> data(presidents)

vlmc> dpres <- cut(presidents, c(0,45,70, 100)) # three values + NA

vlmc> table(dpres <- factor(dpres, exclude = NULL)) # NA as 4th level

(0,45]  (45,70] (70,100]     <NA>
28       60       26        6

vlmc> levels(dpres)#-> make the alphabet -> warning
[1] "(0,45]"   "(45,70]"  "(70,100]" NA

vlmc> vlmc.pres <- vlmc(dpres, debug = TRUE)
vlmc: Alpha = '0123' ; |X| = 4
vlmc: ctl.dump =  3.079181 -1
vlmc: n = |data| = 120, cutoff{prune} = 3.90736, threshold{gen} = 2
vlmc: |alphabet| = 4, alphabet = 0123
generating... pruning... computing differences['completing'] ... writing tree...
[0]0123
0 1 2 1 2
[1]1 18 7 0 2
- -1
- -1
- -1
- -1

[1]1 9 33 5 1
- -1
- -1
[2]2 0 6 6 0
- -1
- -1
- -1
- -1

- -1

[1]1 0 12 14 0
- -1
- -1
- -1
- -1

- -1

vlmc> vlmc.pres

'vlmc' a Variable Length Markov Chain;
alphabet '0123', |alphabet| = 4, n = 120.
Call: vlmc(dts = dpres, debug = TRUE);  default cutoff =3.907
-> extensions (= \$size ) :
ord.MC   context nr.leaves     total
2         5         3        42
AIC =  227.6

vlmc> ## alphabet & and its length:
vlmc> vlmc.pres\$alpha
[1] "0123"

vlmc> stopifnot(
vlmc+   length(print(strsplit(vlmc.pres\$alpha,NULL)[[1]])) == vlmc.pres\$ alpha.len
vlmc+ )
[1] "0" "1" "2" "3"

vlmc> ## You now can use larger alphabets (up to 95) letters:
vlmc> set.seed(7); it <- sample(40, 20000, replace=TRUE)

vlmc> v40 <- vlmc(it)

vlmc> v40

'vlmc' a Variable Length Markov Chain;
alphabet 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMN', |alphabet| = 40, n = 20000.
Call: vlmc(dts = it);  default cutoff =27.29
-> extensions (= \$size ) :
ord.MC   context nr.leaves     total
0         1         1        82
AIC =  147590

vlmc> ## even larger alphabets now give an error:
vlmc> il <- sample(100, 10000, replace=TRUE)

vlmc> ee <- tryCatch(vlmc(il), error= function(e)e)

vlmc> stopifnot(is(ee, "error"))

draw>   draw(vlmc.dt1c01)
[x]-( 18 9| 27)-F
+--[0]-( 8 9| 17) <1.38>-F
|   +--[0]-( 4 3| 7) <0.14>-F
|   |   +--[0]-( 0 3| 3) <2.54>-T
|   |   '--[1]-( 4 0| 4) <2.24>-T
|   '--[1]-( 4 6| 10) <0.10>
|       '--[0]-( 3 6| 9) <0.09>-F
|           +--[0]-( 1 2| 3) <0.00>
|           |   '--[0]-( 1 2| 3) <0.00>
|           |       '--[1]-( 1 2| 3) <0.00>
|           |           '--[0]-( 0 2| 2) <0.81>-T
|           '--[1]-( 2 4| 6) <0.00>
|               '--[0]-( 2 4| 6) <0.00>-F
|                   +--[0]-( 0 2| 2) <0.81>-T
|                   '--[1]-( 2 2| 4) <0.24>-T
'--[1]-( 10 0| 10) <4.05>-T

draw>   draw(vlmc.dt1c01, flag = FALSE)
-4.000000

draw>   draw(vlmc.dt1c01, kind = 1)
[x]-( 18 9| 27)
+--[0]-( 8 9| 17) <1.38>
+---+--[0]-( 4 3| 7) <0.14>
+---+---+--[0]-( 0 3| 3) <2.54>
+---+---+--[1]-( 4 0| 4) <2.24>
+---+--[1]-( 4 6| 10) <0.10>
+---+---+--[0]-( 3 6| 9) <0.09>
+---+---+---+--[0]-( 1 2| 3) <0.00>
+---+---+---+---+--[0]-( 1 2| 3) <0.00>
+---+---+---+---+---+--[1]-( 1 2| 3) <0.00>
+---+---+---+---+---+---+--[0]-( 0 2| 2) <0.81>
+---+---+---+--[1]-( 2 4| 6) <0.00>
+---+---+---+---+--[0]-( 2 4| 6) <0.00>
+---+---+---+---+---+--[0]-( 0 2| 2) <0.81>
+---+---+---+---+---+--[1]-( 2 2| 4) <0.24>
+--[1]-( 10 0| 10) <4.05>

draw>   draw(vlmc.dt1)
[x]-( 18 9| 27)-F
+--[0]-( 8 9| 17) <1.38>
|   '--[0]-( 4 3| 7) <0.14>-F
|       +--[0]-( 0 3| 3) <2.54>-T
|       '--[1]-( 4 0| 4) <2.24>-T
'--[1]-( 10 0| 10) <4.05>-T

draw>   draw(vlmc.dt1, show = 3)
[x]-( 18 9| 27)-F
+--[0]-( 8 9| 17) <1.38>-.
|   +--[0]-( 4 3| 7) <0.14>-F
|   |   +--[0]-( 0 3| 3) <2.54>-T
|   |   '--[1]-( 4 0| 4) <2.24>-T
|   '--[1]-(___)
'--[1]-( 10 0| 10) <4.05>-T

draw>   draw(vlmc.dt1, cumulative = FALSE)
[x]-( 0 0| 0)-F
+--[0]-( 4 6| 10)
|   '--[0]-( 0 0| 0)-F
|       +--[0]-( 0 3| 3)-T
|       '--[1]-( 4 0| 4)-T
'--[1]-( 10 0| 10)-T
Warning messages:
1: In vlmc(dpres, debug = TRUE) :
alphabet with >1-letter strings; trying to abbreviate
2: In vlmc(it) : alphabet with >1-letter strings; trying to abbreviate
v40 : -19994.0949
vd :  -650.2367
vdL :  -558.5883
vlmc.dt1 :    -9.7095
vlmc.dt1c01 :    -4.0000
vlmc.pres :   -71.2724
```

VLMC documentation built on May 1, 2019, 11:32 p.m.