Description Usage Arguments Details Value Author(s) References See Also Examples
Prune a probabilistic suffix tree with a series of cut-offs and select the model having the lowest value of the selected information criterion. Available information criterion are Akaike information criterion (AIC), AIC with a correction for finite sample sizes (AICc) and Bayesian information criterion (BIC).
1 2 |
object |
a probabilistic suffix tree, i.e., an object of class |
gain |
character. The gain function used for pruning decisions. See |
C |
numeric. A vector of cutoff values. See |
criterion |
The criterion used to select the model, either AIC, AICc or BIC. AICc should be used when the ratio between the number of observations and the number of estimated parameters is low, which is often the case with VLMC models. Burnham et al 2004 suggest to use AICc instead of AIC when the ratio is lower than 40. |
output |
If |
The tune
function selects among a series of PST pruned with different values of the C cutoff the model having the lowest AIC or AIC_{c} value. The function can return either the selected PST or a data frame containing the statistics for each model. For more details, see Gabadinho 2016.
If output="PST"
a PST that is an object of class PSTf
. If output="stats"
a matrix with the results of the tuning procedure.
The selected model is tagged with ***
, while models with IC < min(IC)+2 are tagged with **
, and models with IC < min(IC)+10 are tagged with **
.
Alexis Gabadinho
Burnham, K. P. & Anderson, D. R. (2004). Multimodel Inference Sociological Methods & Research, 33, pp. 261-304.
Gabadinho, A. & Ritschard, G. (2016). Analyzing State Sequences with Probabilistic Suffix Trees: The PST R Package. Journal of Statistical Software, 72(3), pp. 1-39.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | ## activity calendar for year 2000
## from the Swiss Household Panel
## see ?actcal
data(actcal)
## selecting individuals aged 20 to 59
actcal <- actcal[actcal$age00>=20 & actcal$age00 <60,]
## defining a sequence object
actcal.lab <- c("> 37 hours", "19-36 hours", "1-18 hours", "no work")
actcal.seq <- seqdef(actcal,13:24,labels=actcal.lab)
## building a PST
actcal.pst <- pstree(actcal.seq, nmin=2, ymin=0.001)
## Cut-offs for 5% and 1% (see ?prune)
C95 <- qchisq(0.95,4-1)/2
C99 <- qchisq(0.99,4-1)/2
## selecting the optimal PST using AIC criterion
actcal.pst.opt <- tune(actcal.pst, gain="G2", C=c(C95,C99))
## plotting the tree
plot(actcal.pst.opt)
|
Loading required package: TraMineR
TraMineR stable version 2.2-1 (Built: 2020-11-21)
Website: http://traminer.unige.ch
Please type 'citation("TraMineR")' for citation information.
Loading required package: RColorBrewer
PST version 0.94 (Built: 2020-11-22)
Website: http://r-forge.r-project.org/projects/pst
[>] 4 distinct states appear in the data:
1 = A
2 = B
3 = C
4 = D
[>] state coding:
[alphabet] [label] [long label]
1 A A > 37 hours
2 B B 19-36 hours
3 C C 1-18 hours
4 D D no work
[>] 1472 sequences in the data set
[>] min/max sequence length: 12/12
[>] 1472 sequence(s) - min/max length: 12/12
[>] max. depth L=11, nmin=2, ymin=0.001
[L] [nodes]
0 1
1 4
2 16
3 40
4 68
5 79
6 86
7 87
8 79
9 69
10 55
11 39
[>] computing sequence(s) likelihood ... (0.316 secs)
[>] total time: 1.875 secs
[>] model 1: AIC=7973.99 (C=3.91)
[>] model 2: AIC=7962.51 (C=5.67)
[>] model 2 selected : AIC=7962.51 (C=5.67)
[>] 11 nodes, 12 leaves, 69 free parameters
[>] building 'PSTr' representation, max. depth=5... (0.002 secs)
There were 23 warnings (use warnings() to see them)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.