PST: Probabilistic Suffix Trees and Variable Length Markov Chains

Provides a framework for analysing state sequences with probabilistic suffix trees (PST), the construction that stores variable length Markov chains (VLMC). Besides functions for learning and optimizing VLMC models, the PST library includes many additional tools to analyse sequence data with these models: visualization tools, functions for sequence prediction and artificial sequences generation, as well as for context and pattern mining. The package is specifically adapted to the field of social sciences by allowing to learn VLMC models from sets of individual sequences possibly containing missing values, and by accounting for case weights. The library also allows to compute probabilistic divergence between two models, and to fit segmented VLMC, where sub-models fitted to distinct strata of the learning sample are stored in a single PST. This software results from research work executed within the framework of the Swiss National Centre of Competence in Research LIVES, which is financed by the Swiss National Science Foundation. The authors are grateful to the Swiss National Science Foundation for its financial support.

Author
Alexis Gabadinho [aut, cre, cph]
Date of publication
2016-11-10 17:55:35
Maintainer
Alexis Gabadinho <alexis.gabadinho@wanadoo.fr>
License
GPL (>= 2)
Version
0.92
URLs

View on CRAN

Man pages

cmine
Mining contexts
cplot
Plot single nodes of a probabilistic suffix tree
cprob
Empirical conditional probability distributions of order 'L'
generate
Generate sequences using a probabilistic suffix tree
impute
Impute missing values using a probabilistic suffix tree
logLik
Log-Likelihood of a variable length Markov chain model
nobs
Extract the number of observations to which a VLMC model is...
nodenames
Retrieve the node labels of a PST
pdist
Compute probabilistic divergence between two PST
plot-PSTr
Plot a PST
pmine
PST based pattern mining
ppplot
Plotting a branch of a probabilistic suffix tree
pqplot
Prediction quality plot
predict
Compute the probability of categorical sequences using a...
print
Print method for objects of class 'PSTf' and 'PSTr'
prune
Prune a probabilistic suffix tree
PSTf-class
Flat representation of a probabilistic suffix tree
PSTr-class
Nested representation of a probabilistic suffix tree
pstree
Build a probabilistic suffix tree
query
Retrieve counts or next symbol probability distribution
s1-data
Example sequence data set
SRH-data
Longitudinal data on self rated health
subtree
Extract a subtree from a segmented PST
summary
Summary of variable length Markov chain model
tune
AIC, AICc or BIC based model selection

Files in this package

PST
PST/inst
PST/inst/CITATION
PST/NAMESPACE
PST/NEWS
PST/data
PST/data/SRH.RData
PST/data/s1.rda
PST/R
PST/R/gain.R
PST/R/plotNode.R
PST/R/PST-setlayout.R
PST/R/plotNodeProb.R
PST/R/subtree.R
PST/R/logLik.R
PST/R/seqgbar.R
PST/R/plot.PSTr.R
PST/R/plotEdge.R
PST/R/impute.R
PST/R/AllClass.R
PST/R/PST-flist.R
PST/R/predict.R
PST/R/cprobd-methods.R
PST/R/PST-legend.R
PST/R/plotTree.R
PST/R/as.pstree.R
PST/R/plotNodeLimit.R
PST/R/PSTf-methods.R
PST/R/print-PSTr.R
PST/R/query.R
PST/R/nobs.R
PST/R/pdist.R
PST/R/longest-suffix.R
PST/R/gain-G2.R
PST/R/PSTr-functions.R
PST/R/PSTr-methods.R
PST/R/cmine.R
PST/R/generate.R
PST/R/pstree.R
PST/R/PSTf-functions.R
PST/R/pmine.R
PST/R/AllGeneric.R
PST/R/gain-G1.R
PST/R/ppplot.R
PST/R/prune.R
PST/R/pqplot.R
PST/R/plotProb.R
PST/R/tune.R
PST/R/zzz.R
PST/R/context.R
PST/R/cprob.R
PST/R/cplot.R
PST/MD5
PST/DESCRIPTION
PST/man
PST/man/ppplot.Rd
PST/man/pqplot.Rd
PST/man/tune.Rd
PST/man/subtree.Rd
PST/man/cprob.Rd
PST/man/logLik.Rd
PST/man/nobs.Rd
PST/man/s1-data.Rd
PST/man/predict.Rd
PST/man/cmine.Rd
PST/man/pmine.Rd
PST/man/prune.Rd
PST/man/PSTr-class.Rd
PST/man/PSTf-class.Rd
PST/man/generate.Rd
PST/man/nodenames.Rd
PST/man/impute.Rd
PST/man/query.Rd
PST/man/print.Rd
PST/man/pstree.Rd
PST/man/SRH-data.Rd
PST/man/pdist.Rd
PST/man/cplot.Rd
PST/man/summary.Rd
PST/man/plot-PSTr.Rd