PST: Probabilistic Suffix Trees and Variable Length Markov Chains

Provides a framework for analysing state sequences with probabilistic suffix trees (PST), the construction that stores variable length Markov chains (VLMC). Besides functions for learning and optimizing VLMC models, the PST library includes many additional tools to analyse sequence data with these models: visualization tools, functions for sequence prediction and artificial sequences generation, as well as for context and pattern mining. The package is specifically adapted to the field of social sciences by allowing to learn VLMC models from sets of individual sequences possibly containing missing values, and by accounting for case weights. The library also allows to compute probabilistic divergence between two models, and to fit segmented VLMC, where sub-models fitted to distinct strata of the learning sample are stored in a single PST. This software results from research work executed within the framework of the Swiss National Centre of Competence in Research LIVES, which is financed by the Swiss National Science Foundation. The authors are grateful to the Swiss National Science Foundation for its financial support.

Getting started

Package details

AuthorAlexis Gabadinho [aut, cre, cph]
MaintainerAlexis Gabadinho <>
LicenseGPL (>= 2)
Package repositoryView on CRAN
Installation Install the latest version of this package by entering the following in R:

Try the PST package in your browser

Any scripts or data that you put into this service are public.

PST documentation built on May 2, 2019, 3:43 a.m.