generate: Generate sequences using a probabilistic suffix tree
In PST: Probabilistic Suffix Trees and Variable Length Markov Chains

generate

R Documentation

Generate sequences using a probabilistic suffix tree

Description

Generate sequences using a probabilistic suffix tree

Usage

## S4 method for signature 'PSTf'
generate(object, l, n, s1, p1, method, L, cnames)

Arguments

`object`	a probabilistic suffix tree, i.e., an object of class `"PSTf"` as returned by the `pstree`, `prune` or `tune` function.
`l`	integer. Length of the sequence(s) to generate.
`n`	integer. Number of the sequence(s) to generate.
`s1`	character. The first state in the sequences. The length of the vector should equal `n`. If specified, the first state in the sequence(s) is not randomly generated but taken from `s1`.
`p1`	numeric. An optional probability vector for generating the first position state in the sequence(s). If specified, the first state in the sequence(s) is randomly generated using the probability distribution in `p1` instead of the probability distribution taken fron the root node of `object`.
`method`	character. If `method=pmax`, at each position the state having the highest probability is chosen. If `method=prob`, at each position the state is generated using the corresponding probability distribution taken from `object`.
`L`	integer: Maximal depth used to extract the probability distributions from the PST object.
`cnames`	character: Optional column (position) names for the returned state sequence object. By default, the names of the sequence object to which the model was fitted are used (slot "data" of the PST).

Details

As a probabilistic suffix tree (PST) represents a generating model, it can be used to generate artificial sequence data sets. Sequences are built by generating the states at each successive position. The process is similar to sequence prediction (see predict), except that the retrieved conditional probability distributions provided by the PST are used to generate a symbol instead of computing the probability of an existing state. For more details, see Gabadinho 2016.

Value

A state sequence object (an object of class stslist) containing n sequences. This object can be passed as argument to all the functions for visualization and analysis provided by the TraMineR package.

Author(s)

Alexis Gabadinho

References

Gabadinho, A. & Ritschard, G. (2016). Analyzing State Sequences with Probabilistic Suffix Trees: The PST R Package. Journal of Statistical Software, 72(3), pp. 1-39.

Examples

data(s1)
s1.seq <- seqdef(s1)
S1 <- pstree(s1.seq, L=3)

## Generating 10 sequences
generate(S1, n=10, l=10, method="prob")

## First state is generated with p(a)=0.9 and p(b)=0.1
generate(S1, n=10, l=10, method="prob", p1=c(0.9, 0.1))

PST documentation built on June 22, 2024, 6:50 p.m.