# seqlogp: Logarithm of the probabilities of state sequences In TraMineR: Trajectory Miner: a Toolbox for Exploring and Rendering Sequences

## Description

Compute the logarithm of the probability of each state sequence obtained from a state transition model. The probability of a sequence is equal to the product of each state probability of the sequence. There are several methods to compute a state probability.

## Usage

 ```1 2``` ```seqlogp(seqdata, prob="trate", time.varying=TRUE, begin="freq", weighted=TRUE) ```

## Arguments

 `seqdata` The sequence to compute the probabilities. `prob` either the name (`"trate"` or `"freq"\$` of the probability model to use to compute the state probabilities, or an `array` specifying the transition probabilities at each position t (see details). `time.varying` Logical. If `TRUE`, the probabilities (transitions or frequencies) are computed separately for each time t point. `begin` Model used to compute the probability of the first state. Either `"freq"` to use the observed frequencies on the first period or a vector specifying the probability of each state of the alphabet. `weighted` Logical. If `TRUE`, uses the weights specified in `seqdata` when computing the observed transition rates.

## Details

The sequence likelihood P(s) is defined as the product of the probability with which each of its observed successive state is supposed to occur at its position. Let s=s_1s_2 ... s_l be a sequence of length l. Then

P(s)=P(s_1, 1) * P(s_2, 2) * ... * P(s_l, l)

with P(s_t,t) the probability to observe state s_t at position t.

The question is how to determinate the state probabilities P(s_t,t). Several methods are available and can be set using the `prob` argument.

One commonly used method for computing them is to postulate a Markov model, which can be of various order. We can consider probabilities derived from the first order Markov model, that is, each P(s_t,t), t>1 is set as the transition rate p(s_t|s_(t-1)). This is available in `seqlogp` by setting `prob="trate"`.
The transition rates may be considered constant over time/positions (`time.varying=FALSE`), that is estimated across sequences from the observations at positions t and t-1 for all t together. Time varying transition rates may also be considered (`time.varying=TRUE`), in which case they are computed separately for each position, that is estimated across sequences from the observations at positions t and t-1 for each t, yielding an array of transition matrices. The user may also specify his own transition rates array or matrix.

Another method is to use the frequency of a state at each position to set P(s_t,t) (`prob="freq"`). In the latter case, the probability of a sequence is independent of the probability of the transitions. Here again, the frequencies can be computed all together (`time.varying=FALSE`) or separately for each position t (`time.varying=TRUE`). For t=1, we set P(s_1,1) to the observed frequency of the state s_1 at position 1. Alternatively, the `begin` argument allows to specify the probability of the first state.

The likelihood P(s) being generally very small, `seqlogp` return -log(P(s)). The latter quantity is minimal when P(s) is equal to 1.

## Value

A vector containing the logarithm of each sequence probability.

## Author(s)

Matthias Studer and Alexis Gabadinho (with Gilbert Ritschard for the help page)

## Examples

 ```1 2 3 4 5 6 7 8 9``` ```## Creating the sequence objects using weigths data(biofam) biofam.seq <- seqdef(biofam, 10:25, weights=biofam\$wp00tbgs) ## Computing sequence probabilities biofam.prob <- seqlogp(biofam.seq) ## Comparing the probability of each cohort cohort <- biofam\$birthyr>1940 boxplot(biofam.prob~cohort) ```

TraMineR documentation built on Jan. 30, 2018, 3:01 a.m.