seqtree: Tree structured analysis of a state sequence object. In TraMineR: Trajectory Miner: a Sequence Analysis Toolkit

 seqtree R Documentation

Tree structured analysis of a state sequence object.

Description

Facility for growing a regression tree for a state sequence object.

Usage

``````seqtree(formula, data = NULL, weighted = TRUE, min.size = 0.05,
max.depth = 5, R = 1000, pval = 0.01, weight.permutation = "replicate",
seqdist.args = list(method = "LCS", norm = "auto"), diss = NULL,
squared = FALSE, first = NULL, minSize, maxdepth, seqdist_arg)
``````

Arguments

 `formula` a formula where the left hand side is a state sequence object (see `seqdef`) and the right hand specifies the candidate variables for partitioning the set of sequences. `weighted` Logical. If `TRUE`, use the weights of the state sequence object. `data` a data frame where variables in the formula will be searched `min.size` minimum number of cases in a node, in percentage if less than 1. `max.depth` maximum depth of the tree. `R` Number of permutations used to assess the significance of the split. `pval` Maximum p-value, in percent. `weight.permutation` Weights permutation method: "diss" (attach weights to the dissimilarity matrix), "replicate" (replicate case according to the `weights` arguments), "rounded-replicate" (replicate case according to the rounded `weights` arguments), "random-sampling" (random assignment of covariate profiles to the objects using distributions defined by the weights.) `seqdist.args` list of arguments directly passed to `seqdist`, only used if `diss=NULL` `diss` An optional dissimilarity matrix. If not provided, a dissimilarity matrix is computed using `seqdist` and `seqdist.args` `squared` Logical. If `TRUE`, the dissimilarity matrix is squared `first` Character. An optional variable name to force the first split. `minSize` Deprecated. Use `min.size` instead. `maxdepth` Deprecated. Use `max.depth` instead. `seqdist_arg` Deprecated. Use `seqdist.args` instead.

Details

The function provides a simplified interface for applying `disstree` on state sequence objects.

The `seqtree` objects can be "plotted" with `seqtreedisplay`. A print method is also available which prints the medoid sequence for each terminal node.

Value

A `seqtree` object with same attributes as `disstree` objects.

The leaf membership is in the first column of the fitted attribute. For example, the leaf memberships for a tree `dt` are in `dt\$fitted[,1]`.

Author(s)

Matthias Studer (with Gilbert Ritschard for the help page)

References

Studer, M., G. Ritschard, A. Gabadinho and N. S. Müller (2011). Discrepancy analysis of state sequences, Sociological Methods and Research, Vol. 40(3), 471-510, \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1177/0049124111415372")}.

`seqtreedisplay`, `disstree`

Examples

``````data(mvad)

## Defining a state sequence object

## Growing a seqtree from Hamming distances:
##   Warning: The R=10 used here to save computation time is
##   much too small and will generate strongly unstable results.
##   We recommend to set R at least as R=1000.
##   To comply with this small R value, we set pval = 0.1.
seqt <- seqtree(mvad.seq~ male + Grammar + funemp + gcse5eq + fmpr + livboth,
print(seqt)

## Growing a seqtree from an existing distance matrix
seqt <- seqtree(mvad.seq~ male + Grammar + funemp + gcse5eq + fmpr + livboth,
print(seqt)

### Following commands only work if GraphViz is properly installed
## Not run:
seqtreedisplay(seqt, type="d", border=NA)