Tree structured analysis of a state sequence object.

Share:

Description

Facility for growing a regression tree for a state sequence object.

Usage

1
2
3
4
5
seqtree(formula, data = NULL, weighted = TRUE, minSize = 0.05,
     maxdepth = 5, R = 1000, pval = 0.01,
     weight.permutation = "replicate",
     seqdist_arg = list(method = "LCS", norm = TRUE),
     diss = NULL, squared = FALSE, first = NULL)

Arguments

formula

a formula where the left hand side is a state sequence object (see seqdef) and the right hand specifies the candidate variables for partitioning the set of sequences.

weighted

Logical. If TRUE, use the weights of the state sequence object.

data

a data frame where variables in the formula will be searched

minSize

minimum number of cases in a node, in percentage if less than 1.

maxdepth

maximum depth of the tree.

R

Number of permutations used to assess the significance of the split.

pval

Maximum p-value, in percent.

weight.permutation

Weights permutation method: "diss" (attach weights to the dissimilarity matrix), "replicate" (replicate case according to the weights arguments), "rounded-replicate" (replicate case according to the rounded weights arguments), "random-sampling" (random assignment of covariate profiles to the objects using distributions defined by the weights.)

seqdist_arg

list of arguments directly passed to seqdist, only used if diss=NULL

diss

An optional dissimilarity matrix. If not provided, a dissimilarity matrix is computed using seqdist and seqdist_arg

squared

Logical. If TRUE, the dissimilarity matrix is squared

first

Character. An optional variable name to force the first split.

Details

The function provides a simplified interface for applying disstree on state sequence objects.

The seqtree objects can be "plotted" with seqtreedisplay. A print method is also available which prints the medoid sequence for each terminal node.

Value

A seqtree object with same attributes as disstree objects.

The leaf membership is in the first column of the fitted attribute. For example, the leaf memberships for a tree dt are in dt$fitted[,1].

Author(s)

Matthias Studer (with Gilbert Ritschard for the help page)

References

Studer, M., G. Ritschard, A. Gabadinho and N. S. M<fc>ller (2011). Discrepancy analysis of state sequences, Sociological Methods and Research, Vol. 40(3), 471-510.

See Also

seqtreedisplay, disstree

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
data(mvad)

## Defining a state sequence object
mvad.seq <- seqdef(mvad[, 17:86])

## Growing a seqtree from Hamming distances:
##   Warning: The R=10 used here to save computation time is
##   much too small and will generate strongly unstable results.
##   We recommend to set R at least as R=1000.
seqt <- seqtree(mvad.seq~ male + Grammar + funemp + gcse5eq + fmpr + livboth,
    data=mvad, R=10, seqdist_arg=list(method="HAM", norm=TRUE))
print(seqt)

## Growing a seqtree from an existing distance matrix
mvad.dhd <- seqdist(mvad.seq, method="DHD")
seqt <- seqtree(mvad.seq~ male + Grammar + funemp + gcse5eq + fmpr + livboth,
    data=mvad, R = 10, diss=mvad.dhd)
print(seqt)


### Following commands only work if GraphViz is properly installed
## Not run: 
seqtreedisplay(seqt, type="d", border=NA)
seqtreedisplay(seqt, type="I", sortv=cmdscale(mvad.dhd, k=1))

## End(Not run)

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.