seqpropclust: Monothetic clustering of state sequences
In WeightedCluster: Clustering of Weighted Data

seqpropclust

R Documentation

Monothetic clustering of state sequences

Description

Monothetic divisive clustering of the data using object properties. For state sequences object different set of properties are automoatically extracted.

Usage

seqpropclust(seqdata, diss, properties = c("state", "duration", "spell.age", 
		"spell.dur", "transition", "pattern", "AFtransition", "AFpattern", 
		"Complexity"), other.prop = NULL, prop.only = FALSE, pmin.support = 0.05, 
		max.k = -1, with.missing = TRUE, R = 1, weight.permutation = "diss", 
		min.size = 0.01, max.depth = 5, maxcluster = NULL, ...)
		
wcPropertyClustering(diss, properties, maxcluster = NULL, ...)
dtcut(st, k, labels = TRUE)

Arguments

`seqdata`	State sequence object (see `seqdef`).
`diss`	a dissimilarity matrix or a `dist` object.
`properties`	Character or `data.frame`. In `seqpropclust`, it can be a list of properties to be extracted from `seqdata`. It can also be a `data.frame` specifying the properties to use for the clustering.
`other.prop`	`data.frame`. Additional properties to be considered to cluster the sequences.
`prop.only`	Logical. If `TRUE`, the function returns a data.frame containing the extracted properties (without clustering the data).
`pmin.support`	Numeric. Minimum support (as a proportion of sequences). See `seqefsub`.
`max.k`	Numeric. The maximum number of events allowed in a subsequence. See `seqefsub`.
`with.missing`	Logical. If `TRUE`, property of missing spell are also extracted.
`R`	Number of permutations used to assess the significance of the split. See `disstree`.
`weight.permutation`	Weight permutation method: "diss" (attach weights to the dissimilarity matrix), "replicate" (replicate cases using weights), "rounded-replicate" (replicate case using rounded weights), "random-sampling" (random assignment of covariate profiles to the objects using distributions defined by the weights.). See `disstree`.
`min.size`	Minimum number of cases in a node, will be treated as a proportion if less than 1. See `disstree`.
`max.depth`	Maximum depth of the tree. See `disstree`.
`maxcluster`	Maximum number of cluster to consider.
`st`	A divise clustering tree as produced by `seqpropclust`
`k`	The number of groups to extract.
`labels`	Logical. If `TRUE`, rules to assign an object to a sequence is used to label the cluster (instead of a number).
`...`	Arguments passed to/from other methods.

Details

The method implement the DIVCLUS-T algorithm.

Value

Return a seqpropclust object, which is (in fact) a distree object. See disstree.

References

Studer, M. (2018). Divisive property-based and fuzzy clustering for sequence analysis. In G. Ritschard and M. Studer (Eds.), Sequence Analysis and Related Approaches: Innovative Methods and Applications, Life Course Research and Social Policies. Springer.

Piccarreta R, Billari FC (2007). Clustering work and family trajectories by using a divisive algorithm. Journal of the Royal Statistical Society: Series A (Statistics in Society), 170(4), 1061-1078.

Chavent M, Lechevallier Y, Briant O (2007). DIVCLUS-T: A monothetic divisive hierarchical clustering method. Computational Statistics & Data Analysis, 52(2), 687-701.

Examples

data(mvad)
mvad.seq <- seqdef(mvad[1:100, 17:86])

## COmpute distance using Hamming distance
diss <- seqdist(mvad.seq, method="HAM")

pclust <- seqpropclust(mvad.seq , diss=diss, maxcluster=5, properties=c("state", "duration")) 

## Run it to visualize the results
##seqtreedisplay(pclust, type="d", border=NA, showdepth=TRUE)

pclustqual <- as.clustrange(pclust, diss=diss, ncluster=5)

WeightedCluster documentation built on April 12, 2025, 9:13 a.m.