fuzzyseqplot: Plot sequences according to a fuzzy clustering.

Description Usage Arguments Details References See Also Examples

View source: R/fuzzyfunc.R

Description

This funciton propose a graphical representation of a fuzzy clustering results where sequences are weighted according to their cluster membership strength.

Usage

1
2
fuzzyseqplot(seqdata, group = NULL, membership.threashold = 0, type = "i", 
			members.weighted = TRUE, memb.exp = 1, ...)

Arguments

seqdata

State sequence object created with the seqdef function.

group

A fuzzy partition of the data, either as a membership matrix or as a fanny object.

membership.threashold

Numeric. Minimum membership strength to be included in plots.

type

the type of the plot. Available types are "d" for state distribution plots (chronograms), "f" for sequence frequency plots, "i" for selected sequence index plots, "I" for whole set index plots, "ms" for plotting the sequence of modal states, "mt" for mean times plots, "pc" for parallel coordinate plots and "r" for representative sequence plots.

members.weighted

Logical. Should the sequences be weighted by their membership strength in each group before being plotted?

memb.exp

Optional. Fuzzyness parameter used in the fanny algorithm.

...

arguments to be passed to seqplot.

Details

The dataset is augmented by repeating the sequence s_i of individual i k times (i.e., once per cluster). We therefore have k sequences for individual i, denoted as s_{i1}... s_{ik}. These sequences are therefore weighted according to their membership degree u_{i1}... u_{ik}. Hence, even if the same sequence were repeated k times, its total weight sum to 1. An additional categorical covariate is created in this augmented dataset that specifies the cluster (ranging from 1 to k) of the associated membership degree. This weighting strategy allows us to use any tools available for weighted sequence data (see seqplot).

For index plots, we additionally suggest ordering the sequences according to membership degree by setting sortv="membership" (see example). The most typical sequence lies at the top of the subfigures, with a high membership degree; meanwhile, the bottom shows less-characteristic patterns. Restricting to sequences with the highest membership degree can be achieved with the membership.treashold argument.

References

Studer, M. (2018). Divisive property-based and fuzzy clustering for sequence analysis. In G. Ritschard and M. Studer (Eds.), Sequence Analysis and Related Approaches: Innovative Methods and Applications, Life Course Research and Social Policies.

See Also

See also fanny for fuzzy clustering.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
	data(mvad)
	mvad.seq <- seqdef(mvad[1:100, 17:86])

	## COmpute distance using Hamming distance
	diss <- seqdist(mvad.seq, method="HAM")
	library(cluster)
	fclust <- fanny(diss, k=2, diss=TRUE)
	
	fuzzyseqplot(mvad.seq, group=fclust, type="d")
	fuzzyseqplot(mvad.seq, group=fclust, type="I", sortv="membership")
	fuzzyseqplot(mvad.seq, group=fclust, type="f")

Example output

Loading required package: TraMineR

TraMineR stable version 2.0-11.1 (Built: 2019-05-12)
Website: http://traminer.unige.ch
Please type 'citation("TraMineR")' for citation information.

Loading required package: cluster
This is WeightedCluster stable version 1.4 (Built: 2019-05-11)

To get the manuals, please run:
   vignette("WeightedCluster") ## Complete manual in English
   vignette("WeightedClusterFR") ## Complete manual in French
   vignette("WeightedClusterPreview") ## Short preview in English

To cite WeightedCluster in publications please use:
Studer, Matthias (2013). WeightedCluster Library Manual: A practical
   guide to creating typologies of trajectories in the social sciences
   with R. LIVES Working Papers, 24. doi:
   10.12682/lives.2296-1658.2013.24
 [>] 6 distinct states appear in the data: 
     1 = FE
     2 = HE
     3 = employment
     4 = joblessness
     5 = school
     6 = training
 [>] state coding:
       [alphabet]  [label]     [long label] 
     1  FE          FE          FE
     2  HE          HE          HE
     3  employment  employment  employment
     4  joblessness joblessness joblessness
     5  school      school      school
     6  training    training    training
 [>] 100 sequences in the data set
 [>] min/max sequence length: 70/70
 [>] 100 sequences with 6 distinct states
 [>] creating a 'sm' with a single substitution cost of 1
 [>] creating 6x6 substitution-cost matrix using 1 as constant value
 [>] 91 distinct sequences
 [>] min/max sequence length: 70/70
 [>] computing distances using the HAM metric
 [>] elapsed time: 0.042 secs

WeightedCluster documentation built on May 2, 2019, 6:35 a.m.