View source: R/dissmergegroups.R
dissmergegroups | R Documentation |
Merging groups by minimizing loss of partition quality.
dissmergegroups(
diss,
group,
weights = NULL,
measure = "ASW",
crit = 0.2,
ref = "max",
min.group = 4,
small = 0.05,
silent = FALSE
)
diss |
A dissimilarity matrix or a distance object. |
group |
Group membership. Typically, the outcome of a clustering function. |
weights |
Vector of non-negative case weights. |
measure |
Character. Name of quality index. One of those returned by |
crit |
Real in the range [0,1]. Maximal allowed proportion of quality loss. |
ref |
Character. Reference for proportion |
min.group |
Integer. Minimal number of end groups. |
small |
Real. Percentage of sample size under which groups are considered as small. |
silent |
Logical. Should merge steps be displayed during computation? |
The procedure is greedy. The function iteratively searches for the pair of groups whose merge minimizes quality loss. As long as the smallest group is smaller than small
, it searches among the pairs formed by that group with one of the other groups. Once all groups have sizes larger than small
, the search is done among all possible pairs of groups. There are two stopping criteria: the minimum number of groups (min.group
) and maximum allowed quality deterioration (crit
). The percentage specified with crit
applies either to the quality of the initial partition (ref="initial"
), the quality after the previous iteration (ref="previous"
), or the maximal quality achieved so far (ref="max"
), the latter being the default. The process stops when any of the criteria is reached.
Vector of merged group memberships.
Gilbert Ritschard
Ritschard, G., T.F. Liao, and E. Struffolino (2023). Strategies for multidomain sequence analysis in social research. Sociological Methodology, 53(2), 288-322. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1177/00811750231163833")}
wcClusterQuality
data(biofam)
## Building one channel per type of event (children, married, left home)
cases <- 1:40
bf <- as.matrix(biofam[cases, 10:25])
children <- bf==4 | bf==5 | bf==6
married <- bf == 2 | bf== 3 | bf==6
left <- bf==1 | bf==3 | bf==5 | bf==6
## Creating sequence objects
child.seq <- seqdef(children, weights = biofam[cases,'wp00tbgs'])
marr.seq <- seqdef(married, weights = biofam[cases,'wp00tbgs'])
left.seq <- seqdef(left, weights = biofam[cases,'wp00tbgs'])
## distances by domain
dchild <- seqdist(child.seq, method="OM", sm="INDELSLOG")
dmarr <- seqdist(marr.seq, method="OM", sm="INDELSLOG")
dleft <- seqdist(left.seq, method="OM", sm="INDELSLOG")
dnames <- c("child","marr","left")
## clustering each domain into 2 groups
child.cl2 <- cutree(hclust(as.dist(dchild)),k=2)
marr.cl2 <- cutree(hclust(as.dist(dmarr)),k=2)
left.cl2 <- cutree(hclust(as.dist(dleft)),k=2)
## Multidomain sequences
MD.seq <- seqMD(list(child.seq,marr.seq,left.seq))
d.expand <- seqdist(MD.seq, method="LCS")
clust.comb <- interaction(child.cl2,marr.cl2,left.cl2)
merged.grp <- dissmergegroups(d.expand, clust.comb,
weights=biofam[cases,'wp00tbgs'])
## weighted size of merged groups
xtabs(biofam[cases,'wp00tbgs'] ~ merged.grp)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.