ClustersList: _Clusters_ utilities

ClustersListR Documentation

Clusters utilities

Description

Handle clusterization <-> clusters list conversions, clusters grouping and merge

Usage

toClustersList(clusters)

fromClustersList(
  clustersList,
  elemNames = vector(mode = "character"),
  throwOnOverlappingClusters = TRUE
)

groupByClustersList(elemNames, clustersList, throwOnOverlappingClusters = TRUE)

groupByClusters(clusters)

mergeClusters(clusters, names, mergedName = "")

multiMergeClusters(clusters, namesList, mergedNames = NULL)

Arguments

clusters

A named vector or factor that defines the clusters

clustersList

A named list whose elements define the various clusters

elemNames

A list of names to which associate a cluster

throwOnOverlappingClusters

When TRUE, in case of overlapping clusters, the function fromClustersList and groupByClustersList will throw. This is the default. When FALSE, instead, in case of overlapping clusters, fromClustersList will return the last cluster to which each element belongs, while groupByClustersList will return a vector of positions that is longer than the given elemNames

names

A list of clusters names to be merged

mergedName

The name of the new merged clusters

namesList

A list of lists of clusters names to be respectively merged

mergedNames

The names of the new merged clusters

Details

toClustersList() given a clusterization, creates a list of clusters (i.e. for each cluster, which elements compose the cluster)

fromClustersList() given a list of clusters returns a clusterization (i.e. a named vector that for each element indicates to which cluster it belongs)

groupByClusters() given a clusterization returns a permutation, such that using the permutation on the input the clusters are grouped together

groupByClustersList() given the elements' names and a list of clusters returns a permutation, such that using the permutation on the given names the clusters are grouped together.

mergeClusters() given a clusterization, creates a new one where the given clusters are merged.

multiMergeClusters() given a clusterization, creates a new one where the given sets of clusters are merged.

Value

toClustersList() returns a list of clusters

fromClustersList() returns a clusterization. If the given elemNames contain values not present in the clustersList, those will be marked as "-1"

groupByClusters() and groupByClustersList() return a permutation that groups the clusters together. For each cluster the positions are guaranteed to be in increasing order. In case, all elements not corresponding to any cluster are grouped together as the last group

mergeClusters() returns a new clusterization with the wanted clusters being merged. If less than 2 cluster names were passed the function will emit a warning and return the initial clusterization

multiMergeClusters() returns a new clusterization with the wanted clusters being merged by consecutive iterations of mergeClusters() on the given namesList

Examples

## create a clusterization
clusters <- paste0("",sample(7, 100, replace = TRUE))
names(clusters) <- paste0("E_",formatC(1:100,  width = 3, flag = "0"))

## create a clusters list from a clusterization
clustersList <- toClustersList(clusters)
head(clustersList, 1)

## recreate the clusterization from the cluster list
clusters2 <- fromClustersList(clustersList, names(clusters))
all.equal(factor(clusters), clusters2)

cl1Size <- length(clustersList[["1"]])

## establish the permutation that groups clusters together
perm <- groupByClusters(clusters)
!is.unsorted(head(names(clusters)[perm],cl1Size))
head(clusters[perm], cl1Size)

## it is possible to have the list of the element names different
## from the names in the clusters list
selectedNames <- paste0("E_",formatC(11:110,  width = 3, flag = "0"))
perm2 <- groupByClustersList(selectedNames, toClustersList(clusters))
all.equal(perm2[91:100], c(91:100))

## is is possible to merge a few clusters together
clustersMerged <- mergeClusters(clusters, names = c("7", "2"),
                                mergedName = "7__2")
sum(table(clusters)[c(2, 7)]) == table(clustersMerged)[["7__2"]]

## it is also possible to do multiple merges at once!
## Note the default new clusters' names
clustersMerged2 <-
  multiMergeClusters(clusters2, namesList = list(c("2", "7"),
                                                 c("1", "3", "5")))
table(clustersMerged2)



seriph78/COTAN documentation built on Dec. 10, 2024, 3:30 a.m.