validateCluster: Validate the cluster analysis in a projected network based on...

View source: R/validate-cluster.R

validateClusterR Documentation

Validate the cluster analysis in a projected network based on additional external measures.

Description

This function calculates the similarity of a given clustering method to the provided ground truth as external features (prior knowledge). This function provides external cluster validity measures including corrected.rand and jaccard similarity. This function requires the community object, igraph object and distance matrix returned by findCluster to analyze.

Usage

validateCluster(community, extra_feature, dist_mat)

Arguments

community

An igraph community object.

extra_feature

A data frame object that shows the group membership of each node based on prior knowledge.

dist_mat

A matrix containing the distance of nodes in the network. This matrix can be retrieved by the output of findCluster to analyze.

Value

A list containing the similarity measures for the clustering results and the ground truth represented as an external features, i.e., corrected Rand and Jaccard indices.

Examples

# load part of the beatAML data
beatAML_data <- NIMAA::beatAML[1:10000,]

# convert to incidence matrix
beatAML_incidence_matrix <- nominalAsBinet(beatAML_data)

# do clustering
cls <- findCluster(beatAML_incidence_matrix,
  part = 1, method = c('infomap','walktrap'),
  normalization = FALSE, rm_weak_edges = TRUE,
  comparison = FALSE)

# generate a random external_feature
external_feature <- data.frame(row.names = cls$infomap$names)
external_feature[,'membership'] <- paste('group',
sample(c(1,2,3,4), nrow(external_feature),
replace = TRUE))

# validate clusters using random external feature
validateCluster(community = cls$walktrap,
extra_feature = external_feature,
dist_mat = cls$distance_matrix)

NIMAA documentation built on April 11, 2022, 5:05 p.m.