getCluster_methods: Clustering Network Nodes

View source: R/BioTIP_update_04202022.R

getCluster_methodsR Documentation

Clustering Network Nodes

Description

This function runs over all states which are grouped samples. For each state, this function splits the correlation network generated from the function getNetwork into several sub-networks (which we called 'module'). The network nodes will be defined by the end-user. For transcriptome analysis, network nodes can be the expressed transcripts. The outputs of this function include the module IDs and node IDs per module.

Usage

getCluster_methods(
  igraphL,
  method = c("rw", "hcm", "km", "pam", "natural"),
  cutoff = NULL
)

Arguments

igraphL

A list of numerical matrices or a list of igraph objects. The list of igraph objects can be the output from the getNetwork function.

method

A mathematical clustering model for analyzing network nodes. Default is a random walk ('rw'). A method could be 'rw', 'hcm', 'km', 'pam', or 'natural', where:

  • rw: random walk using cluster_walktrap function in igraph package. 'igraphL' has to be a list of igraph.

  • hcm: hierarchical clustering using function hclust) and dist, using method 'complete'.

  • km and pam: k-medoids or PAM algorithm using KMedoids.

  • natrual: if nodes are disconnected, they may naturally cluster and form sub-networks.

cutoff

A numeric value, default is NULL. For each method it means:

  • rw: the number of steps needed, see cluster_walktrap for more detail. If "cutoff" is not assigned, default of 4 will be used.

  • hcm, km and pam: number of clusters wanted. No default assigned.

  • natural: does not use this parameter.

Value

When method=rw: A list of communities objects of R package igraph, whose length is the length of the input object igraphL. These communities objects can be used for visualization when being assigned to the 'mark.groups' parameter of the plot.igraph function of the igraph package. Otherwise this function returns a list of vectors, whose length is the length of the input object igraphL. The names of each vector are the pre-selected transcript IDs by th function sd_selection. Each vector, whose length is the number of pre-selected transcript in a state, contains the module IDs.

Author(s)

Zhezhen Wang zhezhen@uchicago.edu

Examples

test = list('state1' = matrix(sample(1:10, 6), 3, 3), 'state2' =
matrix(sample(1:10, 6), 3, 3), 'state3' = matrix(sample(1:10, 6), 3, 3))
#assign colnames and rownames to the matrix

for(i in names(test)){
colnames(test[[i]]) = 1:3
row.names(test[[i]]) = 1:3}

#using 'rw' or 'natural' method
igraphL <- getNetwork(test,  fdr=1)
#[1] "state1:3 nodes"
#[1] "state2:3 nodes"
#[1] "state3:3 nodes"

cl <- getCluster_methods(igraphL)

#using 'km',  'pam' or 'hcm'
cl <- getCluster_methods(test,  method = 'pam',  cutoff=2)


xyang2uchicago/BioTIP documentation built on June 30, 2024, 10:14 p.m.