HCD: hierarchical community detection with recursive spectral...

Description Usage Arguments Details Value Author(s) References Examples

View source: R/HCD-Implement.R

Description

Hierarchical community by recursive spectral partitioning. It includes the splitting methods of spectral clustering and sign splitting, as well stopping rules for fixed stopping, non-backtracking matrix checking and edge cross-validation.

Usage

1
HCD(A, method = "SS", stopping = "NB", reg = FALSE, n.min = 25, D = NULL,notree=TRUE)

Arguments

A

adjacency matrix. Can be standard R matrix or dsCMatrix (or other type in package Matrix)

method

splitting method. "SS" (default) for sign splitting, "SC" for spectral clustering

stopping

stopping rule. "NB" (default) for non-backtracking matrix spectrum, "ECV" for edge cross-validation, "Fix"for fixed D layers of partitioning (needs D value)

reg

logic value on whether regularization is needed. By default it is FALSE.Set it to be TRUE will add reguarlization, which help the performance on sparse networks, but it will make the computation slower.

n.min

integer number. The algorithm will stop splitting if the current size is <= 2*n.min.

D

the number of layers to partition, if stopping=="Fix".

notree

logical value on whether the tree and the corresponding similarity will be computed. If TRUE (default), will not produce the data.tree object or the community similarity matrix. Only the cluster label and the tree path strings will be returned. This typically makes the runing faster.

Details

For stopping rules, ECV is nonparametric rank evaluation by cross-validation, a more generally applicable approach without assuming SBM or its variants. ECV is also applicable for weighted networks.So it is believed to be more robust than NB but less effective if the true model is close to BTSBM. However, the ECV is computationally much more intensive.

Notice that the algorithm does not reply on the assumption of the BTSBM. But the estimated probabiilty matrix from the output is based on the BTSBM.

Value

A list of the following objects:

labels

detected community labels of nodes

ncl

number of clusters from the algorithm

cluster.tree

a data.tree object for the binary tree between communities

P

estimated connection probability matrix between n nodes, according to BTSBM

node.bin.sim.mat

binary string similarity between nodes

comm.bin.sim.mat

binary string similarity between communities

tree.path

a list of strings to describe the path from root to each community along the tree

Author(s)

Tianxi Li, Lihua Lei, Sharmodeep Bhattacharyya, Purnamrita Sarkar, Peter Bickel, and Elizaveta Levina.

Maintainer: Tianxi Li <tianxili@virginia.edu>

References

Tianxi Li, Lihua Lei, Sharmodeep Bhattacharyya, Purnamrita Sarkar, Peter Bickel, and Elizaveta Levina. Hierarchical community detection by recursive partitioning. arXiv:1810.01509

Examples

1
2
3
4
dt <- BTSBM(n=1600,d=4,a.seq=0.2^seq(0,4),lambda=50)
A <- dt$A.list[[1]]
# you can try various versions of the algorithm as below: the Fix is fastest and ECV is slowest.
system.time(HCD.result <- HCD(A,method="SC",stopping="Fix",D=4))

HCD documentation built on Aug. 24, 2019, 5:03 p.m.