gsbm_mcgd: Fit a Generalized Stochastic Block Model

View source: R/gsbm_mcgd.R

gsbm_mcgdR Documentation

Fit a Generalized Stochastic Block Model

Description

Given an adjacency matrix with missing observations, the function gsbm_mgcd robustly estimates the probabilities of connections between nodes.

Usage

gsbm_mcgd(
  A,
  lambda1,
  lambda2,
  epsilon = 0.1,
  U = NULL,
  maxit = 100,
  thresh = 1e-06,
  S0 = NULL,
  L0 = NULL,
  R0 = NULL,
  trace.it = FALSE
)

Arguments

A

nxn adjacency matrix

lambda1

regularization parameter for nuclear norm penalty (positive number)

lambda2

regularization parameter for 2,1-norm penalty (positive number)

epsilon

regularization parameter for the L2-norm penalty (positive number, if NULL, default method is applied)

U

lower bound on nuclear norm (positive number, if NULL, default method is applied)

maxit

maximum number of iterations (positive integer, if NULL, default method is applied)

thresh

convergence tolerance (positive number, if NULL, default method is applied)

S0

initial value for the sparse component (if NULL, default method is applied))

L0

initial value for the low-rank component (if NULL, default method is applied)

R0

lower bound on nuclear norm of L0 (positive number, if NULL, default method is applied)

trace.it

whether messages about convergence should be printed (boolean, if NULL, default is FALSE)

Value

The estimate for the nxn matrix of probabilities of connections between nodes. It is given as the sum of a low-rank nxn matrix L, corresponding to connections between inlier nodes, and a column sparse nxn matrix S, corresponding to connections between outlier nodes and the rest of the network. The matrices L and S are such that

E(A) = L - diag(L) + S + S'

where E(A) is the expectation of the adjacency matrix, diag(L) is a nxn diagonal matrix with diagonal entries equal to those of L, and S' means S transposed.

The return value is a list of components

  • A the adjacency matrix.

  • L estimate for the low-rank component.

  • S estimate for the column-sparse component.

  • objective the value of the objective function.

  • R a bound on the nuclear norm of the low-rank component.

  • iter number of iterations between convergence of the objective function.

Examples

# Draw a 50x50 adjacency matrix
# Generalized SBM with 2 communities and 2 outliers
# Create low-rank matrix L
L <- matrix(0,50,50) # low-rank component
L[1:25, 1:25] <- 0.6 # connection probabilities within community 1
L[1:25, 26:48] <- 0.1 # connection probabilities between communities 1 and 2
L[26:48, 1:25] <- 0.1 # connection probabilities between communities 1 and 2
L[26:48, 26:48] <- 0.6 # connection probabilities within community 2

# Create column-sparse matrix S
S <- matrix(0,50,50) # column sparse component
S[49:50,1:48] <- 0.6 # connection probabilties between outliers and inliers

# Draw connections and create the adjacency matrix
undir <- rbinom(n=50*(50-1)/2, size=1, prob=(L+S+t(S))[upper.tri(L+S+t(S))]) # draw edges
A <- matrix(0,50,50)
A[upper.tri(A)] <- undir
A <- (A+t(A))

# Estimate the probabilities of connection
lambda1 <- 7
lambda2 <- 7
res <- gsbm_mcgd(A, lambda1, lambda2)

gsbm documentation built on Sept. 20, 2022, 9:06 a.m.