a_mixer: MIXtures of Erdos-Renyi random graphs

Description Usage Arguments Details Value Author(s) References Examples

Description

Estimate the parameters, the clusters, as well as the number of clusters q of a (binary) stochastic block model.

Usage

1
2
mixer( x, qmin=2, qmax=NULL, method="variational", directed = NULL,
nbiter=10, fpnbiter=5, improve=FALSE, verbose=TRUE)

Arguments

x

an adjacency matrix or a matrix of edges (each column gives the two node indexes defining an edge) or a spm file name (a .spm file describes the network as a sparse matrix).

qmin

minimum number of classes.

qmax

maximum number of classes (if NULL, only q=qmin is considered).

method

strategy used for the estimation: "variational", "classification", or "bayesian"

directed

TRUE/FALSE for directed/undirected graph. Default is NULL, i.e. according to the input array x, mixer identifies whether the graph is directed or undirected.

nbiter

maximum number of EM iterations (default: 10).

fpnbiter

maximum number of internal iterations for the E step (default: 5).

improve

selects between improved or basic strategies (default: FALSE).

verbose

display warning messages (default: TRUE).

Details

mixer implements inference methods for the MixNet model (sometimes referred to as Erd<c3><b6>s-R<c3><a9>nyi mixture model for graphs) which is described in Daudin et. al (2008). Please note that the MixNet model is a special case of binary stochastic block models (Nowicki and Snijders, 2001). The inference allows to uncover clusters of vertices sharing homogeneous connection profiles. In particular, the package can be used to look for specific clusters, namely communities, where nodes of a community are more likely to connect to nodes of the same community.

MixNet must not be confused with Exponential Random Graph Models for network data (ERGM).

The mixer package implements three different estimation strategies which were developed to deal with directed and undirected graphs:

variational

refers to the paper of Daudin et. al (2008). It is the default method.

classification

implements the method described in Zanghi et. al (2008). This method is faster than the variational approach and is able to deal with bigger networks but can produce biased estimates.

bayesian

implements the method described in Latouche et. al (2012).

The implementation of the two first methods consists of an R wrapper of the c++ software package mixnet developed by Vincent Miele (2006).

The mixer routine uses the estimation strategy described in method and computes a model selection criterion for each value of q (the number of classes) between qmin and qmax. The ICL criterion is used for the variational and classification methods. It corresponds to an asymptotic approximation of the Integrated Classification Likelihood. The other criterion, so called ILvb (Integrated Likelihood variational Bayes), is used for the bayesian method. It is based on a variational (non-asymptotic) approximation of the Integrated observed Likelihood.

mixer is an user-friendly package with a reduced number of functions. For R-developers in statistical networks a more complete set, called mixer-dev, is provided (see below).

Value

mixer returns an object of class mixer. Below the main attributes of this class:

nnodes

number of connected nodes.

map

mapping from connected nodes to the whole set of nodes.

edges

edge list.

qmin, qmax

number of classes.

output

output list of qmax-qmin+1 items. Each item contains the result of the estimation for a given number of class q. Details of output field:

output[[i]]$criterion

ICL criterion or ILvb criterion used for model selection (see details section for more).

output[[i]]$alphas

vector of proportion, whose length is the number of component.

output[[i]]$Pis

class connectivity matrix.

output[[i]]$a

vector of Dirichlet parameters for the (approximated) posterior distribution of the class proportions.

output[[i]]$eta

matrix of Beta parameters for the (approximated) posterior distribution of the connectivity matrix.

output[[i]]$zeta

matrix of Beta parameters for the (approximated) posterior distribution of the connectivity matrix.

output[[i]]$Taus

matrix of posterior probabilities (of the hidden color knowing the graph structure).

Author(s)

Christophe Ambroise, Gilles Grasseau, Mark Hoebeke, Pierre Latouche, Vincent Miele, Franck Picard

References

Jean-Jacques Daudin, Franck Picard and Stephane Robin (2008), A mixture model for random graphs. Statistics and Computing, 18, 2, 151-171.

Hugo Zanghi, Christophe Ambroise and Vincent Miele (2008), Fast online graph clustering via Erd??s-R??nyi mixture. Pattern Recognition, 41, 3592-3599.

Hugo Zanghi, Franck Picard, Vincent Miele and Christophe Ambroise (2010), Strategies for online inference of model-based clustering in large and growing networks. Annals of Applied Statistics, 4, 2, 687-714.

Pierre Latouche, Etienne Birmel?? and Christophe Ambroise (2012), Variational Bayesian inference and complexity control for stochastic block models. Statistical Modelling, SAGE Publications, 12, 1, 93-115.

Vincent Miele, MixNet C++ package,
http://www.math-evry.cnrs.fr/logiciels/mixnet.

mixer-dev tool: see http://ssbgroup.fr/mixnet/mixer.html

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
graph.affiliation(n=100,c(1/3,1/3,1/3),0.8,0.2)->g
mixer(g$x,qmin=2,qmax=6)->xout
## Not run: plot(xout)

##

graph.affiliation(n=50,c(1/3,1/3,1/3),0.8,0.2)->g
mixer(g$x,qmin=2,qmax=5, method="bayesian")->xout
## Not run: plot(xout)

##

data(blog)

## set the seed to replicate results
setSeed(777)
mixer(x=blog$links,qmin=2,qmax=12)->xout
## Not run: plot(xout)

##

## get best run
m <- getModel(xout)

## get run for q=5
m <- getModel(xout, q=5)

mixer documentation built on Feb. 21, 2018, 1:02 a.m.

Related to a_mixer in mixer...