graph.model.selection: Graph model selection

Description Usage Arguments Value References Examples

View source: R/statGraph.R

Description

graph.model.selection selects the graph model that best approximates the observed graph according to the Graph Information Criterion (GIC).

Usage

1
2
3
4
5
6
7
8
graph.model.selection(
  G,
  models = NULL,
  parameters = NULL,
  eps = 0.01,
  bandwidth = "Silverman",
  eigenvalues = NULL
)

Arguments

G

the undirected graph (igraph type) or its adjacency matrix. The adjacency matrix of an unweighted graph contains only 0s and 1s, while the weighted graph may have nonnegative real values that correspond to the weights of the edges.

models

either a vector of strings, a list of functions or a list of arrays describing graph models:

A vector of strings containing some of the following models: "ER" (Erdos-Renyi random graph), "GRG" (geometric random graph), "KR" (k regular random graph), "WS" (Watts-Strogatz model), and "BA" (Barabasi-Albert model).

A list of functions. Each function returns a graph (represented by its adjacency matrix) generated by a graph model and has two arguments: the graph size and the model parameter, in this order.

A list of arrays. Each elememt of the list is a three-dimensional array containing the precomputed spectrum of each model. Let M be a graph model. For each parameter p considered for M, the array of model M contains the eigenvalues of graphs randomly generated by M with parameter p. The position (i,j,k) of the array contains the j-th eigenvalue of the k-th graph that generated by M with the i-th parameter. The attribute 'rownames' of the array corresponds to the parameters converted to string.

If the argument "models" is NULL, then the "ER", "WS", and "BA" models will be considered for the model selection.

parameters

list of numeric vectors. Each vector contains the values that will be considerated for the parameter estimation of the corresponding model. If the user does not provide the argument 'parameters', then default values are used for the predefined models ("ER", "GRG", "KR", "WS", and "BA"). The default vector corresponds to a sequence from

0 to 1 with step 'eps' for the "ER" model (Erdos-Renyi random graph), in which the parameter corresponds to the probability to connect a pair of vertices;

0 to sqrt(2) with step 'eps' for the "GRG" model (geometric random graph), in which the parameter corresponds to the radius used to contruct the geometric graph in a unit square;

0 to 'n' with step 'n*eps' for the "KR" model (k regular random graph), in which the parameter of the model corresponds to the degree 'k' of a regular graph;

0 to 1 with step 'eps' for the "WS" model (Watts-Strogatz model), in which the parameter corresponds to the probability to reconnect a vertex;

and 0 to 3 with step 'eps' for the "BA" model (Barabasi-Albert model), in which the parameter corresponds to the scaling exponent.

eps

precision of the grid (default is 0.01).

bandwidth

string showing which criterion is used to choose the bandwidth during the spectral density estimation. Choose between the following criteria: "Silverman" (default), "Sturges", "bcv", "ucv" and "SJ". "bcv" is an abbreviation of biased cross-validation, while "ucv" means unbiased cross-validation. "SJ" implements the methods of Sheather & Jones (1991) to select the bandwidth using pilot estimation of derivatives.

eigenvalues

optional parameter. It contains the eigenvalues of matrix G. Then, it can be used when the eigenvalues of G were previously computed. If no value is passed, then the eigenvalues of G will be computed by 'graph.model.selection'.

Value

A list containing:

model

the indice(s) or name(s) of the selected model(s), i. e. the model(s) that minimize(s) the Graph Information Criterion (GIC).

estimates

a matrix in which each row corresponds to a model, the column "param" corresponds to the parameter estimate, and the column "GIC" corresponds to the Graph Information Criterion (GIC), i. e. the Kullback-Leibler divergence between the observed graph and the model.

References

Takahashi, D. Y., Sato, J. R., Ferreira, C. E. and Fujita A. (2012) Discriminating Different Classes of Biological Networks by Analyzing the Graph Spectra Distribution. _PLoS ONE_, *7*, e49949. doi:10.1371/journal.pone.0049949.

Silverman, B. W. (1986) _Density Estimation_. London: Chapman and Hall.

Sturges, H. A. The Choice of a Class Interval. _J. Am. Statist. Assoc._, *21*, 65-66.

Sheather, S. J. and Jones, M. C. (1991). A reliable data-based bandwidth selection method for kernel density estimation. _Journal of the Royal Statistical Society series B_, 53, 683-690. http://www.jstor.org/stable/2345597.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
## Example using an igraph object as input data
set.seed(1)
G <- igraph::sample_gnp(n=30, p=0.5)

# Using strings to indicate the graph models
result1 <- graph.model.selection(G, models=c("ER", "WS"), eps=0.5)
result1


## Using functions to describe the graph models
# Erdos-Renyi graph
model1 <- function(n, p) {
  return(igraph::sample_gnp(n, p))
}
# Watts-Strogatz small-world graph
model2 <- function(n, pr, K=8) {
  return(igraph::sample_smallworld(1, n, K, pr))
}
parameters <- list(seq(0.01, 0.99, 0.49), seq(0.01, 0.99, 0.49))
result2 <- graph.model.selection(G, list(model1, model2), parameters)
result2

statGraph documentation built on May 19, 2021, 9:11 a.m.