Description Usage Arguments Details Value References Examples

Graph structure search and estimation for Gaussian covariance and concentration graph models.

1 2 3 4 5 6 7 8 9 10 11 | ```
searchGGM(data = NULL,
S = NULL, N = NULL,
model = c("covariance", "concentration"),
search = c("step-forw", "step-back", "ga"),
penalty = c("bic", "ebic", "erdos", "power"),
beta = NULL,
start = NULL,
regularize = FALSE, regHyperPar = NULL,
ctrlStep = ctrlSTEP(), ctrlGa = ctrlGA(), ctrlIcf = ctrlICF(),
parallel = FALSE,
verbose = FALSE, ...)
``` |

`data` |
A dataframe or matrix, where rows correspond to observations and columns to variables. Categorical variables are not allowed. |

`S` |
The sample covariance matrix of the data. If |

`N` |
The number of observations. If |

`model` |
The type of Gaussian graphical model. Default is |

`search` |
The type of structure search algorithm. If |

`penalty` |
The penalty function used to define a criterion for scoring the candidate graph configurations. Default is |

`beta` |
The hyperparameter of the penalty function. See "Details" and |

`start` |
A starting matrix for the estimation algorithm. If |

`regularize` |
A logical argument indicating if Bayesian regularization should be performed. Default to |

`regHyperPar` |
A list of hyper parameters for Bayesian regularization. Only used when |

`ctrlStep` |
A list of control parameters used in the stepwise search; see also |

`ctrlGa` |
A list of control parameters for the genetic algorithm; see also |

`ctrlIcf` |
A list of control parameters employed in the algorithm for estimation of graphical model parameters; see also |

`parallel` |
A logical argument indicating if parallel computation should be used for structure search. If TRUE, all the available cores are used. The argument could also be set to a numeric integer value specifying the number of cores to be employed. |

`verbose` |
A logical argument controlling whether iterations of the structure searching and estimation procedure need to be shown or not. |

`...` |
Additional internal arguments not to be provided by the user. |

The function performs graph association structure search and maximum penalized likelihood estimation of the optimal Gaussian graphical model given the data provided in input.

A Gaussian covariance graph model is estimated if `model = "covariance"`

, while estimation of a Gaussian covariance graph model is performed if `model = "concentration"`

. A Gaussian covariance graph model postulates that some variables are marginally independent according to the inferred graph structure. On the other hand, in a Gaussian concentration graph model, variables are conditionally independent given their neighbors in the inferred graph. See also `fitGGM`

.

Search for the optimal graph structure and parameter estimation is carried out by maximization of a Gaussian penalized likelihood, given as follows:

*Covariance: argmax_(Sigma, A) L(X | Sigma, A) - P_beta(A) with Sigma in C_G(A) *

*Concentration: argmax_(Omega, A) L(X | Omega, A) - P_beta(A) with Omega in C_G(A) *

where *C_G(A)* is the collection of sparse positive definite matrices whose zero patterns are given by graph *G* represented by the adjacency matrix *A*.

The penalty function *P_beta(A)* depends on the structure of graph *G* through the adjacency matrix *A* and a parameter *beta*; see `penalty`

on how to specify the penalization term and for further information.

For this type of penalized log-likelihood, graph structure search and parameter estimation is a maximization combinatorial problem. For a given candidate structure (i.e. adjacency matrix), association parameters in the covariance or concentration matrix are estimated using the estimation algorithms implemented in `fitGGM`

. Regarding structure search, this can be carried out either using a greedy forward-stepwise or a greedy backward-stepwise algorithm, by setting `search = "step-forw"`

or `search = "step-back"`

respectively. Alternatively, a stochastic search via genetic algorithm can be used by setting `search = "ga"`

. The procedure for the forward stepwise search is described in Fop et al. (2018), and the backward is implemented in a similar way; the genetic algorithm procedure relies on the `GA`

package. All the structure searching methods can be run in parallel on a multi-core machine by setting the argument `parallel = TRUE`

.

An object of class `'fitGGM'`

containing the optimal estimated marginal or conditional independence Gaussian graphical model.

The output is a list containing:

`sigma` |
The estimated covariance matrix. |

`omega` |
The estimated concentration (inverse covariance) matrix. |

`graph` |
The adjacency matrix corresponding to the optimal marginal or conditional independence graph. |

`model` |
Estimated model type, whether |

`loglikPen` |
Value of the maximized penalized log-likelihood. |

`loglik` |
Value of the maximized log-likelihood. |

`nPar` |
Number of estimated parameters. |

`N` |
Number of observations. |

`V` |
Number of variables, corresponding to the number of nodes in the graph. |

`penalty` |
The type of penalty on the graph structure. |

`search` |
The search method used for graph structure search. |

`GA` |
An object of class |

Fop, M., Murphy, T.B., and Scrucca, L. (2018). Model-based clustering with sparse covariance matrices. *Statistics and Computing*. To appear.

Scrucca, L. (2017). On some extensions to GA package: Hybrid optimisation, parallelisation and islands evolution. *The R Journal*, 9(1), 187-206.

Scrucca, L. (2013). GA: A package for genetic algorithms in R. *Journal of Statistical Software*, 53(4), 1-3.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | ```
# fit covariance graph model with default forward-stepwise search
data(mtcars)
x <- mtcars[,c(1,3:7)]
mod1 <- searchGGM(x, model = "covariance")
mod1
plot(mod1)
#
# prefer a sparser model
mod2 <- searchGGM(x, model = "covariance", penalty = "ebic")
mod2
plot(mod2)
# fit concentration graph model with backward-stepwise structure search
# with a covariance matrix in input
data(ability.cov)
mod3 <- searchGGM(S = ability.cov$cov, N = ability.cov$n.obs,
model = "concentration", search = "step-back")
mod3
mod3$graph
mod3$omega
plot(mod3)
## Not run:
# generate data from a Markov model
N <- 1000
V <- 20
dat <- matrix(NA, N, V)
dat[,1] <- rnorm(N)
for ( j in 2:V ) dat[,j] <- dat[,j-1] + rnorm(N, sd = 0.5)
mod4 <- searchGGM(data = dat, model = "concentration") # recover the model
plot(mod4, what = "adjacency")
## End(Not run)
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.