ggmncv: GGMncv
In GGMncv: Gaussian Graphical Models with Nonconvex Regularization

Description Usage Arguments Details Value Note References Examples

\loadmathjax

Gaussian graphical modeling with nonconvex regularization. A thorough survey of these penalties, including simulation studies investigating their properties, is provided in \insertCitewilliams2020beyond;textualGGMncv.

ggmncv(
  R,
  n,
  penalty = "atan",
  ic = "bic",
  select = "lambda",
  gamma = NULL,
  lambda = NULL,
  n_lambda = 50,
  lambda_min_ratio = 0.01,
  n_gamma = 50,
  initial = NULL,
  LLA = FALSE,
  unreg = FALSE,
  maxit = 10000,
  thr = 1e-04,
  store = TRUE,
  progress = TRUE,
  ebic_gamma = 0.5,
  penalize_diagonal = TRUE,
  ...
)

`R`	Matrix. A correlation matrix of dimensions p by p.
`n`	Numeric. The sample size used to compute the information criterion.
`penalty`	Character string. Which penalty should be used (defaults to `"atan"`)?
`ic`	Character string. Which information criterion should be used (defaults to `"bic"`)? The options include `aic`, `ebic` (ebic_gamma defaults to `0.5`), `ric`, or any of the generalized information criteria provided in section 5 of \insertCitekim2012consistent;textualGGMncv. The options are `gic_1` (i.e., `bic`) to `gic_6` (see '`Details`').
`select`	Character string. Which tuning parameter should be selected (defaults to `"lambda"`)? The options include `"lambda"` (the regularization parameter), `"gamma"` (governs the 'shape'), and `"both"`.
`gamma`	Numeric. Hyperparameter for the penalty function. Defaults to 3.7 (`scad`), 2 (`mcp`), 0.5 (`adapt`), and 0.01 with all other penalties. Note care must be taken when departing from the default values (see the references in '`note`')
`lambda`	Numeric vector. Regularization (or tuning) parameters. The defaults is `NULL` that provides default values with `select = "lambda"` and `sqrt(log(p)/n)` with `select = "gamma"`.
`n_lambda`	Numeric. The number of \mjseqn\lambda's to be evaluated. Defaults to 50. This is disregarded if custom values are provided for `lambda`.
`lambda_min_ratio`	Numeric. The smallest value for `lambda`, as a fraction of the upperbound of the regularization/tuning parameter. The default is `0.01`, which mimics the `R` package qgraph. To mimic the `R` package huge, set `lambda_min_ratio = 0.1` and `n_lambda = 10`.
`n_gamma`	Numeric. The number of \mjseqn\gamma's to be evaluated. Defaults to 50. This is disregarded if custom values are provided in `lambda`.
`initial`	A matrix (p by p) or custom function that returns the inverse of the covariance matrix . This is used to compute the penalty derivative. The default is `NULL`, which results in using the inverse of `R` (see '`Note`').
`LLA`	Logical. Should the local linear approximation be used (default to `FALSE`)?
`unreg`	Logical. Should the models be refitted (or unregularized) with maximum likelihood (defaults to `FALSE`)? Setting to `TRUE` results in the approach of \insertCiteFoygel2010;textualGGMncv, but with the regularization path obtained from nonconvex regularization, as opposed to the \mjseqn\ell_1-penalty.
`maxit`	Numeric. The maximum number of iterations for determining convergence of the LLA algorithm (defaults to `1e4`). Note this can be changed to, say, `2` or `3`, which will provide two and three-step estimators without convergence check.
`thr`	Numeric. Threshold for determining convergence of the LLA algorithm (defaults to `1.0e-4`).
`store`	Logical. Should all of the fitted models be saved (defaults to `TRUE`)?
`progress`	Logical. Should a progress bar be included (defaults to `TRUE`)?
`ebic_gamma`	Numeric. Value for the additional hyper-parameter for the extended Bayesian information criterion (defaults to 0.5, must be between 0 and 1). Setting `ebic_gamma = 0` results in BIC.
`penalize_diagonal`	Logical. Should the diagonal of the inverse covariance matrix be penalized (defaults to `TRUE`).
`...`	Additional arguments passed to `initial` when a function is provided and ignored otherwise.

Several of the penalties are (continuous) approximations to the \mjseqn\ell_0 penalty, that is, best subset selection. However, the solution does not require enumerating all possible models which results in a computationally efficient solution.

L0 Approximations

Atan: penalty = "atan" \insertCitewang2016variableGGMncv. This is currently the default.
Seamless \mjseqn\ell_0: penalty = "selo" \insertCitedicker2013variableGGMncv.
Exponential: penalty = "exp" \insertCitewang2018variableGGMncv
Log: penalty = "log" \insertCitemazumder2011sparsenetGGMncv.
Sica: penalty = "sica" \insertCitelv2009unifiedGGMncv

Additional penalties:

SCAD: penalty = "scad" \insertCitefan2001variableGGMncv.
MCP: penalty = "mcp" \insertCitezhang2010nearlyGGMncv.
Adaptive lasso (penalty = "adapt"): Defaults to \mjseqn\gamma = 0.5 \insertCitezou2006adaptiveGGMncv. Note that for consistency with the other penalties, \mjseqn\gamma \rightarrow 0 provides more penalization and \mjseqn\gamma = 1 results in \mjseqn\ell_1 regularization.
Lasso: penalty = "lasso" \insertCitetibshirani1996regressionGGMncv.

gamma (\mjseqn\gamma):

The gamma argument corresponds to additional hyperparameter for each penalty. The defaults are set to the recommended values from the respective papers.

LLA

The local linear approximate is noncovex penalties was described in \insertCitefan2009networkGGMncv. This is essentially an iteratively re-weighted (g)lasso. Note that by default LLA = FALSE. This is due to the work of \insertCitezou2008one;textualGGMncv, which suggested that, so long as the starting values are good enough, then a one-step estimator is sufficient to obtain an accurate estimate of the conditional dependence structure. In the case of low-dimensional data, the sample based inverse covariance matrix is used for the starting values. This is expected to work well, assuming that \mjseqnn is sufficiently larger than \mjseqnp.

Generalized Information Criteria

The following are the available GIC:

\mjseqn\textrm
GIC_1: |\textbfE| \cdot \textrmlog(n) (ic = "gic_1" or ic = "bic")
\mjseqn\textrm
GIC_2: |\textbfE| \cdot p^1/3 (ic = "gic_2")
\mjseqn\textrm
GIC_3: |\textbfE| \cdot 2 \cdot \textrmlog(p) (ic = "gic_3" or ic = "ric")
\mjseqn\textrm
GIC_4: |\textbfE| \cdot 2 \cdot \textrmlog(p) + \textrmlog\big(\textrmlog(p)\big) (ic = "gic_4")
\mjseqn\textrm
GIC_5: |\textbfE| \cdot \textrmlog(p) + \textrmlog\big(\textrmlog(n)\big) \cdot \textrmlog(p) (ic = "gic_5")
\mjseqn\textrm
GIC_6: |\textbfE| \cdot \textrmlog(n) \cdot \textrmlog(p) (ic = "gic_6")

Note that \mjseqn|\textbfE| denotes the number of edges (nonzero relations) in the graph, \mjseqnp the number of nodes (columns), and \mjseqnn the number of observations (rows). Further each can be understood as a penalty term added to negative 2 times the log-likelihood, that is,

\mjseqn

-2 l_n(\hat\boldsymbol\Theta) = -2 \Big[\fracn2 \textrmlog \textrmdet \hat\boldsymbol\Theta - \textrmtr(\hat\textbfS\hat\boldsymbol\Theta)\Big]

where \mjseqn\hat\boldsymbol\Theta is the estimated precision matrix (e.g., for a given \mjseqn\lambda and \mjseqn\gamma) and \mjseqn\hat\textbfS is the sample-based covariance matrix.

An object of class ggmncv, including:

Theta Inverse covariance matrix
Sigma Covariance matrix
P Weighted adjacency matrix
adj Adjacency matrix
lambda Tuning parameter(s)
fit glasso fitted model (a list)

initial

initial not only affects performance (to some degree) but also computational speed. In high dimensions (defined here as p > n), or when p approaches n, the precision matrix can become quite unstable. As a result, with initial = NULL, the algorithm can take a very (very) long time. If this occurs, provide a matrix for initial (e.g., using lw). Alternatively, the penalty can be changed to penalty = "lasso", if desired.

The R package glassoFast is under the hood of ggmncv \insertCitesustik2012glassofastGGMncv, which is much faster than glasso when there are many nodes.

\insertAllCited

# data
Y <- GGMncv::ptsd

S <- cor(Y)

# fit model
# note: atan default
fit_atan <- ggmncv(S, n = nrow(Y),
                   progress = FALSE)

# plot
plot(get_graph(fit_atan),
     edge_magnify = 10,
     node_names = colnames(Y))

# lasso
fit_l1 <- ggmncv(S, n = nrow(Y),
                 progress = FALSE,
                 penalty = "lasso")

# plot
plot(get_graph(fit_l1),
     edge_magnify = 10,
     node_names = colnames(Y))


# for these data, we might expect all relations to be positive
# and thus the red edges are spurious. The following re-estimates
# the graph, given all edges positive (sign restriction).

# set negatives to zero (sign restriction)
adj_new <- ifelse( fit_atan$P <= 0, 0, 1)

check_zeros <- TRUE

# track trys
iter <- 0

# iterate until all positive
while(check_zeros){
  iter <- iter + 1
  fit_new <- constrained(S, adj = adj_new)
  check_zeros <- any(fit_new$wadj < 0)
  adj_new <- ifelse( fit_new$wadj <= 0, 0, 1)
}

# make graph object
new_graph <- list(P = fit_new$wadj,
                  adj = adj_new)
class(new_graph) <- "graph"

plot(new_graph,
     edge_magnify = 10,
     node_names = colnames(Y))