huge.select: Model selection for high-dimensional undirected graph...

Description Usage Arguments Details Value Note See Also Examples

View source: R/huge.select.R

Description

Implements the regularization parameter selection for high dimensional undirected graph estimation. The optional approaches are rotation information criterion (ric), stability approach to regularization selection (stars) and extended Bayesian information criterion (ebic).

Usage

1
2
3
4
5
6
7
8
9
huge.select(
  est,
  criterion = NULL,
  ebic.gamma = 0.5,
  stars.thresh = 0.1,
  stars.subsample.ratio = NULL,
  rep.num = 20,
  verbose = TRUE
)

Arguments

est

An object with S3 class "huge".

criterion

Model selection criterion. "ric" and "stars" are available for all 3 graph estimation methods. ebic is only applicable when est$method = "glasso" in huge(). The default value is "ric".

ebic.gamma

The tuning parameter for ebic. The default value is 0.5. Only applicable when est$method = "glasso" and criterion = "ebic".

stars.thresh

The variability threshold in stars. The default value is 0.1. An alternative value is 0.05. Only applicable when criterion = "stars".

stars.subsample.ratio

The subsampling ratio. The default value is 10*sqrt(n)/n when n>144 and 0.8 when n<=144, where n is the sample size. Only applicable when criterion = "stars".

rep.num

The number of subsamplings when criterion = "stars" or rotations when criterion = "ric". The default value is 20. NOT applicable when criterion = "ebic".

verbose

If verbose = FALSE, tracing information printing is disabled. The default value is TRUE.

Details

Stability approach to regularization selection (stars) is a natural way to select optimal regularization parameter for all three estimation methods. It selects the optimal graph by variability of subsamplings and tends to overselect edges in Gaussian graphical models. Besides selecting the regularization parameters, stars can also provide an additional estimated graph by merging the corresponding subsampled graphs using the frequency counts. The subsampling procedure in stars may NOT be very efficient, we also provide the recent developed highly efficient, rotation information criterion approach (ric). Instead of tuning over a grid by cross-validation or subsampling, we directly estimate the optimal regularization parameter based on random Rotations. However, ric usually has very good empirical performances but suffers from underselections sometimes. Therefore, we suggest if user are sensitive of false negative rates, they should either consider increasing r.num or applying the stars to model selection. Extended Bayesian information criterion (ebic) is another competitive approach, but the ebic.gamma can only be tuned by experience.

Value

An object with S3 class "select" is returned:

refit

The optimal graph selected from the graph path

opt.icov

The optimal precision matrix from the path only applicable when method = "glasso"

opt.cov

The optimal covariance matrix from the path only applicable when method = "glasso" and est$cov is available.

merge

The graph path estimated by merging the subsampling paths. Only applicable when the input criterion = "stars".

variability

The variability along the subsampling paths. Only applicable when the input criterion = "stars".

ebic.scores

Extended BIC scores for regularization parameter selection. Only applicable when criterion = "ebic".

opt.index

The index of the selected regularization parameter. NOT applicable when the input criterion = "ric"

opt.lambda

The selected regularization/thresholding parameter.

opt.sparsity

The sparsity level of "refit".

and anything else included in the input est

Note

The model selection is NOT available when the data input is the sample covariance matrix.

See Also

huge and huge-package.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
#generate data
L = huge.generator(d = 20, graph="hub")
out.mb = huge(L$data)
out.ct = huge(L$data, method = "ct")
out.glasso = huge(L$data, method = "glasso")

#model selection using ric
out.select = huge.select(out.mb)
plot(out.select)

#model selection using stars
#out.select = huge.select(out.ct, criterion = "stars", stars.thresh = 0.05,rep.num=10)
#plot(out.select)

#model selection using ebic
out.select = huge.select(out.glasso,criterion = "ebic")
plot(out.select)

Example output

Generating data from the multivariate normal distribution with the hub graph structure....done.
Conducting Meinshausen & Buhlmann graph estimation (mb)....done
Conducting the graph estimation via correlation thresholding (ct) ....in progress:5% 
Conducting the graph estimation via correlation thresholding (ct) ....in progress:10% 
Conducting the graph estimation via correlation thresholding (ct) ....in progress:15% 
Conducting the graph estimation via correlation thresholding (ct) ....in progress:20% 
Conducting the graph estimation via correlation thresholding (ct) ....in progress:25% 
Conducting the graph estimation via correlation thresholding (ct) ....in progress:30% 
Conducting the graph estimation via correlation thresholding (ct) ....in progress:35% 
Conducting the graph estimation via correlation thresholding (ct) ....in progress:40% 
Conducting the graph estimation via correlation thresholding (ct) ....in progress:45% 
Conducting the graph estimation via correlation thresholding (ct) ....in progress:50% 
Conducting the graph estimation via correlation thresholding (ct) ....in progress:55% 
Conducting the graph estimation via correlation thresholding (ct) ....in progress:60% 
Conducting the graph estimation via correlation thresholding (ct) ....in progress:65% 
Conducting the graph estimation via correlation thresholding (ct) ....in progress:70% 
Conducting the graph estimation via correlation thresholding (ct) ....in progress:75% 
Conducting the graph estimation via correlation thresholding (ct) ....in progress:80% 
Conducting the graph estimation via correlation thresholding (ct) ....in progress:85% 
Conducting the graph estimation via correlation thresholding (ct) ....in progress:90% 
Conducting the graph estimation via correlation thresholding (ct) ....in progress:95% 
Conducting the graph estimation via correlation thresholding (ct) ....in progress:100% 
Conducting the graph estimation via correlation thresholding (ct)....done.             

Conducting the graphical lasso (glasso) wtih lossless screening....in progress: 9%
Conducting the graphical lasso (glasso) wtih lossless screening....in progress: 19%
Conducting the graphical lasso (glasso) wtih lossless screening....in progress: 30%
Conducting the graphical lasso (glasso) wtih lossless screening....in progress: 40%
Conducting the graphical lasso (glasso) wtih lossless screening....in progress: 50%
Conducting the graphical lasso (glasso) wtih lossless screening....in progress: 60%
Conducting the graphical lasso (glasso) wtih lossless screening....in progress: 70%
Conducting the graphical lasso (glasso) wtih lossless screening....in progress: 80%
Conducting the graphical lasso (glasso) wtih lossless screening....in progress: 90%
Conducting the graphical lasso (glasso)....done.                                          
Conducting rotation information criterion (ric) selection....done
Computing the optimal graph....done
Conducting extended Bayesian information criterion (ebic) selection....done

huge documentation built on July 1, 2021, 1:06 a.m.