agnesLambdaSelection: AGNES regularization parameter selection
In ldstatsHD: Linear Dependence Statistics for High-Dimensional Data

Description Usage Arguments Details Value Author(s) References See Also Examples

agnesLambdaSelection is a function designed to select the regularization parameter in graphical models. It selects the most clustered conditional dependence graph structure where clusters are defined by the hierarchical algorithm agnes (See details).

agnesLambdaSelection(obj, way = "direct", nite = 10, subsvec = NULL,
                     eps = 0.05, until = NULL, minNodes = 30, 
                     distF = c("correlation","shortPath"))

`obj`	an object of class `huge`, `camel.tiger` or `wfgl`.
`way`	name that uniquely identifies `"direct"` (default), `"rand.sampling"` for random subsets algorithm and `"int.sampling"` for intelligent subsets algorithm.
`nite`	vector with the number of iterations used for each lambda (only if `way = "rand.sampling"` or `way = "int.sampling"`).
`subsvec`	vector with the number of subsamples used for each lambda (only if `way = "rand.sampling"` or `way = "int.sampling"`). If `NULL`, argument `minNodes` determines the number of subsamples for all lambdas.
`eps`	acceptance tolerance for subsets of variables.
`until`	the last path used in `obj`. If `NULL`, all paths are used to select lambda.
`minNodes`	minimum number of nodes with connections to compute the AGNES coefficient (the coefficient is zero for paths with less nodes than `minNodes`).
`distF`	distance function used to find the dissimilarity matrix from the graph: name that uniquely identifies `"correlation"` and `"shortPath"`.

AGNES algorithm finds λ by minimizing the risk function

R_{AGNES}(λ) = -AC(λ)

where AC(λ) is the AGNES coefficient calculated using the R function agnes. Using AGNES we select the λ that maximizes the between vs within cluster dissimilarities ratio given the dissimilarity matrix of the graph (see graphCorr and graphDist for possible dissimilarities).

A variable subset selection algorithm is available to estimate AC(λ) for very high-dimensional data. It is recommended in order to save memory space and computational time. Especially way = "int.sampling" which tends to finds similar lambda selections to the default procedure.

agnesLambdaSelection gives a good recovery of global network characteristics when the true partial correlation matrix is block diagonal.

An object of class lambdaSelection containing the following components:

`opt.lambda`	optimal lambda.
`crit.coef`	coefficients for each lambda given the criterion AGNES.
`criterion`	with value `"AGNES"`.

Caballe, Adria <a.caballe@sms.ed.ac.uk>, Natalia Bochkina and Claus Mayer.

Caballe, A., N. Bochkina, and C. Mayer (2016). Selection of the Regularization Parameter in Graphical Models using network charactaristics. eprint arXiv:1509.05326, 1-25.

lambdaSelection for other lambda selection approaches and agnes for clustering implementation.

# example to use agnes function
EX1         <- pcorSimulator(nobs = 70, nclusters = 3, nnodesxcluster = c(40,30,20), 
                             pattern = "powerLaw")
y           <- EX1$y
Lambda.SEQ  <- seq(.25, 0.70, length.out=40)
out3        <- huge(y, method = "mb", lambda = Lambda.SEQ)
AG.COEF     <- agnesLambdaSelection(out3, distF = "shortPath", way = "direct")
print(AG.COEF)