weighted fused graphical lasso

Share:

Description

wfgl estimates joint partial correlation matrices from two multivariate normal distributed datasets using an ADMM based algorithm. Allows for paired data.

Usage

1
2
3
4
5
wfgl(D1, D2, lambda1, lambda2, paired = TRUE, automLambdas = TRUE, 
     sigmaEstimate = "CRmad", pairedEst = "Reg-based-sim", maxiter = 30, 
     tol = 1e-05, nsubset = 10000, weights = c(1,1), rho=1, rho.increment = 1, 
     triangleCorrection = TRUE, alphaTri = 0.01, temporalFolders = FALSE, 
     notOnlyLambda2 = TRUE)

Arguments

D1

first population dataset in matrix n_1 \times p form.

D2

second population dataset in matrix n_2 \times p form.

lambda1

tuning parameter for sparsity in the precision matrices (sequence of lambda1 is allowed).

lambda2

tuning parameter for similarity between the precision matrices in the two populations (only one value allowed at a time).

paired

if TRUE, observations in D1 and D2 are assumed to be matched (n_1 must be equal to n_2).

automLambdas

if TRUE the lambda's are estimated automatically with lambda1 and lambda2 being expected false positive rate levels.

sigmaEstimate

robust method used to estimate the variance of estimated partial correlations: name that uniquely identifies "mad", "IQR" or "CRmad" (default). This measure is used to automatically select the tuning parameter (when automLambdas = TRUE).

pairedEst

type of estimator for the correlation of estimated partial correlation coefficients when "paired = TRUE": to select from Reg-based and Reg-based-sim (default). This measure is used to weight similarity penalization lambda2 for different pairs of variables.

maxiter

maximum number of iterations for the ADMM algorithm.

tol

convergence tolerance

nsubset

maximum number of estimated partial correlation coefficients (chosen randomly) used to select lambda1 and lambda2 automatically (when automLambdas = TRUE).

weights

weights for the two populations to find the inverse covariance matrices.

rho

regularization parameter used to compute matrix inverse by eigen value decomposition (default of 1).

rho.increment

default of 1.

triangleCorrection

if TRUE the estimated triangle graph structures are tested.

alphaTri

significance level for the tested triangle graph structures.

temporalFolders

if TRUE temporal files are created and eliminated within the procedure. It is used to free R memory space when the dimension is very large (order of thousands).

notOnlyLambda2

if FALSE only lambda2 is found automatically.

Details

wfgl uses a weighted-fused graphical lasso maximum likelihood estimator by solving:

\hat{Ω}_{WFGL}^{λ} = \arg\max\limits_{Ω_X,Ω_Y} [∑_{k=X,Y} \log\detΩ_k -tr(Ω_k S_k) - P_{λ_1,λ_2,V}(Ω_X,Ω_Y)],

with

P_{λ_1,λ_2, V}(Ω_X,Ω_Y) = λ_1||Ω_X||_1 + λ_1||Ω_Y||_1 +λ_2∑_{i,j} v_{ij} |Ω_{Y_{ij}}-Ω_{X_{ij}}|,

where λ_1 is the sparsity tuning parameter, λ_2 is the similarity tuning parameter, and V = [v_{ij}] is a p\times p matrix to weight λ_2 for each coefficient of the differential precision matrix. If datasets are independent (paired = "FALSE"), then it is assumed that v_{ij} = 1 for all pairs (i,j). Otherwise (paired = "TRUE"), weights are estimated in order to account for the dependence structure between datasets in the differential network estimation.

Lambdas can be estimated in each iteration by controlling the expected false positive rate (EFPR) in case automLambdas = TRUE. This transforms the problem of selecting the tuning parameters λ_1 and λ_2 to the selection of the desired EFPR. In case lambda2 is a single value and lambda1 is a vector with several values, then lambda selection approaches implemented at lambdaSelection can also be used.

If triangleCorrection = TRUE, the weakest edges of estimated triangular motifs are further tested. The reason is that edges that complete triangular graph structures suffer an overestimation when applying the ADMM due to using regularized inverse procedures.

Value

An object of class wfgl containing the following components:

path

adjacency matrices.

omega

precision matrices.

triangleCorrection

determines if triangle structures are tested.

weakTriangEdges

weakest edges in triangle structures which have been tested.

weakTriangEdgesPval

p-values for the weakest edge in triangle structures.

diff_value

convergence control.

iters

number of iterations used.

corEst

dependence structure estimated measure used in the estimation to account for dependent datasets.

Author(s)

Caballe, Adria <a.caballe@sms.ed.ac.uk>, Natalia Bochkina and Claus Mayer.

References

Danaher, P., P. Wang, and D. Witten (2014). The joint graphical lasso for inverse covariance estimation across multiple classes. Journal of the Royal Statistical Society: Series B (Statistical Methodology) (2006), 1-20.

Boyd, S. (2010). Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers. Foundations and Trends in Machine Learning 3(1), 1-122.

See Also

plot.wfgl for graphical representation.
wfrl for weighted fused regression lasso.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
					
# example to use of wfgl
EX2 <- pcorSimulatorJoint(nobs =50, nclusters = 3, nnodesxcluster = c(30, 30,30), 
                          pattern = "pow", diffType = "cluster", dataDepend = "diag", 
                          low.strength = 0.5, sup.strength = 0.9, pdiff = 0.5, nhubs = 5, 
                          degree.hubs = 20,  nOtherEdges = 30, alpha = 2.3, plus = 0, 
                          prob = 0.05, perturb.clust = 0.2, mu = 0, diagCCtype = "dicot", 
                          diagNZ.strength = 0.6, mixProb = 0.5, probSign = 0.7,  
                          exactZeroTh = 0.05)
## not run
#wfgl1 <- wfgl(EX2$D1, EX2$D2, lambda1 = 0.05, lambda2 = 0.1, paired = TRUE, 
#              automLambdas = TRUE, sigmaEstimate = "CRmad", pairedEst = "Reg-based-sim", 
#              maxiter = 30)
#print(wfgl1)

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.