# wfgl: weighted fused graphical lasso

### Description

wfgl estimates joint partial correlation matrices from two multivariate normal distributed datasets using an ADMM based algorithm. Allows for paired data.

### Usage

 1 2 3 4 5 wfgl(D1, D2, lambda1, lambda2, paired = TRUE, automLambdas = TRUE, sigmaEstimate = "CRmad", pairedEst = "Reg-based-sim", maxiter = 30, tol = 1e-05, nsubset = 10000, weights = c(1,1), rho=1, rho.increment = 1, triangleCorrection = TRUE, alphaTri = 0.01, temporalFolders = FALSE, notOnlyLambda2 = TRUE) 

### Arguments

 D1 first population dataset in matrix n_1 \times p form. D2 second population dataset in matrix n_2 \times p form. lambda1 tuning parameter for sparsity in the precision matrices (sequence of lambda1 is allowed). lambda2 tuning parameter for similarity between the precision matrices in the two populations (only one value allowed at a time). paired if TRUE, observations in D1 and D2 are assumed to be matched (n_1 must be equal to n_2). automLambdas if TRUE the lambda's are estimated automatically with lambda1 and lambda2 being expected false positive rate levels. sigmaEstimate robust method used to estimate the variance of estimated partial correlations: name that uniquely identifies "mad", "IQR" or "CRmad" (default). This measure is used to automatically select the tuning parameter (when automLambdas = TRUE). pairedEst type of estimator for the correlation of estimated partial correlation coefficients when "paired = TRUE": to select from Reg-based and Reg-based-sim (default). This measure is used to weight similarity penalization lambda2 for different pairs of variables. maxiter maximum number of iterations for the ADMM algorithm. tol convergence tolerance nsubset maximum number of estimated partial correlation coefficients (chosen randomly) used to select lambda1 and lambda2 automatically (when automLambdas = TRUE). weights weights for the two populations to find the inverse covariance matrices. rho regularization parameter used to compute matrix inverse by eigen value decomposition (default of 1). rho.increment default of 1. triangleCorrection if TRUE the estimated triangle graph structures are tested. alphaTri significance level for the tested triangle graph structures. temporalFolders if TRUE temporal files are created and eliminated within the procedure. It is used to free R memory space when the dimension is very large (order of thousands). notOnlyLambda2 if FALSE only lambda2 is found automatically.

### Details

wfgl uses a weighted-fused graphical lasso maximum likelihood estimator by solving:

\hat{Ω}_{WFGL}^{λ} = \arg\max\limits_{Ω_X,Ω_Y} [∑_{k=X,Y} \log\detΩ_k -tr(Ω_k S_k) - P_{λ_1,λ_2,V}(Ω_X,Ω_Y)],

with

P_{λ_1,λ_2, V}(Ω_X,Ω_Y) = λ_1||Ω_X||_1 + λ_1||Ω_Y||_1 +λ_2∑_{i,j} v_{ij} |Ω_{Y_{ij}}-Ω_{X_{ij}}|,

where λ_1 is the sparsity tuning parameter, λ_2 is the similarity tuning parameter, and V = [v_{ij}] is a p\times p matrix to weight λ_2 for each coefficient of the differential precision matrix. If datasets are independent (paired = "FALSE"), then it is assumed that v_{ij} = 1 for all pairs (i,j). Otherwise (paired = "TRUE"), weights are estimated in order to account for the dependence structure between datasets in the differential network estimation.

Lambdas can be estimated in each iteration by controlling the expected false positive rate (EFPR) in case automLambdas = TRUE. This transforms the problem of selecting the tuning parameters λ_1 and λ_2 to the selection of the desired EFPR. In case lambda2 is a single value and lambda1 is a vector with several values, then lambda selection approaches implemented at lambdaSelection can also be used.

If triangleCorrection = TRUE, the weakest edges of estimated triangular motifs are further tested. The reason is that edges that complete triangular graph structures suffer an overestimation when applying the ADMM due to using regularized inverse procedures.

### Value

An object of class wfgl containing the following components:

 path  adjacency matrices. omega  precision matrices. triangleCorrection  determines if triangle structures are tested. weakTriangEdges  weakest edges in triangle structures which have been tested. weakTriangEdgesPval  p-values for the weakest edge in triangle structures. diff_value  convergence control. iters  number of iterations used. corEst  dependence structure estimated measure used in the estimation to account for dependent datasets.

### Author(s)

Caballe, Adria <a.caballe@sms.ed.ac.uk>, Natalia Bochkina and Claus Mayer.

### References

Danaher, P., P. Wang, and D. Witten (2014). The joint graphical lasso for inverse covariance estimation across multiple classes. Journal of the Royal Statistical Society: Series B (Statistical Methodology) (2006), 1-20.

Boyd, S. (2010). Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers. Foundations and Trends in Machine Learning 3(1), 1-122.

plot.wfgl for graphical representation.
wfrl for weighted fused regression lasso.

### Examples

  1 2 3 4 5 6 7 8 9 10 11 12 13 14  # example to use of wfgl EX2 <- pcorSimulatorJoint(nobs =50, nclusters = 3, nnodesxcluster = c(30, 30,30), pattern = "pow", diffType = "cluster", dataDepend = "diag", low.strength = 0.5, sup.strength = 0.9, pdiff = 0.5, nhubs = 5, degree.hubs = 20, nOtherEdges = 30, alpha = 2.3, plus = 0, prob = 0.05, perturb.clust = 0.2, mu = 0, diagCCtype = "dicot", diagNZ.strength = 0.6, mixProb = 0.5, probSign = 0.7, exactZeroTh = 0.05) ## not run #wfgl1 <- wfgl(EX2$D1, EX2$D2, lambda1 = 0.05, lambda2 = 0.1, paired = TRUE, # automLambdas = TRUE, sigmaEstimate = "CRmad", pairedEst = "Reg-based-sim", # maxiter = 30) #print(wfgl1) 

Search within the ldstatsHD package
Search all R packages, documentation and source code

Questions? Problems? Suggestions? or email at ian@mutexlabs.com.

Please suggest features or report bugs with the GitHub issue tracker.

All documentation is copyright its authors; we didn't write any of that.