lrpsadmm.path: Compute the LRpS estimator along a path (for a fixed value of...

Description Usage Arguments Details Value Examples

View source: R/fit_path.R

Description

The penalty for the LRpS estimator is written as λ_1 ||S||_1 + λ_2 Trace(L) in the objective function of lrpsadmm. This can be equivalently rewritten in terms of the regularisation parameters λ and γ as follows

λ γ ||S||_1 + λ (1 - γ) Trace(L),

for γ \in (0, 1). This function estimates the path of the estimator for a fixed value of γ (which controls the trade-off between the two penalties) by varying the value of λ. See the documentation of lrpsadmm and references therein for more details.

Usage

1
2
3
4
lrpsadmm.path(Sigma, gamma, lambdas = NULL, lambda.max = NULL,
  lambda.ratio = 1e-04, n.lambdas = 20, max.sparsity = 0.5,
  max.rank = NA, rel_tol = 0.01, abs_tol = 1e-04, max.iter = 2000,
  mu = 1, zeros = NULL, verbose = FALSE, backend = "RcppEigen")

Arguments

Sigma

A p x p matrix. An estimate of the correlation matrix

gamma

A real between 0 and 1. The value of the tuning parameter gamma in the parametrisation of the penalty described avove. This is the trade-off between the sparse and trace penalties.

lambdas

A decreasing sequence of values of lambda. See Details for the default value.

lambda.max

A positive real. Maximum value of lambda. See Details.

lambda.ratio

A real between 0 and 1. The smallest value of lambda is given by lambda.max * lambda.ratio. See Details.

n.lambdas

A positive integer. The number of values of lambda to generate according a geometric sequence between lambda.max and lambda.max * lambda.ratio. See Details.

max.sparsity

A real between 0 and 1. Abort the computation of the path if S becomes denser than this value.

max.rank

A real between 0 and 1. Abort the computuation of the path if the rank of L becomes higher than this value.

rel_tol

rel_tol parameter of the lrpsadmm function.

abs_tol

rel_tol parameter of the lrpsadmm function.

max.iter

max.iter parameter of the lrpsadmm function.

mu

mu parameter of the lrpsadmm function.

zeros

A p x p matrix with entries set to 0 or 1. Whereever its entries are 0, the entries of the estimated S will be forced to 0.

verbose

A boolean. Whether to print the value of lambda, gamma, sparsity of S and rank of L after each fit.

backend

The backend parameter of lrpsadmm. It is one of 'R' or 'RcppEigen'.

Details

The function lrpsadmm is fitted for successive values of λ using warm starts. The sequence of values of λ can be provided directly by the user. It is automatically sorted in decreasing order. By default, a decreasing sequence of 20 values within a reasonable range is selected as follows. We set λ_{max} = \max_{ij, i \neq j} |Σ_{ij}|/γ and λ_{min} = λ_{max} * lambda.ratio; then 20 values between λ_{max} and λ_{min} are taken following a geometric progression.

Because it does not make much sense to fit this estimator when the sparse estimate S becomes too dense or if the rank of the low-rank estimate L becomes too high, the computation of the path is aborted when the sparsity of S reaches max.sparsity or when the rank of L reaches max.rank.

Value

An object of class lrpsaddmpath. This is essentially a list (see examples). Each element is itself a list with keys:

lambda

Value of lambda used for that fit. Recall that the value of the tuning parameters λ_1, λ_2 is given by λ_1 = lambda * gamma and λ_2 = lambda * (1 - gamma)

gamma

Value of gamma used for that fit.

lambda1

Corresponds to the parameter l1 given as argument to the lrpsadmm function. lambda1 = lambda * gamma

lambda2

Corresponds to the parameter l2 given as argument to the lrpsadmm function. lambda2 = lambda * (1 - gamma)

number.of.edges

Number of edges in the estimated sparse graphical model.

rank.L

Rank of the estimated low-rank matrix L.

sparsity

Sparsity of the estimated sparse matrix. This is fraction of entries that are non-zero.

fit

An object of class lrpsadmm. This is the outcome of calling lrpsadmm with tuning parameters l1 = lambda * gamma and l2 = lambda (1 - gamma). See the documentation of the function lrpsadmm for more information.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
set.seed(0)
# Generate data with a well-powered dataset
sim.data <- generate.latent.ggm.data(n=2000, p=100, h=5, outlier.fraction = 0.0,
                                     sparsity = 0.02, sparsity.latent = 0.7)
X <- sim.data$obs.data; Sigma <- cor(X) # Sample correlation matrix


gamma <- 0.1 # Some reasonble value for gamma
# We ask for 30 lambdas, but the sparse graph becomes too dense so the
# computation is stopped.
my.path <- lrpsadmm.path(Sigma = Sigma, gamma = gamma,
                         lambda.ratio = 1e-03, n.lambdas = 30,
                         verbose = TRUE, rel_tol = 1e-04, abs_tol=1e-06)

# This time let us ask for 30 values,
# but let us narrow down the range by using a
# a smaller ratio
my.path <- lrpsadmm.path(Sigma = Sigma, gamma = gamma,
                         lambda.max = 0.96, lambda.ratio = 0.1, n.lambdas = 30, 
                         verbose = TRUE, rel_tol = 1e-04, abs_tol=1e-06)

# Plot some basic information about the path
plot(my.path)
# Look at the first graph in the path
plot(my.path[[1]]$fit)
# Because this is simulated data, we know the ground truth
# Let us use it to compute the precsion and recall metrics
# along the path
ground.truth <- sim.data$precision.matrix[1:100, 1:100]
# Remove the elements along the diagonal. Keep a matrix of 0s and 1s
ground.truth <- 1 * (( ground.truth - diag(diag(ground.truth)) ) !=0)
# There is a new plot with the precision / recall curve
plot(my.path, ground.truth = ground.truth)

### Let us use a robust estimator of the correlation matrix
# Generate data with 5% of outliers
set.seed(0)
sim.data <- generate.latent.ggm.data(n=2000, p=100, h=5, outlier.fraction = 0.05,
                                     sparsity = 0.02, sparsity.latent = 0.7)
ground.truth <- sim.data$precision.matrix[1:100, 1:100]
# Remove the elements along the diagonal. Keep a matrix of 0s and 1s
ground.truth <- 1 * (( ground.truth - diag(diag(ground.truth)) ) !=0)
X <- sim.data$obs.data;
Sigma <- cor(X) # Sample correlation matrix
Sigma.Kendall <- Kendall.correlation.estimator(X) # The robust estimator

# With that many strong outliers, using the sample corr. mat.
# is not going to work well
gamma <- 0.2
my.path <- lrpsadmm.path(Sigma = Sigma, gamma = gamma,
                         lambda.ratio = 1e-02, n.lambdas = 30, verbose = TRUE)
# Use another estimator for the correlation matrix:
my.robust.path <- lrpsadmm.path(Sigma = Sigma.Kendall, gamma = gamma,
                                lambda.ratio = 1e-01, n.lambdas = 30, verbose = TRUE)
# The output of the sample correlation path is poor (in terms of prec/recall)
# This is pretty much noise
plot(my.path, ground.truth)
# The Kendall estimator produces far better results.
# It is not affected by the 5% of outliers
plot(my.robust.path, ground.truth)

benjaminfrot/lrpsadmm documentation built on Oct. 19, 2019, 8:13 a.m.