riPEER: Graph-constrained regression with penalty term being a linear...

Description Usage Arguments Details Value References Examples

Description

Graph-constrained regression with penalty term being a linear combination of graph-based and ridge penalty terms.

See Details for model description and optimization problem formulation.

Usage

1
2
3
4
5
6
riPEER(Q, y, Z, X = NULL, optim.metod = "rootSolve",
  rootSolve.x0 = c(1e-05, 1e-05), rootSolve.Q0.x0 = 1e-05, sbplx.x0 = c(1,
  1), sbplx.lambda.lo = c(10^(-5), 10^(-5)), sbplx.lambda.up = c(1e+06,
  1e+06), compute.boot.CI = FALSE, boot.R = 1000, boot.conf = 0.95,
  boot.set.seed = TRUE, boot.parallel = "multicore", boot.ncpus = 4,
  verbose = TRUE)

Arguments

Q

graph-originated penalty matrix (p \times p); typically: a graph Laplacian matrix

y

response values matrix (n \times 1)

Z

design matrix (n \times p) modeled as random effects variables (to be penalized in regression modeling); assumed to be already standarized

X

design matrix (n \times k) modeled as fixed effects variables (not to be penalized in regression modeling); if does not contain columns of 1s, such column will be added to be treated as intercept in a model

optim.metod

optimization method used to optimize λ = (λ_Q, λ_R)

  • "rootSolve" (default) - optimizes by finding roots of non-linear equations by the Newton-Raphson method; from rootSolve package

  • "sbplx" - optimizes with the use of Subplex Algorithm: 'Subplex is a variant of Nelder-Mead that uses Nelder-Mead on a sequence of subspaces'; from nloptr package

rootSolve.x0

vector containing initial guesses for λ = (λ_Q, λ_R) used in "rootSolve" algorithm

rootSolve.Q0.x0

vector containing initial guess for λ_R used in "rootSolve" algorithm

sbplx.x0

vector containing initial guesses for λ = (λ_Q, λ_R) used in "sbplx" algorithm

sbplx.lambda.lo

vector containing minimum values of λ = (λ_Q, λ_R) grid search in "sbplx" algorithm

sbplx.lambda.up

vector containing mximum values of λ = (λ_Q, λ_R) grid search in "sbplx" algorithm

compute.boot.CI

logical whether or not compute bootstrap confidence intervals for b regression coefficient estimates

boot.R

number of bootstrap replications used in bootstrap confidence intervals computation

boot.conf

confidence level assumed in bootstrap confidence intervals computation

boot.set.seed

logical whether or not set seed in bootstrap confidence intervals computation

boot.parallel

value of parallel argument in boot function in bootstrap confidence intervals computation

boot.ncpus

value of ncpus argument in boot function in bootstrap confidence intervals computation

verbose

logical whether or not set verbose mode (print out function execution messages)

Details

Estimates coefficients of linear model of the formula:

y = Xβ + Zb + \varepsilon

where:

The method uses a penalty being a linear combination of a graph-based and ridge penalty terms:

β_{est}, b_{est}= arg \; min_{β,b} \{ (y - Xβ - Zb)^T(y - Xβ - Zb) + λ_Qb^TQb + λ_Rb^Tb \}

where:

The two regularization parameters, λ_Q and λ_R, are estimated as ML estimators from equivalent Linear Mixed Model optimizaton problem formulation (see: References).

Bootstrap confidence intervals computation is available (not set as a default option).

Value

b.est

vector of b coefficient estimates

beta.est

vector of β coefficient estimates

lambda.Q

λ_Q regularization parameter value

lambda.R

λ_R regularization parameter value

lambda.2

lambda.R/lambda.Q value

boot.CI

data frame with two columns, lower and upper, containing, respectively, values of lower and upper bootstrap confidence intervals for b regression coefficient estimates

obj.fn.val

optimization problem objective function value

References

Karas, M., Brzyski, D., Dzemidzic, M., J., Kareken, D.A., Randolph, T.W., Harezlak, J. (2017). Brain connectivity-informed regularization methods for regression. doi: https://doi.org/10.1101/117945

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
set.seed(1234)
n <- 200
p1 <- 10
p2 <- 90
p <- p1 + p2
# Define graph adjacency matrix
A <- matrix(rep(0, p*p), nrow = p, ncol = p)
A[1:p1, 1:p1] <- 1
A[(p1+1):p, (p1+1):p] <- 1
L <- Adj2Lap(A)
# Define Q penalty matrix as graph Laplacian matrix normalized)
Q <- L2L.normalized(L)
# Define Z,X design matrices and aoutcome y
Z <- matrix(rnorm(n*p), nrow = n, ncol = p)
b.true <- c(rep(1, p1), rep(0, p2))
X <- matrix(rnorm(n*3), nrow = n, ncol = 3)
beta.true <- runif(3)
intercept <- 0
eta <- intercept + Z %*% b.true + X %*% beta.true
R2 <- 0.5
sd.eps <- sqrt(var(eta) * (1 - R2) / R2)
error <- rnorm(n, sd = sd.eps)
y <- eta + error

## Not run: 
riPEER.out <- riPEER(Q, y, Z, X)
plt.df <- data.frame(x = 1:p, y = riPEER.out$b.est)
ggplot(plt.df, aes(x = x, y = y, group = 1)) + geom_line() + labs("b estimates")

## End(Not run)

## Not run: 
# riPEER with 0.95 bootstrap confidence intervals computation
riPEER.out <- riPEER(Q, y, Z, X, compute.boot.CI = TRUE, boot.R = 500)
plt.df <- data.frame(x = 1:p, 
                     y = riPEER.out$b.est, 
                     lo = riPEER.out$boot.CI[,1], 
                     up =  riPEER.out$boot.CI[,2])
ggplot(plt.df, aes(x = x, y = y, group = 1)) + geom_line() +  
  geom_ribbon(aes(ymin=lo, ymax=up), alpha = 0.3)

## End(Not run)

mdpeer documentation built on May 2, 2019, 3:36 p.m.