riPEER: Graph-constrained regression with penalty term being a linear... In mdpeer: Graph-Constrained Regression with Enhanced Regularization Parameters Selection

Description

Graph-constrained regression with penalty term being a linear combination of graph-based and ridge penalty terms.

See Details for model description and optimization problem formulation.

Usage

 1 2 3 4 5 6 riPEER(Q, y, Z, X = NULL, optim.metod = "rootSolve", rootSolve.x0 = c(1e-05, 1e-05), rootSolve.Q0.x0 = 1e-05, sbplx.x0 = c(1, 1), sbplx.lambda.lo = c(10^(-5), 10^(-5)), sbplx.lambda.up = c(1e+06, 1e+06), compute.boot.CI = FALSE, boot.R = 1000, boot.conf = 0.95, boot.set.seed = TRUE, boot.parallel = "multicore", boot.ncpus = 4, verbose = TRUE) 

Arguments

 Q graph-originated penalty matrix (p \times p); typically: a graph Laplacian matrix y response values matrix (n \times 1) Z design matrix (n \times p) modeled as random effects variables (to be penalized in regression modeling); assumed to be already standarized X design matrix (n \times k) modeled as fixed effects variables (not to be penalized in regression modeling); if does not contain columns of 1s, such column will be added to be treated as intercept in a model optim.metod optimization method used to optimize λ = (λ_Q, λ_R) "rootSolve" (default) - optimizes by finding roots of non-linear equations by the Newton-Raphson method; from rootSolve package "sbplx" - optimizes with the use of Subplex Algorithm: 'Subplex is a variant of Nelder-Mead that uses Nelder-Mead on a sequence of subspaces'; from nloptr package rootSolve.x0 vector containing initial guesses for λ = (λ_Q, λ_R) used in "rootSolve" algorithm rootSolve.Q0.x0 vector containing initial guess for λ_R used in "rootSolve" algorithm sbplx.x0 vector containing initial guesses for λ = (λ_Q, λ_R) used in "sbplx" algorithm sbplx.lambda.lo vector containing minimum values of λ = (λ_Q, λ_R) grid search in "sbplx" algorithm sbplx.lambda.up vector containing mximum values of λ = (λ_Q, λ_R) grid search in "sbplx" algorithm compute.boot.CI logical whether or not compute bootstrap confidence intervals for b regression coefficient estimates boot.R number of bootstrap replications used in bootstrap confidence intervals computation boot.conf confidence level assumed in bootstrap confidence intervals computation boot.set.seed logical whether or not set seed in bootstrap confidence intervals computation boot.parallel value of parallel argument in boot function in bootstrap confidence intervals computation boot.ncpus value of ncpus argument in boot function in bootstrap confidence intervals computation verbose logical whether or not set verbose mode (print out function execution messages)

Details

Estimates coefficients of linear model of the formula:

y = Xβ + Zb + \varepsilon

where:

• y - response,

• X - data matrix,

• Z - data matrix,

• β - regression coefficients, not penalized in estimation process,

• b - regression coefficients, penalized in estimation process and for whom there is, possibly a prior graph of similarity / graph of connections available.

The method uses a penalty being a linear combination of a graph-based and ridge penalty terms:

β_{est}, b_{est}= arg \; min_{β,b} \{ (y - Xβ - Zb)^T(y - Xβ - Zb) + λ_Qb^TQb + λ_Rb^Tb \}

where:

• Q - a graph-originated penalty matrix; typically: a graph Laplacian matrix,

• λ_Q - regularization parameter for a graph-based penalty term

• λ_R - regularization parameter for ridge penalty term

The two regularization parameters, λ_Q and λ_R, are estimated as ML estimators from equivalent Linear Mixed Model optimizaton problem formulation (see: References).

• Graph-originated penalty term allows imposing similarity between coefficients based on graph information given.

• Ridge-originated penalty term facilitates parameters estimation: it reduces computational issues arising from singularity in a graph-originated penalty matrix and yields plausible results in situations when graph information is not informative.

Bootstrap confidence intervals computation is available (not set as a default option).

Value

 b.est vector of b coefficient estimates beta.est vector of β coefficient estimates lambda.Q λ_Q regularization parameter value lambda.R λ_R regularization parameter value lambda.2 lambda.R/lambda.Q value boot.CI data frame with two columns, lower and upper, containing, respectively, values of lower and upper bootstrap confidence intervals for b regression coefficient estimates obj.fn.val optimization problem objective function value

References

Karas, M., Brzyski, D., Dzemidzic, M., J., Kareken, D.A., Randolph, T.W., Harezlak, J. (2017). Brain connectivity-informed regularization methods for regression. doi: https://doi.org/10.1101/117945

Examples

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 set.seed(1234) n <- 200 p1 <- 10 p2 <- 90 p <- p1 + p2 # Define graph adjacency matrix A <- matrix(rep(0, p*p), nrow = p, ncol = p) A[1:p1, 1:p1] <- 1 A[(p1+1):p, (p1+1):p] <- 1 L <- Adj2Lap(A) # Define Q penalty matrix as graph Laplacian matrix normalized) Q <- L2L.normalized(L) # Define Z,X design matrices and aoutcome y Z <- matrix(rnorm(n*p), nrow = n, ncol = p) b.true <- c(rep(1, p1), rep(0, p2)) X <- matrix(rnorm(n*3), nrow = n, ncol = 3) beta.true <- runif(3) intercept <- 0 eta <- intercept + Z %*% b.true + X %*% beta.true R2 <- 0.5 sd.eps <- sqrt(var(eta) * (1 - R2) / R2) error <- rnorm(n, sd = sd.eps) y <- eta + error ## Not run: riPEER.out <- riPEER(Q, y, Z, X) plt.df <- data.frame(x = 1:p, y = riPEER.out$b.est) ggplot(plt.df, aes(x = x, y = y, group = 1)) + geom_line() + labs("b estimates") ## End(Not run) ## Not run: # riPEER with 0.95 bootstrap confidence intervals computation riPEER.out <- riPEER(Q, y, Z, X, compute.boot.CI = TRUE, boot.R = 500) plt.df <- data.frame(x = 1:p, y = riPEER.out$b.est, lo = riPEER.out$boot.CI[,1], up = riPEER.out$boot.CI[,2]) ggplot(plt.df, aes(x = x, y = y, group = 1)) + geom_line() + geom_ribbon(aes(ymin=lo, ymax=up), alpha = 0.3) ## End(Not run) 

mdpeer documentation built on May 31, 2017, 5:21 a.m.