# picasso: PathwIse CAlibrated Sparse Shooting algOrithm (PICASSO) In picasso: Pathwise Calibrated Sparse Shooting Algorithm

## Description

The function "picasso" implements the user interface.

## Usage

 1 2 3 4 picasso(X, Y, lambda = NULL, nlambda = 100, lambda.min.ratio = 0.05, lambda.min = NULL, family = "gaussian", method="l1", type.gaussian = "naive", gamma = 3, df = NULL, standardize = TRUE, prec = 1e-7, max.ite = 1000, verbose = FALSE) 

## Arguments

 X For sparse linear regression and sparse logistic regression, X is an n by d design matrix where n is the sample size and d is the data dimension. Y Y is the n dimensional response vector. Y is numeric vector for family=gaussian'', or a two-level factor for family=binomial'', or a non-negative integer vector representing counts for family = gaussian''. lambda A sequence of decresing positive values to control the regularization. Typical usage is to leave the input lambda = NULL and have the program compute its own lambda sequence based on nlambda and lambda.min.ratio. Users can also specify a sequence to override this. Default value is from lambda.max to lambda.min.ratio*lambda.max. The default value of lambda.max is the minimum regularization parameter which yields an all-zero estimates. nlambda The number of values used in lambda. Default value is 100. lambda.min.ratio The smallest value for lambda, as a fraction of the uppperbound (MAX) of the regularization parameter. The program can automatically generate lambda as a sequence of length = nlambda starting from MAX to lambda.min.ratio*MAX in log scale. The default value is 0.05.

Caution: logistic and poisson regression can be ill-conditioned if lambda is too small for nonconvex penalty. We suggest the user to avoid using any lambda.min.raito smaller than 0.05 for logistic/poisson regression under nonconvex penalty.

 lambda.min The smallest value for lambda. If $lambda.min.ratio is provided, then it is set to lambda.min.ratio*MAX, where MAX is the uppperbound of the regularization parameter. The default value is 0.1*MAX. family Options for model. Sparse linear regression and sparse multivariate regression is applied if family = "gaussian", sparse logistic regression is applied if family = "binomial" and sparse poisson regression is applied if family = "poisson". The default value is "gaussian". method Options for regularization. Lasso is applied if method = "l1", MCP is applied if method = "mcp" and SCAD Lasso is applied if method = "scad". The default value is "l1". type.gaussian Options for updating residuals in sparse linear regression. The naive update rule is applied if opt = "naive", and the covariance update rule is applied if opt = "covariance". The default value is "naive". gamma The concavity parameter for MCP and SCAD. The default value is 3. df Maximum degree of freedom for the covariance update. The default value is 2*n. standardize Design matrix X will be standardized to have mean zero and unit standard deviation if standardize = TRUE. The default value is TRUE. max.ite The iteration limit. The default value is 1000. prec Stopping precision. The default value is 1e-7. verbose Tracing information is disabled if verbose = FALSE. The default value is FALSE. ## Details For sparse linear regression, \min_{β} {\frac{1}{2n}}|| Y - X β - β_0||_2^2 + λ R(β), where R(β) can be \ell_1 norm, MCP, SCAD regularizers. For sparse logistic regression, \min_{β} {\frac{1}{n}}∑_{i=1}^n (\log(1+e^{x_i^T β+ β_0}) - y_i x_i^T β) + λ R(β), where R(β) can be \ell_1 norm, MCP, and SCAD regularizers. For sparse poisson regression, \min_{β} {\frac{1}{n}}∑_{i=1}^n (e^{x_i^T β + β_0} - y_i (x_i^T β+β_0) + λ R(β), where R(β) can be \ell_1 norm, MCP or SCAD regularizers. ## Value An object with S3 classes "gaussian", "binomial", and "poisson" corresponding to sparse linear regression, sparse logistic regression, and sparse poisson regression respectively is returned:  beta A matrix of regression estimates whose columns correspond to regularization parameters for sparse linear regression and sparse logistic regression. A list of matrices of regression estimation corresponding to regularization parameters for sparse column inverse operator. intercept The value of intercepts corresponding to regularization parameters for sparse linear regression, and sparse logistic regression. Y The value of Y used in the program. X The value of X used in the program. lambda The sequence of regularization parameters lambda used in the program. nlambda The number of values used in lambda. family The family from the input. method The method from the input. path A list of d by d adjacency matrices of estimated graphs as a graph path corresponding to lambda. sparsity The sparsity levels of the graph path for sparse inverse column operator. standardize The standardize from the input. df The degree of freecom (number of nonzero coefficients) along the solution path for sparse linear regression, nd sparse logistic regression. ite A list of vectors where the i-th entries of ite[[1]] and ite[[2]] correspond to the outer iteration and inner iteration of i-th regularization parameter respectively. verbose The verbose from the input. ## Author(s) Jason Ge, Xingguo Li, Mengdi Wang, Tong Zhang, Han Liu and Tuo Zhao Maintainer: Jason Ge <[email protected]> ## References 1. J. Friedman, T. Hastie and H. Hofling and R. Tibshirani. Pathwise coordinate optimization. The Annals of Applied Statistics, 2007. 2. C.H. Zhang. Nearly unbiased variable selection under minimax concave penalty. Annals of Statistics, 2010. 3. J. Fan and R. Li. Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 2001. 4. R. Tibshirani, J. Bien, J. Friedman, T. Hastie, N. Simon, J. Taylor and R. Tibshirani. Strong rules for discarding predictors in lasso-type problems. Journal of the Royal Statistical Society: Series B, 2012. 5. T. Zhao, H. Liu, and T. Zhang. A General Theory of Pathwise Coordinate Optimization. Techinical Report, Princeton Univeristy. ## See Also picasso-package. ## Examples   1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 ################################################################ ## Sparse linear regression ## Generate the design matrix and regression coefficient vector n = 100 # sample number d = 80 # sample dimension c = 0.5 # correlation parameter s = 20 # support size of coefficient set.seed(2016) X = scale(matrix(rnorm(n*d),n,d)+c*rnorm(n))/sqrt(n-1)*sqrt(n) beta = c(runif(s), rep(0, d-s)) ## Generate response using Gaussian noise, and fit sparse linear models noise = rnorm(n) Y = X%*%beta + noise ## l1 regularization solved with naive update fitted.l1.naive = picasso(X, Y, nlambda=100, type.gaussian="naive") ## l1 regularization solved with covariance update fitted.l1.covariance = picasso(X, Y, nlambda=100, type.gaussian="covariance") ## mcp regularization fitted.mcp = picasso(X, Y, nlambda=100, method="mcp") ## scad regularization fitted.scad = picasso(X, Y, nlambda=100, method="scad") ## lambdas used print(fitted.l1.naive$lambda) ## number of nonzero coefficients for each lambda print(fitted.l1.naive$df) ## coefficients and intercept for the i-th lambda i = 30 print(fitted.l1.naive$lambda[i]) print(fitted.l1.naive$beta[,i]) print(fitted.l1.naive$intercept[i]) ## Visualize the solution path plot(fitted.l1.naive) plot(fitted.l1.covariance) plot(fitted.mcp) plot(fitted.scad) ################################################################ ## Sparse logistic regression ## Generate the design matrix and regression coefficient vector n <- 100 # sample number d <- 80 # sample dimension c <- 0.5 # parameter controlling the correlation between columns of X s <- 20 # support size of coefficient set.seed(2016) X <- scale(matrix(rnorm(n*d),n,d)+c*rnorm(n))/sqrt(n-1)*sqrt(n) beta <- c(runif(s), rep(0, d-s)) ## Generate response and fit sparse logistic models p = 1/(1+exp(-X%*%beta)) Y = rbinom(n, rep(1,n), p) ## l1 regularization fitted.l1 = picasso(X, Y, nlambda=100, family="binomial", method="l1") ## mcp regularization fitted.mcp = picasso(X, Y, nlambda=100, family="binomial", method="mcp") ## scad regularization fitted.scad = picasso(X, Y, nlambda=100, family="binomial", method="scad") ## lambdas used print(fitted.l1$lambda) ## number of nonzero coefficients for each lambda print(fitted.l1$df) ## coefficients and intercept for the i-th lambda i = 30 print(fitted.l1$lambda[i]) print(fitted.l1$beta[,i]) print(fitted.l1$intercept[i]) ## Visualize the solution path plot(fitted.l1) ## Estimate of Bernoulli parameters param.l1 = fitted.l1$p ################################################################ ## Sparse poisson regression ## Generate the design matrix and regression coefficient vector n <- 100 # sample number d <- 80 # sample dimension c <- 0.5 # parameter controlling the correlation between columns of X s <- 20 # support size of coefficient set.seed(2016) X <- scale(matrix(rnorm(n*d),n,d)+c*rnorm(n))/sqrt(n-1)*sqrt(n) beta <- c(runif(s), rep(0, d-s))/sqrt(s) ## Generate response and fit sparse poisson models p = X%*%beta+rnorm(n) Y = rpois(n, exp(p)) ## l1 regularization fitted.l1 = picasso(X, Y, nlambda=100, family="poisson", method="l1") ## mcp regularization fitted.mcp = picasso(X, Y, nlambda=100, family="poisson", method="mcp") ## scad regularization fitted.scad = picasso(X, Y, nlambda=100, family="poisson", method="scad") ## lambdas used print(fitted.l1$lambda) ## number of nonzero coefficients for each lambda print(fitted.l1$df) ## coefficients and intercept for the i-th lambda i = 30 print(fitted.l1$lambda[i]) print(fitted.l1$beta[,i]) print(fitted.l1\$intercept[i]) ## Visualize the solution path plot(fitted.l1) 

picasso documentation built on July 21, 2017, 1:02 a.m.