| picasso | R Documentation |
picasso fits regularization paths for sparse Gaussian, logistic
(family = "binomial"), Poisson, and sqrt-lasso models using lasso,
MCP, or SCAD penalties.
picasso(X, Y, lambda = NULL, nlambda = 100, lambda.min.ratio =
0.05, family = "gaussian", method = "l1",
type.gaussian = "naive", gamma = 3, df = NULL,
dfmax = NULL, standardize = TRUE, intercept = TRUE,
prec = 1e-07, max.ite = 1000, verbose = FALSE)
X |
Numeric design matrix with |
Y |
Response vector of length |
lambda |
Optional sequence of decreasing regularization values. If |
nlambda |
Number of regularization values to generate when |
lambda.min.ratio |
Smallest generated |
family |
Model family. Supported values are |
method |
Penalty type. Supported values are |
type.gaussian |
Gaussian solver mode. |
gamma |
Concavity parameter for MCP and SCAD penalties. The default is |
df |
Reserved for backward compatibility with older Gaussian covariance-update implementations. It is currently not used by the active solver backend. |
dfmax |
Maximum number of nonzero coefficients for early stopping on the lambda
path. When the number of nonzero coefficients exceeds |
standardize |
If |
intercept |
If |
prec |
Convergence tolerance. The default is |
max.ite |
Maximum number of iterations. The default is |
verbose |
If |
When lambda is not supplied, picasso constructs a logarithmically
spaced path between \lambda_{\max} and
\lambda_{\min} = \mathrm{lambda.min.ratio} \cdot \lambda_{\max}.
The method solves a penalized optimization problem:
\min_{\beta_0, \beta}\; \mathcal{L}(\beta_0,\beta) + \sum_{j=1}^d p_{\lambda,\gamma}(|\beta_j|),
where p_{\lambda,\gamma} is lasso, MCP, or SCAD.
Loss functions by family are:
"gaussian":
\mathcal{L}(\beta_0,\beta)=\frac{1}{2n}\lVert Y-\beta_0\mathbf{1}-X\beta \rVert_2^2.
"binomial":
\mathcal{L}(\beta_0,\beta)=\frac{1}{n}\sum_{i=1}^n[\log(1+\exp(\beta_0+x_i^\top\beta))-y_i(\beta_0+x_i^\top\beta)].
"poisson":
\mathcal{L}(\beta_0,\beta)=\frac{1}{n}\sum_{i=1}^n[\exp(\beta_0+x_i^\top\beta)-y_i(\beta_0+x_i^\top\beta)].
"sqrtlasso":
\mathcal{L}(\beta_0,\beta)=\frac{1}{\sqrt{n}}\lVert Y-\beta_0\mathbf{1}-X\beta \rVert_2.
A fitted object of class "gaussian", "logit",
"poisson", or "sqrtlasso" depending on family.
Common components include:
Sparse matrix of fitted coefficients. Columns correspond to regularization parameters in the solution path.
Intercept values along the solution path.
Regularization sequence used for fitting.
Number of regularization values.
Degrees of freedom (number of nonzero coefficients) along the path.
Penalty type used for fitting.
Iteration information returned by the solver.
Input verbose value.
Elapsed fitting time.
Input family value.
Jason Ge, Xingguo Li, Haoming Jiang, Mengdi Wang, Tong Zhang, Han Liu and Tuo Zhao
Maintainer: Tuo Zhao <tourzhao@gatech.edu>
1. J. Friedman, T. Hastie, H. Hofling, and R. Tibshirani. Pathwise coordinate optimization. The Annals of Applied Statistics, 2007.
2. C.-H. Zhang. Nearly unbiased variable selection under minimax concave penalty. The Annals of Statistics, 2010.
3. J. Fan and R. Li. Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 2001.
4. R. Tibshirani, J. Bien, J. Friedman, T. Hastie, N. Simon, J. Taylor, and R. Tibshirani. Strong rules for discarding predictors in lasso-type problems. Journal of the Royal Statistical Society: Series B, 2012.
5. J. Ge, X. Li, H. Jiang, M. Wang, T. Zhang, H. Liu, and T. Zhao. PICASSO: A sparse learning library for high-dimensional data analysis in R and Python. arXiv preprint https://arxiv.org/abs/2006.15261.
picasso-package.
################################################################
## Sparse linear regression
## Generate the design matrix and regression coefficient vector
n = 100 # sample number
d = 80 # sample dimension
c = 0.5 # correlation parameter
s = 20 # support size of coefficient
set.seed(2016)
X = scale(matrix(rnorm(n*d),n,d)+c*rnorm(n))/sqrt(n-1)*sqrt(n)
beta = c(runif(s), rep(0, d-s))
## Generate response using Gaussian noise, and fit sparse linear models
noise = rnorm(n)
Y = X%*%beta + noise
## l1 regularization solved with naive update
fitted.l1.naive = picasso(X, Y, nlambda=100, type.gaussian="naive")
## covariance update (faster when n >> d)
fitted.l1.covariance = picasso(X, Y, nlambda=100, type.gaussian="covariance")
## early stopping: stop when more than 10 nonzero coefficients
fitted.l1.dfmax = picasso(X, Y, nlambda=100, dfmax=10)
## mcp regularization
fitted.mcp = picasso(X, Y, nlambda=100, method="mcp")
## scad regularization
fitted.scad = picasso(X, Y, nlambda=100, method="scad")
## lambdas used
print(fitted.l1.naive$lambda)
## number of nonzero coefficients for each lambda
print(fitted.l1.naive$df)
## coefficients and intercept for the i-th lambda
i = 30
print(fitted.l1.naive$lambda[i])
print(fitted.l1.naive$beta[,i])
print(fitted.l1.naive$intercept[i])
## Visualize the solution path
plot(fitted.l1.naive)
plot(fitted.l1.covariance)
plot(fitted.mcp)
plot(fitted.scad)
################################################################
## Sparse logistic regression
## Generate the design matrix and regression coefficient vector
n <- 100 # sample number
d <- 80 # sample dimension
c <- 0.5 # parameter controlling the correlation between columns of X
s <- 20 # support size of coefficient
set.seed(2016)
X <- scale(matrix(rnorm(n*d),n,d)+c*rnorm(n))/sqrt(n-1)*sqrt(n)
beta <- c(runif(s), rep(0, d-s))
## Generate response and fit sparse logistic models
p = 1/(1+exp(-X%*%beta))
Y = rbinom(n, rep(1,n), p)
## l1 regularization
fitted.l1 = picasso(X, Y, nlambda=100, family="binomial", method="l1")
## mcp regularization
fitted.mcp = picasso(X, Y, nlambda=100, family="binomial", method="mcp")
## scad regularization
fitted.scad = picasso(X, Y, nlambda=100, family="binomial", method="scad")
## lambdas used
print(fitted.l1$lambda)
## number of nonzero coefficients for each lambda
print(fitted.l1$df)
## coefficients and intercept for the i-th lambda
i = 30
print(fitted.l1$lambda[i])
print(fitted.l1$beta[,i])
print(fitted.l1$intercept[i])
## Visualize the solution path
plot(fitted.l1)
## Estimate of Bernoulli parameters
param.l1 = fitted.l1$p
################################################################
## Sparse poisson regression
## Generate the design matrix and regression coefficient vector
n <- 100 # sample number
d <- 80 # sample dimension
c <- 0.5 # parameter controlling the correlation between columns of X
s <- 20 # support size of coefficient
set.seed(2016)
X <- scale(matrix(rnorm(n*d),n,d)+c*rnorm(n))/sqrt(n-1)*sqrt(n)
beta <- c(runif(s), rep(0, d-s))/sqrt(s)
## Generate response and fit sparse poisson models
p = X%*%beta+rnorm(n)
Y = rpois(n, exp(p))
## l1 regularization
fitted.l1 = picasso(X, Y, nlambda=100, family="poisson", method="l1")
## mcp regularization
fitted.mcp = picasso(X, Y, nlambda=100, family="poisson", method="mcp")
## scad regularization
fitted.scad = picasso(X, Y, nlambda=100, family="poisson", method="scad")
## lambdas used
print(fitted.l1$lambda)
## number of nonzero coefficients for each lambda
print(fitted.l1$df)
## coefficients and intercept for the i-th lambda
i = 30
print(fitted.l1$lambda[i])
print(fitted.l1$beta[,i])
print(fitted.l1$intercept[i])
## Visualize the solution path
plot(fitted.l1)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.