droplasso: Fit a droplasso model
In jpvert/droplasso: Dropout lasso

Description Usage Arguments Details Value Examples

Fit a dropout lasso (droplasso) model. The regularization path is computed for the lasso component of the penalty at a grid of values for the regularization parameter lambda.

droplasso(x, y, family = c("gaussian", "binomial"), keep_prob = 0.5,
  nlambda = 10, lambda.min.ratio = ifelse(nobs < nvars, 0.01, 1e-04),
  lambda = NULL, init = matrix(0, nrow = ncol(x)), gamma0 = 1,
  decay = 1, n_passes = 1000, minibatch_size = nrow(x))

`x`	Input matrix, of dimension `nobs x nvars`; each row is an observation vector.
`y`	Response variable, a vector of length `nobs` of quantitative values for `family="gaussian"`, or of factors with two levels for `family="binomial"`.
`family`	Response type. `family="gaussian"` (default) for least squares regression, `family="binomial"` for logistic regression
`keep_prob`	The probability that each element is kept (default: `0.5`)
`nlambda`	The number of `lambda` values (default: `10`)
`lambda.min.ratio`	Smallest value for `lambda`, as a fraction of `lambda.max`, the (data derived) entry value (i.e. the smallest value for which all coefficients are zero). The default depends on the sample size `nobs` relative to the number of variables `nvars`. If `nobs > nvars`, the default is `0.0001`, close to zero. If `nobs < nvars`, the default is `0.01`.
`lambda`	The sequence of regularization parameters. By default, `lambda=NULL` lets the function estimate a good sequence by itself.
`init`	Initial model to start optimization (default: zero vector).
`gamma0`	Initial value of the learning rate (default: `1`)
`decay`	Learning rate decay (default: `1`)
`n_passes`	Number of passes over each example of the data on average (default: `1000`)
`minibatch_size`	Batch size (default: `nobs`)

Droplasso estimates a linear model by minimizing an objective function

\min_{w} R(w) + λ*||w||_1

where R(w) is the expected loss when the linear model is applied to a random training example subject to dropout noise, i.e., each coordinate is kept intact with probability keep_prob and set to zero with probability 1 - keep_prob.

Given a prediction u and a true label y, the loss is (u-y)^2 / 2 when family="gaussian", and -y*u + ln( 1+e^u ) when family="binomial" (i.e., the negative log-likelihood of the logistic regression model).

The optimization problem is solved with a stochastic proximal gradient descent algorithm, using mini-batches of size minibatch_size, and a learning rate decaying as gamma0/(1+decay*t), where t is the number of mini-batches processed.

The problem is solved for all regularization parameters provided in the lambda argument. If no lambda argument is provided, then the function automatically chooses a decreasing sequence of λ's to start from the null model and add features in the model along the regularization path. We use warm restart to start optimization for a given λ from the solution of the previous lambda, therefore it is strongly recommended to provide a sequence of λ$'s in decreasing order if a sequence is provided in the lambda argument.

An object of class "droplasso", i.e. a list with the following:

`beta`	The `nvars x nlambda` matrix of weights, one column per lambda, one row per variable
`lambda`	The sequence of lambda for which the weigth is given
`nzero`	The number of non-zero coefficient in each model
`call`	The function call

#create data:
nobs = 100
nvars = 5
x = matrix(rnorm(nobs*nvars),nrow=nobs)
b = c(1,1,0,0,0)
p = 1/(1+exp(-x%*%b))
y = p>0.5
# Fit a lasso model (no dropout)
droplasso(x, y, family="binomial", lambda=0.1, keep_prob=1)
# Fit a dropout model (no lasso)
droplasso(x, y, family="binomial", lambda=0, keep_prob=0.5)
# Fit a dropout lasso model
droplasso(x, y, family="binomial", lambda=0.1, keep_prob=0.5)