Description Usage Arguments Value Author(s) References
dcalasso
fits adaptive lasso for big datasets using multiple linearization methods,
including one-step estimation and least square approximation. This function is able to
fit the adaptive lasso model either when the dataset is being loaded as a whole into data
or when
the datasets are splitted a priori and saved into multiple rds
files.
The algorithm uses a divide-and-conquer one-step estimator as the initial estimator
and uses a least square approximation to the partial likelihood, which
reduces the computation cost. The algorithm currently supports adaptive lasso with
Cox proportional hazards model with or without
time-dependent covariates. Ties in survival data analysis are handled by Efron's method.
The first half of the routine computes an initial estimator (n^1/2 consistent estimator). It first obtains a warm-start by
fitting coxph to the first subset (first random split of data or first data file indicated by data.rds) and then uses one-step
estimation with iter.os rounds to update the warm-start. The one-step estimation loops through each subset and gathering scores
and information matrices. The second half of the routine then shrinks the initial estimator using a least square approximation-based adaptive lasso step.
1 2 3 |
formula |
a formula specifying the model. For Cox model, the outcome should be specified as the Surv(start, stop, status) or Surv(start, status) object in the survival package. |
family |
For Cox model, family should be cox.ph(), or "cox.ph". |
data |
data frame containing all variables. |
data.rds |
when the dataset is too big to load as a whole into the RAM, one can specify |
weights |
a prior weights on each observation |
subset |
an expression indicating subset of rows of data used in model fitting |
na.action |
how to handle NA |
offset |
an offset term with a fixed coefficient of one |
lambda |
tuning parameter for the adaptive lasso penalty. penalty = lambda * sum_j |beta_j|/|beta_j initial|^gamma |
gamma |
exponent of the adaptive penalty. penalty = lambda * sum_j |beta_j|/|beta_j initial|^gamma |
K |
number of division of the full dataset. It will be overwritten to |
iter.os |
number of iterations for one-step updates |
ncores |
number of cores to use. The iterations will be paralleled using |
coefficients.pen |
adaptive lasso shrinkage estimation |
coefficients.unpen |
initial unregularized estimator |
cov.unpen |
variance-covariance matrix of unpenalized model |
cov.pen |
variance-covariance matrix of penalized model |
BIC |
sequence of BIC evaluation at each lambda |
n.pen |
number use to penalize the degrees of freedom in BIC. |
n |
number of used rows of the data |
idx.opt |
index for the optimal BIC |
BIC.opt |
minimal BIC |
family |
family object of the model |
lamba.opt |
optimal lambda to minimize BIC |
df |
degrees of freedom at each lambda |
p |
number of covariates |
iter |
number of one-step iterations |
Terms |
term object of the model |
Yan Wang yaw719@mail.harvard.edu, Tianxi Cai tcai@hsph.harvard.edu, Chuan Hong <Chuan_Hong@hms.harvard.edu>
Wang, Yan, Chuan Hong, Nathan Palmer, Qian Di, Joel Schwartz, Isaac Kohane, and Tianxi Cai. "A Fast Divide-and-Conquer Sparse Cox Regression." arXiv preprint arXiv:1804.00735 (2018).
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.