| bigSurvSGD.na.omit | R Documentation |
Performs stochastic gradient descent optimisation for large-scale survival models after removing observations with missing values.
bigSurvSGD.na.omit(
formula = survival::Surv(time = time, status = status) ~ .,
data,
norm.method = "standardize",
features.mean = NULL,
features.sd = NULL,
opt.method = "AMSGrad",
beta.init = NULL,
beta.type = "averaged",
lr.const = 0.12,
lr.tau = 0.5,
strata.size = 20,
batch.size = 1,
num.epoch = 100,
b1 = 0.9,
b2 = 0.99,
eps = 1e-08,
inference.method = "plugin",
num.boot = 1000,
num.epoch.boot = 100,
boot.method = "SGD",
lr.const.boot = 0.12,
lr.tau.boot = 0.5,
num.sample.strata = 1000,
sig.level = 0.05,
beta0 = 0,
alpha = NULL,
lambda = NULL,
nlambda = 100,
num.strata.lambda = 10,
lambda.scale = 1,
parallel.flag = FALSE,
num.cores = NULL,
bigmemory.flag = FALSE,
num.rows.chunk = 1e+06,
col.names = NULL,
type = "float"
)
formula |
Model formula describing the survival outcome and the set of predictors to include in the optimisation. |
data |
Input data set or connection to a big-memory backed design
matrix that contains the variables referenced in |
norm.method |
Normalization strategy applied to the feature matrix before optimisation, for example centring or standardising columns. |
features.mean |
Optional pre-computed column means used when normalising the features so that repeated fits can reuse shared statistics. |
features.sd |
Optional pre-computed column standard deviations used in
concert with |
opt.method |
Gradient based optimisation routine to employ, such as vanilla SGD or adaptive methods like Adam. |
beta.init |
Vector of starting values for the regression coefficients supplied when warm-starting the optimisation. |
beta.type |
Indicator controlling how |
lr.const |
Base learning-rate constant used by the stochastic gradient descent routine. |
lr.tau |
Learning-rate decay horizon or damping factor that moderates the step size schedule. |
strata.size |
Number of observations drawn per stratum when building mini-batches for the optimisation loop. |
batch.size |
Total number of observations assembled into each stochastic gradient batch. |
num.epoch |
Number of passes over the training data used during the optimisation. |
b1 |
First exponential moving-average rate used by adaptive methods such as Adam to smooth gradients. |
b2 |
Second exponential moving-average rate used by adaptive methods to smooth squared gradients. |
eps |
Numerical stabilisation constant added to denominators when updating the adaptive moments. |
inference.method |
Inference approach requested after fitting, for example naive asymptotics or bootstrap resampling. |
num.boot |
Number of bootstrap replicates to draw when
|
num.epoch.boot |
Number of optimisation epochs to run within each bootstrap replicate. |
boot.method |
Type of bootstrap scheme to apply, such as ordinary or stratified resampling. |
lr.const.boot |
Learning-rate constant used during bootstrap refits. |
lr.tau.boot |
Learning-rate decay factor applied during bootstrap refits. |
num.sample.strata |
Number of strata sampled without replacement during each bootstrap iteration when stratified resampling is selected. |
sig.level |
Significance level used when constructing confidence intervals or hypothesis tests. |
beta0 |
Optional vector of coefficients under the null hypothesis when performing hypothesis tests. |
alpha |
Elastic-net mixing parameter controlling the relative weight of
|
lambda |
Sequence of regularisation strengths supplied explicitly for penalised estimation. |
nlambda |
Number of automatically generated |
num.strata.lambda |
Number of strata used when tuning |
lambda.scale |
Scale on which the |
parallel.flag |
Logical flag enabling parallel computation of gradients or bootstrap replicates. |
num.cores |
Number of processing cores to use when parallel execution is enabled. |
bigmemory.flag |
Logical flag indicating whether intermediate matrices should be stored using bigmemory backed objects. |
num.rows.chunk |
Row chunk size to use when streaming data from an on-disk matrix representation. |
col.names |
Optional character vector of column names associated with the feature matrix. |
type |
Type of survival model to fit, for example Cox proportional hazards or accelerated failure time variants. |
A fitted model object storing the learned coefficients, optimisation metadata, and any requested inference summaries. coef: Log of hazards ratio. If no inference is used, it returns a vector for estimated coefficients: If inference is used, it returns a matrix including estimates and confidence intervals of coefficients. In case of penalization, it resturns a matrix with columns corresponding to lambdas. coef.exp: Exponentiated version of coef (hazards ratio). lambda: Returns lambda(s) used for penalizarion. alpha: Returns alpha used for penalizarion. features.mean: Returns means of features, if given or calculated features.sd: Returns standard deviations of features, if given or calculated.
See Also bigSurvSGD,
bigscale for constructing normalised design matrices and
partialbigSurvSGDv0 for partial fitting pipelines.
data(micro.censure, package = "bigPLScox")
surv_data <- stats::na.omit(micro.censure[, c("survyear", "DC", "sexe", "Agediag")])
# Increase num.epoch and num.boot for real use
fit <- bigSurvSGD.na.omit(
survival::Surv(survyear, DC) ~ .,
data = surv_data,
norm.method = "standardize",
opt.method = "adam",
batch.size = 16,
num.epoch = 2,
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.