boosting_core: Boosting core function

Description Usage Arguments Value Examples

View source: R/boosting_function.R

Description

This function allows you to use gradient boosting for variable selection.

Usage

1
2
3
boosting_core(formula, data, rate, num_iter = 500,
  control_method = NULL, control_parameter = NULL,
  censoring_type = "right")

Arguments

formula

a formula object with a response value using the Surv function.

data

a data.frame containing all variables specified in the formula.

rate

the desired update rate used in the boosting algorithm.

num_iter

an integer used as the number of iterations of the boosting algorithm. Default value is 500.

control_method

specifies stopping method, options include: cv, num_selected, likelihood, BIC, AIC. Default is NULL, which will use a fixed number of iterations as specified by num_iter.

control_parameter

is a list with the parameter(s) needed for each corresponding control_method option, the options are "cv_folds", "early_stop", "EBIC_gamma", "num_select", and "likelihood_tol." For cv method "cv_folds" specifies the number of cross validation folds (default is 10). For EBIC and AIC methods, "early_stop" is a TRUE/FALSE value for early stopping (default is FALSE). An additional parameter for the EBIC method is "EBIC_gamma" that is used to specify the penalty term, should be a value between 0 and 1. If using num_selected method, "num_select" will be the desired number of variables to select, should be an integer. If using likelihood as the method, "likelihood_tol" will be the small change in likelihood in which to stop once reached (default is 0.001).

censoring_type

currently only right censoring is implemented.

Value

a list containing the vector of coefficients ("beta"), variable selection matrix that contains the coefficients at each iteration ("selection_df"), the number of boosting iterations ("mstop"), and other stopping criteria if applicable to selected method. If using method BIC or AIC, the information criteria for each iteration is returned as a vector ("Information Criteria"). If using cross validation for stopping the criteria used for stopping is returned as a numeric vector ("cvrisk").

Examples

1
2
3
4
5
6
data <- simulate_survival_cox(true_beta=c(1,1,1,1,1,0,0,0,0,0))
formula <- as.formula("Surv(time,delta) ~ strata(strata_idx) + V1 + V2 + 
V3 + V4 + V5 + V6 + V7 + V8 + V9 + V10" )
boosting_core(formula, data, rate=0.1, num_iter=500)
boosting_core(formula, data, rate=0.1, control_method="num_selected",
control_parameter=list(num_select = 5))

SurvBoost documentation built on Sept. 20, 2019, 5:04 p.m.