filter5STAR | R Documentation |
Performs Step 2 of the 5-STAR Algorithm: Forms dummy variable matrix for factors, and fits an elastic net (ENET) or random forest (RF) model to determine which covariates to keep for building trees in 5-STAR
filter5STAR(yy, X, family = "cox", plot = FALSE, verbose = 0,
filter.hyper = filter_control(method = "ENET", lambdatype = "min",
mixparm = NULL, vimpalpha = 0.05, nfolds = 10, filterseed = 2019, ...),
vars2keep = NULL)
yy |
Response - either Surv() object for time to event data or 1 column matrix of case/control status for binary data |
X |
Data frame of all possible stratification covariates |
family |
Trait family, current options: "cox", "binomial", or "gaussian" |
plot |
Whether to make plots for filter results (e.g., variable importance plot for RF-based filtering or solution path for ENET-based filtering) |
verbose |
Numeric variable indicating amount of information to print to the terminal (0 = nothing, 1 = notes only, 2+ = notes and intermediate output) |
filter.hyper |
List of control parameters for filtering step
(see also
|
vars2keep: |
List of variable names (matching column names in X) of variables to be passed through filtering step without penalization, etc. This is ignored in the main 5-STAR algorithm but may be used when filtering step is used as a stand alone function. Currently only used when method = ENET |
cov2keep: List of all covariates kept after filtering step, selected by elastic net or random forest
For method=ENET, additionally outputs
cvout: vector containing fraction of null deviance explained,
mean cross validation error, and optimal tuning parameter values
(see cv.glmnet
and glmnet
for more details)
beta: the coefficients of glmnet fit given tuned parameters
ENETplot: plot containing the deviance over different combinations of the mixing and tuning parameters alpha and lambda, with optimal alpha shown in blue and minimum deviance point for each alpha in black (left panel), and solution path for the best mixing parameter (right panel). Output when plot = TRUE
For method=RF or RFbest, additionally outputs:
varselectmat: matrix of VIMP confidence intervals, p-values, and selection decision for each variable
VIMPplot: default variable importance CI plot, output if plot = TRUE
varselect2plot: variable selection information from subsample.rfsrc for making customizable VIMP CI plots
forest: rfsrc object
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.