Lrnr_screener_correlation: Correlation Screening Procedures

Lrnr_screener_correlationR Documentation

Correlation Screening Procedures

Description

This learner provides covariate screening procedures by running a test of correlation (Pearson default) with the cor.test function, and then selecting the (1) top ranked variables (default), or (2) the variables with a pvalue lower than some pre-specified threshold.

Format

R6Class object.

Value

Learner object with methods for training and prediction. See Lrnr_base for documentation on learners.

Parameters

method = 'pearson'

Correlation coefficient used for test.

type = c('rank', 'threshold')

Screen covariates by (1) rank (default), which chooses the top num_screen correlated covariates; or (2) threshold, which chooses covariates with a correlation- test- based pvalue lower the threshold and a minimum of min_screen covariates.

num_screen = 5

Number of covariates to select.

pvalue_threshold = 0.1

Maximum p-value threshold. Covariates with a pvalue lower than this threshold will be retained, and at least min_screen most significant covariates will be selected.

min_screen = 2

Minimum number of covariates to select. Used in pvalue_threshold screening procedure.

See Also

Other Learners: Custom_chain, Lrnr_HarmonicReg, Lrnr_arima, Lrnr_bartMachine, Lrnr_base, Lrnr_bayesglm, Lrnr_caret, Lrnr_cv_selector, Lrnr_cv, Lrnr_dbarts, Lrnr_define_interactions, Lrnr_density_discretize, Lrnr_density_hse, Lrnr_density_semiparametric, Lrnr_earth, Lrnr_expSmooth, Lrnr_gam, Lrnr_ga, Lrnr_gbm, Lrnr_glm_fast, Lrnr_glm_semiparametric, Lrnr_glmnet, Lrnr_glmtree, Lrnr_glm, Lrnr_grfcate, Lrnr_grf, Lrnr_gru_keras, Lrnr_gts, Lrnr_h2o_grid, Lrnr_hal9001, Lrnr_haldensify, Lrnr_hts, Lrnr_independent_binomial, Lrnr_lightgbm, Lrnr_lstm_keras, Lrnr_mean, Lrnr_multiple_ts, Lrnr_multivariate, Lrnr_nnet, Lrnr_nnls, Lrnr_optim, Lrnr_pca, Lrnr_pkg_SuperLearner, Lrnr_polspline, Lrnr_pooled_hazards, Lrnr_randomForest, Lrnr_ranger, Lrnr_revere_task, Lrnr_rpart, Lrnr_rugarch, Lrnr_screener_augment, Lrnr_screener_coefs, Lrnr_screener_importance, Lrnr_sl, Lrnr_solnp_density, Lrnr_solnp, Lrnr_stratified, Lrnr_subset_covariates, Lrnr_svm, Lrnr_tsDyn, Lrnr_ts_weights, Lrnr_xgboost, Pipeline, Stack, define_h2o_X(), undocumented_learner

Examples

library(data.table)

# load example data
data(cpp_imputed)
setDT(cpp_imputed)
cpp_imputed[, parity_cat := factor(ifelse(parity < 4, parity, 4))]
covars <- c(
  "apgar1", "apgar5", "parity_cat", "gagebrth", "mage", "meducyrs",
  "sexn"
)
outcome <- "haz"

# create sl3 task
task <- sl3_Task$new(data.table::copy(cpp_imputed),
  covariates = covars,
  outcome = outcome
)

lrnr_glmnet <- make_learner(Lrnr_glmnet)
lrnr_glm <- make_learner(Lrnr_glm)
lrnr_mean <- make_learner(Lrnr_mean)
lrnrs <- make_learner(Stack, lrnr_glm, lrnr_mean)

screen_corP <- make_learner(Lrnr_screener_correlation, type = "threshold")
corP_pipeline <- make_learner(Pipeline, screen_corP, lrnrs)
fit_corP <- corP_pipeline$train(task)
preds_corP_screener <- fit_corP$predict()

tlverse/sl3 documentation built on Nov. 18, 2024, 12:46 a.m.