cforward: Forward Selection Based on C-Index/Concordance

Description Usage Arguments Value Examples

View source: R/cforward.R

Description

Forward Selection Based on C-Index/Concordance

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
cforward(
  data,
  event_time = "event_time_years",
  event_status = "mortstat",
  weight_column = "WTMEC4YR_norm",
  variables = NULL,
  included_variables = NULL,
  n_folds = 10,
  seed = 1989,
  max_model_size = 50,
  c_threshold = NULL,
  verbose = TRUE,
  cfit_args = list(),
  save_memory = FALSE,
  ...
)

cforward_one(
  data,
  event_time = "event_time_years",
  event_status = "mortstat",
  weight_column = "WTMEC4YR_norm",
  variables,
  included_variables = NULL,
  verbose = TRUE,
  cfit_args = list(),
  save_memory = FALSE,
  ...
)

make_folds(data, event_status = "mortstat", n_folds = 10, verbose = TRUE)

Arguments

data

A data set to perform model selection and cross-validation.

event_time

Character vector of length 1 with event times, passed to Surv

event_status

Character vector of length 1 with event status, passed to Surv

weight_column

Character vector of length 1 with weights for model. If no weights are available, set to NULL

variables

Character vector of variables to perform selection. Must be in data.

included_variables

Character vector of variables forced to have in the model. Must be in data

n_folds

Number of folds for Cross-validation. If you want to run on the full data, set to 1

seed

Seed set before folds are created.

max_model_size

maximum number of variables in the model. Selection will stop if reached. Note, this does not correspond to the number of coefficients, due to categorical variables.

c_threshold

threshold for concordance. If the difference in the best concordance and this one does not reach a certain threshold, break.

verbose

print diagnostic messages

cfit_args

Arguments passed to concordancefit. If strata is to be passed, set strata_column in this list.

save_memory

save only a minimal amount of information, discard the fitted models

...

Additional arguments to pass to coxph

Value

A list of lists, with elements of:

full_concordance

Concordance when fit on the full data

models

Cox model from full data set fit, stripped of large memory elements

cv_concordance

Cross-validated Concordance

included_variables

Variables included in the model, other than those being selection upon

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
variables = c("gender",
              "age_years_interview", "education_adult")

res = cforward(nhanes_example,
               event_time = "event_time_years",
               event_status = "mortstat",
               weight_column = "WTMEC4YR_norm",
               variables = variables,
               included_variables = NULL,
               n_folds = 5,
               c_threshold = 0.02,
               seed = 1989,
               max_model_size = 50,
               verbose = TRUE)
conc = sapply(res, `[[`, "best_concordance")



res = cforward(nhanes_example,
               event_time = "event_time_years",
               event_status = "mortstat",
               weight_column = "WTMEC4YR_norm",
               variables = variables,
               included_variables = NULL,
               n_folds = 5,
               seed = 1989,
               max_model_size = 50,
               verbose = TRUE)
conc = sapply(res, `[[`, "best_concordance")
threshold = 0.01
included_variables = names(conc)[c(1, diff(conc)) > threshold]

new_variables = c("diabetes", "stroke")
second_level = cforward(nhanes_example,
               event_time = "event_time_years",
               event_status = "mortstat",
               weight_column = "WTMEC4YR_norm",
               variables = new_variables,
               included_variables = included_variables,
               n_folds = 5,
               seed = 1989,
               max_model_size = 50,
               verbose = TRUE)
second_conc = sapply(second_level, `[[`, "best_concordance")
result = second_level[[which.max(second_conc)]]
final_model = result$models[[which.max(result$cv_concordance)]]

cforward documentation built on March 29, 2021, 5:07 p.m.