purposeful_step_2: Purposeful selection step #2

View source: R/purposeful-step-2.R

purposeful_step_2R Documentation

Purposeful selection step #2

Description

Fit a multivariable model and assess the importance of each covariate with the purpose to get to a smaller, reduced model. Start with all covariates identified for inclusion in Step #1. The smaller model will include only covariates that are below a 0.05 cutoff for significance or that have strong clinical reasons to stay in the model. A partial likelihood ratio test will compare the full model with the reduced. The same data set used for the full model is used to fit the reduced model.

Usage

purposeful_step_2(
  data,
  outcome,
  predictors,
  keep_in_mod = NULL,
  ref_level = NULL,
  format = FALSE,
  conf_level = 0.95,
  exponentiate = TRUE,
  digits = 2,
  ...
)

Arguments

data

A tibble or data frame with the full data set.

outcome

Character string. The dependent variable (outcome) for logistic regression.

predictors

Character vector. Independent variables (predictors/covariates) for univariable and/or multivariable modelling.

keep_in_mod

Character vector. Variables with strong clinical reasons to stay in the model. These will appear in both the full and reduced model regardless of statistical significance.

ref_level

Character string. The factor level of outcome variable that corresponds to the true condition (1). If not provided then default is NULL and the model fit will determine the reference level.

format

Display format in case I need to escape some characters. A place holder for now in case I need it in the future. Default is "html".

conf_level

The confidence level to use for the confidence interval. Must be strictly greater than 0 and less than 1. Defaults to 0.95, which corresponds to a 95 percent confidence interval.

exponentiate

Logical indicating whether or not to exponentiate the the coefficient estimates. This is typical for logistic and multinomial regressions, but a bad idea if there is no log or logit link. Defaults to TRUE.

digits

Integer; number of decimals to round to.

...

Additional arguments.

Value

A list with:

  • Covariates included in the full model

  • Covariates included in the reduced model

  • Model results for the full model

  • Model results for the reduced model

  • Partial likelihood ratio test results

References

Hosmer DW, Lemeshow S (2000) Applied Logistic Regression. John Wiley & Sons, Inc.

Examples

library(dplyr)


#### Sample data set --------------------------------

set.seed(888)
age <- abs(round(rnorm(n = 1000, mean = 67, sd = 14)))
lac <- abs(round(rnorm(n = 1000, mean = 5, sd = 3), 1))
gender <-factor(rbinom(n = 1000, size = 1, prob = 0.6),
                labels = c("male", "female"))
wbc <- abs(round(rnorm(n = 1000, mean = 10, sd = 3), 1))
hb <- abs(round(rnorm(n = 1000, mean = 120, sd = 40)))
z <- 0.1 * age - 0.02 * hb + lac - 10
pr = 1 / (1 + exp(-z))
y = rbinom(1000, 1, pr)
mort <- factor(rbinom(1000, 1, pr),
               labels = c("alive", "dead"))
data <- tibble::tibble(age, gender, lac, wbc, hb, mort)

#### Example 1 --------------------------------

purposeful_step_2(data = data,
                  outcome = "mort",
                  predictors = c("age", "gender", "hb", "lac", "wbc"))

#### Example 2 --------------------------------

purposeful_step_2(data = data,
                  outcome = "mort",
                  predictors = c("age", "gender", "hb", "lac"),
                  keep_in_mod = "wbc")



emilelatour/purposeful documentation built on Jan. 6, 2023, 8:04 a.m.