View source: R/check-low-freq.R
check_low_freq | R Documentation |
Low counts in certain categories and levels of a variable can cause issues in logistic regression. This function helps to identify low counts that might be problematic. Rule of thumb might be to collapse categories that contain <= 1.0% of your data.
check_low_freq(fit, data, threshold = 0.01)
check_low_freq2(data, outcome, predictors, threshold = 0.01)
fit |
An object of class |
data |
A tibble or data frame with the full data set. |
threshold |
The threshold to flag categories with frequencies/counts. Default is 0.01. |
outcome |
Character string. The dependent variable (outcome) for logistic regression. |
predictors |
Character vector. Independent variables (predictors/covariates) for univariable and/or multivariable modelling. |
A tibble
## Not run:
library(epiDisplay)
library(dplyr)
dplyr::glimpse(infert)
model0 <- glm(case ~ induced + spontaneous + education,
family = binomial,
data = infert)
summary(model0)
check_low_freq(fit = model0,
data = infert)
check_low_freq(fit = model0,
data = infert,
threshold = 0.05)
check_low_freq2(data = infert,
outcome = "case",
predictors = c("induced", "spontaneous", "education"),
threshold = 0.05)
#### Another data set --------------------------------
library(compareGroups)
data(predimed)
dplyr::glimpse(predimed)
predimed <- predimed %>%
mutate_if(is.double, as.double)
fit = glm(htn ~ sex + bmi + smoke,
family = binomial(link = "logit"),
data = predimed)
check_low_freq(fit = fit,
data = predimed)
check_low_freq(fit = model0,
data = infert,
threshold = 0.05)
check_low_freq2(data = predimed,
outcome = "htn",
predictors = c("sex", "bmi", "smoke"),
threshold = 0.01)
check_low_freq2(data = predimed,
outcome = "htn",
predictors = c("sex", "bmi", "smoke"),
threshold = 0.05)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.