check_recode | R Documentation |
This was written a few days after the retraction of a paper in JAMA due to an error in recoding the treatment variable (https://jamanetwork.com/journals/jama/fullarticle/2752474). This takes a data frame or tibble, fuzzy matches variable names, and produces crosstables of all matched variables. A visual inspection should reveal any miscoding.
check_recode(
.data,
dependent = NULL,
explanatory = NULL,
include_numerics = TRUE,
...
)
.data |
Data frame or tibble. |
dependent |
Optional character vector: name(s) of depdendent variable(s). |
explanatory |
Optional character vector: name(s) of explanatory variable(s). |
include_numerics |
Logical. Include numeric variables in function. |
... |
Pass other arguments to |
List of length two. The first is an index of variable combiations. The second is a nested list of crosstables as tibbles.
library(dplyr)
data(colon_s)
colon_s_small = colon_s %>%
select(-id, -rx, -rx.factor) %>%
mutate(
age.factor2 = forcats::fct_collapse(age.factor,
"<60 years" = c("<40 years", "40-59 years")),
sex.factor2 = forcats::fct_recode(sex.factor,
# Intentional miscode
"F" = "Male",
"M" = "Female")
)
# Check
colon_s_small %>%
check_recode(include_numerics = FALSE)
out = colon_s_small %>%
select(-extent, -extent.factor,-time, -time.years) %>%
check_recode()
out
# Select a tibble and expand
out$counts[[9]]
# Note this variable (node4) appears miscoded in original dataset survival::colon.
# Choose to only include variables that you actually use.
# This uses standard Finalfit grammar.
dependent = "mort_5yr"
explanatory = c("age.factor2", "sex.factor2")
colon_s_small %>%
check_recode(dependent, explanatory)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.