knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) library(knitr) library(kableExtra)
library(qacr) df <- data.frame(sex=c(1,2,1,2,2,2), race=c("b", "w", "a", "b", "w", "h"), outcome=c("better", "worse", "same", "same", "better", "worse"), Q1=c(20, 30, 44, 15, 50, 99), Q2=c(15, 23, 18, 86, 99, 35), age=c(12, 20, 33, 55, 30, 100), rating =c(1,2,5,3,4,5))
The recodes()
functions makes it very easy to recode one or more variables in the your data frame. The format is
newdata <- recodes(olddata, variables, from values, to values)
Consider the following data set (below). Lets make the following changes.
kbl(df) %>% kable_styling(bootstrap_options = "striped", full_width = F, position = "left")
For sex
, set 1 to "Male" and 2 to "Female".
df <- recodes(data=df, vars="sex", from=c(1,2), to=c("Male", "Female"))
kbl(df) %>% kable_styling(bootstrap_options = "striped", full_width = F, position = "left")
Recode race
to "White" vs. "Other".
df <- recodes(data=df, vars="race", from=c("w", "b", "a", "h"), to=c("White", "Other", "Other", "Other"))
kbl(df) %>% kable_styling(bootstrap_options = "striped", full_width = F, position = "left")
Recode outcome
to 1 (better) vs. 0 (not better).
df <- recodes(data=df, vars="outcome", from=c("better", "same", "worse"), to=c(1, 0, 0))
kbl(df) %>% kable_styling(bootstrap_options = "striped", full_width = F, position = "left")
For Q1
and Q2
set values of 86 and 99 to missing.
df <- recodes(data=df, vars=c("Q1", "Q2"), from=c(86, 99), to=NA)
kbl(df) %>% kable_styling(bootstrap_options = "striped", full_width = F, position = "left")
For age
, set values
You can use expressions in your from
fields. When they are TRUE
, the corresponding to
values will be applied. We will use the dollar sign (\$) to represent the variable (age in this case). The symbols ( \|, \& ) mean OR and AND respectively.
df <- recodes(data=df, vars="age", from=c("$ < 20 | $ > 90", "$ >= 20 & $ <= 30", "$ > 30 & $ <= 50", "$ > 50 & $ <= 90"), to=c(NA, "Younger", "Middle Aged", "Older"))
We can also write this as
df <- recodes(data=df, vars="age", from=c("$ < 20", "$ <= 30", "$ <= 50", "$ <= 90", "$ > 90"), to= c(NA, "Younger", "Middle Aged", "Older", "NA"))
This works because once the age value for an observations meets a criteria that is TRUE
(working left to right), it is recoded. It isn't changed again by later criteria in the same recodes
statement.
kbl(df) %>% kable_styling(bootstrap_options = "striped", full_width = F, position = "left")
Finally, for the rating
variable, reverse the scoring so that 1 to 5 becomes
5 to 1.
df <- recodes(data=df, vars="rating", from=1:5, to=5:1)
kbl(df) %>% kable_styling(bootstrap_options = "striped", full_width = F, position = "left")
Remember that recodes
returns a data frame, not a variable.
df <- recodes(data=df, vars="rating", from=1:5, to=5:1)
is correct.
df$rating <- recodes(data=df, vars="rating", from=1:5, to=5:1)
is not.
This allows you to apply the same recoding scheme to more than one variable at a time (e.g., Q1 and Q2 above).
And that's it (APPLAUSE, APPLAUSE)!
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.