knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) library(knitr) library(kableExtra)
library(qacBase) df <- data.frame(sex=c(1,2,1,2,2,2), race=c("b", "w", "a", "b", "w", "h"), outcome=c("better", "worse", "same", "same", "better", "worse"), Q1=c(20, 30, 44, 15, 50, 99), Q2=c(15, 23, 18, 86, 99, 35), age=c(12, 20, 33, 55, 30, 100), rating =c(1,2,5,3,4,5))
The recodes()
functions makes it very easy to recode one or more variables in the your data frame. The format is
newdata <- recodes(olddata, variables, from values, to values)
Consider the following data set (below). Lets make the following changes.
kbl(df) %>% kable_styling(bootstrap_options = "striped", full_width = F, position = "left")
For sex
, set 1 to "Male" and 2 to "Female".
df <- recodes(data=df, vars="sex", from=c(1,2), to=c("Male", "Female"))
kbl(df) %>% kable_styling(bootstrap_options = "striped", full_width = F, position = "left")
Recode race
to "White" vs. "Other".
df <- recodes(data=df, vars="race", from=c("w", "b", "a", "h"), to=c("White", "Other", "Other", "Other"))
kbl(df) %>% kable_styling(bootstrap_options = "striped", full_width = F, position = "left")
Recode outcome
to 1 (better) vs. 0 (not better).
df <- recodes(data=df, vars="outcome", from=c("better", "same", "worse"), to=c(1, 0, 0))
kbl(df) %>% kable_styling(bootstrap_options = "striped", full_width = F, position = "left")
For Q1
and Q2
set values of 86 and 99 to missing.
df <- recodes(data=df, vars=c("Q1", "Q2"), from=c(86, 99), to=NA)
kbl(df) %>% kable_styling(bootstrap_options = "striped", full_width = F, position = "left")
For age
, set values
You can use expressions in your from
fields. When they are TRUE
, the corresponding to
values will be applied. We will use the dollar sign (\$) to represent the variable (age in this case). The symbols ( \|, \& ) mean OR and AND respectively.
df <- recodes(data=df, vars="age", from=c("$ < 20 | $ > 90", "$ >= 20 & $ <= 30", "$ > 30 & $ <= 50", "$ > 50 & $ <= 90"), to=c(NA, "Younger", "Middle Aged", "Older"))
We can also write this as
df <- recodes(data=df, vars="age", from=c("$ < 20", "$ <= 30", "$ <= 50", "$ <= 90", "$ > 90"), to= c(NA, "Younger", "Middle Aged", "Older", "NA"))
This works because once the age value for an observations meets a criteria that is TRUE
(working left to right), it is recoded. It isn't changed again by later criteria in the same recodes
statement.
kbl(df) %>% kable_styling(bootstrap_options = "striped", full_width = F, position = "left")
Finally, for the rating
variable, reverse the scoring so that 1 to 5 becomes
5 to 1.
df <- recodes(data=df, vars="rating", from=1:5, to=5:1)
kbl(df) %>% kable_styling(bootstrap_options = "striped", full_width = F, position = "left")
Remember that recodes
returns a data frame, not a variable.
df <- recodes(data=df, vars="rating", from=1:5, to=5:1)
is correct.
df$rating <- recodes(data=df, vars="rating", from=1:5, to=5:1)
is not.
This allows you to apply the same recoding scheme to more than one variable at a time (e.g., Q1 and Q2 above).
And that's it (APPLAUSE, APPLAUSE)!
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.