knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
library(knitr)
library(kableExtra)
library(qacr)
df <- data.frame(sex=c(1,2,1,2,2,2),

                 race=c("b", "w", "a", "b", "w", "h"),
                 outcome=c("better", "worse", "same", "same", "better", "worse"),
                 Q1=c(20, 30, 44, 15, 50, 99),
                 Q2=c(15, 23, 18, 86, 99, 35),
                 age=c(12, 20, 33, 55, 30, 100),
                 rating =c(1,2,5,3,4,5))

The recodes() functions makes it very easy to recode one or more variables in the your data frame. The format is

newdata <- recodes(olddata, variables, from values, to values)

Original dataset

Consider the following data set (below). Lets make the following changes.

kbl(df) %>% kable_styling(bootstrap_options = "striped", full_width = F, position = "left")

sex

For sex, set 1 to "Male" and 2 to "Female".

df <- recodes(data=df, vars="sex", 
               from=c(1,2), to=c("Male", "Female"))
kbl(df) %>% kable_styling(bootstrap_options = "striped", full_width = F, position = "left")

race

Recode race to "White" vs. "Other".

df <- recodes(data=df, vars="race", 
              from=c("w", "b", "a", "h"), 
              to=c("White", "Other", "Other", "Other"))
kbl(df) %>% kable_styling(bootstrap_options = "striped", full_width = F, position = "left")

outcome

Recode outcome to 1 (better) vs. 0 (not better).

df <- recodes(data=df, vars="outcome", 
              from=c("better", "same", "worse"), 
              to=c(1, 0, 0))
kbl(df) %>% kable_styling(bootstrap_options = "striped", full_width = F, position = "left")

Q1 and Q2

For Q1 and Q2 set values of 86 and 99 to missing.

df <- recodes(data=df, vars=c("Q1", "Q2"), 
              from=c(86, 99), to=NA)
kbl(df) %>% kable_styling(bootstrap_options = "striped", full_width = F, position = "left")

age

For age, set values

You can use expressions in your from fields. When they are TRUE, the corresponding to values will be applied. We will use the dollar sign (\$) to represent the variable (age in this case). The symbols ( \|, \& ) mean OR and AND respectively.

df <- recodes(data=df, vars="age", 
              from=c("$ <   20 | $ >  90", 
                     "$ >=  20 & $ <= 30",
                     "$ >   30 & $ <= 50",
                     "$ >   50 & $ <= 90"), 
              to=c(NA, "Younger", "Middle Aged", "Older"))

We can also write this as

df <- recodes(data=df, vars="age", 
              from=c("$ < 20", "$ <= 30", "$ <= 50", "$ <= 90", "$ > 90"), 
              to=  c(NA, "Younger", "Middle Aged", "Older", "NA"))

This works because once the age value for an observations meets a criteria that is TRUE (working left to right), it is recoded. It isn't changed again by later criteria in the same recodes statement.

kbl(df) %>% kable_styling(bootstrap_options = "striped", full_width = F, position = "left")

rating

Finally, for the rating variable, reverse the scoring so that 1 to 5 becomes 5 to 1.

df <- recodes(data=df, vars="rating", from=1:5, to=5:1)
kbl(df) %>% kable_styling(bootstrap_options = "striped", full_width = F, position = "left")

Note

Remember that recodes returns a data frame, not a variable.

This allows you to apply the same recoding scheme to more than one variable at a time (e.g., Q1 and Q2 above).

And that's it (APPLAUSE, APPLAUSE)!



Rkabacoff/qacr documentation built on March 20, 2021, 3:03 p.m.