knitr::opts_chunk$set(echo = TRUE)

rec_with_table

Recode with Table

Recoding variables is a common task in research that is time consuming and error prone. RecWTable compliments sjmisc::rec() and other recoding methods with the following features:

Syntax for variable recoding

The syntax is the same syntax as sjmisc.

The recode-pattern, i.e. which new values should replace the old values, is defined using the rec variable in variable_details data.frame. This argument has a specific "syntax":

the pairs are obtained from the RecFrom and RecTo columns

Different from sjmisc::rec(), there is the ability to define intervals uising interval. The default interval, [,) which corresponds to the common math notation where a closed interval is denoted with a closed bracket [ or ] and an open interval is denoted with an open bracket ( or ). A closed interval is an interval which includes all it limit points. For example, [0,1] means greater than or equal to 0 and less than or equal to 1. For example,from "1:2.5=1"` recodes to the default interval, where any value greater than or equal to 1 and less than 2.5 to the new value 1.

[note from Doug: these descriptors for rev and direct value labels will be modified in our final documentation. Indicated here to identify how bllflow differs from sjmisc::rec().

rev is available in sjmisc, but not available in bllflow. * "rev": "rev" is a special token that reverses the value order.

Direct value label is avaiable in sjmis. In bllflow, value labelling is performed using 'valueLabel' for the corresponding row in the variable_details data.table. * direct value labelling: Value labels for new values can be assigned inside the recode pattern by writing the value label in square brackets after defining the new value in a recode pair, e.g. rec = "15:30=1 [young aged]; 31:55=2 [middle aged]; 56:max=3 [old aged]"


————-

Notes: 1) If a startVariable is not present in the data, should we consider whether it is an intermediary variable, and then transform (recode) that variable? An intermediary variable is a variable that exists as a variable variableDetails (it is created by variables in data and then used in a transformation with other)

For initial version, give warnings and errors only. If missing intermediary variable: “Error: recoding {variable} requires {start_variable} variable. Recode available in {variable_details}. Suggest first recoding {start_variable} variable, then try again.”

If missing start_variable (altogether): “Error: missing required starting variable(s): {starting_variable}”

2) Check to make sure all possible values are recoded. What should we do if values cannot be recoded? See outOfRange = NA

3) Return error if any required fields are missing, including: startVarible, type, etc.

4) Log example: “{variable} created from: startVariables. Observations: {n} type: {continuous, factor, etc.} (if continuous:) min: {min}, max: {min}, NA: {n of missing} (if factor:) {n} factors, NA: {n of missing}”



Big-Life-Lab/bllflow documentation built on Feb. 1, 2023, 12:29 p.m.