Recodes one single according to a set of rules. Recommends for numeric (single value or range change)

ez.recode replaces the original var with recoded var;
ez.recode2 saves orignal var as var_ori, and then recodes var
see also ez.replace, recommends for numeric (single value change), characters, factors
keep data type whenever possible, remove value labels attr of col (otherwise could be inconsistent), but variable label is kept for numeric, character, factors etc.


ez.recode(df, col, recodes)

ez.recode2(df, col, recodes)



data.frame to be recoded


the name of var to be recoded, must be a string in quotes ""


Definition of the recoding rules. See details


recodes contains a set of recoding rules separated by ";". There are several different types of recoding rules:

  • The simplest codes one value to another. If we wish to recode 1 into 2, we could use the rule "1=2;".

  • A range of values can be coded to a single value using "1:3=4;". This rule would code all values between 1 and 3 inclusive into 4. For factors, a value is between two levels if it is between them in the factor ordering. One sided ranges can be specified using the lo and hi key words (e.g."lo:3=0; 4:hi=1"). hi=Hi=HI=max, lo=Lo=LI=min, :=thru=Thru=THRU (mimic SPSS recode syntax) -> can replace = as well. if multiple ranges overlap, the latter one prevails. 1:3=1;3:5=2 (3->2 finally).

  • Default conditions can be coded using "else." For example, if we wish to recode all values >=0 to 1 and all values <0 to missing, we could use ("0:hi=1; else=NA"). the "else"-token should be the last argument in the recodes-string.

  • Variable label attributes (see, for instance, get_label) are preserved if exists, however, value label attributes are removed (makes sense, right)

  • the sjmisc_rec function in sjmisc does not work well with double numbers (eg, 3.59)

recommends ez.replace to change characters, factors
Works with characters/factors as well e.g., ('Gr',"'U1'='U';'U2'='U';'R1'='R';'R2'='R'")
characters to number does not work directly e.g., ('Gr',"'U1'=2;'U2'=3") –> 2, 3 are converted to "2", "3" (char of number)
but number to character works directly, char->char, factor->factor
for factors, no need to reset levels (auto reset)
The conclusion is: numeric<->numeric without quote
but if newval is quoted character, then numeric->char, char->char, factor->factor
See the example section for more detail.


returns a new df, old one does not change


Jerry Zhu modified from Ian Fellows (pkg Deducer) adapted from code by John Fox (car)

ez.recode(data, "a", "hi = 1")
ez.recode(data, "a", "lo:0 = 0;0:hi = 1;")
ez.recode(data, "b", "lo:0 = 0;0:hi = 1;")
ez.recode(data, "a", "lo:0 = 'low';0:hi = 1;")
         #a was numeric type, now is character type
         #note: for hi=1, the 1 is not even quoted
         #can be quoted hi='1', but it does not matter here
data <- ez.recode(data,"male", "1 = 'Male';FALSE = 'Female';else = NA;")
         #both 1 and TRUE = 'Male' work
         #the last semicolon; after NA is not necessary
         #male was initially a logic type, now is a character type

