refactor_list: Produce a lookup for refactor()

Description Usage Arguments Value Author(s) See Also Examples

Description

The refactor_list command is a helper function for refactor. It prints the R code requiqred for a 'lookup' to the console, for inclusion in data preparation/cleaning scripts (perhaps after a bit of editing!).

For vary large lookups, it might make more sense to pass the lookup to refactor using a file. You can write the lookup to a .csv file by supplying a path/name to the the file argument.

To try and make the process less laborious, refactor_list also has a consolidate parameter. If set to TRUE, the lookup generated will pass the 'TO' values through consolidate_values, hopefully consoldating factor levels which are different for small formatting reasons in to one. See the consolidate_values documentation for details.

For a demonstration of how refactor and refactor_list work together, see the package vignette, with:

vignette('brocks')

Usage

1
refactor_list(x, consolidate = FALSE, file = NULL)

Arguments

x

A factor (or character) variable

consolidate

logical. Should the 'TO' values be passed through consolidate_values in an automated attempt to clean them up?

file

A writable file path. If supplied, the lookup will be written out to a two column .csv file, as opposed to written to the console. The file produced can be passed to the file argument in refactor

Value

Nothing. Prints to the console/terminal with cat.

Author(s)

Brendan Rocks rocks.brendan@gmail.com

See Also

refactor, the function which rfeactor_list supports

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
## Not run: 
# Let's tidy up the gender variable in test_data
data(test_data)
table(test_data$gender)

# Passing the gender variable to refactor_list, will generate the R code we
# need to create a lookup for it in our data-cleaning script! Setting
# consolidate to TRUE will do some of the work for us.

refactor_list(test_data$gender, consolidate = TRUE)

# At this point you'd take the code generated and itegrate it into your
# script. Here's one I made earlier. We can pass it to refactor, and our
# factor variable is now tidy!

new_vals <- list(
  # FROM      TO
  c("",        NA     ),
  c("<NA>",    NA     ),
  c("F",      "female"),
  c("Female", "female"),
  c("m",      "male"  ),
  c("M",      "male"  ),
  c("Male",   "male"  ),
  c("Man",    "male"  ),
  c("Woman",  "female"),
  c("n/a",     NA     )
)

test_data$gender <- refactor(test_data$gender, new_vals)

## End(Not run)

brendan-r/brocks documentation built on May 13, 2019, 5:08 a.m.