encodefrom: Encode data frame column using external crosswalk file.

Description Usage Arguments Value Functions Examples

View source: R/encodefrom.R

Description

Encode data frame column using external crosswalk file.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
encodefrom(
  .data,
  var,
  cw_file,
  raw,
  clean,
  label,
  delimiter = NULL,
  sheet = NULL,
  case_ignore = TRUE,
  ignore_tibble = FALSE
)

encodefrom_(
  .data,
  var,
  cw_file,
  raw,
  clean,
  label,
  delimiter = NULL,
  sheet = NULL,
  case_ignore = TRUE,
  ignore_tibble = FALSE
)

Arguments

.data

Data frame or tbl_df

var

Column name of vector to be encoded

cw_file

Either data frame object or string with path to external crosswalk file, including path, which has columns representing raw (current) vector values, clean (new) vector values, and labels for values. Values in raw and clean columns must be unique (1:1 match) or an error will be thrown. Acceptable file types include: delimited (.csv, .tsv, or other), R (.rda, .rdata, .rds), or Stata (.dta).

raw

Name of column in cw_file that contains values in current vector.

clean

Name of column in cw_file that contains new values for vector.

label

Name of column in cw_file with labels for new values.

delimiter

String delimiter used to parse cw_file. Only necessary if using a delimited file that isn't a comma-separated or tab-separated file (guessed by function based on file ending).

sheet

Specify sheet if cw_file is an Excel file and required sheet isn't the first one.

case_ignore

Ignore case when matching current (raw) vector name with new (clean) column name.

ignore_tibble

Ignore .data status as tbl_df and return vector as a factor rather than labelled vector.

Value

Vector that is either a factor or labelled, depending on data input and options

Functions

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
df <- data.frame(state = c('Kentucky','Tennessee','Virginia'),
                 stfips = c(21,47,51),
                 cenregnm = c('South','South','South'))

df_tbl <- tibble::as_tibble(df)

cw <- get(data(stcrosswalk))

df$state2 <- encodefrom(df, state, cw, stname, stfips, stabbr)
df_tbl$state2 <- encodefrom(df_tbl, state, cw, stname, stfips, stabbr)
df_tbl$state3 <- encodefrom(df_tbl, state, cw, stname, stfips, stabbr,
                            ignore_tibble = TRUE)

haven::as_factor(df_tbl)
haven::zap_labels(df_tbl)

crosswalkr documentation built on Jan. 8, 2020, 5:07 p.m.