Sanitize to adhere to REDCap character encoding requirements.

Description

Replace non-ASCII characters with legal characters that won't cause problems when writing to a REDCap project.

Usage

1
2
redcap_column_sanitize(d, column_names = colnames(d),
  encoding_initial = "latin1", substitution_character = "?")

Arguments

d

The data.frame containing the dataset used to update the REDCap project. Required.

column_names

An array of character values indicating the names of the variables to sanitize. Optional.

encoding_initial

An array of character values indicating the names of the variables to sanitize. Optional.

substitution_character

The character value that replaces characters that were unable to be appropriatedly matched.

Details

Letters like an accented ‘A’ are replaced with a plain ‘A’.

This is a thin wrapper around base::iconv(). The ASCII//TRANSLIT option does the actual transliteration work. As of R 3.1.0, the OSes use similar, but different, versions to convert the characters. Be aware of this in case you notice slight OS-dependent differences.

Value

A data.frame with same columns, but whose character values have been sanitized.

Author(s)

Will Beasley

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
dirty <- data.frame(id=1:3, names=c("Ekstr\xf8m", "J\xf6reskog", "bi\xdfchen Z\xfcrcher"))
REDCapR::redcap_column_sanitize(dirty)
# Produces the dataset:
#  id            names
#1  1          Ekstr?m
#2  2         Joreskog
#3  3 bisschen Zurcher

# Typical examples are not shown because they require non-ASCII encoding, 
#   which makes the package documentation less portable.