fieldCastingFunctions: Functions for Casting Fields After Export (Post Processing)

fieldCastingFunctionsR Documentation

Functions for Casting Fields After Export (Post Processing)

Description

The functions provided here allow for recasting fields after records have been exported. They generally have a similar interface to the casting strategy of exportRecordsTyped(), though they may not each offer all the same options.

Usage

recastRecords(
  data,
  rcon,
  fields,
  cast = list(),
  suffix = "",
  warn_zero_coded = TRUE
)

castForImport(
  data,
  rcon,
  fields = NULL,
  na = list(),
  validation = list(),
  cast = list(),
  warn_zero_coded = TRUE
)

guessCast(
  data,
  rcon,
  na = isNAorBlank,
  validation,
  cast,
  quiet = FALSE,
  threshold = 0.8
)

guessDate(
  data,
  rcon,
  na = isNAorBlank,
  validation = valRx("^[0-9]{1,4}-(0?[1-9]|1[012])-(0?[1-9]|[12][0-9]|3[01])$"),
  cast = function(x, ...) as.POSIXct(x, format = "%Y-%m-%d"),
  quiet = FALSE,
  threshold = 0.8
)

mChoiceCast(data, rcon, style = "labelled", drop_fields = TRUE)

Arguments

data

data.frame with the data fields to be recoded.

rcon

A redcapConnection object.

fields

character/logical/integerish. A vector for identifying which fields to recode. When logical, the length must match the number of columns in data (i.e., recycling not permitted). A message is printed if any of the indicated fields are not a multiple choice field; no action will be taken on such fields. For this function, yes/no and true/false fields are considered multiple choice fields. Fields of class mChoice are quietly skipped.

cast

A named list of user specified class casting functions. The same named keys are supported as the na argument. The function will be provided the variables (x, field_name, coding). The function must return a vector of logical matching the input length. The cast should match the validation, if one is using raw_cast, then validation=skip_validation is likely the desired intent. See fieldValidationAndCasting()

suffix

character(1). An optional suffix to provide if the recoded variables should be returned as new columns. For example, if recoding a field forklift_brand and suffix = "_labeled", the result will have one column with the coded values (forklift_brand) and one column with the labeled values (forklift_brand_labeled).

warn_zero_coded

logical(1). Turn on or off warnings about zero coded fields. Default is TRUE.

na

A named list of user specified functions to determine if the data is NA. This is useful when data is loaded that has coding for NA, e.g. -5 is NA. Keys must correspond to a truncated REDCap field type, i.e. date_, datetime_, datetime_seconds_, time_mm_ss, time_hh_mm_ss, time, float, number, calc, int, integer, select, radio, dropdown, yesno, truefalse, checkbox, form_complete, sql, system. The function will be provided the variables (x, field_name, coding). The function must return a vector of logicals matching the input. It defaults to isNAorBlank() for all entries.

validation

A named list of user specified validation functions. The same named keys are supported as the na argument. The function will be provided the variables (x, field_name, coding). The function must return a vector of logical matching the input length. Helper functions to construct these are valRx() and valChoice(). Only fields that are not identified as NA will be passed to validation functions.

quiet

Print no messages if triggered, Default=FALSE.

threshold

numeric(1). The threshold of non-NA data to trigger casting.

style

character. One of "labelled" or "coded". Default is "labelled"

drop_fields

character or NULL. A vector of field names to remove from the data.

Details

recastRecords is a post-processing function motivated initially by the need to switch between codes and labels in multiple choice fields. Field types for which no casting function is specified will be returned with no changes. It will not attempt to validate the content of fields; fields that cannot be successfully cast will be quietly returned as missing values.

castForImport is written with defaults that will return data in a format ready to be imported to a project via importRecords. All fields are returned as character vectors. If any values fail to validation check, are report is returned as an attribute named invalid. This attribute may be retrieved using reviewInvalidRecords(). These are then set to NA, which will be imported as blanks through the API.

guessCast is a helper function to make a guess at casting uncast columns. It will do a type cast if a validation is met above a threshold ratio of non-NA records. It modifies the existing invalid attribute to reflect the cast. This attribute may be retrieved using reviewInvalidRecords(). guessDate is a special cast of guessCast that has defaults set for casting a date field.

mChoiceCast is a helper function that adds the Hmisc::mChoice multiple choice class. It adds a column for a multiple choice checkbox that is cast to the Hmisc::mChoice class. Requires Hmisc to be loaded.

Zero-Coded Check Fields

A zero-coded check field is a field of the REDCap type checkbox that has a coding definition of ⁠0, [label]⁠. When exported, the field names for these fields is ⁠[field_name]___0⁠. As in other checkbox fields, the raw data output returns binary values where 0 represent an unchecked box and 1 represents a checked box. For zero-coded checkboxes, then, a value of 1 indicates that 0 was selected.

This coding rarely presents a problem when casting from raw values (as is done in exportRecordsTyped). However, casting from coded or labeled values can be problematic. In this case, it becomes indeterminate from context if the intent of 0 is 'false' or the coded value '0' ('true') ...

The situations in which casting may fail to produce the desired results are

Code Label Result
0 anything other than "0" Likely to fail when casting from coded values
0 0 Likely to fail when casting from coded or labeled values

Because of the potential for miscast data, casting functions will issue a warning anytime a zero-coded check field is encountered. A separate warning is issued when a field is cast from coded or labeled values.

When casting from coded or labeled values, it is strongly recommended that the function castCheckForImport() be used. This function permits the user to state explicitly which values should be recognized as checked, avoiding the ambiguity resulting from the coding.

See Also

Exporting records

exportRecordsTyped(),
exportReportsTyped(),
fieldValidationAndCasting(),
reviewInvalidRecords()

Other Post Processing Functions

splitForms(),
widerRepeated()

Vignettes

vignette("redcapAPI-offline-connection", package = "redcapAPI")
vignette("redcapAPI-casting-data")
vignette("redcapAPI-missing-data-detection")
⁠vignette("redcapAPI-data-validation)⁠

Examples

## Not run: 
# Using recastRecords after export
Recs <- 
  exportRecordsTyped(rcon) |>
  recastRecords(rcon, 
                fields = "dropdown_test",
                cast = list(dropdown = castCode))
                
                
# Using castForImport
castForImport(Records, 
              rcon)
              
              
# Using castForImport to recast zero-coded checkbox values
castForImport(Records, 
              rcon, 
              cast = list(checkbox = castCheckForImport(c("0", "Unchecked"))))


# Using guessCast
exportRecordsTyped(rcon,
                   validation=skip_validation,
                   cast = raw_cast) |> 
  guessCast(rcon, 
            validation=valRx("^[0-9]{1,4}-(0?[1-9]|1[012])-(0?[1-9]|[12][0-9]|3[01])$"), 
            cast=as.Date,
            threshold=0.6)
            
            
# Using mChoiceCast
exportRecordsTyped(rcon) |> 
  mChoiceCast(rcon)


## End(Not run)



redcapAPI documentation built on May 29, 2024, 12:18 p.m.