recordsTypedMethods: Export Records or Reports From a Project

View source: R/docsRecordsTypedMethods.R

recordsTypedMethodsR Documentation

Export Records or Reports From a Project

Description

These methods enable the user to export records from a database or from a report. These methods have more control for casting fields to R objects than exportRecords.

Usage

exportRecordsTyped(
  rcon,
  fields = NULL,
  drop_fields = NULL,
  forms = NULL,
  records = NULL,
  events = NULL,
  ...
)

exportReportsTyped(rcon, report_id, ...)

## S3 method for class 'redcapApiConnection'
exportRecordsTyped(
  rcon,
  fields = NULL,
  drop_fields = NULL,
  forms = NULL,
  records = NULL,
  events = NULL,
  survey = TRUE,
  dag = FALSE,
  date_begin = NULL,
  date_end = NULL,
  na = list(),
  validation = list(),
  cast = list(),
  assignment = list(label = stripHTMLandUnicode, units = unitsFieldAnnotation),
  filter_empty_rows = TRUE,
  warn_zero_coded = TRUE,
  ...,
  config = list(),
  api_param = list(),
  csv_delimiter = ",",
  batch_size = NULL,
  error_handling = getOption("redcap_error_handling")
)

## S3 method for class 'redcapOfflineConnection'
exportRecordsTyped(
  rcon,
  fields = NULL,
  drop_fields = NULL,
  forms = NULL,
  records = NULL,
  events = NULL,
  na = list(),
  validation = list(),
  cast = list(),
  assignment = list(label = stripHTMLandUnicode, units = unitsFieldAnnotation),
  warn_zero_coded = TRUE,
  ...
)

## S3 method for class 'redcapApiConnection'
exportReportsTyped(
  rcon,
  report_id,
  drop_fields = NULL,
  na = list(),
  validation = list(),
  cast = list(),
  assignment = list(label = stripHTMLandUnicode, units = unitsFieldAnnotation),
  warn_zero_coded = TRUE,
  ...,
  config = list(),
  api_param = list(),
  csv_delimiter = ","
)

Arguments

rcon

A redcapConnection object.

report_id

integerish(1). The ID number of the report to be exported.

fields

character or NULL. Vector of fields to be returned. If NULL, all fields are returned (unless forms is specified).

drop_fields

character or NULL. A vector of field names to remove from the export.

forms

character or NULL. Vector of forms to be returned. If NULL, all forms are returned (unless fields is specified.

records

character or integerish. A vector of study ID's to be returned. If NULL, all subjects are returned.

events

A character vector of events to be returned from a longitudinal database. If NULL, all events are returned. When using a redcapOfflineConnection object, this argument is unvalidated, and only rows that match one of the values given are returned; misspellings may result in unexpected results.

survey

logical(1). When TRUE, the survey identifier field (e.g., redcap_survey_identifier) and survey timestamp fields (e.g., ⁠[form_name]_timestamp⁠) will be exported (relevant only when surveys are utilized in the project).

dag

logical(1). When TRUE the redcap_data_access_group field will be included in the export \ when data access groups are utilized in the project. This flag is only viable if the user whose token is being used to make the API request is not in a data access group. If the user is in a group, then this flag will revert to its default value. Data Access Groups privilege is required when creating/renaming/deleting DAGs and when importing/exporting user-DAG assignments. Therefore, the default for this flag is FALSE. To export DAG information set this flag to TRUE.

date_begin

POSIXct(1) or NULL. Ignored if NULL (default). Otherwise, records created or modified after this date will be returned.

date_end

POSIXct(1) or NULL. Ignored if NULL (default). Otherwise, records created or modified before this date will be returned.

na

A named list of user specified functions to determine if the data is NA. This is useful when data is loaded that has coding for NA, e.g. -5 is NA. Keys must correspond to a truncated REDCap field type, i.e. date_, datetime_, datetime_seconds_, time_mm_ss, time_hh_mm_ss, time, float, number, calc, int, integer, select, radio, dropdown, yesno, truefalse, checkbox, form_complete, sql, system. The function will be provided the variables (x, field_name, coding). The function must return a vector of logicals matching the input. It defaults to isNAorBlank() for all entries.

validation

A named list of user specified validation functions. The same named keys are supported as the na argument. The function will be provided the variables (x, field_name, coding). The function must return a vector of logical matching the input length. Helper functions to construct these are valRx() and valChoice(). Only fields that are not identified as NA will be passed to validation functions.

cast

A named list of user specified class casting functions. The same named keys are supported as the na argument. The function will be provided the variables (x, field_name, coding). The function must return a vector of logical matching the input length. The cast should match the validation, if one is using raw_cast, then validation=skip_validation is likely the desired intent. See fieldValidationAndCasting()

assignment

A named list of functions. These functions are provided, field_name, label, description and field_type and return a list of attributes to assign to the column. Defaults to creating a label attribute from the stripped HTML and UNICODE raw label and scanning for units={"UNITS"} in description

filter_empty_rows

logical(1). Filter out empty rows post retrieval. Defaults to TRUE.

csv_delimiter

character. One of c(",", "\t", ";", "|", "^"). Designates the delimiter for the CSV file received from the API.

batch_size

integerish(1) or NULL. When NULL, all records are pulled. Otherwise, the records all pulled in batches of this size.

warn_zero_coded

logical(1). Turn on or off warnings about potentially problematic zero coded fields. Defaults to TRUE.

...

Arguments to pass to other methods

error_handling

character(1). One of c("error", "null"). An option for how to handle errors returned by the API. see redcapError().

config

A named list. Additional configuration parameters to pass to httr::POST(). These are appended to any parameters in rcon$config.

api_param

A named list. Additional API parameters to pass into the body of the API call. This provides users to execute calls with options that may not otherwise be supported by redcapAPI.

Details

The 'offline' method operates on the raw (unlabeled) data file downloaded from REDCap along with the data dictionary. This is made available for instances where the API cannot be accessed for some reason (such as waiting for API approval from the REDCap administrator).

When validating data for offlineRedcapConnection objects, links to invalid data forms will not work if the user does not provide the url, version, project_info, and events arguments (if the project is longitudinal). For the project_info, the values project_id and is_longitudinal are required. The user may be able to provide as little as ⁠project_info = data.frame(project_id = [id], is_longitudinal = [0/1])⁠. The user should be aware that the REDCap User Interface download for events does not include the event ID. To include the event ID, the user must construct a data frame to pass to offlineConnection.

Record Identifier (System) Fields

In all calls, the project's ID fields will be included–there is no option provided to prevent this. Additionally, if the project has a secondary unique field specified, it will also be included. Inclusion of these fields is necessary to support some post-processing functions.

By default, the system fields redcap_event_name, redcap_repeat_instrument, and redcap_repeat_instance are exported (when they are appropriate to the project). These are automatically included by the API. However, if the user omits any of these in fields or designates one in drop_fields, the final result will honor those conditions. Excluding any of these identifiers may cause problems with some post-processing functions that operate on repeating instrument data.

The combination of the project ID field, secondary unique field, and the system fields are what uniquely identify an experimental unit. In nearly all cases, it is desirable to have them all included.

System fields are cast to labelled values by default. They may be cast to their coded values using the override cast = list(system = castRaw). The fields affected by the system override are redcap_event_name, redcap_repeat_instrument, and redcap_data_access_group.

BioPortal Fields

Text fields that are validation enabled using the BioPortal Ontology service may be cast to labeled values so long as the labels have been cached on the REDCap server. Caching is performed when the field is viewed in a form on the web interface. However, labels are not cached when data are imported via the API. In cases where labels are not cached, the coded value is treated as both the code and the label.

Record Batching

A 'batched' export is one where the export is performed over a series of API calls rather than one large call. For large projects on small servers, this may prevent a single user from tying up the server and forcing others to wait on a larger job. The batched export is performed by first calling the API to export the subject identifier field (the first field in the meta data). The unique ID's are then assigned a batch number with no more than batch_size ID's in any single batch. The batches are exported from the API and stacked together.

In longitudinal projects, batch_size may not necessarily be the number of records exported in each batch. If batch_size is ten and there are four records per patient, each batch will consist of 40 records. Thus, if the user is concerned about tying up the server with a large, longitudinal project, it would be prudent to use a smaller batch size.

Inversion of Control

The final product of calling this is a data.frame with columns that have been type cast to most commonly used analysis class (e.g. factor). This version allows the user to override any step of this process by specifying a different function for each of the stages of the type casting. The algorithm is as follows:

  1. Detect NAs in returned data (na argument).

  2. Run validate functions for the field_types.

  3. On the fields that are not NA and pass validate do the specified cast.

It is expected that the na and validate overrides should rarely be used. Their exposure via the function parameters is to future proof against possible bugs in the defaults, and allows for things that higher versions of REDCap add as possible field types. I.e., the overrides are for use to continue using the library when errors or changes to REDCap occur.

The cast override is one where users can specify things that were controlled by an ever increasing set of flags before. E.g., dates=as.Date was an addition to allow dates in the previous version to be overridden if the user wanted to use the Date class. In this version, it would appear called as ⁠cast=list(_date=as.Date))⁠. See fieldValidationAndCasting() for a full listing of package provided cast functions.

Value

exportRecordsTyped returns a data frame with the formatted data.

exportReportsTyped returns a data frame with the formatted data.

Functions

  • exportRecordsTyped(): Export records with type casting.

  • exportReportsTyped(): Export reports with type casting.

  • exportRecordsTyped(redcapOfflineConnection): Export records without API access.

Zero-Coded Check Fields

A zero-coded check field is a field of the REDCap type checkbox that has a coding definition of ⁠0, [label]⁠. When exported, the field names for these fields is ⁠[field_name]___0⁠. As in other checkbox fields, the raw data output returns binary values where 0 represent an unchecked box and 1 represents a checked box. For zero-coded checkboxes, then, a value of 1 indicates that 0 was selected.

This coding rarely presents a problem when casting from raw values (as is done in exportRecordsTyped). However, casting from coded or labeled values can be problematic. In this case, it becomes indeterminate from context if the intent of 0 is 'false' or the coded value '0' ('true') ...

The situations in which casting may fail to produce the desired results are

Code Label Result
0 anything other than "0" Likely to fail when casting from coded values
0 0 Likely to fail when casting from coded or labeled values

Because of the potential for miscast data, casting functions will issue a warning anytime a zero-coded check field is encountered. A separate warning is issued when a field is cast from coded or labeled values.

When casting from coded or labeled values, it is strongly recommended that the function castCheckForImport() be used. This function permits the user to state explicitly which values should be recognized as checked, avoiding the ambiguity resulting from the coding.

See Also

Other records exporting functions

exportRecords(),
exportReports(),
exportBulkRecords()

Field validations and casting

fieldValidationAndCasting(),
reviewInvalidRecords()

Post-processing functionality

recastRecords(),
guessCast(),
guessDate(),
castForImport(),
mChoiceCast(),
splitForms(),
widerRepeated()

Vignettes

vignette("redcapAPI-offline-connection")
vignette("redcapAPI-casting-data")
vignette("redcapAPI-missing-data-detection")
⁠vignette("redcapAPI-data-validation)⁠
⁠vignette("redcapAPI-faq)⁠

Examples

## Not run: 
unlockREDCap(connections = c(rcon = "project_alias"), 
             url = "your_redcap_url", 
             keyring = "API_KEYs", 
             envir = globalenv())
             
# Export records with default settings
exportRecordsTyped(rcon)

# Export records with no factors
exportRecordsTyped(rcon, 
                   cast = default_cast_character)
                   
# Export records for specific records
exportRecordsTyped(rcon, 
                   records = 1:3)
                   
# Export records for specific instruments
exportRecordsTyped(rcon, 
                   forms = c("registration", "visit_1", "medications"))
                   
# Export records using filterLogic, an API parameter not provided
# in the exportRecordsTyped function signature
exportRecordsTyped(
  rcon, 
  records = 1:3, 
  api_param = list(filterLogic = "[age_at_enrollment] > 25")
)
                   
                   
                   
# Export a report 
exportReports(rcon, 
              report_id = 12345)
              
              
# Export records using files downloaded from the user interface
rcon_off <- 
  offlineConnection(
    meta_data = 
      system.file(file.path("extdata/offlineConnectionFiles", 
                            "TestRedcapAPI_DataDictionary.csv"), 
                  package = "redcapAPI"), 
    records = 
      system.file(file.path("extdata/offlineConnectionFiles",
                            "TestRedcapAPI_Records.csv"), 
                  package = "redcapAPI"))

exportRecordsTyped(rcon_off)

## End(Not run)


redcapAPI documentation built on May 29, 2024, 12:18 p.m.