con_contradictions_redcap: Checks user-defined contradictions in study data

View source: R/con_contradictions_redcap.R

con_contradictions_redcapR Documentation

Checks user-defined contradictions in study data

Description

This approach considers a contradiction if impossible combinations of data are observed in one participant. For example, if age of a participant is recorded repeatedly the value of age is (unfortunately) not able to decline. Most cases of contradictions rest on comparison of two variables.

Important to note, each value that is used for comparison may represent a possible characteristic but the combination of these two values is considered to be impossible. The approach does not consider implausible or inadmissible values.

Indicator

Usage

con_contradictions_redcap(
  study_data,
  item_level = "item_level",
  label_col,
  threshold_value,
  meta_data_cross_item = "cross-item_level",
  use_value_labels,
  summarize_categories = FALSE,
  meta_data = item_level,
  cross_item_level,
  `cross-item_level`,
  meta_data_v2
)

Arguments

study_data

data.frame the data frame that contains the measurements

item_level

data.frame the data frame that contains metadata attributes of study data

label_col

variable attribute the name of the column in the metadata with labels of variables

threshold_value

numeric from=0 to=100. a numerical value ranging from 0-100

meta_data_cross_item

data.frame contradiction rules table. Table defining contradictions. See online documentation for its required structure.

use_value_labels

logical Deprecated in favor of DATA_PREPARATION. If set to TRUE, labels can be used in the REDCap syntax to specify contraction checks for categorical variables. If set to FALSE, contractions have to be specified using the coded values. In case that this argument is not set in the function call, it will be set to TRUE if the metadata contains a column VALUE_LABELS which is not empty.

summarize_categories

logical Needs a column CONTRADICTION_TYPE in the meta_data_cross_item. If set, a summary output is generated for the defined categories plus one plot per category. TODO: Not yet controllable by metadata.

meta_data

data.frame old name for item_level

cross_item_level

data.frame alias for meta_data_cross_item

meta_data_v2

character path to workbook like metadata file, see prep_load_workbook_like_file for details. ALL LOADED DATAFRAMES WILL BE PURGED, using prep_purge_data_frame_cache, if you specify meta_data_v2.

`cross-item_level`

data.frame alias for meta_data_cross_item

Details

Algorithm of this implementation:

  • Remove missing codes from the study data (if defined in the metadata)

  • Remove measurements deviating from limits defined in the metadata

  • Assign label to levels of categorical variables (if applicable)

  • Apply contradiction checks (given as REDCap-like rules in a separate metadata table)

  • Identification of measurements fulfilling contradiction rules. Therefore two output data frames are generated:

    • on the level of observation to flag each contradictory value combination, and

    • a summary table for each contradiction check.

  • A summary plot illustrating the number of contradictions is generated.

List function.

Value

If summarize_categories is FALSE: A list with:

  • FlaggedStudyData: The first output of the contradiction function is a data frame of similar dimension regarding the number of observations in the study data. In addition, for each applied check on the variables an additional column is added which flags observations with a contradiction given the applied check.

  • VariableGroupData: The second output summarizes this information into one data frame. This output can be used to provide an executive overview on the amount of contradictions.

  • VariableGroupTable: A subset of VariableGroupData used within the pipeline.

  • SummaryPlot: The third output visualizes summarized information of SummaryData.

If summarize_categories is TRUE, other objects are returned: A list with one element Other, a list with the following entries: One per category named by that category (e.g. "Empirical") containing a result for contradiction checks within that category only. Additionally, in the slot all_checks, a result as it would have been returned with summarize_categories set to FALSE. Finally, in the top-level list, a slot SummaryData is returned containing sums per Category and an according ggplot2::ggplot in SummaryPlot.

See Also

Online Documentation for the function meta_data_cross Online Documentation for the required cross-item-level metadata


dataquieR documentation built on Jan. 8, 2026, 5:08 p.m.