acc_distributions_prop: Plots and checks for distributions - Proportion

View source: R/acc_distributions.R

acc_distributions_propR Documentation

Plots and checks for distributions – Proportion

Description

Data quality indicator checks "Unexpected location" and "Unexpected proportion" with histograms.

Indicator

Usage

acc_distributions_prop(
  resp_vars = NULL,
  study_data,
  label_col,
  item_level = "item_level",
  check_param = "proportion",
  plot_ranges = TRUE,
  flip_mode = "noflip",
  meta_data = item_level,
  meta_data_v2
)

Arguments

resp_vars

variable list the names of the measurement variables

study_data

data.frame the data frame that contains the measurements

label_col

variable attribute the name of the column in the metadata with labels of variables

item_level

data.frame the data frame that contains metadata attributes of study data

check_param

enum any | location | proportion. Which type of check should be conducted (if possible): a check on the location of the mean or median value of the study data, a check on proportions of categories, or either of them if the necessary metadata is available.

plot_ranges

logical Should the plot show ranges and results from the data quality checks? (default: TRUE)

flip_mode

enum default | flip | noflip | auto. Should the plot be in default orientation, flipped, not flipped or auto-flipped. Not all options are always supported. In general, this con be controlled by setting the roptions(dataquieR.flip_mode = ...). If called from dq_report, you can also pass flip_mode to all function calls or set them specifically using specific_args.

meta_data

data.frame old name for item_level

meta_data_v2

character path to workbook like metadata file, see prep_load_workbook_like_file for details. ALL LOADED DATAFRAMES WILL BE PURGED, using prep_purge_data_frame_cache, if you specify meta_data_v2.

Value

A list with:

  • SummaryTable: data.frame containing data quality checks for "Unexpected location" (FLG_acc_ud_loc) and "Unexpected proportion" (FLG_acc_ud_prop) for each response variable in resp_vars.

  • SummaryData: a data.frame containing data quality checks for "Unexpected location" and / or "Unexpected proportion" for a report

  • SummaryPlotList: list of ggplot2::ggplots for each response variable in resp_vars.

Algorithm of this implementation:

  • If no response variable is defined, select all variables of type float or integer in the study data.

  • Remove missing codes from the study data (if defined in the metadata).

  • Remove measurements deviating from (hard) limits defined in the metadata (if defined).

  • Exclude variables containing only NA or only one unique value (excluding NAs).

  • Perform check for "Unexpected location" if defined in the metadata (needs a LOCATION_METRIC (mean or median) and LOCATION_RANGE (range of expected values for the mean and median, respectively)).

  • Perform check for "Unexpected proportion" if defined in the metadata (needs PROPORTION_RANGE (range of expected values for the proportions of the categories)).

  • Plot histogram(s).

See Also


dataquieR documentation built on Jan. 8, 2026, 5:08 p.m.