acc_shape_or_scale: Compare observed versus expected distributions
In dataquieR: Data Quality in Epidemiological Research

acc_shape_or_scale

R Documentation

Compare observed versus expected distributions

Description

This implementation contrasts the empirical distribution of a measurement variables against assumed distributions. The approach is adapted from the idea of rootograms (Tukey 1977) which is also applicable for count data (Kleiber and Zeileis 2016).

Indicator

Usage

acc_shape_or_scale(
  resp_vars,
  study_data,
  label_col,
  item_level = "item_level",
  dist_col,
  guess,
  par1,
  par2,
  end_digits,
  flip_mode = "noflip",
  meta_data = item_level,
  meta_data_v2
)

Arguments

`resp_vars`	variable the name of the continuous measurement variable
`study_data`	data.frame the data frame that contains the measurements
`label_col`	variable attribute the name of the column in the metadata with labels of variables
`item_level`	data.frame the data frame that contains metadata attributes of study data
`dist_col`	variable attribute the name of the variable attribute in meta_data that provides the expected distribution of a study variable
`guess`	logical estimate parameters
`par1`	numeric first parameter of the distribution if applicable
`par2`	numeric second parameter of the distribution if applicable
`end_digits`	logical internal use. check for end digits preferences
`flip_mode`	enum default \| flip \| noflip \| auto. Should the plot be in default orientation, flipped, not flipped or auto-flipped. Not all options are always supported. In general, this con be controlled by setting the `roptions(dataquieR.flip_mode = ...)`. If called from `dq_report`, you can also pass `flip_mode` to all function calls or set them specifically using `specific_args`.
`meta_data`	data.frame old name for `item_level`
`meta_data_v2`	character path to workbook like metadata file, see `prep_load_workbook_like_file` for details. ALL LOADED DATAFRAMES WILL BE PURGED, using `prep_purge_data_frame_cache`, if you specify `meta_data_v2`.

Value

a list with:

ResultData: data.frame underlying the plot
SummaryPlot: ggplot2::ggplot2 probability distribution plot
SummaryTable: data.frame with the columns Variables and FLG_acc_ud_shape

ALGORITHM OF THIS IMPLEMENTATION:

This implementation is restricted to data of type float or integer.
Missing codes are removed from resp_vars (if defined in the metadata)
The user must specify the column of the metadata containing probability distribution (currently only: normal, uniform, gamma)
Parameters of each distribution can be estimated from the data or are specified by the user
A histogram-like plot contrasts the empirical vs. the technical distribution

dataquieR
Data Quality in Epidemiological Research

acc_shape_or_scale: Compare observed versus expected distributions
In dataquieR: Data Quality in Epidemiological Research

Compare observed versus expected distributions

Description

Usage

Arguments

Value

ALGORITHM OF THIS IMPLEMENTATION:

See Also

Related to acc_shape_or_scale in dataquieR...

R Package Documentation

Browse R Packages

We want your feedback!

dataquieR Data Quality in Epidemiological Research

acc_shape_or_scale: Compare observed versus expected distributions In dataquieR: Data Quality in Epidemiological Research

Compare observed versus expected distributions

Description

Usage

Arguments

Value

ALGORITHM OF THIS IMPLEMENTATION:

See Also

Related to acc_shape_or_scale in dataquieR...

R Package Documentation

Browse R Packages

We want your feedback!

dataquieR
Data Quality in Epidemiological Research

acc_shape_or_scale: Compare observed versus expected distributions
In dataquieR: Data Quality in Epidemiological Research