acc_shape_or_scale: Compare observed versus expected distributions

View source: R/acc_shape_or_scale.R

acc_shape_or_scaleR Documentation

Compare observed versus expected distributions

Description

This implementation contrasts the empirical distribution of a measurement variables against assumed distributions. The approach is adapted from the idea of rootograms (Tukey 1977) which is also applicable for count data (Kleiber and Zeileis 2016).

Usage

acc_shape_or_scale(
  resp_vars,
  dist_col,
  guess,
  par1,
  par2,
  end_digits,
  label_col,
  study_data,
  meta_data,
  flip_mode = "noflip"
)

Arguments

resp_vars

variable the name of the continuous measurement variable

dist_col

variable attribute the name of the variable attribute in meta_data that provides the expected distribution of a study variable

guess

logical estimate parameters

par1

numeric first parameter of the distribution if applicable

par2

numeric second parameter of the distribution if applicable

end_digits

logical internal use. check for end digits preferences

label_col

variable attribute the name of the column in the metadata with labels of variables

study_data

data.frame the data frame that contains the measurements

meta_data

data.frame the data frame that contains metadata attributes of study data

flip_mode

enum default | flip | noflip | auto. Should the plot be in default orientation, flipped, not flipped or auto-flipped. Not all options are always supported. In general, this con be controlled by setting the roptions(dataquieR.flip_mode = ...). If called from dq_report, you can also pass flip_mode to all function calls or set them specifically using specific_args.

Value

a list with:

  • SummaryData: data.frame underlying the plot

  • SummaryPlot: ggplot2 probability distribution plot

  • SummaryTable: data.frame with the columns Variables and GRADING

ALGORITHM OF THIS IMPLEMENTATION:

  • This implementation is restricted to data of type float or integer.

  • Missing codes are removed from resp_vars (if defined in the metadata)

  • The user must specify the column of the metadata containing probability distribution (currently only: normal, uniform, gamma)

  • Parameters of each distribution can be estimated from the data or are specified by the user

  • A histogram-like plot contrasts the empirical vs. the technical distribution

See Also

Online Documentation


dataquieR documentation built on July 26, 2023, 6:10 p.m.