acc_margins: Estimate marginal means, see emmeans::emmeans

View source: R/acc_margins.R

acc_marginsR Documentation

Estimate marginal means, see emmeans::emmeans

Description

margins does calculations for quality indicator Unexpected distribution wrt location (link). Therefore we pursue a combined approach of descriptive and model-based statistics to investigate differences across the levels of an auxiliary variable.

CAT: Unexpected distribution w.r.t. location

Marginal means

Marginal means rests on model based results, i.e. a significantly different marginal mean depends on sample size. Particularly in large studies, small and irrelevant differences may become significant. The contrary holds if sample size is low.

Usage

acc_margins(
  resp_vars = NULL,
  group_vars = NULL,
  co_vars = NULL,
  threshold_type = NULL,
  threshold_value,
  min_obs_in_subgroup,
  study_data,
  meta_data,
  label_col
)

Arguments

resp_vars

variable the name of the continuous measurement variable

group_vars

variable list len=1-1. the name of the observer, device or reader variable

co_vars

variable list a vector of covariables, e.g. age and sex for adjustment

threshold_type

enum empirical | user | none. In case empirical is chosen a multiplier of the scale measure is used, in case of user a value of the mean or probability (binary data) has to be defined see Implementation and use of thresholds. In case of none, no thresholds are displayed and no flagging of unusual group levels is applied.

threshold_value

numeric a multiplier or absolute value see Implementation and use of thresholds

min_obs_in_subgroup

integer from=0. optional argument if a "group_var" is used. This argument specifies the minimum no. of observations that is required to include a subgroup (level) of the "group_var" in the analysis. Subgroups with less observations are excluded. The default is 5.

study_data

data.frame the data frame that contains the measurements

meta_data

data.frame the data frame that contains metadata attributes of study data

label_col

variable attribute the name of the column in the metadata with labels of variables

Details

Limitations

Selecting the appropriate distribution is complex. Dozens of continuous, discrete or mixed distributions are conceivable in the context of epidemiological data. Their exact exploration is beyond the scope of this data quality approach. The function above uses the help function util_dist_selection which discriminates four cases:

  • continuous data

  • binary data

  • count data with <= 20 categories

  • count data with > 20 categories

Nonetheless, only three different plot types are generated. The fourth case is treated as continuous data. This is in fact a coarsening of the original data but for the purpose of clarity this approach is chosen.

Value

a list with:

  • SummaryTable: data frame underlying the plot

  • SummaryData: data frame

  • SummaryPlot: ggplot2 margins plot

See Also

Online Documentation

Examples

## Not run: 
# runs spuriously slow on rhub
load(system.file("extdata/study_data.RData", package = "dataquieR"))
load(system.file("extdata/meta_data.RData", package = "dataquieR"))
acc_margins(resp_vars = "DBP_0",
            study_data = study_data,
            meta_data = meta_data,
            group_vars = "USR_BP_0",
            label_col = LABEL,
            co_vars = "AGE_0")

## End(Not run)

dataquieR documentation built on July 26, 2023, 6:10 p.m.