View source: R/acc_mahalanobis.R
| acc_mahalanobis | R Documentation |
Mahalanobis distancesA standard tool to calculate Mahalanobis distance.
In this approach the squared Mahalanobis distance is calculated for ordinal
variables (treated as continuous) to identify inattentive responses.
It calculates the distance for each observational unit from the sample mean.
The greater the distance, the atypical the responses.
Indicator
acc_mahalanobis(
variable_group = NULL,
study_data,
item_level = "item_level",
meta_data = item_level,
meta_data_cross_item = "cross-item_level",
label_col = VAR_NAMES,
meta_data_v2,
cross_item_level,
`cross-item_level`,
mahalanobis_threshold =
suppressWarnings(as.numeric(getOption("dataquieR.MAHALANOBIS_THRESHOLD",
dataquieR.MAHALANOBIS_THRESHOLD_default)))
)
variable_group |
variable list the names of the variables used to
calculate the |
study_data |
data.frame the data frame that contains the measurements |
item_level |
data.frame the data frame that contains metadata attributes of study data |
meta_data |
data.frame old name for |
meta_data_cross_item |
data.frame – Cross-item level metadata |
label_col |
variable attribute the name of the column in the metadata containing the labels of the variables |
meta_data_v2 |
character path or file name of the workbook like
metadata file, see
|
cross_item_level |
data.frame alias for |
`cross-item_level` |
data.frame alias for |
mahalanobis_threshold |
numeric the confidence level to use to define
|
a list with:
SummaryTable: data.frame underlying the plot
SummaryData: data.frame underlying the plot with speaking column labels
SummaryPlot: ggplot2::ggplot2 Q-Q plot of squared Mahalanobis
distances vs. a theoretical
chi-squared distribution showing outliers.
FlaggedStudyData: data.frame contains the original data frame of the
variables used to calculate
the squared Mahalanobis distances
with the additional column,
containing the squared
Mahalanobis distance, and a column
called MD_outliers, that contains
1 if the observational unit is considered
a multivariate outlier.
Implementation is restricted to variables of type integer
Remove missing codes from the study data (if defined in the metadata)
The covariance matrix is estimated for all variables from variable_group
The Mahalanobis distance of each observation is calculated
MD^2_i = (x_i - \mu)^T \Sigma^{-1} (x_i - \mu)
The default to consider a value an outlier is to use the 0.975 quantile
of a theoretical chi-square distribution with degrees of freedom
equals to the number of variables used to calculate the
Mahalanobis distance (Mayrhofer and Filzmoser, 2023)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.