residuals_mggmm: Calculation of Residuals for the Multi-Group GMM

View source: R/cellMGGMM.R

residuals_mggmmR Documentation

Calculation of Residuals for the Multi-Group GMM

Description

This function calculates the cell-wise residuals for each observation based on the fitted parameters of a multi-group Gaussian Mixture Model (GMM) and the cellwise outlyingness pattern in matrix 'W'.

Usage

residuals_mggmm(X, groups, Sigma, mu, probs, W, set_to_zero = TRUE)

Arguments

X

A numeric data matrix or data frame with observations in rows and variables in columns.

groups

A vector indicating pre-defined group membership for each observation (length must match 'nrow(X)').

Sigma

A list of estimated covariance matrices.

mu

A list of estimated mean vectors.

probs

A matrix of posterior probabilities for each observation (rows) and group (columns).

W

A binary matrix indicating which entries are considered non-outlying (1 = clean, 0 = outlying). Same dimensions as 'X'.

set_to_zero

A boolean indicating whether residuals of non-outlying cells should be set to zero.

Details

Positive values of residuals mean that the observed value of the outlying variable is higher than would have been expected based on the other observed variables, negative values mean that the observed value is lower than expected. For non-outlying cells (i.e. where 'W[i, j] == 1'), the residual is set to zero.

Value

A numeric matrix of residuals of the same dimension as 'X', where each cell represents the standardized deviation from the model-based conditional expectation, or zero if the cell was not flagged as outlying in 'W'.

References

Puchhammer, P., Wilms, I., & Filzmoser, P. (2025). A smooth multi-group Gaussian Mixture Model for cellwise robust covariance estimation. ArXiv preprint \Sexpr[results=rd]{tools:::Rd_expr_doi("10.48550/arXiv.2504.02547")}.

See Also

cellMGGMM

Examples

data("weatherAUT2021")
cut_lon = c(min(weatherAUT2021$lon)-0.2, 12, 16, max(weatherAUT2021$lon) + 0.2)
cut_lat = c(min(weatherAUT2021$lat)-0.2, 48, max(weatherAUT2021$lat) + 0.2)
groups = ssMRCD::groups_gridbased(weatherAUT2021$lon, weatherAUT2021$lat, cut_lon, cut_lat)
N = length(unique(groups))
model = cellMGGMM(X = weatherAUT2021[, c("p", "s", "vv", "t", "rsum", "rel")],
                 groups = groups,
                 alpha = 0.5)
res = residuals_mggmm(X =  weatherAUT2021[, c("p", "s", "vv", "t", "rsum", "rel")],
                groups = groups,
                Sigma = model$Sigma,
                mu = model$mu,
                probs = model$probs,
                W = model$W)

ssMRCD documentation built on Nov. 5, 2025, 7:44 p.m.