Cellwise Robust Multi-Group Gaussian Mixture Model"

knitr::opts_chunk$set(
  collapse = TRUE,
  warning = FALSE, 
  fig.dim = c(7, 4.5),
  comment = "#>"
)

This vignette reproduces the weather example described in Puchhammer, Wilms and Filzmoser (2025). The original data is from Geosphere Austria (2022) and included in this package.

library(ssMRCD)
library(ggplot2)
library(dplyr)

Data Preparation

The original data from GeoSphere Austria (2022) is pre-cleaned and saved in the data frame object weatherAUT2021. Additional information can be found on the helping page.

# get meta data for the data set
? weatherAUT2021
# load the data
data("weatherAUT2021")

# inspect the data
head(weatherAUT2021)

# select variables, station names and number of observations
data = weatherAUT2021 %>% select(p:rel)
stations = weatherAUT2021$name
n = dim(data)[1]

The predefined groups are based in the underlying geographical landscape consisting of Alpine mountains, hills and flatter areas in Austria.

# build 5 groups of observations based on spatial proximity and geography
cut_lon = c(min(weatherAUT2021$lon)-0.2, 12, 16, max(weatherAUT2021$lon) + 0.2)
cut_lat = c(min(weatherAUT2021$lat)-0.2, 48, max(weatherAUT2021$lat) + 0.2)
groups = ssMRCD::groups_gridbased(weatherAUT2021$lon, 
                                  weatherAUT2021$lat, 
                                  cut_lon, 
                                  cut_lat)
N = length(unique(groups))
table(groups)
# calculate MG-GMM
model = cellMGGMM(X = data, groups = groups,
                  nsteps = 100, alpha = 0.5,
                  maxcond = 100)
# mixture probabilities
cat("Pi (in %):\n")
round(model$pi_groups*100, 2)
# percentage of outliers
cat("% Outliers per group and variable:\n")
round(sapply(1:N, function(x) colMeans(1-model$W[groups == x, ]))*100, 2)
# calculate residuals
res = residuals_mggmm(X = data, 
                groups = groups,
                Sigma = model$Sigma,
                mu = model$mu, 
                probs = model$probs,
                W = model$W)

References

GeoSphere Austria (2022): https://data.hub.geosphere.at.

Puchhammer P., Wilms I. and Filzmoser P. (2025): A smooth multi-group Gaussian Mixture Model for cellwise robust covariance estimation. https://doi.org/10.48550/arXiv.2504.02547



Try the ssMRCD package in your browser

Any scripts or data that you put into this service are public.

ssMRCD documentation built on Nov. 5, 2025, 7:44 p.m.