Quick links: mixtools | mclust
This repository contains the code used to generate automatically summary data and figures of the manuscript "Gaussian mixtures in R". Its main purpose is to compare automatically computational and statistical performances of R packages estimating Gaussian mixture models (temporary in univariate dimension). Especially, we compare packages Rmixmod, mixtools, bgmm, mclust, EMCluster, GMKMcharlie, flexmix and DCEM. Additionally, otrimle is provided to estimate parameters in case of outliers and mixsmnsn is dedicated for the estimation of skewed GMMs.
# load useful libraries and packages
library(ggplot2)
import::from(magrittr, "%>%", .into = "operators")
import::from(rebmix, .except = c("AIC", "BIC", "split"))
library(mclust)
library(Rmixmod)
relevant_mixture_functions <- list ("otrimle"=list(name_fonction=em_otrimle, list_params=list()),
"mixsmsn"=list(name_fonction=em_mixsmsn, list_params=list()),
"em R" = list(name_fonction=emnmix, list_params=list()),
"Rmixmod" = list(name_fonction=em_Rmixmod, list_params=list()),
"mixtools" = list(name_fonction=em_mixtools, list_params=list()),
"bgmm"= list(name_fonction=em_bgmm, list_params=list()),
"mclust" = list(name_fonction=em_mclust, list_params=list(prior = NULL)),
"EMCluster" = list(name_fonction=em_EMCluster, list_params=list()),
"GMKMcharlie"=list(name_fonction=em_GMKMcharlie, list_params=list()),
"flexmix"= list(name_fonction=em_flexmix, list_params=list()),
"DCEM"=list(name_fonction=em_DCEM, list_params=list()))
##################################################################
## Compare computational performances of the packages ##
##################################################################
four_components_statistical_performances <- benchmark_distribution_parameters(mixture_functions=relevant_mixture_functions,
sigma_values=list("high OVL"= rep(2, 4)),
mean_values=list(c(0, 4, 8, 12)),
proportions = list("highly unbalanced"=c(0.1, 0.7, 0.1, 0.1)),
skewness_values = list("null skewness"=rep(0, 4),
Nbootstrap=200, nobservations=c(2000)))
#################################################################
## Save results (example with the four components simulation ##
#################################################################
# save summary scores and distributions of the bootstrap simulations
openxlsx::write.xlsx(four_components_statistical_performances$local_scores,file = "tables/four_components_local_scores.xlsx", asTable = T)
openxlsx::write.xlsx(four_components_statistical_performances$global_scores,file = "tables/four_components_global_scores.xlsx", asTable = T)
openxlsx::write.xlsx(four_components_statistical_performances$distributions,file = "tables/four_components_distributions.xlsx", asTable = T)
# save boxplots associated to the distribution of the estimates
unbalanced_overlapping_boxplots <- four_components_computational_performances$plots$`2000_observations_UR_0.9_skewness_0_OVL_0.08_prop_outliers_0`
ggsave("images/four_components_unbalanced_overlapping_boxplots.pdf", unbalanced_overlapping_boxplots,
width = 15, height = 14,dpi = 600)
To get the most recent version, open R
and run:
if(!require(remotes)) install.packages("remotes")
remotes::install_github("bastienchassagnol-servier/RGMMBench")
The package is composed of four scripts: main contains the main script to load required libraries and executes auxiliary functions to reproduce figures and tables of the paper Gaussian Mixtures in R, mixture enlists the functions used to simulate a Gaussian mixture and estimate its parameters using the EM algorithm, benchmark gathers the two functions used to compare the computational and statistical performances of R packages in learning GMMs. Finally, visualisation displays three functions, two to compare graphically the computational performances of the packages, and one representing the boxplots of the bootstrap simulations.
ArXiv publication associated with the paper:
Originally developed from an original course on mixture models and use of the EM algorithm for complex MLE estimation supplied by Gregory Nuel
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.