library(knitr) knitr::opts_chunk$set( fig.align = "center", fig.height = 5.5, fig.width = 6, warning = FALSE, collapse = TRUE, dev.args = list(pointsize = 10), out.width = "90%", par = TRUE ) knit_hooks$set(par = function(before, options, envir) { if (before && options$fig.show != "none") par(family = "sans", mar = c(4.1,4.1,1.1,1.1), mgp = c(3,1,0), tcl = -0.5) })
library(meteorits)
StMoE (Skew-t Mixture-of-Experts) provides a flexible and robust modelling framework for heterogenous data with possibly skewed, heavy-tailed distributions and corrupted by atypical observations. StMoE consists of a mixture of K skew-t expert regressors network (of degree p) gated by a softmax gating network (of degree q) and is represented by:
alpha's of the softmax net.beta's, scale parameters sigma's, the skewness parameters
lambda's and the degree of freedom parameters nu's. StMoE thus
generalises mixtures of (normal, skew-normal, t, and skew-t) distributions and
mixtures of regressions with these distributions. For example, when $q=0$, we
retrieve mixtures of (skew-t, t-, skew-normal, or normal) regressions, and when
both $p=0$ and $q=0$, it is a mixture of (skew-t, t-, skew-normal, or normal)
distributions. It also reduces to the standard (normal, skew-normal, t, and
skew-t) distribution when we only use a single expert ($K=1$).Model estimation/learning is performed by a dedicated expectation conditional maximization (ECM) algorithm by maximizing the observed data log-likelihood. We provide simulated examples to illustrate the use of the model in model-based clustering of heterogeneous regression data and in fitting non-linear regression functions.
It was written in R Markdown, using the knitr package for production.
See help(package="meteorits") for further details and references provided by
citation("meteorits").
n <- 500 # Size of the sample alphak <- matrix(c(0, 8), ncol = 1) # Parameters of the gating network betak <- matrix(c(0, -2.5, 0, 2.5), ncol = 2) # Regression coefficients of the experts sigmak <- c(0.5, 0.5) # Standard deviations of the experts lambdak <- c(3, 5) # Skewness parameters of the experts nuk <- c(5, 7) # Degrees of freedom of the experts network t densities x <- seq.int(from = -1, to = 1, length.out = n) # Inputs (predictors) # Generate sample of size n sample <- sampleUnivStMoE(alphak = alphak, betak = betak, sigmak = sigmak, lambdak = lambdak, nuk = nuk, x = x) y <- sample$y
K <- 2 # Number of regressors/experts p <- 1 # Order of the polynomial regression (regressors/experts) q <- 1 # Order of the logistic regression (gating network)
n_tries <- 1 max_iter <- 1500 threshold <- 1e-5 verbose <- TRUE verbose_IRLS <- FALSE
stmoe <- emStMoE(X = x, Y = y, K, p, q, n_tries, max_iter, threshold, verbose, verbose_IRLS)
stmoe$summary()
stmoe$plot(what = "meancurve")
stmoe$plot(what = "confregions")
stmoe$plot(what = "clusters")
stmoe$plot(what = "loglikelihood")
library(MASS) data("mcycle") x <- mcycle$times y <- mcycle$accel
K <- 4 # Number of regressors/experts p <- 2 # Order of the polynomial regression (regressors/experts) q <- 1 # Order of the logistic regression (gating network)
n_tries <- 1 max_iter <- 1500 threshold <- 1e-5 verbose <- TRUE verbose_IRLS <- FALSE
stmoe <- emStMoE(X = x, Y = y, K, p, q, n_tries, max_iter, threshold, verbose, verbose_IRLS)
stmoe$summary()
stmoe$plot(what = "meancurve")
stmoe$plot(what = "confregions")
stmoe$plot(what = "clusters")
stmoe$plot(what = "loglikelihood")
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.