library(knitr) knitr::opts_chunk$set( fig.align = "center", fig.height = 5.5, fig.width = 6, warning = FALSE, collapse = TRUE, dev.args = list(pointsize = 10), out.width = "90%", par = TRUE ) knit_hooks$set(par = function(before, options, envir) { if (before && options$fig.show != "none") par(family = "sans", mar = c(4.1,4.1,1.1,1.1), mgp = c(3,1,0), tcl = -0.5) })
library(meteorits)
StMoE (Skew-t Mixture-of-Experts) provides a flexible and robust modelling framework for heterogenous data with possibly skewed, heavy-tailed distributions and corrupted by atypical observations. StMoE consists of a mixture of K skew-t expert regressors network (of degree p) gated by a softmax gating network (of degree q) and is represented by:
alpha
's of the softmax net.beta
's, scale parameters sigma
's, the skewness parameters
lambda
's and the degree of freedom parameters nu
's. StMoE thus
generalises mixtures of (normal, skew-normal, t, and skew-t) distributions and
mixtures of regressions with these distributions. For example, when $q=0$, we
retrieve mixtures of (skew-t, t-, skew-normal, or normal) regressions, and when
both $p=0$ and $q=0$, it is a mixture of (skew-t, t-, skew-normal, or normal)
distributions. It also reduces to the standard (normal, skew-normal, t, and
skew-t) distribution when we only use a single expert ($K=1$).Model estimation/learning is performed by a dedicated expectation conditional maximization (ECM) algorithm by maximizing the observed data log-likelihood. We provide simulated examples to illustrate the use of the model in model-based clustering of heterogeneous regression data and in fitting non-linear regression functions.
It was written in R Markdown, using the knitr package for production.
See help(package="meteorits")
for further details and references provided by
citation("meteorits")
.
n <- 500 # Size of the sample alphak <- matrix(c(0, 8), ncol = 1) # Parameters of the gating network betak <- matrix(c(0, -2.5, 0, 2.5), ncol = 2) # Regression coefficients of the experts sigmak <- c(0.5, 0.5) # Standard deviations of the experts lambdak <- c(3, 5) # Skewness parameters of the experts nuk <- c(5, 7) # Degrees of freedom of the experts network t densities x <- seq.int(from = -1, to = 1, length.out = n) # Inputs (predictors) # Generate sample of size n sample <- sampleUnivStMoE(alphak = alphak, betak = betak, sigmak = sigmak, lambdak = lambdak, nuk = nuk, x = x) y <- sample$y
K <- 2 # Number of regressors/experts p <- 1 # Order of the polynomial regression (regressors/experts) q <- 1 # Order of the logistic regression (gating network)
n_tries <- 1 max_iter <- 1500 threshold <- 1e-5 verbose <- TRUE verbose_IRLS <- FALSE
stmoe <- emStMoE(X = x, Y = y, K, p, q, n_tries, max_iter, threshold, verbose, verbose_IRLS)
stmoe$summary()
stmoe$plot(what = "meancurve")
stmoe$plot(what = "confregions")
stmoe$plot(what = "clusters")
stmoe$plot(what = "loglikelihood")
library(MASS) data("mcycle") x <- mcycle$times y <- mcycle$accel
K <- 4 # Number of regressors/experts p <- 2 # Order of the polynomial regression (regressors/experts) q <- 1 # Order of the logistic regression (gating network)
n_tries <- 1 max_iter <- 1500 threshold <- 1e-5 verbose <- TRUE verbose_IRLS <- FALSE
stmoe <- emStMoE(X = x, Y = y, K, p, q, n_tries, max_iter, threshold, verbose, verbose_IRLS)
stmoe$summary()
stmoe$plot(what = "meancurve")
stmoe$plot(what = "confregions")
stmoe$plot(what = "clusters")
stmoe$plot(what = "loglikelihood")
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.