# A-quick-tour-of-SNMoE In meteorits: Mixture-of-Experts Modeling for Complex Non-Normal Distributions

```library(knitr)
knitr::opts_chunk\$set(
fig.align = "center",
fig.height = 5.5,
fig.width = 6,
warning = FALSE,
collapse = TRUE,
dev.args = list(pointsize = 10),
out.width = "90%",
par = TRUE
)
knit_hooks\$set(par = function(before, options, envir)
{ if (before && options\$fig.show != "none")
par(family = "sans", mar = c(4.1,4.1,1.1,1.1), mgp = c(3,1,0), tcl = -0.5)
})
```
```library(meteorits)
```

# Introduction

SNMoE (Skew-Normal Mixtures-of-Experts) provides a flexible modelling framework for heterogenous data with possibly skewed distributions to generalize the standard Normal mixture of expert model. SNMoE consists of a mixture of K skew-Normal expert regressors network (of degree p) gated by a softmax gating network (of degree q) and is represented by:

• The gating network parameters `alpha`'s of the softmax net.
• The experts network parameters: The location parameters (regression coefficients) `beta`'s, scale parameters `sigma`'s, and the skewness parameters `lambda`'s. SNMoE thus generalises mixtures of (normal, skew-normal) distributions and mixtures of regressions with these distributions. For example, when \$q=0\$, we retrieve mixtures of (skew-normal, or normal) regressions, and when both \$p=0\$ and \$q=0\$, it is a mixture of (skew-normal, or normal) distributions. It also reduces to the standard (normal, skew-normal) distribution when we only use a single expert (\$K=1\$).

Model estimation/learning is performed by a dedicated expectation conditional maximization (ECM) algorithm by maximizing the observed data log-likelihood. We provide simulated examples to illustrate the use of the model in model-based clustering of heterogeneous regression data and in fitting non-linear regression functions.

It was written in R Markdown, using the knitr package for production.

See `help(package="meteorits")` for further details and references provided by `citation("meteorits")`.

# Application to a simulated dataset

## Generate sample

```n <- 500 # Size of the sample
alphak <- matrix(c(0, 8), ncol = 1) # Parameters of the gating network
betak <- matrix(c(0, -2.5, 0, 2.5), ncol = 2) # Regression coefficients of the experts
lambdak <- c(3, 5) # Skewness parameters of the experts
sigmak <- c(1, 1) # Standard deviations of the experts
x <- seq.int(from = -1, to = 1, length.out = n) # Inputs (predictors)

# Generate sample of size n
sample <- sampleUnivSNMoE(alphak = alphak, betak = betak, sigmak = sigmak,
lambdak = lambdak, x = x)
y <- sample\$y
```

## Set up SNMoE model parameters

```K <- 2 # Number of regressors/experts
p <- 1 # Order of the polynomial regression (regressors/experts)
q <- 1 # Order of the logistic regression (gating network)
```

## Set up EM parameters

```n_tries <- 1
max_iter <- 1500
threshold <- 1e-6
verbose <- TRUE
verbose_IRLS <- FALSE
```

## Estimation

```snmoe <- emSNMoE(X = x, Y = y, K, p, q, n_tries, max_iter,
threshold, verbose, verbose_IRLS)
```

## Summary

```snmoe\$summary()
```

## Plots

### Mean curve

```snmoe\$plot(what = "meancurve")
```

### Confidence regions

```snmoe\$plot(what = "confregions")
```

### Clusters

```snmoe\$plot(what = "clusters")
```

### Log-likelihood

```snmoe\$plot(what = "loglikelihood")
```

# Application to a real dataset

```data("tempanomalies")
x <- tempanomalies\$Year
y <- tempanomalies\$AnnualAnomaly
```

## Set up SNMoE model parameters

```K <- 2 # Number of regressors/experts
p <- 1 # Order of the polynomial regression (regressors/experts)
q <- 1 # Order of the logistic regression (gating network)
```

## Set up EM parameters

```n_tries <- 1
max_iter <- 1500
threshold <- 1e-6
verbose <- TRUE
verbose_IRLS <- FALSE
```

## Estimation

```snmoe <- emSNMoE(X = x, Y = y, K, p, q, n_tries, max_iter,
threshold, verbose, verbose_IRLS)
```

## Summary

```snmoe\$summary()
```

## Plots

### Mean curve

```snmoe\$plot(what = "meancurve")
```

### Confidence regions

```snmoe\$plot(what = "confregions")
```

### Clusters

```snmoe\$plot(what = "clusters")
```

### Log-likelihood

```snmoe\$plot(what = "loglikelihood")
```

## Try the meteorits package in your browser

Any scripts or data that you put into this service are public.

meteorits documentation built on Jan. 11, 2020, 9:13 a.m.