# Generation of Gaussian Mixture Data for Classification

### Description

Generation of Gaussian mixture data for classification.

### Usage

1 2 3 4 5 6 7 | ```
mixtureData(n, prior, lambda, mu, sigma)
mixtureLabels(data, prior, lambda, mu, sigma)
mixturePosterior(data, prior, lambda, mu, sigma)
mixtureBayesClass(data, prior, lambda, mu, sigma)
``` |

### Arguments

`n` |
Number of observations. |

`prior` |
Vector of class prior probabilities. |

`lambda` |
The conditional probabilities for the
mixture components given the class. Either a vector (if
the same number |

`mu` |
The centers of the mixture components. A list
containing one |

`sigma` |
The covariance matrices of the mixture components. Either one single matrix that is used for each mixture component or a list as long as the number of classes. List elements can be matrices (in case that for all mixture components forming one class the same covariance matrix shall be used) or lists of matrices as long as the number of mixture components in the corresponding class. |

`data` |
A |

### Value

`mixtureData`

returns an object of class
`"locClass"`

, a list with components:

`x` |
(A matrix.) The explanatory variables. |

`y` |
(A factor.) The class labels. |

`mixtureLabels`

returns a factor of class labels.

`mixturePosterior`

returns a matrix of posterior
probabilities.

`mixtureBayesClass`

returns a factor of Bayes
predictions.

### Examples

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 | ```
## Simplest case:
# lambda vector, sigma matrix
# Generate a training and a test set
mu <- list(matrix(c(-1,-1),1), matrix(rep(c(1,4.5,12),2),3))
train <- mixtureData(n = 1000, prior = rep(0.5,2), lambda = list(1, rep(1/3,3)), mu = mu, sigma = 3*diag(2))
test <- mixtureData(n = 1000, prior = rep(0.5,2), lambda = list(1, rep(1/3,3)), mu = mu, sigma = 3*diag(2))
# Generate a grid of points
x.1 <- x.2 <- seq(-7,15,0.1)
grid <- expand.grid(x.1 = x.1, x.2 = x.2)
# Calculate the posterior probablities for all grid points
gridPosterior <- mixturePosterior(grid, prior = rep(0.5,2), lambda = list(1, rep(1/3,3)), mu = mu, sigma = 3*diag(2))
# Draw contour lines of posterior probabilities and plot training observations
contour(x.1, x.2, matrix(gridPosterior[,1], length(x.1)), col = "gray")
points(train$x, col = train$y)
# Calculate Bayes error
ybayes <- mixtureBayesClass(test$x, prior = rep(0.5,2), lambda = list(1, rep(1/3,3)), mu = mu, sigma = 3*diag(2))
mean(ybayes != test$y)
if (require(MASS)) {
# Fit an LDA model and calculate misclassification rate on the test data set
tr <- lda(y ~ ., data = as.data.frame(train))
pred <- predict(tr, as.data.frame(test))
mean(pred$class != test$y)
# Draw decision boundary
gridPred <- predict(tr, grid)
contour(x.1, x.2, matrix(gridPred$posterior[,1], length(x.1)), col = "red", levels = 0.5, add = TRUE)
}
## lambda list, sigma list of lists of matrices
mu <- list()
mu[[1]] <- matrix(c(1,5,1,5),2)
mu[[2]] <- matrix(c(8,11,8,11),2)
lambda <- list(c(0.5, 0.5), c(0.1, 0.9))
sigma <- list()
sigma[[1]] <- diag(2)
sigma[[2]] <- list(diag(2), 3*diag(2))
data <- mixtureData(n = 100, prior = c(0.3, 0.7), lambda, mu, sigma)
plot(data$x, col = data$y)
``` |