# Single Data Source Mixture Model

### Description

Fit a finite mixture model to a single source of data using one of several distributions.

### Usage

1 2 3 4 | ```
mixmod(X, K, family=names(LC_FAMILY), prior=NULL, iter.max=LC_ITER_MAX,
dname=deparse(substitute(X)))
## S3 method for class 'mixmod'
print(x, ...)
``` |

### Arguments

`X` |
for univariate data, a vector; for multivariate data, a matrix or data frame. Must consist only of numeric values. Each element of the vector, or each row of the matrix or data frame, should represent an independent observation. |

`K` |
the number of components, an integer greater than or equal to 1. |

`family` |
a string, one of the supported distribution family names given in |

`prior` |
prior probability distribution on |

`iter.max` |
the maximum number of iterations for the EM algorithm, by default equal to |

`dname` |
the name of the data. |

`x` |
an object of class |

`...` |
further arguments to |

### Details

In the finite mixture model used here, a hidden categorical random variable *Y*, which can take on values from 1 to some positive integer *K*, generates the distribution of the observed random variable *X*, from which the observed `X`

is assumed to be drawn.
Specifically, `mixmod`

fits a mixture model of the form

*f(x) = sum_k p_k f_k(x)*

where *k = 1, …, K* and each *f_k(.)* is a density function on the sample space of *X*. The *p_k*'s, that is, the component probabilities, sum to 1.

The EM algorithm used in model fitting attempts to maximize the Q-value, that is, the expected complete data log-likelihood, for the model. The parameter values which maximize the Q-value also maximize the log-likelihood for the density given above.

### Value

A list of class `mixmod`

, having the following elements:

`N` |
the length of the data, that is, |

`D` |
the width of the data, that is, 1 if |

`K` |
the number of components in the mixture model. |

`X` |
the original data; if |

`npar` |
the total number of parameters in the model. |

`npar.hidden` |
the number of parameters for the hidden component portion of the model. |

`npar.observed` |
the number of parameters for the observed data portion of the model. |

`iter` |
the number of iterations required to fit the model. |

`params` |
the parameters estimated for the model. This is a list with elements |

`stats` |
a vector with named elements corresponding to the number of iterations, log-likelihood, Q-value, and BIC for the estimated parameters. |

`weights` |
a list with the single element |

`pdfs` |
a list with two elements: |

`posterior` |
the |

`assignment` |
the vector of length |

`iteration.params` |
a list of length |

`iteration.stats` |
a data frame of |

`family` |
the name of the distribution family used in the model. See |

`distn` |
the name of the actual distribution used in the model. See |

`prior` |
the value of the |

`iter.max` |
the maximum number of distributions allowed in model fitting. |

`dname` |
the name of the data. |

`dattr` |
attributes of the data, used by model likelihood functions to determine if the data have been scaled or otherwise transformed. |

`kvec` |
a vector of integers from 1 to |

### Author(s)

Daniel Dvorkin

### References

McLachlan, G.J. and Thriyambakam, K. (2008) *The EM Algorithm and Extensions*, John Wiley & Sons.

### See Also

`LC_FAMILY`

for distributions and families; `mdmixmod`

for fitting multiple-data mixture models; `reporting`

and `likelihood`

for model reporting; `rocinfo`

for performance evaluation; `convergencePlot`

for behavior of the algorithm; `simulation`

for simulating from the parameters of a model; packages `mixtools`

and `mclust`

.

### Examples

1 2 3 4 5 6 7 8 9 10 11 12 13 | ```
## Not run:
data(CiData)
data(CiGene)
fit <- mixmod(CiData$expression, 3)
fit
# Normal mixture model ('mvnorm')
# Data 'CiData$expression' of size 10244-by-4 fitted to 3 components
# Model statistics:
# iter llik qval bic iclbic
# 42.00 -47499.54 -50052.71 -95405.40 -100511.73
plot(rocinfo(fit, CiGene$target))
## End(Not run)
``` |