Description Usage Arguments Details Value Author(s) References See Also Examples
Fit a finite mixture model to a single source of data using one of several distributions.
1 2 3 4 | mixmod(X, K, family=names(LC_FAMILY), prior=NULL, iter.max=LC_ITER_MAX,
dname=deparse(substitute(X)))
## S3 method for class 'mixmod'
print(x, ...)
|
X |
for univariate data, a vector; for multivariate data, a matrix or data frame. Must consist only of numeric values. Each element of the vector, or each row of the matrix or data frame, should represent an independent observation. |
K |
the number of components, an integer greater than or equal to 1. |
family |
a string, one of the supported distribution family names given in |
prior |
prior probability distribution on Y. This feature is under development and its use is not currently recommended. |
iter.max |
the maximum number of iterations for the EM algorithm, by default equal to |
dname |
the name of the data. |
x |
an object of class |
... |
further arguments to |
In the finite mixture model used here, a hidden categorical random variable Y, which can take on values from 1 to some positive integer K, generates the distribution of the observed random variable X, from which the observed X
is assumed to be drawn.
Specifically, mixmod
fits a mixture model of the form
f(x) = sum_k p_k f_k(x)
where k = 1, …, K and each f_k(.) is a density function on the sample space of X. The p_k's, that is, the component probabilities, sum to 1.
The EM algorithm used in model fitting attempts to maximize the Q-value, that is, the expected complete data log-likelihood, for the model. The parameter values which maximize the Q-value also maximize the log-likelihood for the density given above.
A list of class mixmod
, having the following elements:
N |
the length of the data, that is, |
D |
the width of the data, that is, 1 if |
K |
the number of components in the mixture model. |
X |
the original data; if |
npar |
the total number of parameters in the model. |
npar.hidden |
the number of parameters for the hidden component portion of the model. |
npar.observed |
the number of parameters for the observed data portion of the model. |
iter |
the number of iterations required to fit the model. |
params |
the parameters estimated for the model. This is a list with elements |
stats |
a vector with named elements corresponding to the number of iterations, log-likelihood, Q-value, and BIC for the estimated parameters. |
weights |
a list with the single element |
pdfs |
a list with two elements: |
posterior |
the N-by-K matrix of which the (n,k)th element is the estimated posterior probability that the nth observation was generated by the kth component. Equal to the |
assignment |
the vector of length N of which the nth element is the most probable component to have generated the nth observation. In other words, |
iteration.params |
a list of length |
iteration.stats |
a data frame of |
family |
the name of the distribution family used in the model. See |
distn |
the name of the actual distribution used in the model. See |
prior |
the value of the |
iter.max |
the maximum number of distributions allowed in model fitting. |
dname |
the name of the data. |
dattr |
attributes of the data, used by model likelihood functions to determine if the data have been scaled or otherwise transformed. |
kvec |
a vector of integers from 1 to K. |
Daniel Dvorkin
McLachlan, G.J. and Thriyambakam, K. (2008) The EM Algorithm and Extensions, John Wiley & Sons.
LC_FAMILY
for distributions and families; mdmixmod
for fitting multiple-data mixture models; reporting
and likelihood
for model reporting; rocinfo
for performance evaluation; convergencePlot
for behavior of the algorithm; simulation
for simulating from the parameters of a model; packages mixtools
and mclust
.
1 2 3 4 5 6 7 8 9 10 11 12 13 | ## Not run:
data(CiData)
data(CiGene)
fit <- mixmod(CiData$expression, 3)
fit
# Normal mixture model ('mvnorm')
# Data 'CiData$expression' of size 10244-by-4 fitted to 3 components
# Model statistics:
# iter llik qval bic iclbic
# 42.00 -47499.54 -50052.71 -95405.40 -100511.73
plot(rocinfo(fit, CiGene$target))
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.