Description Usage Arguments Details Value Author(s) References See Also Examples
Fits a layered or chained mixture model to a list representing multiple sources of data, using a choice of distributions and number of components for each data source.
1 2 3 4 | mdmixmod(X, K, K0=min(K), topology=LC_TOPOLOGY, family=NULL, prior=NULL,
prefit=TRUE, iter.max=LC_ITER_MAX, dname=deparse(substitute(X)))
## S3 method for class 'mdmixmod'
print(x, ...)
|
X |
a list of observed data sources; the elements must be numeric vectors, matrices, or data frames. Each element of |
K |
the vector of numbers of mixture components for the hidden variables corresponding to each observed data source. If |
K0 |
the number of mixture components for the top-level hidden variable. |
topology |
one of the model topologies in |
family |
a vector of names of distribution families to be used in fitting the models for each observed data source; each element of |
prior |
prior probability distribution on Y_0. This feature is under development and its use is not currently recommended. |
prefit |
logical; if |
iter.max |
the maximum number of iterations for the EM algorithm, by default equal to |
dname |
the name of the data. |
x |
an object of class |
... |
further arguments to |
In the layered model, a top-level hidden categorical random variable Y_0, which can take on values from 1 to some positive integer K_0, generates categorical hidden random variables Y_1, …, Y_Z for some positive integer Z. For z = 1,…,Z, each Y_z can take on values from 1 to some positive integer K_z. In the chained model, Y_0 generates Y_1, which in turn generates Y_2, etc., up to Y_{Z-1}, which generates Y_Z.
In both models, the Y_z's generate the observed mixture random variables X_1, …, X_Z, from which the elements of the observed data X
are assumed to be drawn. (That is, Z = length(X)
, the number of list elements in X
.) The relationship between each Y_z and X_z is the same as the relationship between Y and X in mixmod
.
As in mixmod
, the EM algorithm attempts to maximize the Q-value, that is, the expected complete data (hidden and observed variables) log-likelihood.
A list of class mdmixmod
, a subclass of mixmod
, having the following elements:
N |
the length of the data, that is, |
Z |
the size of the data, that is, |
D |
the vector of widths of the data, that is, |
K |
the vector of the numbers of components in the lower-level mixture models. |
K0 |
the number of components in the top-level mixture model, that is, K_0. |
X |
the original data, with data frames converted to matrices. If the elements of |
npar |
the total number of parameters in the model. |
npar.hidden |
the number of parameters for the hidden component portion of the model. |
npar.observed |
the number of parameters for the observed data portion of the model. |
iter |
the number of iterations required to fit the model. |
params |
the parameters estimated for the model. This is a list with elements |
stats |
a vector with named elements corresponding to the number of iterations, log-likelihood, Q-value, and BIC for the estimated parameters. |
weights |
a list with elements |
pdfs |
a list with elements |
posterior |
the N-by-K_0 matrix of which the (n,k_0)th element is the estimated posterior probability that the nth observation (across all data sources) was generated by the k_0th component. Equal to the |
assignment |
the vector of length N of which the nth element is the most probable top-level component to have generated the nth observation. In other words, |
iteration.params |
a list of length |
iteration.stats |
a data frame of |
topology |
the topology of the model. |
family |
the vector of names of the distribution families used in the model. See |
distn |
the vector of names of the actual distributions used in the model. See |
iter.max |
the maximum number of distributions allowed in model fitting. |
dname |
the name of the data. |
dattr |
attributes of the data, used by model likelihood functions to determine if the data have been scaled or otherwise transformed. |
zvec |
the vector of names of |
kvec |
a list of which the zth element is a vector of integers from 1 to K_z. |
k0vec |
a vector of integers from 1 to K_0. |
prior |
the value of the |
marginals |
if |
Daniel Dvorkin
McLachlan, G.J. and Thriyambakam, K. (2008) The EM Algorithm and Extensions, John Wiley & Sons.
LC_FAMILY
for distributions and families; mixmod
for fitting single-data mixture models; reporting
and likelihood
for model reporting; rocinfo
for performance evaluation; convergencePlot
for behavior of the algorithm; simulation
for simulating from the parameters of a model.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | ## Not run:
data(CiData)
data(CiGene)
fit <- mdmixmod(CiData, c(2,3,2), topology="chained",
family=c("pvii", "norm", "pvii"))
fit
# Chained (PVII, normal, PVII) mixture model ('pvii', 'mvnorm', 'pvii')
# Data 'CiData' of size 10244-by-(1,4,1) fitted to 2 (2,3,2) components
# Model statistics:
# iter llik qval bic iclbic
# 377.00 -75859.81 -87065.28 -152310.62 -174721.56
margs <- marginals(fit)
allFits <- c(list(chained=fit), margs)
plot(multiroc(allFits, CiGene$target))
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.