Description Usage Arguments Details Value References Examples
This function fits a mixed-effects model for clustered data with cluster-level missing values in the outcome.
1 |
Ym |
is an N by p outcome data from N clusters/batches/experiments; p is the number of samples within each cluster. The first sample within each cluster is assumed to be a reference sample with different error variance. Missing values are coded as NAs. |
Xm |
is a covariate array of dimension N by k by p, where k is the number of covariates. |
Zm |
is a design array for random-effects, with a dimension of N by h by p, where h is the number of variables with random effects. |
gamma |
is the parameter for the missing-data mechanism. The missingness of the outcome in cluster i depends on the mean of the outcome. The missing probability is modelled as exp(-gamma0 - gamma*mean(y)). The parameter gamma can be estimated by borrowing information across outcomes and finding the common missing-data patterns in the high-dimensional data. For example, by estimating the relationship the observed average value of \bar\mathbf{y}_{i} and the missing rate, or the parameter can be selected by the log-likelihood profile (see the Reference). If gamma = 0, the missingness is ignorable. The parameter gamma0 does not affect the estimation of the EM algorithm, and is mostly determined by the missing rate. So it is set as 0 in the estimation here. |
maxIter |
the maximum number of iterations in the estimation of the EM algorithm. |
tol |
the tolerance level for the absolute change in the observed-data log-likelihood function. |
The model consists of two parts, the outcome model and the missing-data model. The outcome model is a mixed-effects model,
\mathbf{y}_{i} = \mathbf{X}_{i}\boldsymbol{α}+\mathbf{Z}_{i}\boldsymbol{b}_{i}+\mathbf{e}_{i},
where \mathbf{y}_{i} is the outcome for the i-th cluster, \mathbf{X}_{i} is the covariate matrix, \boldsymbol{α} is the fixed-effects, \mathbf{Z}_{i} is the design matrix for the random-effects \mathbf{b}_i, and \mathbf{e}_{i} is the error term.
The non-ignorable batch-level (or cluster-level) abundance-dependent missing-data model (BADMM) can be written as
\textrm{Pr}≤ft(M_{i}=1|\mathbf{y}_{i}\right)= \mathrm{exp}≤ft(-γ_{0} - γ \bar\mathbf{y}_{i} \right),
where M_{i} is the missing indicator for the i-th cluster, and \bar\mathbf{y}_{i} is the average of \mathbf{y}_{i}. If M_{i}=1, the outcome of the i-th cluster \mathbf{y}_{i} would be missing altogether. The estimation of the mixEMM model is implemented via an ECM algorithm. If γ \neq 0, i.e., the missingness depends on the outcome, the missing-data mechanism is missing not at random (MNAR), otherwise it is missing completely at random (MCAR) for the current model. The parameter γ can be estimated by borrowing information across outcomes and finding the common missing-data patterns in the high-dimensional data. For example, by estimating the relationship the observed average value of \bar\mathbf{y}_{i} and the missing rate, or the parameter can be selected by the log-likelihood profile (see the Reference).
A list containing
alpha.hat |
the estimated fixed-effects. |
alpha.se |
the standard errors for the estimated fixed-effects. |
sigma0.hat, sigma2.hat |
the estimated sample error variances. It returns the variances for the first (reference) sample and the other samples within each cluster/batch. |
D |
the estimated covariance matrix for the random-effects. |
RE |
the estimated random-effects. |
loglikelihood |
the observed-data log-likelihood values. |
Chen, L. S., Wang, J., Wang, X., & Wang, P. (2017). A mixed-effects model for incomplete data from labeling-based quantitative proteomics experiments. The Annals of Applied Statistics, 11(1), 114-138. doi: 10.1214/16-AOAS994
1 2 3 4 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.