Description Usage Arguments Details Value Note Author(s) References See Also Examples
An fast and modified implementation of the Li et. al. (2011) EM-like algorithm for estimating the maximizing parameters of the GMCM-likelihood function.
1 2 3 | PseudoEMAlgorithm(x, theta, eps = 1e-04, max.ite = 1000,
verbose = FALSE, trace.theta = FALSE, meta.special.case = FALSE,
convergence.criterion = c("absGMCM", "GMCM", "GMM", "Li", "absLi"))
|
x |
A matrix of observations where rows corresponds to features and columns to experiments. |
theta |
A list of parameters formatted as described in
|
eps |
The maximum difference required to achieve convergence. |
max.ite |
The maximum number of iterations. |
verbose |
Logical. Set to |
trace.theta |
Logical. If |
meta.special.case |
Logical. If |
convergence.criterion |
Character. Sets the convergence criterion. If
|
When either "absGMCM"
or "absLi"
are used, the parameters
corresponding to the biggest observed likelihood is returned. This is not
necessarily the last iteration.
A list of 3 or 4 is returned depending on the value of
trace.theta
theta |
A list containing the final parameter
estimate in the format of |
loglik.tr |
A matrix with different log-likelihood traces in each row. |
kappa |
A matrix
where the (i,j)'th entry is the probability that |
theta.tr |
A list of each obtained parameter estimates in the format of
|
The algorithm is highly sensitive to the starting parameters which therefore should be carefully chosen.
Anders Ellern Bilgrau <anders.ellern.bilgrau@gmail.com>
Li, Q., Brown, J. B. J. B., Huang, H., & Bickel, P. J. (2011). Measuring reproducibility of high-throughput experiments. The Annals of Applied Statistics, 5(3), 1752-1779. doi:10.1214/11-AOAS466
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 | set.seed(1)
# Choosing the true parameters and simulating data
true.par <- c(0.8, 3, 1.5, 0.4)
data <- SimulateGMCMData(n = 1000, par = true.par, d = 2)
uhat <- Uhat(data$u) # Observed ranks
# Plot of latent and observed data colour coded by the true component
par(mfrow = c(1,2))
plot(data$z, main = "Latent data", cex = 0.6,
xlab = "z (Experiment 1)", ylab = "z (Experiment 2)",
col = c("red","blue")[data$K])
plot(uhat, main = "Observed data", cex = 0.6,
xlab = "u (Experiment 1)", ylab = "u (Experiment 2)",
col = c("red","blue")[data$K])
# Fit the model using the Pseudo EM algorithm
init.par <- c(0.5, 1, 1, 0.5)
res <- GMCM:::PseudoEMAlgorithm(uhat, meta2full(init.par, d = 2),
verbose = TRUE,
convergence.criterion = "absGMCM",
eps = 1e-4,
trace.theta = FALSE,
meta.special.case = TRUE)
# Compute posterior cluster probabilities
IDRs <- get.IDR(uhat, par = full2meta(res$theta))
# Plot of observed data colour coded by the MAP estimate
plot(res$loglik[3,], main = "Loglikelihood trace", type = "l",
ylab = "log GMCM likelihood")
abline(v = which.max(res$loglik[3,])) # Chosen MLE
plot(uhat, main = "Clustering\nIDR < 0.05", xlab = "", ylab = "", cex = 0.6,
col = c("Red","Blue")[IDRs$Khat])
# View parameters
rbind(init.par, true.par, estimate = full2meta(res$theta))
# Confusion matrix
table("Khat" = IDRs$Khat, "K" = data$K)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.