Description Usage Arguments Details Value Examples
View source: R/runHeuristics.R
Use heuristics to select the best-fitting hierarchical Bayesian cluster model, from multiple fitted models. Certain data processing is performed using the best fitting model, to generate a cluster assignment variable.
1 | runHeuristics(fittedModel, minHomozygoteSize)
|
fittedModel |
An object of class |
minHomozygoteSize |
The minimum allowable size of the homzygote clusters. |
This function takes as input multiple fitted Hierarchical Bayesian Cluster models. These models are fitted using MCMC, often using different initial states. This function produces a summary of the fitted models, including cluster assignment variables and component covariance matrices. It also chooses a ‘best’ fitted model.
Selecting the ‘best’ fitted model is complicated. We begin by rejecting models for which any of the following are true:
For more than 5% of the data points, the posterior probability of membership is more than 15%, for BOTH homozygote clusters
Either of the homozygote clusters is smaller than minHomozygoteSize
The difference in the x-coordinate of the two homozygote clusters is smaller than 0.06
More than a quater of the points fall into the outlier cluster
The difference in the centers of the homozygote clusters is small, relative to the covariance matrix of those clusters
If there are any remaining models, then for each model we compute the maximum of the determinants of the covariance matrices of the homozygote clusters. We select the model for this value (the maximum determinant) is smallest.
If there are no remaining models, then the returned object indicates that no suitable model was found.
An object of class magicHeuristicHBC
, containing data about all the fitted models, and an indication of which model is best, according to the heuristics described in this section.
classifications |
A list with an entry for each Markov chain. Each entry is a vector of cluster assignments for the data points. |
clusterMeans |
A list with an entry for each Markov chain. Each entry is a matrix with three rows and two columns. Each row gives the center point of a cluster. The first two rows are the centers of the homozygote clusters, and the third row is the center of the heterozygote cluster. |
covariances |
A list with an entry for each Markov chain. Each entry is a 4 x 2 x 2 matrix, so that |
chainIndex |
The index of the ‘best’ Markov chain. If no chain satisfied the required conditions, then this is set to -1. |
data |
The input data. |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | data("eightWayExampleData", package="magicCalling")
data <- eightWayExampleData[[1]]
meanY <- mean(data[,2])
startingPoints <- list(
rbind(c(0.5, meanY), c(0.5, meanY)),
rbind(c(0.5, meanY), c(0.5, meanY)),
rbind(c(0.25, meanY), c(0.5, meanY)),
rbind(c(0.25, meanY), c(0.5, meanY)),
rbind(c(0.75, meanY), c(0.5, meanY)),
rbind(c(0.75, meanY), c(0.5, meanY)),
rbind(c(0.8, meanY), c(0.2, meanY)),
rbind(c(0.8, meanY), c(0.2, meanY))
)
result <- fitClusterModel(data, startingPoints, n.iter = 200, D_hom = diag(2)*4, V_hom = cbind(c(0.005, 0), c(0, 0.1))/3, n_hom = 30, D_err = diag(2), V_err = diag(2)*10/3, n_err = 300, V_het = diag(2)*0.025/3, n_het = 1500)
plot(result, chainIndex = 1)
heuristicResults <- runHeuristics(result, minHomozygoteSize = 200)
plot(heuristicResults, chainIndex = heuristicResults$chainIndex)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.