plotTCorr: Plot test statistics for the correlation matrices

View source: R/GcClusterFunctions.R

plotTCorrR Documentation

Plot test statistics for the correlation matrices

Description

Plot test statistics for the correlation matrices in the pdfs of the finite mixture model. These test statistics are used for posterior predictive checking of the model.

Usage

plotTCorr(combinedChains, obsTestStats)

Arguments

combinedChains

A stanfit object containing multiple Monte Carlo chains. This object is return by function combineChains, for which the documentation includes a complete description of container combinedChains.

obsTestStats

List containing the test statistics for the observed data. This list is return by function calcObsTestStats, for which the documentation includes a complete description of container obsTestStats.

Details

The plot of the test statistics appears as a 2x2 matrix. The first and second rows pertain to the first and second pdfs of the finite mixture model. The first column presents comparisons of the correlation matrices. For each pdf, the comparison is a composite of the upper triangle of the correlation matrix that is calculated from the principal components and the lower triangle of the correlation matrix that is the median of its Monte Carlo samples. Corresponding elements in the upper and lower triangles should be almost identical.

The second column presents the posterior predictive p-values for every element in the correlation matrices. To simplify the explanation, consider a p-value for one matrix element. The p-value is calculated from two test statistics: One test statistic is the correlation calculated from the principal components, which are the observed data for the posterior predictive check. This test statistic is designated "T.obs". The other test statistic is the specified matrix element from the Monte Carlo sample of the correlation matrix, which is the replicated data for the posterior predictive check. This test statistic is designated "T.rep". Because there are many Monte Carlo samples, there are many values for T.rep.

The relation between T.obs and the distribution for T.rep is summarized by the posterior predictive p-value. The p-value is defined as the probability that the test statistic for the replicated data could be more extreme that the test statistic for the observed data (Gelman et al., 2014, p. 146). In mathematical terms, pvalue = Pr( T.rep >= T.obs). The p-value is close to 1 when T.obs is in the left tail of the distribution for T.rep. This can confuse the interpretation of the p-value, so, for this situation, the mathematical definition is modified slightly: pvalue = Pr( T.rep < T.obs) (Gelman et al., 2014, p. 148). Consequently, the calculated p-value is always less than 0.5, and it may be interpreted in the standard way.

For each pdf, the p-values are presented within the upper triangle of a matrix having the same dimension as the correlation matrix. Consequently, the relation between this matrix of p-values and the correlation matrix is readily apparent. The color scale ranges from the smallest calculated p-value to 0.5, which is the largest possible p-value.

References

Gelman, A., Carlin, J.B., Stern, H.S., Dunson, D.B., Vehtari, A., and Rubin, D.B., 2014, Bayesian data analysis (3rd ed.): CRC Press.

Examples

## Not run: 
plotTCorr(combinedChains, obsTestStats)

## End(Not run)


USGS-R/GcClust documentation built on April 17, 2023, 8:08 p.m.