plotTMeanSd: Plot test statistics for the mean and standard deviation...

View source: R/GcClusterFunctions.R

plotTMeanSdR Documentation

Plot test statistics for the mean and standard deviation vectors

Description

Plot test statistics for the mean and standard deviation vectors in the pdfs of the finite mixture model. These test statistics are used for posterior predictive checking of the model.

Usage

plotTMeanSd(combinedChains, obsTestStats, intervalPercentage = 95)

Arguments

combinedChains

A stanfit object containing multiple Monte Carlo chains. This object is return by function combineChains, for which the documentation includes a complete description of container combinedChains.

obsTestStats

List containing the test statistics for the observed data (namely, the principal components). This list is return by function calcObsTestStats, for which the documentation includes a complete description of container obsTestStats.

intervalPercentage

Credible interval for the distributions of the test statistic. Typical values might be 50, 90, or 95.

Details

The plot of the test statistics appears as a 2x2 matrix. The first and second rows pertain, respectively, to the mean and standard deviation vectors. The first and second columns pertain, respectively, to the first and second pdfs in the finite mixture model.

The formats of the four plots in the 2x2 matrix are identical, so only one plot is described. The horizontal axis specifies the vector elements within the mean or standard deviation vector. The vertical axis specifies the values for those vector elements. There are two plot symbols for every vector element. One plot symbol pertains to the Monte Carlo samples of the vector element. These samples, which are the replicated data for the posterior predictive check, are one test statistic and are designated "T.rep". The distribution of T.rep is summarized by its median and the credible interval, which are represented by a horizontal line and a vertical line, respectively. The second plot symbol pertains to the value of the vector element that is calculated from the principal components. This value, which is the observed data for the posterior predictive check, is the other test statistic and is designated "T.obs". It is represented by a red dot.

The relation between T.obs and the distribution for T.rep is summarized by the posterior predictive p-value, which is printed at the top edge of the plot. The p-value is defined as the probability that the test statistic for the replicated data could be more extreme that the test statistic for the observed data (Gelman et al., 2014, p. 146). In mathematical terms, pvalue = Pr( T.rep >= T.obs). The p-value is close to 1 when T.obs is in the left tail of the distribution for T.rep. This can confuse the interpretation of the p-value, so, for this situation, the mathematical definition is modified slightly: pvalue = Pr( T.rep < T.obs) (Gelman et al., 2014, p. 148). Consequently, the calculated p-value is always less than 0.5, and it may be interpreted in the standard way.

References

Gelman, A., Carlin, J.B., Stern, H.S., Dunson, D.B., Vehtari, A., and Rubin, D.B., 2014, Bayesian data analysis (3rd ed.): CRC Press.

Examples

## Not run: 
plotTMeanSd( combinedChains, obsTestStats)

## End(Not run)


USGS-R/GcClust documentation built on April 17, 2023, 8:08 p.m.