summaryDIRECT: Processing Posterior Estimates for Clustering Under DIRECT

summaryDIRECTR Documentation

Processing Posterior Estimates for Clustering Under DIRECT

Description

Function summaryDIRECT processes posterior estimates in the output files from DIRECT for clustering and parameter estimation.

Usage

summaryDIRECT(data.name, PERM.ADJUST = FALSE)

Arguments

data.name

A character string indicating the prefix of output files.

PERM.ADJUST

If TRUE, add 1 to labels of mixture components such that the labels start from 1 instead of 0.

Details

Output files from DIRECT include MCMC samples before relabeling and permuted labels of mixture components after relabeling. Function summaryDIRECT uses permuted labels stored in output file *_mcmc_perms.out to reorganize the MCMC samples stored in other output files *_mcmc_cs.out, *_mcmc_pars.out and *_mcmc_probs.out. It defines each mixture component as a cluster.

Value

A list with components:

nitem

The number of items in the data.

nclust

The number of inferred clusters.

top.clust.alloc

A vector of length nitem, each component being the maximum posterior probability of allocating the corresponding item to a cluster.

cluster.sizes

Vector of cluster sizes.

top.clust.labels

An integer vector of labels of inferred clusters. The integers are not necessarily consecutive; that is, an inferred mixture component that is associated with items at small posterior allocation probabilities is dropped from the final list of cluster labels.

top2allocations

A data frame containing "first", the most likely allocation; "second", the second most likely allocation; "prob1", the posterior allocation probability associated with "first"; and "prob2", the posterior allocation probability associated with "second".

post.alloc.probs

A nitem-by-nclust matrix of mean posterior allocation probability matrix.

post.clust.pars.mean

A matrix of nclust rows. Each row, corresponding to an inferred cluster, contains the posterior mean estimates of cluster-specific parameters.

post.clust.pars.median

A matrix of nclust rows. Each row, corresponding to an inferred cluster, contains the posterior median estimates of cluster-specific parameters.

misc

A list containing two components:

  • post.pars.mean: Matrix each row of which contains the posterior mean estimates of parameters for a mixture component.

  • post.pars.median: Matrix each row of which contains the posterior median estimates of parameters for a mixture component.

Author(s)

Audrey Q. Fu

References

Fu, A. Q., Russell, S., Bray, S. and Tavare, S. (2013) Bayesian clustering of replicated time-course gene expression data with weak signals. The Annals of Applied Statistics. 7(3) 1334-1361.

See Also

DIRECT for what output files are produced.

simuDataREM for simulating data under the mixture random-effects model.

Examples

## See example in DIRECT.

DIRECT documentation built on Sept. 8, 2023, 5:45 p.m.