View source: R/restrictedGibbsMergeSplit.R
restrictedGibbsMergeSplit | R Documentation |
Merge-split proposals for conjugate "Chinese Restaurant Process" (CRP) mixture models using restricted Gibbs scans from a uniformly random launch state, as presented in Jain & Neal (2004), with additional functionality for the two parameter CRP prior.
restrictedGibbsMergeSplit( partition, logPosteriorPredictiveDensity = function(i, subset) 0, t = 1, mass = 1, discount = 0, nUpdates = 1L, selectionWeights = NULL )
partition |
A numeric vector of cluster labels representing the current partition. |
logPosteriorPredictiveDensity |
A function taking an index i (as a numeric vector of length one) and a subset of integers subset, and returning the natural logarithm of p( y_i | y_subset ), i.e., that item's contribution to the log integrated likelihood given a subset of the other items. The default value "turns off" the likelihood, resulting in prior simulation (rather than posterior simulation). |
t |
A non-negative integer indicating the number of restricted Gibbs scans to perform for each merge/split proposal. |
mass |
A specification of the mass (concentration) parameter in the CRP
prior. Must be greater than the |
discount |
A numeric value on the interval [0,1) corresponding to the discount parameter in the two parameter CRP prior. |
nUpdates |
An integer giving the number of merge-split proposals before returning. This has the effect of thinning the Markov chain. |
selectionWeights |
A matrix or data frame whose first two columns are the unique pairs of data indices, along with a column of weights representing how likely each pair is to be selected at the beginning of each merge-split update. |
An integer vector giving the updated partition encoded using cluster labels.
The acceptance rate
of the Metropolis-Hastings proposals, i.e. the number accepted proposals
divided by nUpdates
.
Jain, S., & Neal, R. M. (2004). A split-merge Markov chain Monte Carlo procedure for the Dirichlet process mixture model. Journal of computational and Graphical Statistics, 13(1), 158-182.
# Neal (2000) model and data nealData <- c(-1.48, -1.40, -1.16, -1.08, -1.02, 0.14, 0.51, 0.53, 0.78) mkLogPosteriorPredictiveDensity <- function(data = nealData, sigma2 = 0.1^2, mu0 = 0, sigma02 = 1) { function(i, subset) { posteriorVariance <- 1 / ( 1/sigma02 + length(subset)/sigma2 ) posteriorMean <- posteriorVariance * ( mu0/sigma02 + sum(data[subset])/sigma2 ) posteriorPredictiveSD <- sqrt(posteriorVariance + sigma2) dnorm(data[i], posteriorMean, posteriorPredictiveSD, log = TRUE) } } logPostPredict <- mkLogPosteriorPredictiveDensity() nSamples <- 1000L partitions <- matrix(0, nrow = nSamples, ncol = length(nealData)) accept <- 0 for ( i in 2:nSamples ) { ms <- restrictedGibbsMergeSplit(partitions[i-1,], logPostPredict, t = 1, mass = 1.0, nUpdates = 2) partitions[i,] <- ms$partition accept <- accept + ms$accept } accept / nSamples # M-H acceptance rate # convergence and mixing diagnostics nSubsets <- apply(partitions, 1, function(x) length(unique(x))) mean(nSubsets) sum(acf(nSubsets)$acf)-1 # Autocorrelation time entropy <- apply(partitions, 1, partitionEntropy) plot.ts(entropy)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.