semiparamDPE: Semiparametric Density Product Estimator Method

Description Usage Arguments Value References Examples

View source: R/semiparamDPE.R

Description

The function uses the Semiparametric Density Product Estimator method introduced by Neiswanger et al. (see References) to combine the independent subset posterior samples subchains into the set of samples that estimate the posterior density given the full data set. The semiparametric density product estimator method uses kernel smoothing techniques to estimate each subset posterior density; the subposterior densities are then multiplied together to approximate the posterior density based on the full data set.

Usage

1
semiparamDPE(subchain, bandw = rep(1.0, dim(subchain)[1]), anneal = TRUE, shuff = FALSE)

Arguments

subchain

array of subset posterior samples of the dimension c(d,sampT,M). Here d is the dimension of the parameter space, sampT is the number of samples, and M is the number of subposterior datasets.

bandw

bandwidth vector of the length d=dim(subchain)[1]. It is a vector of tuning parameters used in kernel density approximation employed by the semiparametric method. When anneal=TRUE then one of the choices for bandw could be the vector consisting of standard deviations for each of the d parameters. When anneal=FALSE then one of the choices for bandw could be the diagonal of the optimal bandwidth matrix obtained via Silverman's rule of thumb; see Examples. By default bandw=rep(1.0,d).

anneal

logical; if TRUE, the bandwidth bandw (instead of being fixed) is annealed as bandw*i^(-1/(4+d)); here i is the index corresponding to a sample; see References.

shuff

logical; if TRUE, each of the M subsets of d dimensional parameters in subchain is shuffled.

Value

Returns an array of samples of dimension dim=c(d,sampT) representing an estimated (combined) full posterior density.

References

Neiswanger, W., Wang, C., Xing E. (2014) Asymptotically exact, embarrassingly parallel MCMC. arXiv:1311.4780v2.

Silverman, B.W. (1986). Density Estimation for Statistics and Data Analysis. Chapman & Hall/CRC. pp. 7-11.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
d      <- 2     # dimension of the parameter space
sampT  <- 300   # number of subset posterior samples
M      <- 3     # total number of subsets

## simulate Gaussian subposterior samples

theta <- array(NA,c(d,sampT,M))

norm.mean <- c(1.0, 2.0)
norm.sd   <- c(0.5, 1.0)

for (i in 1:d)
  for (s in 1:M)
    theta[i,,s] <- rnorm(sampT, mean=norm.mean[i]+runif(1,-0.01,0.01), sd=norm.sd[i])

## estimate (mean) standard deviations for each parameter across the subsets

norm.var.est <- rep(0,d)

for(i in 1:d)
  for(s in 1:M)
    norm.var.est[i] <- norm.var.est[i] + var(theta[i,,s])

norm.sd.est <- sqrt(norm.var.est/M)


## Compute the diagonal of the optimal bandwidth
## matrix according to Silverman's rule

h_opt1 = (4/(d+2))^(1/(4+d)) * (sampT^(-1/(4+d))) * norm.sd.est

## Combine samples. The bandwidth matrix is fixed:

full.theta1 <- semiparamDPE( subchain = theta, bandw = h_opt1 * 2, anneal = FALSE)

## Compute the diagonal of the optimal bandwidth
## matrix for the method that uses annealing

h_opt2 = (4/(d+2))^(1/(4+d)) * norm.sd.est

## Combine samples. The bandwidth matrix will be annealed:

full.theta2 <- semiparamDPE(subchain = theta, bandw = h_opt2 * 2, anneal = TRUE)

parallelMCMCcombine documentation built on June 23, 2021, 9:06 a.m.