normalizing.constant | R Documentation |
This function returns a vector of coefficients that defines a function f(a_0)
that approximates the normalizing constant for generalized linear models with random a_0
.
The user should input the values returned to glm.random.a0
or power.glm.random.a0
.
normalizing.constant(
grid,
historical,
data.type,
data.link,
prior.beta.var = rep(10, 50),
lower.limits = rep(-100, 50),
upper.limits = rep(100, 50),
slice.widths = rep(1, 50),
nMC = 10000,
nBI = 250
)
grid |
Matrix of potential values for |
historical |
List of historical dataset(s). East historical dataset is stored in a list which constains two named elements:
For binomial data, an additional element
|
data.type |
Character string specifying the type of response. The options are "Bernoulli", "Binomial", "Poisson" and "Exponential". |
data.link |
Character string specifying the link function. The options are "Logistic", "Probit", "Log", "Identity-Positive", "Identity-Probability" and "Complementary Log-Log". Does not apply if |
prior.beta.var |
Vector of variances of the independent normal initial priors on |
lower.limits |
Vector of lower limits for parameters to be used by the slice sampler. The length of the vector should be equal to the total number of parameters, i.e. P+1 where P is the number of covariates. The default is -100 for all parameters (may not be appropriate for all situations). Does not apply if |
upper.limits |
Vector of upper limits for parameters to be used by the slice sampler. The length of the vector should be equal to the total number of parameters, i.e. P+1 where P is the number of covariates. The default is 100 for all parameters (may not be appropriate for all situations). Does not apply if |
slice.widths |
Vector of initial slice widths for parameters to be used by the slice sampler. The length of the vector should be equal to the total number of parameters, i.e. P+1 where P is the number of covariates. The default is 1 for all parameter (may not be appropriate for all situations). Does not apply if |
nMC |
Number of iterations (excluding burn-in samples) for the slice sampler or Gibbs sampler. The default is 10,000. |
nBI |
Number of burn-in samples for the slice sampler or Gibbs sampler. The default is 250. |
This function performs the following steps:
Suppose there are K historical datasets. The user inputs a grid of M rows and K columns of potential values for a_0
. For example, one can choose the vector v = c(0.1, 0.25, 0.5, 0.75, 1)
and use expand.grid(a0_1=v, a0_2=v, a0_3=v)
when K=3
to get a grid with M=5^3=125
rows and 3 columns. If there are more than three historical datasets, the dimension of v
can be reduced
to limit the size of the grid. A large grid will increase runtime.
For each row of a_0
values in the grid, obtain M
samples for \beta
from the power prior associated with the current values of a_0
using the slice sampler.
For each of the M sets of posterior samples, execute the PWK algorithm (Wang et al., 2018) to estimate the log of normalizing constant d_1,...,d_M
for the normalized power prior.
At this point, one has a dataset with outcomes d_1,...,d_M
and predictors corresponding to the rows of the a_0
grid matrix. A polynomial regression is applied to estimate a function d=f(a0)
.
The degree of the polynomial regression is determined by the algorithm to ensure R^2 > 0.99
.
The vector of coefficients from the polynomial regression model is returned by the function, which the user must input into glm.random.a0
or power.glm.random.a0
.
When a row of the grid
contains elements that are close to zero, the resulting power prior will be flat and estimates of normalizing constants may be inaccurate.
Therefore, it is recommended that grid
values should be at least 0.05.
If one encounters the error message "some coefficients are not defined because of singularities",
it could be due to the following factors: number of grid
rows too large or too small, insufficient sample size of the historical data, insufficient number of iterations for the slice sampler,
or near-zero grid
values.
Note that due to computational intensity, the normalizing.constant
function has not been evaluated for accuracy for high dimensional \beta
(e.g., dimension > 10) or high dimensional a_0
(e.g., dimension > 5).
Vector of coefficients for a_0
that defines a function f(a_0)
that approximates the normalizing constant, necessary for functions glm.random.a0
and power.glm.random.a0
.
The length of the vector is equal to 1+K*L where K is the number of historical datasets and L is the degree of the polynomial regression determined by the algorithm.
Wang, Yu-Bo; Chen, Ming-Hui; Kuo, Lynn; Lewis, Paul O. A New Monte Carlo Method for Estimating Marginal Likelihoods. Bayesian Anal. 13 (2018), no. 2, 311–333.
glm.random.a0
and power.glm.random.a0
data.type <- "Bernoulli"
data.link <- "Logistic"
data.size <- 50
# Simulate two historical datasets
p <- 1
set.seed(111)
x1 <- matrix(rnorm(p*data.size),ncol=p,nrow=data.size)
set.seed(222)
x2 <- matrix(rnorm(p*data.size),ncol=p,nrow=data.size)
beta <- c(1,2)
mean1 <- exp(x1*beta)/(1+exp(x1*beta))
mean2 <- exp(x2*beta)/(1+exp(x2*beta))
historical <- list(list(y0=rbinom(data.size,size=1,prob=mean1),x0=x1),
list(y0=rbinom(data.size, size=1, prob=mean2),x0=x2))
# Create grid of possible values of a0 with two columns corresponding to a0_1 and a0_2
g <- c(0.1, 0.25, 0.5, 0.75, 1)
grid <- expand.grid(a0_1=g, a0_2=g)
nMC <- 100 # nMC should be larger in practice
nBI <- 50
result <- normalizing.constant(grid=grid, historical=historical,
data.type=data.type, data.link=data.link,
nMC=nMC, nBI=nBI)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.