BayesSAE: MCMC Draws Based on Area-Level Models

Description Usage Arguments Details Value Author(s) References Examples

Description

This function obtains MCMC draws of specific small area area-level models defined by the sampling model and linking model.

Usage

1
2
3
BayesSAE(formula, innov = "normal", df = NULL, b = NA, spatial = FALSE, tran = 
     "F", prox = NULL, beta.start = NULL, theta.start = NULL, lam.start = runif(1), 
     prior = NULL, mcmc = 5000, burnin = 2500, thin = 5, data)

Arguments

formula

a symbolic description of the model to be fitted (of type y ~ x | z). y is the response variable in the sampling model while x is the design matrix in the linking model, and z is the estimated variance of direct estimation in the sampling model. See Rao (2003) for details about the sampling model and linking model in small area estimation.

innov

distribution of innovations in the sampling model. to be chosen between "normal" and "t".

df

a vector containing degrees of freedom for the t innovation in the sampling model if innov = "t".

b

an optional weights vector to be used in the fitting process. Number of domains must be the same as that of the direct estimators. By default is a vector of ones. See Details.

spatial

a logical variable indicating whether it's a spatial model or not.

tran

the transformation to be taken on the responsive variable in the linking model, to be chosen between "F", "log" or "logit". "F" is the default value and indicates no transformation taken. See Details.

prox

l * 2 matrix defining the neighbourhood matrix. See also Details.

beta.start

initial values for beta's. By default is the coefficients of regression model where the response vector is directly regressed on the design matrix in the linking model.

theta.start

initial values for theta's. By default is the response vector.

lam.start

initial value for lambda in the spatial model. See You and Chapman (2006).

prior

a list of objects specifying priors. See Details.

mcmc

user-specified number of MCMC draws.

burnin

The number of burnin iterations for the sampler. See Gelman (2006).

thin

the thinning interval used in the simulation. See Gelman (2006).

data

an optional data frame, list or environment containing variables in the model.

Details

Let θ_i donates variable of interest for each domain i, x_i the regressors, β the regression coefficient, v_i i.i.d normal innovations. If argument b is specified, the linking model is of the form: θ_i = x_i β + b_i v_i.

If tran = "log", the linking model will be: \log(θ_i) = x_i β + b_i v_i. tran = "logit" means that logit transformation will be taken and the model will be: logit(θ_i) = x_i β + b_i v_i. Both are unmatched area level models. See Rao (2003).

The neighbourhood matrix has the ith diagonal element equal to the number of neighbours of area i, and off-diagonal elements equal to -1 if the corresponding areas are neighbours otherwise 0. See You and Chapman (2006).

The ith tuple in the argument prox indicates that area prox[i, 1] and area prox[i, 2] are neighbours. Duplicated tuples will be omitted. For example, if the first row of prox is (1, 2) while the second is (2, 1), the second row will be deleted. The two elements within each tuple are not supposed to be the same.

Initial values are crucial to MCMC convergence. EBLUP predictors of θ's and β's can provides good starting values for MCMC procedure.

The list prior should include following attributes for basic Fay-Herriot model:

Besides, the prior list should also include attribute sig2.prior to specify priors for σ_i^2 in the You-Chapman model. See You and Chapman (2006). The sig2.prior is also a list contains the components ai and bi. Both ai and bi are vectors whose length are the same as number of domains. Thus, prior for σ_i^2 would be inverse gamma distribution with shape parameter ai[i] and rate parameter bi[i]. Default value of elements in ai and bi are all 0.05.

Value

The function returns a object of class "BayesSAE" containing the following components:

mcmc

an mcmc object that contains the posterior sample. This object can be summarized by functions provided by the coda package

type

character string indicating the type of the model. For instance "UFH" indicates that it's an unmatched Fay-Herriot model while "SYC" refers to spatial You-Chapman model

Y

response vector in the sampling model

X

design matrix in the linking model

Z

variance of direct estimation in the sampling model

lam.rate

the acceptance rate of λ since draws of λ are generated by M-H algorithm

D_avg

average deviance, defined as

-2/n ∑_{i=1}^n p(y|θ^{i})

where θ^{i} donates the ith posterior draw of θ.

D_theta.hat

discrepancy between data and model depending on a point estimator for θ, defined as

-2 p(y|θ^')

where θ^' is the point estimator for θ. Here we calculate the posterior mean as the point estimator

DIC

deviance information criterion, defined as 2D_avg - D_theta.hat. Small DIC value indicates strong out-of-sample predictive power

theta.HB

a vector of length the same as number of domains provides Rao-Blackwell estimators for each area.

call

the original function call

HB

Rao-Blackwellization of theta's

spatial

logical variable indicating whether the model is spatial or not

tran

character indicating the transformation of response variable in the linking model

If it's an unmatched model, θ_i's are generated by M-H algorithm and theta.rate as a vector of length m provides acceptance rate for each θ_i respectively is involved

Author(s)

Chengchun Shi

References

You, Y. and Chapman, B. (2006) Small Area Estimation Using Area Level Models and Estimated Sampling Variances. Survey Methodology, 32: 97-103.

Rao, J. N. K. (2003) Small Area Estimation. John Wiley and Sons.

Gelman, A. and Carlin, J. B. and Stern, H. S. and Rubin, D. B. (2006). Bayesian Data Analysis, CRC Press Company.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
# load data set
data(SAIPE)
m <- length(SAIPE$SACPR)

# basic Fay-Herriort models (FH)
result <- BayesSAE(SACPR~SNAPR+CenPR+CPER|Vardir, data = SAIPE, mcmc = 5000)

# You-Chapman models (YC)
result <- BayesSAE(SACPR~SNAPR+CenPR+CPER|Vardir, data = SAIPE, mcmc = 5000, innov = "t", 
    df = rep(50, m))
	
# spatial model with unknown sampling variance (SYC)
# define the neighbourhood matrix
prox <- cbind(sample(1:51, 50, replace = TRUE), sample(1:51, 50, replace = TRUE))
prox <- prox[prox[,1] != prox[,2], ]
result <- BayesSAE(SACPR~SNAPR+CenPR+CPER|Vardir, data = SAIPE, mcmc = 5000, innov = "t",
    df = rep(50, m), spatial = TRUE, prox = prox)

# Unmatched models (UFH)
result <- BayesSAE(SACPR~SNAPR+CenPR+CPER|Vardir, data = SAIPE, mcmc = 5000, tran = "log")	

BayesSAE documentation built on May 2, 2019, 3:44 a.m.