databeta: Synthetic Data for Small Area Estimation using Spatial Beta...

databetaR Documentation

Synthetic Data for Small Area Estimation using Spatial Beta Model

Description

A synthetic dataset generated for testing and tutorial purposes of the saeHB.Spatial.Beta package. The data is generated under a Spatial Simultaneous Autoregressive (SAR) process with a Beta distribution, accommodating survey design effects (DEFF).

This data is generated by these following steps:

  1. Generate auxiliary variables x1 \sim N(0, 1) and x2 \sim N(0, 1).

  2. Generate sample sizes n_i \sim U(10, 50) and survey design effects deff_i \sim U(1, 2.5). Calculate the precision parameter for each area: \phi_i = (n_i / deff_i) - 1.

  3. Generate spatial random effects under the SAR model. First, generate independent normal errors u \sim N(0, 1). Then, calculate the spatial random effect v = (I - \rho W)^{-1}u, where I is an identity matrix, W is the row-standardized proximity matrix (weight_mat), and the spatial autoregressive parameter \rho is set to 0.70.

  4. Calculate the true mean proportions \mu = \text{logit}^{-1}(X\beta + v), where the regression coefficients are set as \beta_0 = \beta_1 = \beta_2 = 1.

  5. Generate the response variable y \sim \text{Beta}(\mu \phi, (1 - \mu) \phi). Values are strictly bounded between 0 and 1.

  6. Area ID domain, response variable y, auxiliary variables x1, x2, sample size n_i, and design effect deff are combined into a data frame called databeta.

Usage

data(databeta)

Format

A data frame with 36 rows and 6 columns:

domain

Area ID/name

y

Direct estimates of the proportion/variable of interest (0 < y < 1)

x1

Auxiliary variable 1 (Normal distribution)

x2

Auxiliary variable 2 (Normal distribution)

n_i

Sample size for each area

deff

Survey design effect for each area


saeHB.Spatial.Beta documentation built on July 1, 2026, 5:07 p.m.