library("knitr") knitr::opts_knit$set(self.contained = FALSE) knitr::opts_chunk$set(tidy = TRUE, collapse=TRUE, comment = "#>", tidy.opts=list(blank=FALSE, width.cutoff=55))
The package ebci
implements robust empirical Bayes confidence intervals
(EBCIs) proposed by @akp20 for inference in a normal means model $Y_i\sim
N(\theta_i, \sigma^2_i)$, $i=1, \dotsc, n$.
Suppose we use an empirical Bayes estimator of $\theta_i$ that shrinks toward the predictor based on the regression of $\theta_i$ onto $X_i$ (equivalently, regression of $Y_i$ onto $X_i$), \begin{equation}\label{Bayes-estimator} \hat{\theta}{i} = X_i'\delta+w{EB, i}(Y_i-X_i'\delta), \end{equation}
where $\delta=E[X_i X_i']^{-1}E[X_i\theta_i]$, $w_{EB, i}=\frac{\mu_{2}}{\mu_{2}+\sigma_{i}^{2}}$ is the degree of shrinkage, and \begin{equation}\label{mu2-independence} \mu_{2}=E[(\theta_i-X_i'\delta)^{2} \mid X_{i}, \sigma_i]. \end{equation} is the second moment of the regression residual. We assume that $\mu_2$ doesn't depend on $\sigma_i$. Under this setup, @morris83 proposes to use the parametric EBCI \begin{equation} \hat{\theta}{i} \pm \frac{z{1-\alpha/2}}{\sqrt{w_{EB, i}}}w_{EB, i}\sigma_i. \end{equation}
The critical value $z_{1-\alpha/2}/\sqrt{w_{EB, i}}$ is larger than the usual
critical value $z_{1-\alpha/2}=$qnorm(1-alpha/2)
if the estimator was unbiased
conditional on $\theta_i$. This CI is justified if we strengthen the assumption
(\ref{mu2-independence}) by making the normality assumption $\theta_i\mid X_{i},
\sigma_{i} \sim N(X_i'\delta, \mu_2)$. However, if the residual
$\theta_i-X_i'\delta$ is not normally distributed, this CI will undercover.
@akp20 derive a robust EBCI that is only uses (\ref{mu2-independence}) and not
the normality assumption. The EBCI takes the form
\begin{equation}\label{robust-ebci}
\hat{\theta}{i} \pm cva{\alpha}(m_{2i}, \infty) w_{EB, i}\sigma_i,
\,\, m_{2i}=(1-1/w_{EB, i})^2\mu_2/\sigma^2_i=\sigma^{2}{i}/\mu{2},
\end{equation}
where the critical value $cva_{\alpha}$ is derived in @akp20. Here $m_{2i}$ is the
second moment of the conditional bias of $\hat{\theta}i$, scaled by the
standard error (normalized bias, henceforth). This critical value imposes a
constraint (\ref{mu2-independence}) on the second moment of $\theta_i$, but no
constraints on higher moments. We can make the critical value smaller by also
imposing a constraint on the kurtosis of $\theta_i$ (or equivalently, the
kurtosis of the normalized bias)
\begin{equation}\label{kappa-independence}
\kappa=E[(\theta_i-X_i'\delta)^{4}
\mid X{i}, \sigma_i]/\mu_{2}^2.
\end{equation}
In analogy to (\ref{mu2-independence}), we assume here that the conditional fourth
moment of $\theta_i-X_i'\delta$ doesn't depend on $(X_i,\sigma_i)$. In this
case, the robust EBCI takes the form
\begin{equation}
\hat{\theta}{i} \pm cva{\alpha}(m_{2i},\kappa) w_{EB, i}\sigma_i.
\end{equation}
These critical values are implemented in the package by the cva
function:
library("ebci") ## If m_2=0, then we get the usual critical value cva(m2=0, kappa=Inf, alpha=0.05)$cv ## Otherwise the critical value is larger: cva(m2=4, kappa=Inf, alpha=0.05)$cv ## Imposing a constraint on kurtosis tightens it cva(m2=4, kappa=3, alpha=0.05)$cv
In practice, the parameters $\delta, \mu_2$, and $\kappa$ are unknown. To implement the EBCI, the package replaces them with consistent estimates, following the baseline implementation in @akp20. We illustrate this in the next section.
Here we illustrate the use of the package using a dataset from @ChHe18ii (CH
hereafter). The dataset is included in the package as the list cz
. Run ?cz
for a full description of the dataset. As in @ChHe18ii, we use precision weights
proportional to the inverse of the squared standard error to compute
$(\delta,\mu_2,\kappa)$.
## As Y_i, use fixed effect estimate theta25 of causal effect of neighborhood ## for children with parents at the 25th percentile of income distribution. The ## standard error for this estimate is se25. As predictors use average outcome ## for permanent residents (stayers), stayer25. Let us use 90% CIs, as in ## Armstrong et al r <- ebci(formula=theta25~stayer25, data=cz, se=se25, weights=1/se25^2, alpha=0.1)
For shrinkage toward the grand mean, or toward zero, use the specification
theta25 ~ 1
, and theta25 ~ 0
, respectively, in the formula
argument of
ebci
.
The return value contains (see ?ebci
for full description)
r
r$delta
estimate
) is obtained by applying a finite-sample correction to an initial
method of moments estimate (uncorrected_estimate
). This correction ensures
that we don't shrink all the way to zero (or past zero) if the
method-of-moments estimate of $\mu_2$ is negative (see @akp20 for details):
r
c(r$mu2, r$kappa)
r$alpha
, and the
matrix of regressors, r$X
.r
names(r$df)
The columns of the data frame refer to:
w_eb
Empirical Bayes shrinkage factor $w_{EB, i}=\mu_2/(\mu_2+\sigma_i^2)$.th_eb
Empirical Bayes estimator $\hat{\theta_i}$ given in (\ref{Bayes-estimator})len_eb
Half-length $cva_{\alpha}(m_2, \kappa)w_i\sigma_i$ of the robust
EBCI, so that the lower endpoint of the EBCIs are given by th_eb-len_eb
, and
the upper endpoint by th_eb+len_eb
. Let us verify this for the first observation:
r
cva(r$df$se[1]^2/r$mu2[1], r$kappa[1], alpha=0.1)$cv*r$df$w_eb[1]*r$df$se[1]
r$df$len_eb[1]
len_pa
Half-length $z_{1-\alpha/2}\sqrt{w_i}\sigma_i$ of the parametric EBCI.w_opt
Shrinkage factor that optimizes the length of the resulting confidence
interval. In other words, instead of using $w_{EB, i}$ in (\ref{robust-ebci}),
we use shrinkage $w_i$ that minimizes $cva_{\alpha}((1-1/w_{EB,
i})^2\mu_2/\sigma^2_i, \kappa) w_{i}\sigma_i$. See @akp20 for details. The
vector is missing here since the default option, wopt=FALSE
, is to skip
computation of the length-optimal CIs to speed up the calculations.th_op
Estimator based on the length-optimal shrinkage factor w_opt
(missing here since the default is wopt=FALSE
)len_op
Half-length $cva_{\alpha}((1-1/w_{EB, i})^2\mu_2/\sigma^2_i, \kappa)
w_{i}\sigma_i$ of the length-optimal EBCI (missing here since we specified
wopt=FALSE
).th_us
The unshrunk estimate $Y_i$, as specified in the formula
argument of
the function ebci
.len_us
Half-length $z_{1-\alpha/2}\sigma_i$ of the CI based on the unshrunk
estimatese
The standard error $\sigma_i$, as specified by the argument se
of the
ebci
function.ncov_pa
maximal non-coverage of the parametric EBCI.Using the data frame, we can give a table summarizing the results. Let us show the results for the CZ in New York:
df <- (cbind(cz[!is.na(cz$se25), ], r$df)) df <- df[df$state=="NY", ] knitr::kable(data.frame(cz=df$czname, unshrunk_estimate=df$theta25, estimate=df$th_eb, lower_ci=df$th_eb-df$len_eb, upper_ci=df$th_eb+df$len_eb), digits=3)
@akp20 present the same information as a figure.
Finally, let us compute some summary statistics as in Table 3 in @akp20. Average half-length of the robust, parametric, and unshrunk CI:
mean(r$df$len_eb) mean(r$df$len_pa) mean(r$df$len_us)
The efficiency of the parametric and unshrunk CI relative to the robust EBCI is given by
mean(r$df$len_eb)/mean(r$df$len_pa) mean(r$df$len_eb)/mean(r$df$len_us)
While the parametric EBCI is shorter on average, it yields CIs that may violate the 90% coverage requirement. In particular, the average maximal non-coverage probability at the estimated value of $(\mu_{2},\kappa)$ is given by
mean(r$df$ncov_pa)
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.