Built using Zelig version r packageVersion("Zelig")

knitr::opts_knit$set(
    stop_on_error = 2L
)
knitr::opts_chunk$set(
    fig.height = 11,
    fig.width = 7
)

options(cite = FALSE)

Bayesian Tobit Regression with tobit.bayes.

Bayesian tobit regression estimates a linear regression model with a censored dependent variable using a Gibbs sampler. The dependent variable may be censored from below and/or from above. For other linear regression models with fully observed dependent variables, see Bayesian regression, maximum likelihood normal regression, or least squares.

Syntax

z.out <- zelig(Y ~ X1 + X2, below = 0, above = Inf,
               model = "tobit.bayes", weights = w, data = mydata)
x.out <- setx(z.out)
s.out <- sim(z.out, x = x.out)

Inputs

zelig() accepts the following arguments to specify how the dependent variable is censored.

Additional Inputs

Use the following arguments to monitor the convergence of the Markov chain:

Use the following parameters to specify the model’s priors:

Zelig users may wish to refer to help(MCMCtobit) for more information.

Examples

rm(list=ls(pattern="\\.out"))
suppressWarnings(suppressMessages(library(Zelig)))
set.seed(1234)

Basic Example

Attaching the sample dataset:

data(tobin)

Estimating linear regression using tobit.bayes:

z.out <- zelig(durable ~ age + quant, model = "tobit.bayes",
               data = tobin, verbose = FALSE)

You can check for convergence before summarizing the estimates with three diagnostic tests. See the section Diagnostics for Zelig Models for examples of the output with interpretation:

z.out$geweke.diag()
z.out$heidel.diag()
z.out$raftery.diag()
summary(z.out)

Setting values for the explanatory variables to their sample averages:

x.out <- setx(z.out)

Simulating quantities of interest from the posterior distribution given x.out.

s.out1 <- sim(z.out, x = x.out)
summary(s.out1)

Simulating First Differences

Set explanatory variables to their default(mean/mode) values, with high (80th percentile) and low (20th percentile) liquidity ratio (quant):

x.high <- setx(z.out, quant = quantile(tobin$quant, prob = 0.8))
x.low <- setx(z.out, quant = quantile(tobin$quant, prob = 0.2))

Estimating the first difference for the effect of high versus low liquidity ratio on duration(durable):

s.out2 <- sim(z.out, x = x.high, x1 = x.low)
summary(s.out2)

Model

Let $Y_i^*$ be the dependent variable which is not directly observed. Instead, we observe $Y_i$ which is defined as following:

$$ Y_i = \left{ \begin{array}{lcl} Y_i^ &\textrm{if} & c_1<Y_i^<c_2 \ c_1 &\textrm{if} & c_1 \ge Y_i^ \ c_2 &\textrm{if} & c_2 \le Y_i^ \end{array}\right. $$

where $c_1$ is the lower bound below which $Y_i^$ is censored, and $c_2$ is the upper bound above which $Y_i^$ is censored.

$$ \begin{aligned} \epsilon_{i} & \sim & \textrm{Normal}(0, \sigma^2)\end{aligned} $$

where $\epsilon_{i}=Y^*_i-\mu_i$.

$$ \begin{aligned} \mu_{i}= x_{i} \beta,\end{aligned} $$

where $x_{i}$ is the vector of $k$ explanatory variables for observation $i$ and $\beta$ is the vector of coefficients.

$$ \begin{aligned} \beta & \sim & \textrm{Normal}k \left( b{0},B_{0}^{-1}\right) \ \sigma^{2} & \sim & \textrm{InverseGamma} \left( \frac{c_0}{2}, \frac{d_0}{2} \right) \end{aligned} $$

where $b_{0}$ is the vector of means for the $k$ explanatory variables, $B_{0}$ is the $k\times k$ precision matrix (the inverse of a variance-covariance matrix), and $c_0/2$ and $d_0/2$ are the shape and scale parameters for $\sigma^{2}$. Note that $\beta$ and $\sigma^2$ are assumed a priori independent.

Quantities of Interest

$$ \begin{aligned} \Phi_1 &=& \Phi\left(\frac{(c_1 - x \beta)}{\sigma}\right) \ \Phi_2 &=& \Phi\left(\frac{(c_2 - x \beta)}{\sigma}\right) \ \phi_1 &=& \phi\left(\frac{(c_1 - x \beta)}{\sigma}\right) \ \phi_2 &=& \phi\left(\frac{(c_2 - x \beta)}{\sigma}\right) \end{aligned} $$

where $\Phi(\cdot)$ is the (cumulative) Normal density function and $\phi(\cdot)$ is the Normal probability density function of the standard normal distribution. Then the expected values are

$$ \begin{aligned} E(Y|x) &=& P(Y^ \le c_1|x) c_1+P(c_1<Y^<c_2|x) E(Y^ \mid c_1<Y^<c_2, x)+P(Y^* \ge c_2) c_2 \ &=& \Phi_{1}c_1 + x \beta(\Phi_{2}-\Phi_{1}) + \sigma (\phi_1 -\phi_2) + (1-\Phi_2) c_2,\end{aligned} $$

$$ \begin{aligned} \text{FD}=E(Y\mid x_{1})-E(Y\mid x).\end{aligned} $$

$$ \begin{aligned} \frac{1}{\sum t_{i}}\sum_{i:t_{i}=1}[Y_{i}(t_{i}=1)-E[Y_{i}(t_{i}=0)]],\end{aligned} $$

where $t_{i}$ is a binary explanatory variable defining the treatment ($t_{i}=1$) and control ($t_{i}=0$) groups.

Output Values

The Zelig object stores fields containing everything needed to rerun the Zelig output, and all the results and simulations as they are generated. In addition to the summary commands demonstrated above, some simply utility functions (known as getters) provide easy access to the raw fields most commonly of use for further investigation.

In the example above z.out$get_coef() returns the estimated coefficients, z.out$get_vcov() returns the estimated covariance matrix, and z.out$get_predict() provides predicted values for all observations in the dataset from the analysis.

See also

Bayesian tobit regression is part of the MCMCpack package by Andrew D. Martin and Kevin M. Quinn. The convergence diagnostics are part of the CODA package by Martyn Plummer, Nicky Best, Kate Cowles, and Karen Vines.

z5 <- ztobitbayes$new()
z5$references()


IQSS/Zelig documentation built on Dec. 11, 2023, 1:51 a.m.