sart: Estimating Tobit Models with Social Interactions
In CDatanet: Econometrics of Network Data

View source: R/sart.R

sart	R Documentation

Estimating Tobit Models with Social Interactions

Description

sart estimates Tobit models with social interactions based on the framework of Xu and Lee (2015). The method allows for modeling both complete and incomplete information scenarios in networks, incorporating rational expectations in the latter case.

Usage

sart(
  formula,
  Glist,
  starting = NULL,
  Ey0 = NULL,
  optimizer = "fastlbfgs",
  npl.ctr = list(),
  opt.ctr = list(),
  cov = TRUE,
  cinfo = TRUE,
  data
)

Arguments

`formula`	An object of class formula: a symbolic description of the model. The formula must follow the structure, e.g., `y ~ x1 + x2 + gx1 + gx2`, where `y` is the endogenous variable, and `x1`, `x2`, `gx1`, and `gx2` are control variables. Control variables may include contextual variables, such as peer averages, which can be computed using `peer.avg`.
`Glist`	The network matrix. For networks consisting of multiple subnets, `Glist` can be a list, where the `m`-th element is an `ns*ns` adjacency matrix representing the `m`-th subnet, with `ns` being the number of nodes in that subnet.
`starting`	(Optional) A vector of starting values for `\theta = (\lambda, \Gamma, \sigma)`, where: `\lambda` is the peer effect coefficient, `\Gamma` is the vector of control variable coefficients, `\sigma` is the standard deviation of the error term.
`Ey0`	(Optional) A starting value for `E(y)`.
`optimizer`	The optimization method to be used. Choices are: `"fastlbfgs"`: L-BFGS optimization method from the RcppNumerical package, `"nlm"`: Refers to the nlm function, `"optim"`: Refers to the optim function. Additional arguments for these functions, such as `control` and `method`, can be specified through the `opt.ctr` argument.
`npl.ctr`	A list of controls for the NPL (Nested Pseudo-Likelihood) method (refer to the details in `cdnet`).
`opt.ctr`	A list of arguments to be passed to the chosen solver (`fastlbfgs`, nlm, or optim), such as `maxit`, `eps_f`, `eps_g`, `control`, `method`, etc.
`cov`	A Boolean indicating whether to compute the covariance matrix (`TRUE` or `FALSE`).
`cinfo`	A Boolean indicating whether the information structure is complete (`TRUE`) or incomplete (`FALSE`). Under incomplete information, the model is defined with rational expectations.
`data`	An optional data frame, list, or environment (or object coercible by as.data.frame) containing the variables in the model. If not found in `data`, the variables are taken from `environment(formula)`, typically the environment from which `sart` is called.

Details

For a complete information model, the outcome y_i is defined as:

\begin{cases} y_i^{\ast} = \lambda \bar{y}_i + \mathbf{z}_i'\Gamma + \epsilon_i, \\ y_i = \max(0, y_i^{\ast}), \end{cases}

where \bar{y}_i is the average of y among peers, \mathbf{z}_i is a vector of control variables, and \epsilon_i \sim N(0, \sigma^2).

In the case of incomplete information models with rational expectations, y_i is defined as:

\begin{cases} y_i^{\ast} = \lambda E(\bar{y}_i) + \mathbf{z}_i'\Gamma + \epsilon_i, \\ y_i = \max(0, y_i^{\ast}). \end{cases}

Value

A list containing:

info: General information about the model.
estimate: The Maximum Likelihood (ML) estimates of the parameters.
Ey: E(y), the expected values of the endogenous variable.
GEy: The average of E(y) among peers.
cov: A list including covariance matrices (if cov = TRUE).
details: Additional outputs returned by the optimizer.

References

Xu, X., & Lee, L. F. (2015). Maximum likelihood estimation of a spatial autoregressive Tobit model. Journal of Econometrics, 188(1), 264-280, \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1016/j.jeconom.2015.05.004")}.

Examples


# Group sizes
set.seed(123)
M      <- 5 # Number of sub-groups
nvec   <- round(runif(M, 100, 200))
n      <- sum(nvec)

# Parameters
lambda <- 0.4
Gamma  <- c(2, -1.9, 0.8, 1.5, -1.2)
sigma  <- 1.5
theta  <- c(lambda, Gamma, sigma)

# Covariates (X)
X      <- cbind(rnorm(n, 1, 1), rexp(n, 0.4))

# Network creation
G      <- list()

for (m in 1:M) {
  nm           <- nvec[m]
  Gm           <- matrix(0, nm, nm)
  max_d        <- 30
  for (i in 1:nm) {
    tmp        <- sample((1:nm)[-i], sample(0:max_d, 1))
    Gm[i, tmp] <- 1
  }
  rs           <- rowSums(Gm); rs[rs == 0] <- 1
  Gm           <- Gm / rs
  G[[m]]       <- Gm
}

# Data creation
data   <- data.frame(X, peer.avg(G, cbind(x1 = X[, 1], x2 = X[, 2])))
colnames(data) <- c("x1", "x2", "gx1", "gx2")

## Complete information game
ytmp    <- simsart(formula = ~ x1 + x2 + gx1 + gx2, Glist = G, theta = theta, 
                   data = data, cinfo = TRUE)
data$yc <- ytmp$y

## Incomplete information game
ytmp    <- simsart(formula = ~ x1 + x2 + gx1 + gx2, Glist = G, theta = theta, 
                   data = data, cinfo = FALSE)
data$yi <- ytmp$y

# Complete information estimation for yc
outc1   <- sart(formula = yc ~ x1 + x2 + gx1 + gx2, optimizer = "nlm",
                Glist = G, data = data, cinfo = TRUE)
summary(outc1)

# Complete information estimation for yi
outc1   <- sart(formula = yi ~ x1 + x2 + gx1 + gx2, optimizer = "nlm",
                Glist = G, data = data, cinfo = TRUE)
summary(outc1)

# Incomplete information estimation for yc
outi1   <- sart(formula = yc ~ x1 + x2 + gx1 + gx2, optimizer = "nlm",
                Glist = G, data = data, cinfo = FALSE)
summary(outi1)

# Incomplete information estimation for yi
outi1   <- sart(formula = yi ~ x1 + x2 + gx1 + gx2, optimizer = "nlm",
                Glist = G, data = data, cinfo = FALSE)
summary(outi1)

CDatanet documentation built on April 3, 2025, 11:07 p.m.