simsart: Simulating Data from Tobit Models with Social Interactions
In CDatanet: Econometrics of Network Data

simsart

R Documentation

Simulating Data from Tobit Models with Social Interactions

Description

simsart simulates censored data with social interactions (see Xu and Lee, 2015).

Usage

simsart(formula, Glist, theta, tol = 1e-15, maxit = 500, cinfo = TRUE, data)

Arguments

`formula`	a class object `formula`: a symbolic description of the model. `formula` must be, for example, `y ~ x1 + x2 + gx1 + gx2`, where `y` is the endogenous vector, and `x1`, `x2`, `gx1`, and `gx2` are control variables. These can include contextual variables, i.e., averages among the peers. Peer averages can be computed using the function `peer.avg`.
`Glist`	The network matrix. For networks consisting of multiple subnets, `Glist` can be a list of subnets with the `m`-th element being an `ns*ns` adjacency matrix, where `ns` is the number of nodes in the `m`-th subnet.
`theta`	a vector defining the true value of `\theta = (\lambda, \Gamma, \sigma)` (see the model specification in the details).
`tol`	the tolerance value used in the fixed-point iteration method to compute `y`. The process stops if the `\ell_1`-distance between two consecutive values of `y` is less than `tol`.
`maxit`	the maximum number of iterations in the fixed-point iteration method.
`cinfo`	a Boolean indicating whether information is complete (`cinfo = TRUE`) or incomplete (`cinfo = FALSE`). In the case of incomplete information, the model is defined under rational expectations.
`data`	an optional data frame, list, or environment (or object coercible by `as.data.frame` to a data frame) containing the variables in the model. If not found in `data`, the variables are taken from `environment(formula)`, typically the environment from which `simsart` is called.

Details

For a complete information model, the outcome y_i is defined as:

\begin{cases} y_i^{\ast} = \lambda \bar{y}_i + \mathbf{z}_i'\Gamma + \epsilon_i, \\ y_i = \max(0, y_i^{\ast}), \end{cases}

where \bar{y}_i is the average of y among peers, \mathbf{z}_i is a vector of control variables, and \epsilon_i \sim N(0, \sigma^2).

In the case of incomplete information models with rational expectations, y_i is defined as:

\begin{cases} y_i^{\ast} = \lambda E(\bar{y}_i) + \mathbf{z}_i'\Gamma + \epsilon_i, \\ y_i = \max(0, y_i^{\ast}). \end{cases}

Value

A list consisting of:

yst: y^{\ast}, the latent variable.
y: The observed censored variable.
Ey: E(y), the expected value of y.
Gy: The average of y among peers.
GEy: The average of E(y) among peers.
meff: A list including average and individual marginal effects.
iteration: The number of iterations performed per sub-network in the fixed-point iteration method.

References

Xu, X., & Lee, L. F. (2015). Maximum likelihood estimation of a spatial autoregressive Tobit model. Journal of Econometrics, 188(1), 264-280, \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1016/j.jeconom.2015.05.004")}.

Examples


# Define group sizes
set.seed(123)
M      <- 5 # Number of sub-groups
nvec   <- round(runif(M, 100, 200)) # Number of nodes per sub-group
n      <- sum(nvec) # Total number of nodes

# Define parameters
lambda <- 0.4
Gamma  <- c(2, -1.9, 0.8, 1.5, -1.2)
sigma  <- 1.5
theta  <- c(lambda, Gamma, sigma)

# Generate covariates (X)
X      <- cbind(rnorm(n, 1, 1), rexp(n, 0.4))

# Construct network adjacency matrices
G      <- list()
for (m in 1:M) {
  nm           <- nvec[m] # Nodes in sub-group m
  Gm           <- matrix(0, nm, nm) # Initialize adjacency matrix
  max_d        <- 30 # Maximum degree
  for (i in 1:nm) {
    tmp        <- sample((1:nm)[-i], sample(0:max_d, 1)) # Random connections
    Gm[i, tmp] <- 1
  }
  rs           <- rowSums(Gm) # Normalize rows
  rs[rs == 0]  <- 1
  Gm           <- Gm / rs
  G[[m]]       <- Gm
}

# Prepare data
data   <- data.frame(X, peer.avg(G, cbind(x1 = X[, 1], x2 = X[, 2])))
colnames(data) <- c("x1", "x2", "gx1", "gx2") # Add column names

# Complete information game simulation
ytmp    <- simsart(formula = ~ x1 + x2 + gx1 + gx2, 
                   Glist = G, theta = theta, 
                   data = data, cinfo = TRUE)
data$yc <- ytmp$y # Add simulated outcome to the dataset

# Incomplete information game simulation
ytmp    <- simsart(formula = ~ x1 + x2 + gx1 + gx2, 
                   Glist = G, theta = theta, 
                   data = data, cinfo = FALSE)
data$yi <- ytmp$y # Add simulated outcome to the dataset

CDatanet documentation built on April 3, 2025, 11:07 p.m.