simsar: Simulating Data from Linear-in-Mean Models with Social...

View source: R/sar.R

simsarR Documentation

Simulating Data from Linear-in-Mean Models with Social Interactions

Description

simsar simulates continuous variables under linear-in-mean models with social interactions, following the specifications described in Lee (2004) and Lee et al. (2010). The model incorporates peer interactions, where the value of an individual’s outcome depends not only on their own characteristics but also on the average characteristics of their peers in the network.

Usage

simsar(formula, Glist, theta, cinfo = TRUE, data)

Arguments

formula

A symbolic description of the model, passed as a class object of type formula. The formula must specify the endogenous variable and control variables, for example: y ~ x1 + x2 + gx1 + gx2, where y is the endogenous vector, and x1, x2, gx1, and gx2 are the control variables, which may include contextual variables (peer averages). Peer averages can be computed using the function peer.avg.

Glist

A list of network adjacency matrices representing multiple subnets. The m-th element in the list should be an ns * ns matrix, where ns is the number of nodes in the m-th subnet.

theta

A numeric vector defining the true values of the model parameters \theta = (\lambda, \Gamma, \sigma). These parameters are used to define the model specification in the details section.

cinfo

A Boolean flag indicating whether the information is complete (cinfo = TRUE) or incomplete (cinfo = FALSE). If information is incomplete, the model operates under rational expectations.

data

An optional data frame, list, or environment (or an object coercible by as.data.frame to a data frame) containing the variables in the model. If not provided, the variables are taken from the environment of the function call.

Details

In the complete information model, the outcome y_i for individual i is defined as:

y_i = \lambda \bar{y}_i + \mathbf{z}_i'\Gamma + \epsilon_i,

where \bar{y}_i represents the average outcome y among individual i's peers, \mathbf{z}_i is a vector of control variables, and \epsilon_i \sim N(0, \sigma^2) is the error term. In the case of incomplete information models with rational expectations, the outcome y_i is defined as:

y_i = \lambda E(\bar{y}_i) + \mathbf{z}_i'\Gamma + \epsilon_i,

where E(\bar{y}_i) is the expected average outcome of i's peers, as perceived by individual i.

Value

A list containing the following elements:

y

the observed count data.

Gy

the average of y among friends.

References

Lee, L. F. (2004). Asymptotic distributions of quasi-maximum likelihood estimators for spatial autoregressive models. Econometrica, 72(6), 1899-1925, \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1111/j.1468-0262.2004.00558.x")}.

Lee, L. F., Liu, X., & Lin, X. (2010). Specification and estimation of social interaction models with network structures. The Econometrics Journal, 13(2), 145-176, \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1111/j.1368-423X.2010.00310.x")}

See Also

sar, simsart, simcdnet.

Examples


# Groups' size
set.seed(123)
M      <- 5 # Number of sub-groups
nvec   <- round(runif(M, 100, 1000))
n      <- sum(nvec)

# Parameters
lambda <- 0.4
Gamma  <- c(2, -1.9, 0.8, 1.5, -1.2)
sigma  <- 1.5
theta  <- c(lambda, Gamma, sigma)

# X
X      <- cbind(rnorm(n, 1, 1), rexp(n, 0.4))

# Network
G      <- list()

for (m in 1:M) {
  nm           <- nvec[m]
  Gm           <- matrix(0, nm, nm)
  max_d        <- 30
  for (i in 1:nm) {
    tmp        <- sample((1:nm)[-i], sample(0:max_d, 1))
    Gm[i, tmp] <- 1
  }
  rs           <- rowSums(Gm); rs[rs == 0] <- 1
  Gm           <- Gm/rs
  G[[m]]       <- Gm
}

# data
data   <- data.frame(X, peer.avg(G, cbind(x1 = X[,1], x2 =  X[,2])))
colnames(data) <- c("x1", "x2", "gx1", "gx2")

ytmp    <- simsar(formula = ~ x1 + x2 + gx1 + gx2, Glist = G, 
                  theta = theta, data = data) 
y       <- ytmp$y


CDatanet documentation built on April 3, 2025, 11:07 p.m.