horse_race: Compare the explanatory power of parameter.dependent network...
In econet: Estimation of Parameter-Dependent Network Centrality Measures

horse_race

R Documentation

Compare the explanatory power of parameter.dependent network centrality measures with those of standard measures of network centrality.

Description

Compare the explanatory power of parameter.dependent network centrality measures with those of standard measures of network centrality.

Usage

horse_race(
  formula = formula(),
  centralities = c("indegree", "outdegree", "degree", "betweenness", "incloseness",
    "outcloseness", "closeness", "eigenvector"),
  directed = FALSE,
  weighted = FALSE,
  normalization = FALSE,
  data = list(),
  unobservables = list(),
  G = list(),
  model = c("model_A", "model_B"),
  estimation = c("NLLS", "MLE"),
  endogeneity = FALSE,
  first_step = NULL,
  exclusion_restriction = NULL,
  start.val = NULL,
  to_weight = NULL,
  time_fixed_effect = NULL,
  ind_fixed_effect = NULL,
  mle_controls = NULL,
  kappa = NULL,
  delta = NULL
)

Arguments

`formula`	an object of class `formula`: a symbolic description of the model to be fitted. The constant (i.e. intercept) and the autogressive parameter needs not to be specified.
`centralities`	at least one of `c("indegree","outdegree","degree", "betweenness", "incloseness", "outcloseness", "closeness", "eigenvector")`.
`directed`	logical. `TRUE` if the social network is directed, `FALSE` otherwise.
`weighted`	logical. `TRUE` if the social network is weighted, `FALSE` otherwise.
`normalization`	Default is NULL. Alternatively, it can be set to `c("bygraph","bycomponent","bymaxcomponent","bymaxgraph")`. See details.
`data`	an object of class `data.frame` containing the variables in the model. If data are longitudinal, observations must be ordered by time period and then by individual.
`unobservables`	a numeric vector used to obtain an unbiased estimate of the parameter.dependent centrality when the network is endogenous. See details.
`G`	an object of class `Matrix` representing the social network. Row and column names must be specified and match the order of the observations in `data`.
`model`	string. One of `c("model_A","model_B")`. See details.
`estimation`	string. One of `c("NLLS","MLE")`. They are used to implement respectively a non-linear least square and a maximum likelihood estimator.
`endogeneity`	logical. Default is `FALSE`. If `TRUE`, `net_dep` implements a two-step correction procedure to control for the endogeneity of the network.
`first_step`	Default is NULL. If `endogeneity = TRUE`, it requires to specify one of `c("standard","fe", "shortest", "coauthors", "degree")`. See details.
`exclusion_restriction`	an object of class `Matrix` representing the exogenous matrix used to instrument the endogenous social network, if `unobservables` is non-`NULL`. Row and column names must be specified and match the order of the observations in `data`.
`start.val`	an optional list containing the starting values for the estimations. Object names must match the names provided in `formula`. It is also required to specify the value of both the constant and the decay parameter(s).
`to_weight`	an optional vector of weights to be used in the fitting process to indicate that different observations have different variances. Should be `NULL` or a numeric vector. If non-`NULL`, weighted non-linear least squares (if `estimation = "NLLS"`) or weighted maximum likelihood (if `estimation = "MLE"`) is estimated.
`time_fixed_effect`	an optional string. It indicates the name of the time index used in formula. It is used for models with longitudinal data.
`ind_fixed_effect`	an optional string. Default is `NULL`. It indicates the name of the individual index contained in the data. If provided, individual fixed effects are automatically added to the `formula` of the main equation. If `endogeneity = TRUE`, the field `first_step` is overridden, and automatically set equal to `"fe"`. It is used for models with longitudinal data.
`mle_controls`	a list allowing the user to set upper and lower bounds for control variables in MLE estimation and the variance for the ML estimator. See details.
`kappa`	a normalization level with default equals 1 used in MLE estimation.
`delta`	Default is `NULL`. To be used when `estimation = "NLLS"`. It has to be a number between zero (included) and one (excluded). When used, `econet` performs a constrained NLLS estimation. In this case, the estimated peer effect parameter, taken in absolute value, is forced to be higher than zero and lower than the spectral radius of `G`. Specifically, `delta` is a penalizing factor, decreasing the goodness of fit of the NLLS estimation, when the peer effect parameter approaches one of the two bounds. Observe that very high values of `delta` may cause NLLS estimation not to converge.

Details

A number of different normalization are available to the user:

bygraph and bycomponent are used to divide degree and closeness centrality by n - 1, and betweenness centrality by (n - 1) * (n - 2) if directed = TRUE, or by (n - 1)*(n - 2)/2 if directed = FALSE. In the former case (i.e. bygraph), n is equal to the number of nodes in the network In the latter case (i.e. bycomponent), n is equal to the number of nodes of the component in which the node is embedded.
bymaxgraph and bymaxcomponent are used to divide degree, betweenness and closeness centrality by the maximum value of the centrality of the network (bymaxgraph) or component (bymaxcomponent) in which the node is embedded.

If the network is endogenous, the user is required to run separately net_dep and extract from the resulting object the vector of unobservables necessary for obtaining an unbiased estimate of the parameter.dependent centrality. This vector can be passed through the argument unobservables.
If endogeneity = TRUE, a two-step estimation is implemented to control for network endogeneity. The argument first_step is used to control for the specification of the first-step model, e.g.:

first_step = "standard" is used when agents' connection are predicted by the differences in their characteristics (i.e. those on the right hand side of formula), and an exclusion_restriction: i.e., their connections in a different network.
first_step = "fe" adds individual fixed effects to the standard model, as in Graham (2017).
first_step = "shortest" adds to the standard model, the shortest distance between i and j, excluding the link between i and j itself, as in Fafchamps et al (2010).
first_step = "coauthor" adds to the standard model, the number of shared connections between i and j, as in Graham (2015).
first_step = "degree" adds to the standard model, the difference in the degree centrality of i and j.

For additional details, see the vignette (doi:10.18637/jss.v102.i08).

Value

A list of two objects:

A list of estimates, each one setting the decay parameter to zero, and adding one of the centralities to the specification of formula. The last object adds to formula all the selected centralities and the decay parameter is set different from zero.
An object of class data.frame containing the computed centrality measures.
A list of first-step estimations used to correct the effect of centrality measures when the network is endogenous.

References

Battaglini M., V. Leone Sciabolazza, E. Patacchini, S. Peng (2020), "Econet: An R package for the Estimation of parameter-dependent centrality measures", Mimeo.

Examples


# Load data
data("db_cosponsor")
data("G_alumni_111")
db_model_B <- db_cosponsor
G_model_B <- G_cosponsor_111
G_exclusion_restriction <- G_alumni_111
are_factors <- c("party", "gender", "nchair")
db_model_B[are_factors] <- lapply(db_model_B[are_factors], factor)

# Specify formula
f_model_B <- formula("les ~gender + party + nchair")

# Specify starting values
starting <- c(alpha = 0.214094,
             beta_gender1 = -0.212706,
             beta_party1 = 0.478518,
             beta_nchair1 = 3.09234,
             beta_betweenness = 7.06287e-05,
             phi = 0.344787)

# Fit model
horse_model_B <- horse_race(formula = f_model_B,
              centralities = "betweenness",
              directed = TRUE, weighted = TRUE,
              data = db_model_B, G = G_model_B,
              model = "model_B", estimation = "NLLS",
              start.val = starting)

# Store and print results
summary(horse_model_B)
summary(horse_model_B, centrality = "betweenness")
horse_model_B$centrality

# WARNING, This toy example is provided only for runtime execution.
# Please refer to previous examples for sensible calculations.
data("db_alumni_test")
data("G_model_A_test")
db_model <- db_alumni_test
G_model <- G_model_A_test
f_model <- formula("les ~ dw")
horse_model_test <- horse_race(formula = f_model, centralities = "betweenness",
                            directed = TRUE, weighted = FALSE, normalization = NULL,
                            data = db_model, unobservables = NULL, G = G_model,
                            model = "model_A", estimation = "NLLS",
                            start.val = c(alpha = -0.31055275,
                                          beta_dw = 1.50666982,
                                          beta_betweenness = 0.09666742,
                                          phi = 16.13035695))
summary(horse_model_test)

econet documentation built on Sept. 11, 2024, 6:46 p.m.