horse_race: Compare the explanatory power of parameter.dependent network...

View source: R/horse_race.R

horse_raceR Documentation

Compare the explanatory power of parameter.dependent network centrality measures with those of standard measures of network centrality.

Description

Compare the explanatory power of parameter.dependent network centrality measures with those of standard measures of network centrality.

Usage

horse_race(
  formula = formula(),
  centralities = c("indegree", "outdegree", "degree", "betweenness", "incloseness",
    "outcloseness", "closeness", "eigenvector"),
  directed = FALSE,
  weighted = FALSE,
  normalization = FALSE,
  data = list(),
  unobservables = list(),
  G = list(),
  model = c("model_A", "model_B"),
  estimation = c("NLLS", "MLE"),
  endogeneity = FALSE,
  first_step = NULL,
  exclusion_restriction = NULL,
  start.val = NULL,
  to_weight = NULL,
  time_fixed_effect = NULL,
  ind_fixed_effect = NULL,
  mle_controls = NULL,
  kappa = NULL,
  delta = NULL
)

Arguments

formula

an object of class formula: a symbolic description of the model to be fitted. The constant (i.e. intercept) and the autogressive parameter needs not to be specified.

centralities

at least one of c("indegree","outdegree","degree", "betweenness", "incloseness", "outcloseness", "closeness", "eigenvector").

directed

logical. TRUE if the social network is directed, FALSE otherwise.

weighted

logical. TRUE if the social network is weighted, FALSE otherwise.

normalization

Default is NULL. Alternatively, it can be set to c("bygraph","bycomponent","bymaxcomponent","bymaxgraph"). See details.

data

an object of class data.frame containing the variables in the model. If data are longitudinal, observations must be ordered by time period and then by individual.

unobservables

a numeric vector used to obtain an unbiased estimate of the parameter.dependent centrality when the network is endogenous. See details.

G

an object of class Matrix representing the social network. Row and column names must be specified and match the order of the observations in data.

model

string. One of c("model_A","model_B"). See details.

estimation

string. One of c("NLLS","MLE"). They are used to implement respectively a non-linear least square and a maximum likelihood estimator.

endogeneity

logical. Default is FALSE. If TRUE, net_dep implements a two-step correction procedure to control for the endogeneity of the network.

first_step

Default is NULL. If endogeneity = TRUE, it requires to specify one of c("standard","fe", "shortest", "coauthors", "degree"). See details.

exclusion_restriction

an object of class Matrix representing the exogenous matrix used to instrument the endogenous social network, if unobservables is non-NULL. Row and column names must be specified and match the order of the observations in data.

start.val

an optional list containing the starting values for the estimations. Object names must match the names provided in formula. It is also required to specify the value of both the constant and the decay parameter(s).

to_weight

an optional vector of weights to be used in the fitting process to indicate that different observations have different variances. Should be NULL or a numeric vector. If non-NULL, weighted non-linear least squares (if estimation = "NLLS") or weighted maximum likelihood (if estimation = "MLE") is estimated.

time_fixed_effect

an optional string. It indicates the name of the time index used in formula. It is used for models with longitudinal data.

ind_fixed_effect

an optional string. Default is NULL. It indicates the name of the individual index contained in the data. If provided, individual fixed effects are automatically added to the formula of the main equation. If endogeneity = TRUE, the field first_step is overridden, and automatically set equal to "fe". It is used for models with longitudinal data.

mle_controls

a list allowing the user to set upper and lower bounds for control variables in MLE estimation and the variance for the ML estimator. See details.

kappa

a normalization level with default equals 1 used in MLE estimation.

delta

Default is NULL. To be used when estimation = "NLLS". It has to be a number between zero (included) and one (excluded). When used, econet performs a constrained NLLS estimation. In this case, the estimated peer effect parameter, taken in absolute value, is forced to be higher than zero and lower than the spectral radius of G. Specifically, delta is a penalizing factor, decreasing the goodness of fit of the NLLS estimation, when the peer effect parameter approaches one of the two bounds. Observe that very high values of delta may cause NLLS estimation not to converge.

Details

A number of different normalization are available to the user:

  • bygraph and bycomponent are used to divide degree and closeness centrality by n - 1, and betweenness centrality by (n - 1) * (n - 2) if directed = TRUE, or by (n - 1)*(n - 2)/2 if directed = FALSE. In the former case (i.e. bygraph), n is equal to the number of nodes in the network In the latter case (i.e. bycomponent), n is equal to the number of nodes of the component in which the node is embedded.

  • bymaxgraph and bymaxcomponent are used to divide degree, betweenness and closeness centrality by the maximum value of the centrality of the network (bymaxgraph) or component (bymaxcomponent) in which the node is embedded.

If the network is endogenous, the user is required to run separately net_dep and extract from the resulting object the vector of unobservables necessary for obtaining an unbiased estimate of the parameter.dependent centrality. This vector can be passed through the argument unobservables.
If endogeneity = TRUE, a two-step estimation is implemented to control for network endogeneity. The argument first_step is used to control for the specification of the first-step model, e.g.:

  • first_step = "standard" is used when agents' connection are predicted by the differences in their characteristics (i.e. those on the right hand side of formula), and an exclusion_restriction: i.e., their connections in a different network.

  • first_step = "fe" adds individual fixed effects to the standard model, as in Graham (2017).

  • first_step = "shortest" adds to the standard model, the shortest distance between i and j, excluding the link between i and j itself, as in Fafchamps et al (2010).

  • first_step = "coauthor" adds to the standard model, the number of shared connections between i and j, as in Graham (2015).

  • first_step = "degree" adds to the standard model, the difference in the degree centrality of i and j.

For additional details, see the vignette (doi:10.18637/jss.v102.i08).

Value

A list of two objects:

  • A list of estimates, each one setting the decay parameter to zero, and adding one of the centralities to the specification of formula. The last object adds to formula all the selected centralities and the decay parameter is set different from zero.

  • An object of class data.frame containing the computed centrality measures.

  • A list of first-step estimations used to correct the effect of centrality measures when the network is endogenous.

References

Battaglini M., V. Leone Sciabolazza, E. Patacchini, S. Peng (2020), "Econet: An R package for the Estimation of parameter-dependent centrality measures", Mimeo.

See Also

net_dep

Examples


# Load data
data("db_cosponsor")
data("G_alumni_111")
db_model_B <- db_cosponsor
G_model_B <- G_cosponsor_111
G_exclusion_restriction <- G_alumni_111
are_factors <- c("party", "gender", "nchair")
db_model_B[are_factors] <- lapply(db_model_B[are_factors], factor)

# Specify formula
f_model_B <- formula("les ~gender + party + nchair")

# Specify starting values
starting <- c(alpha = 0.214094,
             beta_gender1 = -0.212706,
             beta_party1 = 0.478518,
             beta_nchair1 = 3.09234,
             beta_betweenness = 7.06287e-05,
             phi = 0.344787)

# Fit model
horse_model_B <- horse_race(formula = f_model_B,
              centralities = "betweenness",
              directed = TRUE, weighted = TRUE,
              data = db_model_B, G = G_model_B,
              model = "model_B", estimation = "NLLS",
              start.val = starting)

# Store and print results
summary(horse_model_B)
summary(horse_model_B, centrality = "betweenness")
horse_model_B$centrality

# WARNING, This toy example is provided only for runtime execution.
# Please refer to previous examples for sensible calculations.
data("db_alumni_test")
data("G_model_A_test")
db_model <- db_alumni_test
G_model <- G_model_A_test
f_model <- formula("les ~ dw")
horse_model_test <- horse_race(formula = f_model, centralities = "betweenness",
                            directed = TRUE, weighted = FALSE, normalization = NULL,
                            data = db_model, unobservables = NULL, G = G_model,
                            model = "model_A", estimation = "NLLS",
                            start.val = c(alpha = -0.31055275,
                                          beta_dw = 1.50666982,
                                          beta_betweenness = 0.09666742,
                                          phi = 16.13035695))
summary(horse_model_test)

econet documentation built on April 28, 2022, 1:07 a.m.