simulate_selfreport_network: Simulate a self-reported network

View source: R/simulate_selfreport_network.R

simulate_selfreport_networkR Documentation

Simulate a self-reported network

Description

This function allows the user to simulate a self-reported network, along with a 'true' network, and a network of resource flows. The function first simulates the true network data using the simulate_sbm_plus_srm_network() function, and then simulate self-reports and observable resource flows over the true network. This allows the user to investigate the effects of response biases, such as false positive rate, on network properties.

Usage

simulate_selfreport_network(
  N_id = 99,
  B = NULL,
  V = 3,
  groups = NULL,
  sr_mu = c(0, 0),
  sr_sigma = c(1, 1),
  sr_rho = 0.6,
  dr_mu = 0,
  dr_sigma = 1,
  dr_rho = 0.7,
  individual_predictors = NULL,
  dyadic_predictors = NULL,
  individual_effects = NULL,
  dyadic_effects = NULL,
  fpr_effects = NULL,
  rtt_effects = NULL,
  theta_effects = NULL,
  false_positive_rate = c(0.01, 0.02, 0),
  recall_of_true_ties = c(0.8, 0.6, 0.99),
  theta_mean = 0.125,
  fpr_sigma = c(0.3, 0.2, 0),
  rtt_sigma = c(0.5, 0.2, 0),
  theta_sigma = 0.2,
  N_responses = 2,
  N_periods = 12,
  flow_rate = rbeta(12, 3, 30),
  decay_curve = rev(1.5 * exp(-seq(1, 4, length.out = 12))),
  outcome_mode = "bernoulli",
  link_mode = "logit"
)

Arguments

N_id

Number of individuals.

B

List of matrices that hold intercept and offset terms. Log-odds. The first matrix should be 1 x 1 with the value being the intercept term.

V

Number of blocking variables in B.

groups

Dataframe of the block IDs of each individual for each variable in B.

sr_mu

Mean vector for sender and receivier random effects. In most cases, this should be c(0,0).

sr_sigma

Standard deviation vector for sender and receivier random effects. The first element controls node-level variation in out-degree, the second in in-degree.

sr_rho

Correlation of sender-receiver effects (i.e., generalized reciprocity).

dr_mu

Mean vector for dyadic random effects. In most cases, this should be c(0,0).

dr_sigma

Standard deviation for dyadic random effects.

dr_rho

Correlation of dyad effects: (i.e., dyadic reciprocity).

individual_predictors

An N_id by N_individual_parameters matrix of covariates.

dyadic_predictors

An N_id by N_id by N_dyadic_parameters matrix of covariates.

individual_effects

A 2 by N_individual_parameters matrix of slopes. The first row gives effects of focal characteristics (on out-degree). The second row gives effects of target characteristics (on in-degree).

dyadic_effects

An N_dyadic_parameters vector of slopes.

fpr_effects

A 3 by N_predictors matrix of slopes. The first row controls the effects of covariates on layer 1 responses, and the second row layer 2 responses. The third row controls the effects of covariates on false positive in the observation network.

rtt_effects

A 3 by N_predictors matrix of slopes. The first row controls the effects of covariates on layer 1 responses, and the second row layer 2 responses. The third row controls the effects of covariates on false positive in the observation network.

theta_effects

An N_predictors vector of slopes controling the effects of covariates on name duplication from layer 1 to layer 2.

false_positive_rate

The baseline false positive rate. This should be supplied as a 3 vector, with each element controlling the false positive rate for a single layer. Support is on the unit interval. If covariates are centered, this can be loosly thought of as an average false positive rate.

recall_of_true_ties

The baseline recall rate of true ties. This should be supplied as a 3 vector, with each element controlling the true tie recall rate for a single layer. Support is on the unit interval. If covariates are centered, this can be loosly thought of as an average true tie recall rate.

theta_mean

The baseline probability of name duplication from layer 1 to layer 2. Scalar. Support is on the unit interval. If covariates are centered, this can be loosly thought of as an average true tie recall rate.

fpr_sigma

Standard deviation 3-vector for false_positive_rate random effects. There should be one value for each layer.

rtt_sigma

Standard deviation 3-vector for recall_of_true_ties random effects. There should be one value for each layer.

theta_sigma

Standard deviation scalar for theta random effects.

N_responses

Number of self-report layers. =1 for single-sampled, =2 for double-sampled.

N_periods

Number of time-periods in which observed transfers are sampled.

flow_rate

A vector of length N_periods, each element controls the probability that a true tie will result in a transfer in a given time period.

decay_curve

A vector of length N_periods, each element controls the increment log-odds of recalling a true tie at time T, as a function of a transfer occuring in period t.

outcome_mode

Outcome mode: must be "bernoulli"

link_mode

Link mode: can be "logit" or "probit".

Value

A list of data formatted for use in Stan models.

Examples

## Not run: 
library(igraph)
V = 1            # One blocking variable
G = 3            # Three categories in this variable
N_id = 100       # Number of people

clique = sample(1:3, N_id, replace=TRUE)
B = matrix(-8, nrow=G, ncol=G)
diag(B) = -4.5 # Block matrix

B[1,3] = -5.9
B[3,2] = -6.9

A = simulate_selfreport_network(N_id = N_id, B=list(B=B), V=V, 
                         groups=data.frame(clique=factor(clique)),
                         individual_predictor=matrix(rnorm(N_id, 0, 1), nrow=N_id, ncol=1), 
                         individual_effects=matrix(c(1.7, 0.3),ncol=1, nrow=2),
                         sr_sigma = c(1.4, 0.8), sr_rho = 0.5,
                         dr_sigma = 1.2, dr_rho = 0.8,
                         false_positive_rate = c(0.00, 0.00, 0.00), 
                         recall_of_true_ties = c(0.7, 0.3, 0.99),
                         theta_mean = 0.0, 
                         fpr_sigma = c(0.0, 0.0, 0.0), 
                         rtt_sigma = c(0.5, 0.2, 0.0),
                         theta_sigma = 0.0,
                         N_responses = 2,
                         N_periods = 1, 
                         flow_rate = rbeta(1, 3, 30),
                         decay_curve = rep(0,1)
                         )

par(mfrow=c(1,3))

# True Network
Net = graph_from_adjacency_matrix(A$true_network, mode = c("directed"))
V(Net)$color = c("turquoise4","gray13", "goldenrod3")[A$group_ids$clique]

plot(Net, edge.arrow.size =0.1, edge.curved = 0.3, vertex.label=NA, vertex.size = 5)


# Reported - out transfers
Net = graph_from_adjacency_matrix(A$reporting_network[,,1], mode = c("directed"))
V(Net)$color = c("turquoise4","gray13", "goldenrod3")[A$group_ids$clique]

plot(Net, edge.arrow.size =0.1, edge.curved = 0.3, vertex.label=NA, vertex.size = 5)


# Reported - in transfers
Net = graph_from_adjacency_matrix(A$reporting_network[,,2], mode = c("directed"))
V(Net)$color = c("turquoise4","gray13", "goldenrod3")[A$group_ids$clique]

plot(Net, edge.arrow.size =0.1, edge.curved = 0.3, vertex.label=NA, vertex.size = 5)

## End(Not run)


ctross/STRAND documentation built on Dec. 15, 2024, 6:02 a.m.