neha: A function to select edges and prepare data for NEHA...

View source: R/neha.R

nehaR Documentation

A function to select edges and prepare data for NEHA estimation

Description

A function to select edges and prepare data for NEHA estimation

Usage

neha(eha_data, node, time, event, cascade, covariates, ncore = 2, negative = F)

Arguments

eha_data

A dataframe that includes one observation for each node at risk of experiencing the event during each at-risk time point in each cascade. Note, it is assumed that each node can experience an event in each cascade once, at most.

node

A character string name of the variable that gives the node id

time

A character string name of the variable that gives the time, in integers

event

A character string name of the variable that gives the binary 0/1 indicator of event occurrence.

cascade

A character string name of the variable that gives the cascade id

covariates

character vector of covariate names to include in the neha, excluding the intercept.

ncore

an integer giving the number of cores to use in parallel computation.

negative

logical, experimental indicating whether to go thorugh a phase of negative tie inference

Value

A list with five elements.

  • a_est - The estimated value of alpha, the edge effect decay parameter.

  • edges - A character vector giving the names of the edges inferred 'sender_receiver'.

  • data_for_neha - A dataframe that can be used to find the NEHA estimates using a function for logistic regression.

  • combined_formula - NEHA formula to use if you want a single gamma estimate.

  • separate_formula - NEHA formula to use if you want a separate gamma estimate for each edge.

Examples

library(neha)
## Not run: 
# Simulate data for NEHA
# basic data parameters
cascades <- 50
nodes <- 20
times <- 30
nties <- 25
# generate dataframe
time <- sort(rep(1:times,nodes))
node <- paste("n",as.character(rep(1:nodes,times)),sep="")
intercept <- rep(1,length(time))
covariate <- runif(length(time))-2
data_for_sim <- data.frame(
  time, node, intercept, covariate, stringsAsFactors = FALSE
)

# regression parameters
beta <- cbind(c(-2.5,.25))
rownames(beta) <- c("intercept","covariate")

# generate network effects
possible_ties <- rbind(t(combn(1:nodes,2)),t(combn(1:nodes,2))[,c(2,1)])
possible_ties <- paste(
  paste("n",possible_ties[,1],sep=""),
  paste("n",possible_ties[,2],sep=""),
  sep="_"
)
ties <- sample(possible_ties,nties)
gamma <- cbind(rep(1.5,length(ties)))
rownames(gamma) <- ties

# initiate simulated data object
simulated_data <- NULL

# generate the data one cascade at a time
for(c in 1:cascades) {
 simulated_cascade <-
   simulate_neha_discrete(
     x = data_for_sim,
     node = "node",
     time = "time",
     beta = beta,
     gamma = gamma,
     a = -6
   )
 simulated_cascade <-
   data.frame(simulated_cascade,
              cascade = c,
              stringsAsFactors = F)
 simulated_data <- rbind(simulated_data, simulated_cascade)
}

# infer edges
neha_results <-
 neha(
   simulated_data,
   node = "node",
   time = "time",
   event = "event",
   cascade = "cascade",
   covariates = "covariate",
   ncore = 3
 )

# estimate NEHA logistic regression
neha_estimate <-
  glm(neha_results$combined_formula,
      data = neha_results$data_for_neha,
      family = binomial)
summary(neha_estimate)

## End(Not run)



desmarais-lab/dnehm documentation built on Jan. 17, 2025, 11:57 a.m.