netinf: Infer latent diffusion network

Description Usage Arguments Details Value References Examples

Description

Infer a network of diffusion ties from a set of cascades. Each cascade is defined by pairs of node ids and infection times.

Usage

1
2
3
netinf(cascades, trans_mod = "exponential", n_edges = NULL,
  p_value_cutoff = NULL, params = NULL, quiet = FALSE,
  trees = FALSE)

Arguments

cascades

an object of class cascade containing node and cascade information. See as_cascade_long and as_cascade_wide for details.

trans_mod

character, indicating the choice of model: "exponential", "rayleigh" or "log-normal".

n_edges

integer, number of edges to infer. Leave unspecified if using p_value_cutoff.

p_value_cutoff

numeric, in the interval (0, 1). If specified, edges are inferred in each iteration until the Vuong test for edge addition reaches the p-value cutoff or when the maximum possible number of edges is reached. Leave unspecified if using n_edges to explicitly specify number of edges to infer.

params

numeric, Parameters for diffusion model. If left unspecified reasonable parameters are inferred from the data. See details for how to specify parameters for the different distributions.

quiet

logical, Should output on progress by suppressed.

trees

logical, Should the inferred cascade trees be returned. Note, that this will lead to a different the structure of the function output. See section Value for details.

Details

The algorithm is describe in detail in Gomez-Rodriguez et al. (2010). Additional information can be found on the netinf website (http://snap.stanford.edu/netinf/).

If higher performance is required and for very large data sets, a faster pure C++ implementation is available in the Stanford Network Analysis Project (SNAP). The software can be downloaded at http://snap.stanford.edu/netinf/.

Value

Returns the inferred diffusion network as an edgelist in an object of class diffnet and data.frame. The first column contains the sender, the second column the receiver node. The third column contains the improvement in fit from adding the edge that is represented by the row. The output additionally has the following attributes:

If the argument trees is set to TRUE, the output is a list with the first element being the data.frame described above, and the second element being the trees in edge-list form in a single data.frame.

References

M. Gomez-Rodriguez, J. Leskovec, A. Krause. Inferring Networks of Diffusion and Influence.The 16th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2010.

Examples

1
2
3
4
5
6
7
8
# Data already in cascades format:
data(cascades)
out <- netinf(cascades, trans_mod = "exponential", n_edges = 5, params = 1)

# Starting with a dataframe
df <- simulate_rnd_cascades(10, n_nodes = 20)
cascades2 <- as_cascade_long(df, node_names = unique(df$node_name))
out <- netinf(cascades2, trans_mod = "exponential", n_edges = 5, params = 1)

NetworkInference documentation built on May 1, 2019, 9:20 p.m.