generate_net: Simulating networks from preferential attachment and fitness...

View source: R/generate_net.R

generate_netR Documentation

Simulating networks from preferential attachment and fitness mechanisms

Description

This function generates networks from the General Temporal model, a generative temporal network model that includes many well-known models such as the Erdős–Rényi model, the Barabási-Albert model or the Bianconi-Barabási model as special cases. This function also includes some flexible mechanisms to vary the number of new nodes and new edges at each time-step in order to generate realistic networks.

Usage

generate_net (N                 = 1000   , 
             num_seed           = 2      , 
             multiple_node      = 1      , 
             specific_start     = NULL   ,
             m                  = 1      ,
             prob_m             = FALSE  ,
             increase           = FALSE  , 
             log                = FALSE  , 
             no_new_node_step   = 0      ,
             m_no_new_node_step = m      ,
             custom_PA          = NULL   ,
             mode               = 1      , 
             alpha              = 1      , 
             beta               = 2      , 
             sat_at             = 100    ,
             offset             = 1      ,
             mode_f             = "gamma", 
             s                  = 10       )

Arguments

The parameters can be divided into four groups.

The first group specifies basic properties of the network:

N

Integer. Total number of nodes in the network (including the nodes in the seed graph). Default value is 1000.

num_seed

Integer. The number of nodes of the seed graph (the initial state of the network). The seed graph is a cycle. Default value is 2.

multiple_node

Positive integer. The number of new nodes at each time-step. Default value is 1.

specific_start

Positive Integer. If specific_start is specified, then all the time-steps from time-step 1 to specific_start are grouped to become the initial time-step in the final output. This option is usefull when we want to create a network with a large initial network that follows a scale-free degree distribution. Default value is NULL.

The second group specifies the number of new edges at each time-step:

m

Positive integer. The number of edges of each new node. Default value is 1.

prob_m

Logical. Indicates whether we fix the number of edges of each new node as a constant, or let it follows a Poisson distribution. If prob_m == TRUE, the number of edges of each new node follows a Poisson distribution. The mean of this distribution depends on the value of increase and log. Default value is FALSE.

increase

Logical. Indicates whether we increase the mean of the Poisson distribution over time. If increase == FALSE, the mean is fixed at m. If increase == TRUE, the way the mean increases depends on the value of log. Default value is FALSE.

log

Logical. Indicates how to increase the mean of the Poisson distribution. If log == TRUE, the mean increases logarithmically with the number of current nodes. If log == FALSE, the mean increases linearly with the number of current nodes. Default value is FALSE.

no_new_node_step

Non-negative integer. The number of time-steps in which no new node is added, while new edges are added between existing nodes. Default value is 0, i.e., new nodes are always added at each time-step.

m_no_new_node_step

Positive integer. The number of new edges in the no-new-node steps. Default value is equal to m. Note that the number of new edges in the no-new-node steps is not effected by the parameters increase or prob_m; this number is always the constant specified by m_no_new_node_step.

The third group of parameters specifies the preferential attachment function:

custom_PA

Numeric vector. This is the user-input PA function: A_0, A_1,..., A_K. If custom_PA is specified, then mode is ignored, and we grow the network using the PA function custom_PA. Degrees greater than K will have attachment value A_k. Default value is NULL.

mode

Integer. Indicates the parametric attachment function to be used in generating the network. If mode == 1, the attachment function is A_k = k^\alpha. If mode == 2, the attachment function is A_k = min(k,sat.at)^\alpha. If mode == 3, the attachment function is A_k = \alpha log (k)^\beta. Default value is 1.

alpha

Numeric. If mode == 1, this is the attachment exponent in the attachment function A_k = k^\alpha. If mode == 2, this is the attachment exponenet in the attachment function A_k = min(k,sat.at)^\alpha. If mode == 3, this is the \alpha in the attachment function A_k = \alpha log (k)^\beta + 1.

beta

Numeric. This is the beta in the attachment function A_k = \alpha log (k)^\beta + 1.

sat_at

Integer. This is the saturation position sat.at in the attachment function A_k = min(k,sat.at)^\alpha.

offset

Numeric. The attachment value of degree 0. Default value is 1.

The final group of parameters specifies the distribution from which node fitnesses are generated:

mode_f

String. Possible values:"gamma", "log_normal" or "power_law". This parameter indicates the true distribution for node fitness. "gamma" = gamma distribution, "log_normal" = log-normal distribution. "power_law" = power-law (pareto) distribution. Default value is "gamma".

s

Non-negative numeric. The inverse variance parameter. The mean of the distribution is kept at 1 and the variance is 1/s (since node fitnesses are only meaningful up to scale). This is achieved by setting shape and rate parameters of the Gamma distribution to s; setting mean and standard deviation in log-scale of the log-normal distribution to -1/2*log (1/s + 1) and (log (1/s + 1))^{0.5}; and setting shape and scale parameters of the pareto distribution to (s+1)^{0.5} + 1 and (s+1)^{0.5}/((s+1)^{0.5} + 1). If s is 0, all node fitnesses \eta are fixed at 1 (i.e., Barabási-Albert model). The default value is 10.

Value

The output is a PAFit_net object, which is a List contains the following four fields:

graph

a three-column matrix, where each row contains information of one edge, in the form of (from_id, to_id, time_stamp). from_id is the id of the source, to_id is the id of the destination.

type

a string indicates whether the network is "directed" or "undirected".

PA

a numeric vector contains the true PA function.

fitness

fitness values of nodes in the network. The name of each value is the ID of the node.

Author(s)

Thong Pham thongphamthe@gmail.com

See Also

For subsequent estimation procedures, see get_statistics.

For simpler functions to generate networks from well-known models, see generate_BA, generate_ER, generate_BB and generate_fit_only.

Examples

library("PAFit")
#Generate a network from the original BA model with alpha = 1, N = 100, m = 1
net <- generate_net(N = 100,m = 1,mode = 1, alpha = 1, s = 0)
str(net)
plot(net)

thongphamthe/PAFit documentation built on March 30, 2024, 4:14 p.m.