A function to estimate the attachment function and node fitness in temporal complex networks

Share:

Description

From a PAFitData object, which contains summary statistics summerized from the dataset, PAFit estimates the attachment function A_k and node fitness η_i by penalized log-likelihood maximization. It also calculates confidence intervals for A_k and η_i. Estimation of the attachment function in isolation (while fixing η_i = 1) can be specified by setting only_PA = TRUE. Estimation of node fitness in isolation (fixing either A_k = k or A_k = 1) can be specified by setting only_f = TRUE.

Usage

1
2
3
4
5
6
7
PAFit (data, only_PA = FALSE, only_f = FALSE, 
       mode_f = c("Linear_PA", "Constant_PA","Log_linear"),
       true_A = NULL, true_f = NULL, s = 1, auto_lambda = TRUE, 
       r = 0.01, lambda = 1, weight_PA_mode = 1, auto_stop = TRUE,
       stop_cond = 10^-6, iteration = 20, max_iter = 1000, 
       debug = FALSE, alpha_start = 1, normalized_f = FALSE, 
       interpolate = TRUE, ...)

Arguments

data

An object of class "PAFitData" containing all the summary statistics summerized from the data by the function GetStatistics

only_PA

Logical. TRUE means that the attachment function A_k is estimated in isolation(fixing η_i = 1). Default is FALSE.

only_f

Logical. TRUE means that the fitness function is estimated in isolation. Default is FALSE.

mode_f

String. Only effective when only_f == TRUE. If mode_f == "Linear_PA" then A_k = k for k ≥ 1 and A_0 = C. If mode_f == "Constant_PA" then A_k = 1 for all k. If mode_f == "Log_linear" then A_k = k^α for k ≥ 1 and A_0 = C. The values of α and C are estimated by MLE. Default values is "Linear_PA".

true_A

Numeric vector. If the true_A is supplemented, then only fitness is estimated.

true_f

Numeric vector. If the true_f is supplemented, then only PA is estimated.

s

Numeric. The regularization parameter s for node fitness. Default value is 1.

auto_lambda

Logical. If auto_lambda == TRUE, lambda will be determined automatically from the data. Default is TRUE.

r

Numeric. The regularization parameter r for the PA function. Default value is 0.01.

lambda

Numeric. The strength of the regularization for PA function. Ignored when auto_lambda == TRUE. Default value is 1. lambda == 0 means no regularization for A.

weight_PA_mode

Integer. Indicates how the regularization terms for A_k are weighted. If weight_PA_mode == 0, the regularization term for A_k is weighted by the total number of edges connected to degree k nodes. If weight_PA_mode == 1, the regularization terms have uniform weights. Default value is 0.

auto_stop

Logical. Indicates whether the algorithm stop automatically or not. Default is TRUE

stop_cond

Numeric. If auto_stop = TRUE, the iterative algorithm stops when abs(h(ii) - h(ii + 1)) / (abs(h(ii)) + 1) < stop_cond where h(ii) is the value of the objective function (posterior probability in log-scale) at iteration ii. We recommend to choose stop_cond at most equal to 10^(- number of digits of h - 2), in order to ensure that when the algorithm stops, the increase in posterior probability is less than 1% of the current posterior probability. Default is 10^-4.

iteration

Numeric. The number of iterations. Ignored if auto_stop == TRUE. Default value is 20.

max_iter

Numeric. The maximum number of iterations. Regardless of other settings, the algorithm will stop once the number of iterations reaches this threshold. Default value is 1000.

debug

Logical. if debug == TRUE, the value of the objective function h is printed out at each step. Defaule is FALSE.

alpha_start

Numeric. The starting value for alpha when we use the model k^α. Default value is 1.

normalized_f

Logical. Indicates whether we should normalize the estimated value of f after estimation. Default value is FALSE.

interpolate

Logical. Indicates whether we should perform interpolation for the missing values of the estimated A_k. The interpolation, if performed, is a linear regression on log-scale. Default value is TRUE.

...

Value

an object of class "PAFit", which is a list. Some important fields of this object:

A

The estimated attachment function

k

The corresponding degree

var_A

Variances of the estimated A

linear_fit

Result of fitting the log-linear model log A_k = α log k to the estimated A_k

alpha

The estimated attachment exponent of the log-linear model A_k =k^α

weight_of_A

The number of A in each bin

var_logA

Variances of log(A)

upper_A

The upper value of the 2-sigma confidence interval of A

lower_A

The lower value of the 2-sigma confidence interval of A

center_k

The logarithmic center of the bins

theta

Attachment value of the bins (before converting back to A_k)

upper_bin

The upper value of the 2-sigma confidence interval of theta

lower_bin

The lower value of the 2-sigma confidence interval of theta

f

The estimated node fitnesses η

var_f

Variances of the estimated node fitnesses

upper_f

The upper value of the 2-sigma confidence interval of η

lower_f

The lower value of the 2-sigma confidence interval of η

objective_value

Values of the objective function h (posterior probability in log-scale) recorded at each iteration

Author(s)

Thong Pham thongpham@thongpham.net

References

1. Pham, T., Sheridan, P. & Shimodaira, H. (2016). Nonparametric Estimation of the Preferential Attachment Function in Complex Networks: Evidence of Deviations from Log Linearity, Proceedings of ECCS 2014, 141-153 (Springer International Publishing) (http://dx.doi.org/10.1007/978-3-319-29228-1_13).

2. Pham, T., Sheridan, P. & Shimodaira, H. (2015). PAFit: A Statistical Method for Measuring Preferential Attachment in Temporal Complex Networks. PLoS ONE 10(9): e0137796. doi:10.1371/journal.pone.0137796 (http://dx.doi.org/10.1371/journal.pone.0137796).

3. Pham, T., Sheridan, P. & Shimodaira, H. (2016). Joint Estimation of Preferential Attachment and Node Fitness in Growing Complex Networks. Scientific Reports 6, Article number: 32558. doi:10.1038/srep32558 (www.nature.com/articles/srep32558).

Examples

1
2
3
4
library("PAFit")
data   <- GenerateNet(N = 100,m = 1, mode = 1, alpha = 1, shape = 5, rate = 5)
stats  <- GetStatistics(data$graph)
result <- PAFit(stats,stop_cond = 10^-3)