only_F_estimate: Estimating node fitnesses in isolation
In PAFit: Generative Mechanism Estimation in Temporal Complex Networks

only_F_estimate

R Documentation

Estimating node fitnesses in isolation

Description

This function estimates node fitnesses \eta_i assusming either A_k = k (i.e. linear preferential attachment) or A_k = 1 (i.e. no preferential attachment). The method has a hyper-parameter s. It first performs a cross-validation to select the optimal parameter s for the prior of \eta_i, then estimates eta_i with the full data (Ref. 1).

Usage

only_F_estimate(net_object                             , 
               net_stat    = get_statistics(net_object), 
               p           = 0.75                      ,
               stop_cond   = 10^-8                     , 
               model_A     = "Linear"                  ,
               ...)

Arguments

`net_object`	an object of class `PAFit_net` that contains the network.
`net_stat`	An object of class `PAFit_data` which contains summerized statistics needed in estimation. This object is created by the function `get_statistics`. The default value is `get_statistics(net_object)`.
`p`	Numeric. This is the ratio of the number of new edges in the learning data to that of the full data. The data is then divided into two parts: learning data and testing data based on `p`. The learning data is used to learn the node fitnesses and the testing data is then used in cross-validation. Default value is `0.75`.
`stop_cond`	Numeric. The iterative algorithm stops when `abs(h(ii) - h(ii + 1)) / (abs(h(ii)) + 1) < stop.cond` where `h(ii)` is the value of the objective function at iteration `ii`. We recommend to choose `stop.cond` at most equal to `10^(- number of digits of h - 2)`, in order to ensure that when the algorithm stops, the increase in posterior probability is less than 1% of the current posterior probability. Default is `10^-8`. This threshold is good enough for most applications.
`model_A`	String. Indicates which attachment function `A_k` we assume: `"Linear"`: We assume `A_k = k`, i.e. the Bianconi-Barabási model (Ref. 2). `"Constant"`: We assume `A_k = 1`, i.e. the Caldarelli model (Ref. 3).
`...`	Other arguments to pass to the underlying algorithm.

Value

Outputs a Full_PAFit_result object, which is a list containing the following fields:

cv_data: a CV_Data object which contains the cross-validation data. Normally the user does not need to pay attention to this data.
cv_result: a CV_Result object which contains the cross-validation result. Normally the user does not need to pay attention to this data.
estimate_result: this is a PAFit_result object which contains the estimated node fitnesses and their confidence intervals. In particular, the important fields are:
- shape: this is the selected value for the hyper-parameter s.
- g: the number of bins used.
- f: the estimated node fitnesses.
- var_f: the estimated variance of \eta_i.
- upper_f: the estimated upper value of the interval of two standard deviations around \eta_i.
- lower_f: the estimated lower value of the interval of two standard deviations around \eta_i.
- objective_value: values of the objective function over iterations in the final run with the full data.
- diverge_zero: logical value indicates whether the algorithm diverged in the final run with the full data.

Author(s)

Thong Pham thongphamthe@gmail.com

References

1. Pham, T., Sheridan, P. & Shimodaira, H. (2016). Joint Estimation of Preferential Attachment and Node Fitness in Growing Complex Networks. Scientific Reports 6, Article number: 32558. (\Sexpr[results=rd]{tools:::Rd_expr_doi("10.1038/srep32558")}).

2. Bianconni, G. & Barabási, A. (2001). Competition and multiscaling in evolving networks. Europhys. Lett., 54, 436 (\Sexpr[results=rd]{tools:::Rd_expr_doi("10.1209/epl/i2001-00260-6")}).

3. Caldarelli, G., Capocci, A. , De Los Rios, P. & Muñoz, M.A. (2002). Scale-Free Networks from Varying Vertex Intrinsic Fitness. Phys. Rev. Lett., 89, 258702 (\Sexpr[results=rd]{tools:::Rd_expr_doi("10.1103/PhysRevLett.89.258702")}).

Examples

## Not run: 
  library("PAFit")
  set.seed(1)
  # size of initial network = 100
  # number of new nodes at each time-step = 100
  # Ak = k; inverse variance of the distribution of node fitnesse = 10
  net        <- generate_BB(N        = 1000 , m             = 50 , 
                            num_seed = 100  , multiple_node = 100,
                            s        = 10)
                            
  net_stats  <- get_statistics(net)
  
  # estimate node fitnesses in isolation, assuming Ak = k
  result     <- only_F_estimate(net, net_stats)
 
  # plot the estimated node fitnesses and true node fitnesses
  plot(result, net_stats, true = net$fitness, plot = "true_f")
  
## End(Not run)

PAFit documentation built on June 22, 2024, 11:06 a.m.