only_F_estimate: Estimating node fitnesses in isolation

View source: R/only_F_estimate.R

only_F_estimateR Documentation

Estimating node fitnesses in isolation

Description

This function estimates node fitnesses \eta_i assusming either A_k = k (i.e. linear preferential attachment) or A_k = 1 (i.e. no preferential attachment). The method has a hyper-parameter s. It first performs a cross-validation to select the optimal parameter s for the prior of \eta_i, then estimates eta_i with the full data (Ref. 1).

Usage

only_F_estimate(net_object                             , 
               net_stat    = get_statistics(net_object), 
               p           = 0.75                      ,
               stop_cond   = 10^-8                     , 
               model_A     = "Linear"                  ,
               ...)

Arguments

net_object

an object of class PAFit_net that contains the network.

net_stat

An object of class PAFit_data which contains summerized statistics needed in estimation. This object is created by the function get_statistics. The default value is get_statistics(net_object).

p

Numeric. This is the ratio of the number of new edges in the learning data to that of the full data. The data is then divided into two parts: learning data and testing data based on p. The learning data is used to learn the node fitnesses and the testing data is then used in cross-validation. Default value is 0.75.

stop_cond

Numeric. The iterative algorithm stops when abs(h(ii) - h(ii + 1)) / (abs(h(ii)) + 1) < stop.cond where h(ii) is the value of the objective function at iteration ii. We recommend to choose stop.cond at most equal to 10^(- number of digits of h - 2), in order to ensure that when the algorithm stops, the increase in posterior probability is less than 1% of the current posterior probability. Default is 10^-8. This threshold is good enough for most applications.

model_A

String. Indicates which attachment function A_k we assume:

  • "Linear": We assume A_k = k, i.e. the Bianconi-Barabási model (Ref. 2).

  • "Constant": We assume A_k = 1, i.e. the Caldarelli model (Ref. 3).

...

Other arguments to pass to the underlying algorithm.

Value

Outputs a Full_PAFit_result object, which is a list containing the following fields:

  • cv_data: a CV_Data object which contains the cross-validation data. Normally the user does not need to pay attention to this data.

  • cv_result: a CV_Result object which contains the cross-validation result. Normally the user does not need to pay attention to this data.

  • estimate_result: this is a PAFit_result object which contains the estimated node fitnesses and their confidence intervals. In particular, the important fields are:

    • shape: this is the selected value for the hyper-parameter s.

    • g: the number of bins used.

    • f: the estimated node fitnesses.

    • var_f: the estimated variance of \eta_i.

    • upper_f: the estimated upper value of the interval of two standard deviations around \eta_i.

    • lower_f: the estimated lower value of the interval of two standard deviations around \eta_i.

    • objective_value: values of the objective function over iterations in the final run with the full data.

    • diverge_zero: logical value indicates whether the algorithm diverged in the final run with the full data.

Author(s)

Thong Pham thongphamthe@gmail.com

References

1. Pham, T., Sheridan, P. & Shimodaira, H. (2016). Joint Estimation of Preferential Attachment and Node Fitness in Growing Complex Networks. Scientific Reports 6, Article number: 32558. (\Sexpr[results=rd]{tools:::Rd_expr_doi("10.1038/srep32558")}).

2. Bianconni, G. & Barabási, A. (2001). Competition and multiscaling in evolving networks. Europhys. Lett., 54, 436 (\Sexpr[results=rd]{tools:::Rd_expr_doi("10.1209/epl/i2001-00260-6")}).

3. Caldarelli, G., Capocci, A. , De Los Rios, P. & Muñoz, M.A. (2002). Scale-Free Networks from Varying Vertex Intrinsic Fitness. Phys. Rev. Lett., 89, 258702 (\Sexpr[results=rd]{tools:::Rd_expr_doi("10.1103/PhysRevLett.89.258702")}).

See Also

See get_statistics for how to create summerized statistics needed in this function.

See joint_estimate for the method to jointly estimate the attachment function A_k and node fitnesses \eta_i.

Examples

## Not run: 
  library("PAFit")
  set.seed(1)
  # size of initial network = 100
  # number of new nodes at each time-step = 100
  # Ak = k; inverse variance of the distribution of node fitnesse = 10
  net        <- generate_BB(N        = 1000 , m             = 50 , 
                            num_seed = 100  , multiple_node = 100,
                            s        = 10)
                            
  net_stats  <- get_statistics(net)
  
  # estimate node fitnesses in isolation, assuming Ak = k
  result     <- only_F_estimate(net, net_stats)
 
  # plot the estimated node fitnesses and true node fitnesses
  plot(result, net_stats, true = net$fitness, plot = "true_f")
  
## End(Not run)

PAFit documentation built on June 22, 2024, 11:06 a.m.