proxfun: Vertex proximity indexes

Description Usage Arguments Details Value References Examples

Description

General function for calculating several types of vertex proximities in a graph.

Usage

1
2
3
4
5
6
7
8
9
proxfun(graph, ...)

## S3 method for class 'igraph'
proxfun(graph, method, v1 = NULL, v2 = v1,
  value = c("matrix", "edgelist", "graph"), ...)

## S3 method for class 'network'
proxfun(graph, method, v1 = NULL, v2 = v1,
  value = c("matrix", "edgelist", "graph"), ...)

Arguments

graph

an object of class igraph or network

...

additional arguments specific for a selected measure

method

single character, the method to be used, see Details

v1, v2

vectors of vertices between which similarity will be calculated. Character vector is interpreted as vertex names. Numeric vector as vertex ids.

value

a character string giving a type of the object that should be returned. This must be one of "matrix", "graph" or "edgelist", with default "matrix".

Details

This function calculates vertex proximities in graph graph with the selected method. The graph has to be undirected and connected. Some of the methods support computation only for selected vertices, which should be more efficient when needed. Supplying vertex IDs or names (if present in the graph) to v1 and v2 will calculate proximities of v1 x v2.

The following methods are available (see vignette("proxfun", package="linkprediction") for more details and formal definitions):

aa

Adamic-Adar index (Adamic and Adar 2001). Additional arguments are passed to igraph::similarity.

act

Average Commute Time (Fouss, Pirotte, Renders, and Saerens 2007)

act_n

Normalized Average Commute Time (Fouss et al. 2007)

cn

Common Neighbours

cos

Cosine similarity (Salton and McGill 1986)

cos_l

cosine similarity on L+ (Fouss et al. 2007)

dist

graph distance

hdi

Hub Depressed Index (Ravasz, Somera, Mongru, Oltvai, and Barabasi 2002)

hpi

Hub Promoted Index (Ravasz et al. 2002)

jaccard

Jaccard coefficient (Jaccard 1912)

katz

Katz index (Katz 1953)

l

L+ directly (Fouss et al. 2007)

lhn_local

Leicht-Holme-Newman Index (Leicht, Holme, and Newman 2006)

lhn_global

Leicht-Holme-Newman Index global version (Leicht et al. 2006)

lp

Local Path Index (Zhou, Lu, and Zhang 2009)

mf

Matrix Forest Index (Chebotarev P. Yu. 1997)

pa

preferential attachment (Barabasi and Albert 1999)

ra

resource allocation (Zhou et al. 2009)

rwr

random walk with restart (Brin and Page 1998). Additional argument alpha (default value 0.3) is the probability that the walk will restart after a step.

sor

sorensen index/dice coefficient (Sorensen 1948)

Value

If value = "matrix" a matrix with length(v1) rows and length(v2) with rownames and colnames equal to v1 and v2 respectively. If value = "edgelist" a data.frame with three columns:

from

ID of a start node of an edge

to

ID of an end node of an edge

value

similarity score for that edge

Edges with similarity score 0 are omitted. If value = "graph" an object of class igraph or network, depending on the class of input graph. Returned graph has the same structure (graph and node attributes, etc.) as the input graph, except for edges - original edges are skipped, and new edges with positive similarity score are added. Edged attribute "weight" indicates similarity score.

References

Adamic L and Adar E (2003). "Friends and Neighbors on the Web." Social Networks, 25, pp. 211-230 doi: 10.1016/S0378-8733(03)00009-1.

Barabasi A and Albert R (1999). "Emergence of Scaling in Random Networks." Science, 286(5439), pp. 509-512.

Brin S and Page L (1998). "The anatomy of a large-scale hypertextual Web search engine ." _Computer Networks and ISDN Systems _, 30(1-7), pp. 107 - 117. Proceedings of the Seventh International World Wide Web Conference .

Chebotarev P. Yu. SEV (1997). "The matrix-forest theorem and measuring relations in small social groups ." _Automation and Remote Control _, 58(9), pp. 1505-1514.

Fouss F, Pirotte A, Renders J and Saerens M (2007). "Random-Walk Computation of Similarities Between Nodes of a Graph with Application to Collaborative Recommendation." IEEE Transactions on Knowledge and Data Engineering, 19(3), pp. 355-369 doi: 10.1109/TKDE.2007.46.

Jaccard P (1912). "The Distribution of the Flora in the Alpine Zone 1" New Phytologist, 11(2), pp. 37-50.

Katz L (1953). "A new status index derived from sociometric analysis." Psychometrika, 18(1), pp. 39-43.

Leicht EA, Holme P and Newman MEJ (2006). "Vertex similarity in networks." Phys. Rev. E, 73(2), pp. 026120 doi: 10.1103/PhysRevE.73.026120.

Ravasz E, Somera AL, Mongru DA, Oltvai ZN and Barabasi A (2002). "Hierarchical Organization of Modularity in Metabolic Networks." Science, 297(5586), pp. 1551-1555.

Salton G and McGill MJ (1986). Introduction to Modern Information Retrieval. McGraw-Hill, Inc., New York, NY, USA.

Sorensen T (1948). "A Method of Establishing Groups of Equal Amplitude in Plant Sociology Based on Similarity of Species Content and Its Application to Analyses of the Vegetation on Danish Commons." Biologiske Skrifter, 5, pp. 1-34.

Zhou T, Lu L and Zhang Y (2009). "Predicting missing links via local information." The European Physical Journal B, 71(4), pp. 623-630 doi: 10.1140/epjb/e2009-00335-8.

Examples

1
2
3
4
5
6
7
8
9
if(requireNamespace("igraph")) {
  g <- igraph::make_graph(~ A -- C:D:E -- B -- F -- G:H -- I)
  
# Adamic-Adar
proxfun(g, method="aa", value="edgelist")
  
# Random Walk with Restart
proxfun(g, method="rwr", value="edgelist")
}

Example output

Loading required namespace: igraph
   from to      value
1     1  1 0.86196238
2     2  1 0.31185997
3     3  1 0.31185997
4     4  1 0.31185997
5     5  1 0.24181711
6     6  1 0.06171445
7     7  1 0.02384112
8     8  1 0.02384112
9     9  1 0.01668879
10    1  2 0.31185997
11    2  2 0.75641140
12    3  2 0.15641140
13    4  2 0.15641140
14    5  2 0.29610259
15    6  2 0.07346959
16    7  2 0.02724700
17    8  2 0.02724700
18    9  2 0.01907290
19    1  3 0.31185997
20    2  3 0.15641140
21    3  3 0.75641140
22    4  3 0.15641140
23    5  3 0.29610259
24    6  3 0.07346959
25    7  3 0.02724700
26    8  3 0.02724700
27    9  3 0.01907290
28    1  4 0.31185997
29    2  4 0.15641140
30    3  4 0.15641140
31    4  4 0.75641140
32    5  4 0.29610259
33    6  4 0.07346959
34    7  4 0.02724700
35    8  4 0.02724700
36    9  4 0.01907290
37    1  5 0.24181711
38    2  5 0.29610259
39    3  5 0.29610259
40    4  5 0.29610259
41    5  5 0.85164744
42    6  5 0.22187815
43    7  5 0.08816350
44    8  5 0.08816350
45    9  5 0.06171445
46    1  6 0.06171445
47    2  6 0.07346959
48    3  6 0.07346959
49    4  6 0.07346959
50    5  6 0.22187815
51    6  6 0.82225961
52    7  6 0.31764996
53    8  6 0.31764996
54    9  6 0.22235497
55    1  7 0.02384112
56    2  7 0.02724700
57    3  7 0.02724700
58    4  7 0.02724700
59    5  7 0.08816350
60    6  7 0.31764996
61    7  7 0.81515495
62    8  7 0.21515495
63    9  7 0.36060847
64    1  8 0.02384112
65    2  8 0.02724700
66    3  8 0.02724700
67    4  8 0.02724700
68    5  8 0.08816350
69    6  8 0.31764996
70    7  8 0.21515495
71    8  8 0.81515495
72    9  8 0.36060847
73    1  9 0.01668879
74    2  9 0.01907290
75    3  9 0.01907290
76    4  9 0.01907290
77    5  9 0.06171445
78    6  9 0.22235497
79    7  9 0.36060847
80    8  9 0.36060847
81    9  9 0.85242593

linkprediction documentation built on May 1, 2019, 9:58 p.m.