birankpy/README.md

API documentation of birankpy

PageRank

pagerank(adj, d=0.85, max_iter=200, tol=0.0001, verbose=False)

Return the PageRank of the nodes in a graph using power iteration. This function takes the sparse matrix as input directly, avoiding the overheads of converting the network to a networkx Graph object and back.

This function takes the sparse matrix as input directly, avoiding the overheads of converting the network to a networkx Graph object and back.

Input:

parameter | type | note ----------|------|----- adj | scipy.sparsematrix | Adjacency matrix of the graph d | float | Dumping factor max_iter | int | Maximum iteration times tol | float | Error tolerance to check convergence verbose | boolean | If print iteration information

Output:

type | note ----------|------ numpy.ndarray|The PageRank values

BiRank

birank(W, normalizer='HITS', alpha=0.85, beta=0.85, max_iter=200, tol=0.0001, verbose=False)

Calculate the PageRank of bipartite networks directly. See paper https://ieeexplore.ieee.org/abstract/document/7572089/ for details. Different normalizer yields very different results. More studies are needed for deciding the right one.

Input:

parameter | type | note ----------|------|----- W | scipy.sparsematrix | Adjacency matrix of the bipartite network D*P normalizer | string | Choose which normalizer to use, see the paper for details alpha | float | Damping factors for the rows beta | float | Damping factors for the columns max_iter | int | Maximum iteration times tol | float | Error tolerance to check convergence verbose | boolean | If print iteration information

Output:

variable | type | note ----------|------|----- d | numpy.ndarray | The BiRank for rows and columns p | numpy.ndarray | The BiRank for rows and columns

UnipartiteNetwork

UnipartiteNetwork(self)

Class for handling unipartite networks using scipy's sparse matrix Designed to for large networkx, but functionalities are limited

set_adj_matrix(self, id_df, W, index_col=None)

Set the adjacency matrix of the network.

Input:

parameter | type | note ----------|------|----- id_df | pandas.DataFrame | The mapping between node and index W | scipy.sparsematrix | The adjacency matrix of the network; the node order in id_df has to match W index_col | string | column name of the index

generate_pagerank(self, **kwargs)

Generate the PageRank values for the network using pagerank(). The parameters are the same with pagerank().

BipartiteNetwork

BipartiteNetwork(self)

Class for handling bipartite networks using scipy's sparse matrix Design to for large networkx, but functionalities are limited

load_edgelist(self, edgelist_path, top_col, bottom_col, weight_col='None', sep=',')

Method to load an edgelist.

Inputs:

parameter | type | note ----------|------|----- edge_list_path | string | the path to the edgelist file top_col | string | column of the top nodes bottom_col | string | column of the bottom nodes weight_col | string | column of the edge weights sep | string | the seperators of the edgelist file

Suppose the bipartite network has D top nodes and P bottom nodes. The edgelist file should have the format similar to the example:

top | bottom | weight ----|--------|------- t1 | b1 | 1 t1 | b2 | 1 t2 | b1 | 2 ...|...|... tD | bP | 1

The edgelist file needs at least two columns for the top nodes and bottom nodes. An optional column can carry the edge weight. You need to specify the columns in the method parameters. The network is represented by a D*P dimensional matrix.

set_edgelist(self, df, top_col, bottom_col, weight_col=None)

Method to set the edgelist.

Inputs:

parameter | type | note ----------|------|----- df | pandas.DataFrame | the edgelist with at least two columns top_col | string | column of the edgelist dataframe for top nodes bottom_col | string | column of the edgelist dataframe for bottom nodes weight_col | string | column of the edgelist dataframe for edge weights

The edgelist should be represented by a dataframe. The dataframe needs at least two columns for the top nodes and bottom nodes. An optional column can carry the edge weight. You need to specify the columns in the method parameters.

unipartite_projection(self, on)

Project the bipartite network to one side of the nodes to generate a unipartite network.

Input:

parameter | type | note ----------|------|----- on | string | Name of the column to project the network on

Output:

| type | note ------|----- UnipartiteNetwork | The projected unipartite network

If projected on top nodes, the resulting adjacency matrix has dimension: D*D. If projected on bottom nodes, the resulting adjacency matrix has dimension: P*P.

generate_birank(self, **kwargs)

Output:

variable | type | note ---------|------|----- top_df | pandas.DataFrame | BiRank values for the top nodes bottom_df | pandas.DataFrame | BiRank values for bottom nodes

Generate the BiRank values for the top and bottom nodes simultaneously using birank(). The parameters are the same with birank().



BrianAronson/birankr documentation built on Nov. 13, 2021, 1:25 a.m.