dataset_graphsage: GraphSage Datasets
In rdinnager/rspektral: What the Package Does (One Line, Title Case)

Loads one of the datasets (PPI or Reddit) used in Hamilton & Ying (2017). The PPI dataset (originally Stark et al. (2006)) for inductive node classification uses positional gene sets, motif gene sets and immunological signatures as features and gene ontology sets as labels. The Reddit dataset consists of a graph made of Reddit posts in the month of September, 2014. The label for each node is the community that a post belongs to. The graph is built by sampling 50 large communities and two nodes are connected if the same user commented on both. Node features are obtained by concatenating the average GloVe CommonCrawl vectors of the title and comments, the post's score and the number of comments. The train, test, and validation splits are returned as binary masks. :param max_degree: int, if positive, subsample edges so that each node has the specified maximum degree. :param normalize_features: if TRUE, the node features are normalized; :

1	dataset_graphsage(dataset_name, max_degree = -1L, normalize_features = TRUE)

`dataset_name`	dataset_name name of the dataset to load (`'ppi'`, or `'reddit'`)
`max_degree`	max_degree if positive, subsample edges so that each node has the specified maximum degree.
`normalize_features`	normalize_features if TRUE, the node features are normalized
`return_type`	Data format to return data in. One of either "list", or "tidygraph"

Either a list with 6 elements (containing - Adjacency matrix, Node features, Labels, and 3 binary masks for train, validation, and test splits), or a tbl_graph object with the Node features, Labels, and 3 binary masks as node attributes).

rdinnager/rspektral documentation built on June 12, 2021, 1:26 a.m.