dataConvert: Function to convert data into the format required by the...

Description Usage Arguments Details

View source: R/dataConvert.R

Description

This function takes (1) a matrix representing networks, (2) cluster membership, (3) outcome and (4) entity of each node (i.e. type of each node. In a bipartite network, 1 for the first type and 2 for the other. In a unipartite network, all nodes has entity value of 1.)

Usage

1
dataConvert(net_matrix, cluster, outcome, entity)

Arguments

net_matrix

A numeric matrix or dataframe storing the network as (bi)adjacency matrix, row and column names within the data is recommended

cluster

A numeric vector containing the cluster membership for each nodes. Make sure the length matches the data matrix file.

outcome

A numeric vector containing the outcome for each vertice, length should be the same as 'cluster'

entity

A numeric vector representing the entity for each vertice , length should be the same as 'cluster' and 'outcome'.

Details

Input:

(1) The input network matrix is an adjacency matrix for a unipartite network or biadjacency for a bipartite network. (see definition of adjacency matrix from wikipedia:https://en.wikipedia.org/wiki/Adjacency_matrix.)

(2) Cluster membership is an integer vector which has the length equal to the number of nodes in the network. Each element in this vector represents the cluster that the corresponding node belongs to.

(3) The outcome is an integer vector with the same length as cluster membership vector. For example, in a patient-variable data sets, some patients are selected as cases(unhealthy) and some are controls(healthy), then the outcome could be set as 1 for case and 2 for control. If outcome information is not provided, it would be set to all 1s by default.

(4) The entity is an integer vector with the same length as cluster membership and outcome. In a unipartite network, entity for each node will set as 1. In a bipartite network, entity is 1 for one type of nodes and 2 for the other.

Output:

The output of the function would be a list containing two dataframes, 'nodelist' and 'net', which will serve as the input for the next 'search' step for EL algorithm. The 'nodelist' contains the following information:label names for all nodes (will be set to default as number if not specified in the input data); initial coordinates of all nodes using the Fruchterman-Reingold and Kamada-Kawai layout algorithms; outcome, cluster membership and entity for each node. The 'net' dataframe stores the network.


UTMB-DIVA-Lab/epl documentation built on July 28, 2019, 5:53 a.m.