View source: R/network_analysis.R
addClusterStats | R Documentation |
Given a list of network objects returned by
buildRepSeqNetwork()
or
generateNetworkObjects()
,
computes cluster-level network properties,
performing clustering first if needed.
The list of network objects is returned
with the cluster properties added as a data frame.
addClusterStats(
net,
cluster_id_name = "cluster_id",
seq_col = NULL,
count_col = NULL,
degree_col = "degree",
cluster_fun = "fast_greedy",
overwrite = FALSE,
verbose = FALSE,
...
)
net |
A |
cluster_id_name |
A character string specifying the name of the cluster membership variable
in |
seq_col |
Specifies the column(s) of |
count_col |
Specifies the column of |
degree_col |
Specifies the column of |
cluster_fun |
A character string specifying the clustering algorithm to use when
adding or overwriting the cluster membership variable in
|
overwrite |
Logical. If |
verbose |
Logical. If |
... |
Named optional arguments to the function specified by |
The list net
must contain the named elements
igraph
(of class igraph
),
adjacency_matrix
(a matrix
or
dgCMatrix
encoding edge connections),
and node_data
(a data.frame
containing node metadata),
all corresponding to the same network. The lists returned by
buildRepSeqNetwork()
and
generateNetworkObjects()
are examples of valid inputs for the net
argument.
If the network graph has previously been partitioned into clusters using
addClusterMembership()
and the user
wishes to compute network properties for these clusters, the name of the
cluster membership variable in net$node_data
should be provided to
the cluster_id_name
argument.
If the value of cluster_id_name
is not the name of a variable
in net$node_data
, then clustering is performed using
addClusterMembership()
with the specified value of cluster_fun
,
and the cluster membership values are written to net$node_data
using
the value of cluster_id_name
as the variable name.
If overwrite = TRUE
, this is done even if this variable already exists.
A modified copy of net
, with cluster properties contained in the element
cluster_data
. This is a data.frame
containing
one row for each cluster in the network and the following variables:
cluster_id |
The cluster ID number. |
node_count |
The number of nodes in the cluster. |
mean_seq_length |
The mean sequence length in the cluster.
Only present when |
A_mean_seq_length |
The mean first sequence length in the cluster.
Only present when |
B_mean_seq_length |
The mean second sequence length in the cluster.
Only present when |
mean_degree |
The mean network degree in the cluster. |
max_degree |
The maximum network degree in the cluster. |
seq_w_max_degree |
The receptor sequence possessing the maximum degree within the cluster.
Only present when |
A_seq_w_max_degree |
The first sequence of the node possessing the maximum degree within the cluster.
Only present when |
B_seq_w_max_degree |
The second sequence of the node possessing the maximum degree within the cluster.
Only present when |
agg_count |
The aggregate count among all nodes in the cluster (based on the counts in
|
max_count |
The maximum count among all nodes in the cluster (based on the counts in
|
seq_w_max_count |
The receptor sequence possessing the maximum count within the cluster.
Only present when |
A_seq_w_max_count |
The first sequence of the node possessing the maximum count within the cluster.
Only present when |
B_seq_w_max_count |
The second sequence of the node possessing the maximum count within the cluster.
Only present when |
diameter_length |
The longest geodesic distance in the cluster, computed as the length of the
vector returned by |
assortativity |
The assortativity coefficient of the cluster's graph, based on the degree
(minus one) of each node in the cluster (with the degree computed based only
upon the nodes within the cluster). Computed using
|
global_transitivity |
The transitivity (i.e., clustering coefficient) for the cluster's graph, which
estimates the probability that adjacent vertices are connected. Computed using
|
edge_density |
The number of edges in the cluster as a fraction of the maximum possible number
of edges. Computed using |
degree_centrality_index |
The centrality index of the cluster's graph based on within-cluster network degree.
Computed as the |
closeness_centrality_index |
The centrality index of the cluster's graph based on closeness,
i.e., distance to other nodes in the cluster.
Computed using |
eigen_centrality_index |
The centrality index of the cluster's graph based on the eigenvector centrality scores,
i.e., values of the first eigenvector of the adjacency matrix for the cluster.
Computed as the |
eigen_centrality_eigenvalue |
The eigenvalue corresponding to the first eigenvector of the adjacency matrix
for the cluster. Computed as the |
If net$node_data
did not previously contain a variable whose name matches
the value of cluster_id_name
, then this variable will be present
and will contain values for cluster membership, obtained through a call to
addClusterMembership()
using the clustering algorithm specified by cluster_fun
.
If net$node_data
did previously contain a variable whose name matches
the value of cluster_id_name
and overwrite = TRUE
, then the
values of this variable will be overwritten with new values for cluster membership,
obtained as above based on cluster_fun
.
If net$node_data
did not previously contain a variable whose name matches
the value of degree_col
, then this variable will be present
and will contain values for network degree.
Additionally, if net
contains a list named details
, then the
following elements will be added to net$details
, or overwritten if they
already exist:
cluster_data_goes_with |
A character string containing the value of |
count_col_for_cluster_data |
A character string containing the value of |
Brian Neal (Brian.Neal@ucsf.edu)
Hai Yang, Jason Cham, Brian Neal, Zenghua Fan, Tao He and Li Zhang. (2023). NAIR: Network Analysis of Immune Repertoire. Frontiers in Immunology, vol. 14. doi: 10.3389/fimmu.2023.1181825
addClusterMembership()
getClusterStats()
labelClusters()
set.seed(42)
toy_data <- simulateToyData()
net <- generateNetworkObjects(
toy_data, "CloneSeq"
)
net <- addClusterStats(
net,
count_col = "CloneCount"
)
head(net$cluster_data)
net$details
# won't change net since net$cluster_data exists
net <- addClusterStats(
net,
count_col = "CloneCount",
cluster_fun = "leiden",
verbose = TRUE
)
# overwrites values in net$cluster_data
# and cluster membership values in net$node_data$cluster_id
# with values obtained using "cluster_leiden" algorithm
net <- addClusterStats(
net,
count_col = "CloneCount",
cluster_fun = "leiden",
overwrite = TRUE
)
net$details
# overwrites existing values in net$cluster_data
# with values obtained using "cluster_louvain" algorithm
# saves cluster membership values to net$node_data$cluster_id_louvain
# (net$node_data$cluster_id retains membership values from "cluster_leiden")
net <- addClusterStats(
net,
count_col = "CloneCount",
cluster_fun = "louvain",
cluster_id_name = "cluster_id_louvain",
overwrite = TRUE
)
net$details
# perform clustering using "cluster_fast_greedy" algorithm,
# save cluster membership values to net$node_data$cluster_id_greedy
net <- addClusterMembership(
net,
cluster_fun = "fast_greedy",
cluster_id_name = "cluster_id_greedy"
)
# compute cluster properties for the clusters from previous step
# overwrites values in net$cluster_data
net <- addClusterStats(
net,
cluster_id_name = "cluster_id_greedy",
overwrite = TRUE
)
net$details
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.