getClusterStats: Compute Cluster-Level Network Properties
In mlizhangx/Network-Analysis-for-Repertoire-Sequencing-: Network Analysis of Immune Repertoire

getClusterStats

R Documentation

Compute Cluster-Level Network Properties

Description

Given the node-level metadata and adjacency matrix for a network graph that has been partitioned into clusters, computes network properties for the clusters and returns them in a data frame.

addClusterStats() is preferred to getClusterStats() in most situations.

Usage

getClusterStats(
  data,
  adjacency_matrix,
  seq_col = NULL,
  count_col = NULL,
  cluster_id_col = "cluster_id",
  degree_col = NULL,
  cluster_fun = deprecated(),
  verbose = FALSE
)

Arguments

`data`	A data frame containing the node-level metadata for the network, with each row corresponding to a network node.
`adjacency_matrix`	The adjacency matrix for the network.
`seq_col`	Specifies the column(s) of `data` containing the receptor sequences upon whose similarity the network is based. Accepts a character or numeric vector of length 1 or 2, containing either column names or column indices. If provided, then related cluster-level properties will be computed.
`count_col`	Specifies the column of `data` containing a measure of abundance (such as clone count or UMI count). Accepts a character string containing the column name or a numeric scalar containing the column index. If provided, related cluster-level properties will be computed.
`cluster_id_col`	Specifies the column of `data` containing the cluster membership variable that identifies the cluster to which each node belongs. Accepts a character string containing the column name or a numeric scalar containing the column index.
`degree_col`	Specifies the column of `data` containing the network degree of each node. Accepts a character string containing the column name or a numeric scalar containing the column index. If the column does not exist, the network degree will be computed.
`cluster_fun`	Does nothing.
`verbose`	Logical. If `TRUE`, generates messages about the tasks performed and their progress, as well as relevant properties of intermediate outputs. Messages are sent to `stderr()`.

Details

To use getClusterStats(), the network graph must first be partitioned into clusters, which can be done using addClusterMembership(). The name of the cluster membership variable in the node metadata must be provided to the cluster_id_col argument when calling getClusterStats().

Value

A data frame containing one row for each cluster in the network and the following variables:

`cluster_id`	The cluster ID number.
`node_count`	The number of nodes in the cluster.
`mean_seq_length`	The mean sequence length in the cluster. Only present when `length(seq_col) == 1`.
`A_mean_seq_length`	The mean first sequence length in the cluster. Only present when `length(seq_col) == 2`.
`B_mean_seq_length`	The mean second sequence length in the cluster. Only present when `length(seq_col) == 2`.
`mean_degree`	The mean network degree in the cluster.
`max_degree`	The maximum network degree in the cluster.
`seq_w_max_degree`	The receptor sequence possessing the maximum degree within the cluster. Only present when `length(seq_col) == 1`.
`A_seq_w_max_degree`	The first sequence of the node possessing the maximum degree within the cluster. Only present when `length(seq_col) == 2`.
`B_seq_w_max_degree`	The second sequence of the node possessing the maximum degree within the cluster. Only present when `length(seq_col) == 2`.
`agg_count`	The aggregate count among all nodes in the cluster (based on the counts in `count_col`).
`max_count`	The maximum count among all nodes in the cluster (based on the counts in `count_col`).
`seq_w_max_count`	The receptor sequence possessing the maximum count within the cluster. Only present when `length(seq_col) == 1`.
`A_seq_w_max_count`	The first sequence of the node possessing the maximum count within the cluster. Only present when `length(seq_col) == 2`.
`B_seq_w_max_count`	The second sequence of the node possessing the maximum count within the cluster. Only present when `length(seq_col) == 2`.
`diameter_length`	The longest geodesic distance in the cluster, computed as the length of the vector returned by `get_diameter()`.
`assortativity`	The assortativity coefficient of the cluster's graph, based on the degree (minus one) of each node in the cluster (with the degree computed based only upon the nodes within the cluster). Computed using `assortativity_degree()`.
`global_transitivity`	The transitivity (i.e., clustering coefficient) for the cluster's graph, which estimates the probability that adjacent vertices are connected. Computed using `transitivity()` with `type = "global"`.
`edge_density`	The number of edges in the cluster as a fraction of the maximum possible number of edges. Computed using `edge_density()`.
`degree_centrality_index`	The centrality index of the cluster's graph based on within-cluster network degree. Computed as the `centralization` element of the output from `centr_degree()`.
`closeness_centrality_index`	The centrality index of the cluster's graph based on closeness, i.e., distance to other nodes in the cluster. Computed using `centralization()`.
`eigen_centrality_index`	The centrality index of the cluster's graph based on the eigenvector centrality scores, i.e., values of the first eigenvector of the adjacency matrix for the cluster. Computed as the `centralization` element of the output from `centr_eigen()`.
`eigen_centrality_eigenvalue`	The eigenvalue corresponding to the first eigenvector of the adjacency matrix for the cluster. Computed as the `value` element of the output from `eigen_centrality()`.

Author(s)

Brian Neal (Brian.Neal@ucsf.edu)

References

Hai Yang, Jason Cham, Brian Neal, Zenghua Fan, Tao He and Li Zhang. (2023). NAIR: Network Analysis of Immune Repertoire. Frontiers in Immunology, vol. 14. doi: 10.3389/fimmu.2023.1181825

Webpage for the NAIR package

Examples

set.seed(42)
toy_data <- simulateToyData()

net <-
  generateNetworkObjects(
    toy_data, "CloneSeq"
  )

net <- addClusterMembership(net)

net$cluster_data <-
  getClusterStats(
    net$node_data,
    net$adjacency_matrix,
    seq_col = "CloneSeq",
    count_col = "CloneCount"
  )

mlizhangx/Network-Analysis-for-Repertoire-Sequencing- documentation built on Jan. 17, 2025, 12:44 a.m.

mlizhangx/Network-Analysis-for-Repertoire-Sequencing- index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

mlizhangx/Network-Analysis-for-Repertoire-Sequencing-
Network Analysis of Immune Repertoire

getClusterStats: Compute Cluster-Level Network Properties
In mlizhangx/Network-Analysis-for-Repertoire-Sequencing-: Network Analysis of Immune Repertoire

Compute Cluster-Level Network Properties

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Related to getClusterStats in mlizhangx/Network-Analysis-for-Repertoire-Sequencing-...

R Package Documentation

Browse R Packages

We want your feedback!

mlizhangx/Network-Analysis-for-Repertoire-Sequencing- Network Analysis of Immune Repertoire

getClusterStats: Compute Cluster-Level Network Properties In mlizhangx/Network-Analysis-for-Repertoire-Sequencing-: Network Analysis of Immune Repertoire

Compute Cluster-Level Network Properties

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Related to getClusterStats in mlizhangx/Network-Analysis-for-Repertoire-Sequencing-...

R Package Documentation

Browse R Packages

We want your feedback!

mlizhangx/Network-Analysis-for-Repertoire-Sequencing-
Network Analysis of Immune Repertoire

getClusterStats: Compute Cluster-Level Network Properties
In mlizhangx/Network-Analysis-for-Repertoire-Sequencing-: Network Analysis of Immune Repertoire