View source: R/public_clusters.R
buildPublicClusterNetwork | R Documentation |
Part of the workflow
Searching for Public TCR/BCR Clusters.
Intended for use following findPublicClusters()
.
Given node-level metadata for each sample's filtered clusters, combines the data into a global network and performs network analysis and cluster analysis.
buildPublicClusterNetwork(
## Input ##
file_list,
input_type = "rds",
data_symbols = "ndat",
header = TRUE, sep,
read.args = list(row.names = 1),
seq_col,
## Network Settings ##
drop_isolated_nodes = FALSE,
node_stats = deprecated(),
stats_to_include = deprecated(),
cluster_stats = deprecated(),
## Visualization ##
color_nodes_by = "SampleID",
color_scheme = "turbo",
plot_title = "Global Network of Public Clusters",
## Output ##
output_dir = NULL,
output_name = "PublicClusterNetwork",
verbose = FALSE,
...
)
file_list |
A character vector of file paths, or a list containing
|
input_type |
A character string specifying the file format of the input files. Options are
|
data_symbols |
Used when |
header |
For values of |
sep |
For values of |
read.args |
For values of |
seq_col |
Specifies the column in the node-level metadata that contains the TCR/BCR sequences. Accepts a character string containing the column name or a numeric scalar containing the column index. |
drop_isolated_nodes |
Passed to |
node_stats |
|
stats_to_include |
|
cluster_stats |
|
color_nodes_by |
Passed to |
color_scheme |
Passed to |
plot_title |
Passed to |
output_dir |
Passed to |
output_name |
Passed to |
verbose |
Logical. If |
... |
Other arguments to |
The node-level metadata for the filtered clusters from all samples is combined
and the global network is constructed by calling
buildNet()
with
node_stats = TRUE
, stats_to_include = "all"
,
cluster_stats = TRUE
and cluster_id_name = "ClusterIDPublic"
.
The computed node-level network properties are renamed to reflect their correspondence to the global network. This is done to distinguish them from the network properties that correspond to the sample-level networks. The names are:
ClusterIDPublic
PublicNetworkDegree
PublicTransitivity
PublicCloseness
PublicCentralityByCloseness
PublicEigenCentrality
PublicCentralityByEigen
PublicBetweenness
PublicCentralityByBetweenness
PublicAuthorityScore
PublicCoreness
PublicPageRank
See the Searching for Public TCR/BCR Clusters article on the package website.
A list of network objects as returned by
buildRepSeqNetwork()
.
The list is returned invisibly.
If the input data contains a combined total of fewer than two rows, or if the
global network contains no nodes, then the function returns NULL
,
invisibly, with a warning.
Brian Neal (Brian.Neal@ucsf.edu)
Hai Yang, Jason Cham, Brian Neal, Zenghua Fan, Tao He and Li Zhang. (2023). NAIR: Network Analysis of Immune Repertoire. Frontiers in Immunology, vol. 14. doi: 10.3389/fimmu.2023.1181825
Searching for Public TCR/BCR Clusters article on package website
findPublicClusters()
buildPublicClusterNetworkByRepresentative()
set.seed(42)
## Simulate 30 samples with a mix of public/private sequences ##
samples <- 30
sample_size <- 30 # (seqs per sample)
base_seqs <- c(
"CASSIEGQLSTDTQYF", "CASSEEGQLSTDTQYF", "CASSSVETQYF",
"CASSPEGQLSTDTQYF", "RASSLAGNTEAFF", "CASSHRGTDTQYF", "CASDAGVFQPQHF",
"CASSLTSGYNEQFF", "CASSETGYNEQFF", "CASSLTGGNEQFF", "CASSYLTGYNEQFF",
"CASSLTGNEQFF", "CASSLNGYNEQFF", "CASSFPWDGYGYTF", "CASTLARQGGELFF",
"CASTLSRQGGELFF", "CSVELLPTGPLETSYNEQFF", "CSVELLPTGPSETSYNEQFF",
"CVELLPTGPSETSYNEQFF", "CASLAGGRTQETQYF", "CASRLAGGRTQETQYF",
"CASSLAGGRTETQYF", "CASSLAGGRTQETQYF", "CASSRLAGGRTQETQYF",
"CASQYGGGNQPQHF", "CASSLGGGNQPQHF", "CASSNGGGNQPQHF", "CASSYGGGGNQPQHF",
"CASSYGGGQPQHF", "CASSYKGGNQPQHF", "CASSYTGGGNQPQHF",
"CAWSSQETQYF", "CASSSPETQYF", "CASSGAYEQYF", "CSVDLGKGNNEQFF")
# Relative generation probabilities
pgen <- cbind(
stats::toeplitz(0.6^(0:(sample_size - 1))),
matrix(1, nrow = samples, ncol = length(base_seqs) - samples)
)
simulateToyData(
samples = samples,
sample_size = sample_size,
prefix_length = 1,
prefix_chars = c("", ""),
prefix_probs = cbind(rep(1, samples), rep(0, samples)),
affixes = base_seqs,
affix_probs = pgen,
num_edits = 0,
output_dir = tempdir(),
no_return = TRUE
)
## 1. Find Public Clusters in Each Sample
sample_files <-
file.path(tempdir(),
paste0("Sample", 1:samples, ".rds")
)
findPublicClusters(
file_list = sample_files,
input_type = "rds",
seq_col = "CloneSeq",
count_col = "CloneCount",
min_seq_length = NULL,
drop_matches = NULL,
top_n_clusters = 3,
min_node_count = 5,
min_clone_count = 15000,
output_dir = tempdir()
)
## 2. Build Global Network of Public Clusters
public_clusters <-
buildPublicClusterNetwork(
file_list =
list.files(
file.path(tempdir(), "node_meta_data"),
full.names = TRUE
),
seq_col = "CloneSeq",
count_col = "CloneCount",
plot_title = NULL,
plot_subtitle = NULL,
print_plots = TRUE
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.