View source: R/network_functions.R
runSSN | R Documentation |
Runs scale-free shared nearest neighbor network (SNN) analysis on subset of features specified in Seurat object.
runSSN(
object,
features,
scale_free = T,
robust_pca = F,
data_type = c("pearson", "deviance"),
reprocess_sct = F,
slot = c("scale", "data"),
batch_feature = NULL,
do_scale = F,
do_center = F,
pca_var_explained = 0.9,
weight_by_var = F,
umap_knn = 10,
optimize_resolution = T,
target_purity = 0.8,
step_size = 0.05,
n_workers = 1,
verbose = T
)
object |
Seurat object |
features |
features to compute SNN on. If features are missing from scaled data, scaled data is recomputed. |
scale_free |
Logical to enforce scale free topology. Default is T. |
robust_pca |
Logical to run robust PCA (WARNING: computationally intensive, not recommended for large data). Default is F. |
data_type |
Data type to compute SNN on.
|
reprocess_sct |
if 'data_type' is "pearson", specify whether SCTransform is run (regardless whether features missing from existing scaled data or not). Default is F. |
slot |
Slot to use.
|
batch_feature |
Variables to regress out. Default is NULL. |
do_scale |
Whether to scale data (only if 'slot' = "data") |
do_center |
Whether to center data (only if 'slot' = "data") |
pca_var_explained |
Proportion of variance explained by PCA. Uses that top N PC components that explain 'pca_var_explained' amount of variance. Default is 0.9. |
weight_by_var |
Weight the feature embedding by the variance of each PC |
umap_knn |
This determines the number of neighboring points used in local approximations of UMAP manifold structure. Larger values will result in more global structure being preserved at the loss of detailed local structure. In general this parameter should often be in the range 5 to 50. default is 10. |
optimize_resolution |
Logical specifying whether to identify optimal clustering resolution. Optimal resolution identifying use target purity criteria. Default is T. |
target_purity |
Target purity for identifying optimal cluster resolution. Default is 0.8. |
step_size |
Step size between consecutive resolutions to test. Default is 0.05. |
n_workers |
Number of workers for parallel implementation. Default is 1. |
verbose |
Print progress. Default is T. |
Cell x Gene Seurat object, with gene-centric UMAP embedding and associated gene programs
Nicholas Mikolajewicz
https://nmikolajewicz.github.io/scMiko/articles/Module_Detection.html
findNetworkFeatures
for finding features, SCTransform
for gene count normalization and scaling, nullResiduals
for deviance calculations, scaleFreeNet
for scale-free topology transform.
# load human gastrulation data
so.query <- readRDS("../data/demo/so_tyser2021_220621.rds")
# Expression-based feature selection
features_expr <- findNetworkFeatures(object = so.query, method = "expr",
min_pct = 0.5)
# Highly-variable genes
features_hvg <- findNetworkFeatures(object = so.query, method = "hvg",
n_features = 2000)
# run SSN
so.gene <- runSSN(object = so.query ,
features = unique(c(features_hvg, features_dev)),
scale_free = T,
robust_pca = F,
data_type = "pearson",
reprocess_sct = T,
slot = c("scale"),
batch_feature = NULL,
pca_var_explained = 0.9,
optimize_resolution = T,
target_purity = 0.8,
step_size = 0.05,
n_workers = parallel::detectCores(),
verbose = F)
# get network connectivity plot
plt_connectivity <- SSNConnectivity(so.gene, quantile_threshold = 0.85, raster_dpi = 500)
# visualize
plt_connectivity$plot_edge + labs(title = "Network Connectivity")
# specify pruning threshold [0,1] (low values = less pruning, high values = more pruning)
prune.threshold <- 0.1
get feature-specific connectivities (wi)
df.wi <- pruneSSN(object = so.gene,
graph = "RNA_snn_power",
prune.threshold = prune.threshold,
return.df = T)
# visualize
plt.prune <- df.wi %>%
ggplot(aes(x = wi_l2)) +
geom_histogram(bins = 30) +
geom_vline(xintercept = prune.threshold, linetype = "dashed", color = "tomato") +
labs(x = "Degree (L2 norm)", y = "Count",
title = "Network Pruning",
subtitle = paste0(signif(100*sum(df.wi$wi_l2 <= prune.threshold)/nrow(df.wi), 3),
"% (", sum(df.wi$wi_l2 <= prune.threshold), "/", nrow(df.wi), ") genes pruning" )) +
theme_miko(grid = T)
print(plt.prune)
# get (pruned) gene module list
mod.list <- pruneSSN(object = so.gene, graph = "RNA_snn_power", prune.threshold = prune.threshold)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.