findNetworkFeatures: Identify features for gene program discovery.

View source: R/network_functions.R

findNetworkFeaturesR Documentation

Identify features for gene program discovery.

Description

Identify features for gene program discovery.

Usage

findNetworkFeatures(
  object,
  method = c("expr", "hvg", "deviance"),
  n_features = 2000,
  min_pct = 0.5,
  split_var = "seurat_clusters",
  batch_feature = NULL,
  verbose = T
)

Arguments

object

Seurat object

method

Feature selection method.

  • "expr" - Top expressed genes within 'split_var' groups, together with variable genes.

  • "hvg" - Highly-variable genes (number specified by 'n_features')

  • "deviance" - Deviant genes (number specified by 'n_features')

n_features

Number of features to use when calculating variable or deviant features. Default is 2000.

min_pct

Minimum expression within spilt_var. Ignored if 'method' is not "expr". Default is 0.5.

split_var

Grouping variable to enforce ‘min_pct' criteria within. Default is ’seurat_clusters'.

batch_feature

Variables to regress out. Ignored if 'method' is not "deviance". Default is NULL.

verbose

Show progress. Default is T.

Value

vector of features

Author(s)

Nicholas Mikolajewicz

References

https://nmikolajewicz.github.io/scMiko/articles/Module_Detection.html

See Also

runSSN

Examples


# load human gastrulation data
so.query <- readRDS("../data/demo/so_tyser2021_220621.rds")

# Expression-based feature selection
features_expr <- findNetworkFeatures(object = so.query, method = "expr",
                                     min_pct = 0.5)

# Highly-variable genes
features_hvg <- findNetworkFeatures(object = so.query, method = "hvg",
                                    n_features =  2000)

# run SSN
so.gene <- runSSN(object = so.query ,
     features = unique(c(features_hvg, features_dev)),
     scale_free = T,
     robust_pca = F,
     data_type = "pearson",
     reprocess_sct = T,
     slot = c("scale"),
     batch_feature = NULL,
     pca_var_explained = 0.9,
     optimize_resolution = T,
     target_purity = 0.8,
     step_size =  0.05,
     n_workers = parallel::detectCores(),
     verbose = F)

# get network connectivity plot
plt_connectivity <- SSNConnectivity(so.gene, quantile_threshold = 0.85, raster_dpi = 500)

# visualize
plt_connectivity$plot_edge + labs(title = "Network Connectivity")


# specify pruning threshold [0,1] (low values = less pruning, high values = more pruning)
prune.threshold <- 0.1

get feature-specific connectivities (wi)
df.wi   <- pruneSSN(object = so.gene,
                    graph = "RNA_snn_power",
                    prune.threshold = prune.threshold,
                    return.df = T)

# visualize
plt.prune <- df.wi %>%
  ggplot(aes(x = wi_l2)) +
  geom_histogram(bins = 30) +
  geom_vline(xintercept = prune.threshold, linetype = "dashed", color = "tomato") +
  labs(x = "Degree (L2 norm)", y = "Count",
       title = "Network Pruning",
       subtitle = paste0(signif(100*sum(df.wi$wi_l2 <=  prune.threshold)/nrow(df.wi), 3),
                         "% (", sum(df.wi$wi_l2 <=  prune.threshold), "/", nrow(df.wi), ") genes pruning" )) +
  theme_miko(grid = T)

print(plt.prune)

# get (pruned) gene module list
mod.list   <- pruneSSN(object = so.gene, graph = "RNA_snn_power", prune.threshold = prune.threshold)


NMikolajewicz/scMiko documentation built on June 28, 2023, 1:41 p.m.