run_tfidf: Run TF-IDF on single-cell data Run Term Frequency - Inverse...

View source: R/run_tfidf.R

run_tfidfR Documentation

Run TF-IDF on single-cell data Run Term Frequency - Inverse Document Frequency (TF-IDF) analysis on samples metadata to characterise each cluster.

Description

Run TF-IDF on single-cell data

Run Term Frequency - Inverse Document Frequency (TF-IDF) analysis on samples metadata to characterise each cluster.

Usage

run_tfidf(
  obj = NULL,
  reduction = "UMAP",
  label_var = "label",
  cluster_var = "seurat_clusters",
  replace_regex = "[.]|[_]|[-]",
  terms_per_cluster = 3,
  force_new = FALSE,
  return_all_results = FALSE,
  verbose = TRUE
)

Arguments

obj

Single-cell data object.

reduction

Name of the reduction to use (case insensitive).

label_var

Which cell metadata column to input to NLP analysis.

cluster_var

Which cell metadata column to use to identify which cluster each cell is assigned to.

replace_regex

Characters by which to split label_var into terms (i.e. tokens) for NLP enrichment analysis.

terms_per_cluster

The maximum number of words to return per cluster.

force_new

If NLP results are already detected the metadata, set force_new=TRUE to replace them with new results.

return_all_results

Whether to return just the obj with updated metadata (TRUE), or all intermediate results (FALSE).

verbose

Whether to print messages.

Examples

 
data("pseudo_seurat")
obj2 <- run_tfidf(obj = pseudo_seurat, 
                  cluster_var = "cluster",
                  label_var = "celltype")  

neurogenomics/scNLP documentation built on Oct. 8, 2024, 5:30 p.m.