runScAnnotation: runScAnnotation
In wguo-research/scCancer: A package for automated processing of single cell RNA-seq data in cancer

runScAnnotation

R Documentation

runScAnnotation

Description

According to the results of 'runScStatistics', perform cell and gene quality control. Using the R package Seurat to perform basic operations (normalization, log-transformation, highly variable genes identification, removing unwanted variance, scaling, centering, dimension reduction, clustering, and differential expression analy-sis). Perform some cancer-specific analyses: cancer micro-environmental cell type classification, cell malignancy estimation, cell cycle analysis, cell stemness analysis, gene set signature analysis, expression programs identification, and so on.

Usage

runScAnnotation(
  dataPath,
  statPath,
  savePath = NULL,
  authorName = NULL,
  sampleName = "sc",
  bool.filter.cell = T,
  bool.filter.gene = T,
  anno.filter = c("mitochondrial", "ribosome", "dissociation"),
  nCell.min = 3,
  bgPercent.max = 1,
  bool.rmContamination = F,
  vars.add.meta = c("mito.percent", "ribo.percent", "diss.percent"),
  vars.to.regress = c("nCount_RNA", "mito.percent", "ribo.percent"),
  pc.use = 30,
  resolution = 0.8,
  clusterStashName = "default",
  show.features = NULL,
  bool.add.features = T,
  bool.runDiffExpr = T,
  n.markers = 5,
  species = "human",
  genome = "hg19",
  hg.mm.mix = F,
  bool.runDoublet = T,
  doublet.method = "bcds",
  bool.runCellClassify = T,
  ct.templates = NULL,
  coor.names = c("tSNE_1", "tSNE_2"),
  bool.runMalignancy = T,
  cnv.ref.data = NULL,
  cnv.referAdjMat = NULL,
  cutoff = 0.1,
  p.value.cutoff = 0.5,
  bool.intraTumor = T,
  bool.runCellCycle = T,
  bool.runStemness = T,
  bool.runGeneSets = T,
  geneSets = NULL,
  geneSet.method = "average",
  bool.runExprProgram = T,
  nmf.rank = 50,
  bool.runInteraction = T,
  genReport = T
)

Arguments

`dataPath`	A path containing the cell ranger processed data. Under this path, folders 'filtered_feature_bc_matrix' and 'raw_feature_bc_matrix' exist generally.
`statPath`	A path containing the results files of step 'runScStatistics'.
`savePath`	A path to save the results files. If NULL, the 'statPath' will be used instead.
`authorName`	A character string for authors name and will be shown in the report.
`sampleName`	A character string giving a label for this sample.
`bool.filter.cell`	A logical value indicating whether to filter the cells according to the QC of 'scStatistics'.
`bool.filter.gene`	A logical value indicating whether to filter the genes according to the QC of 'scStatistics'.
`anno.filter`	A vector indicating the types of genes to be filtered. Must be some of c("mitochondrial", "ribosome", "dissociation")(default) or NULL.
`nCell.min`	An integer number used to filter gene. The default is 3. Genes with the number of expressed cells less than this threshold will be filtered.
`bgPercent.max`	A float number used to filter gene. The default is 1 (no filtering). Genes with the background percentage larger than this threshold will be filtered.
`bool.rmContamination`	A logical value indicating whether to remove ambient RNA contamination based on 'SoupX'.
`vars.add.meta`	A vector indicating the variables to be added to Seurat object's meta.data. The default is c("mito.percent", "ribo.percent", "diss.percent").
`vars.to.regress`	A vector indicating the variables to regress out in R package Seurat. The default is c("nCount_RNA", "mito.percent", "ribo.percent").
`pc.use`	An integer number indicating the number of PCs to use as input features. The default is 30.
`resolution`	A float number used in function 'FindClusters' in Seurat. The default is 0.8.
`clusterStashName`	A character string used as the name of cluster identies. The default is "default".
`show.features`	A list or vector for genes to be plotted in 'markerPlot'.
`bool.add.features`	A logical value indicating whether to add default features to 'show.features' or not.
`bool.runDiffExpr`	A logical value indicating whether to perform differential expressed analysis.
`n.markers`	An integer indicating the number of differential expressed genes showed in the plot. The defalut is 5.
`species`	A character string indicating what species the sample belong to. Only "human"(default) or "mouse" are allowed.
`genome`	A character string indicating the version of the reference gene annotation information. This information is mainly used to infer CNV profile and estimate malignancy. Only 'hg19' (defalut) or 'hg38' are allowed for "human" species, and only "mm10" is allowed for "mouse" species.
`hg.mm.mix`	A logical value indicating whether the sample is a mix of human cells and mouse cells(such as PDX sample). If TRUE, the arguments 'hg.mm.thres' and 'mix.anno' should be set to corresponding values.
`bool.runDoublet`	A logical value indicating whether to estimate doublet scores.
`doublet.method`	The method to estimate doublet score. The default is "bcds". "cxds"(co-expression based doublet scoring) and "bcds"(binary classification based doublet scoring) are allowed. These methods are from R package "scds".
`bool.runCellClassify`	A logical value indicating whether to predict the usual cell type. The default is TRUE.
`ct.templates`	A list of vectors of several cell type templates. The default is NULL and the templates prepared in this package will be used.
`coor.names`	A vector indicating the names of two-dimension coordinate used in visualization.
`bool.runMalignancy`	A logical value indicating whether to estimate malignancy.
`cnv.ref.data`	An expression matrix of gene by cell, which is used as the normal reference during estimating malignancy. The default is NULL, and an immune cells or bone marrow cells expression matrix will be used for human or mouse species, respectively.
`cnv.referAdjMat`	An adjacent matrix for the normal reference data. The larger the value, the closer the cell pair is. The default is NULL, and a SNN matrix of the default ref.data will be used.
`cutoff`	A threshold used in the CNV inference.
`p.value.cutoff`	A threshold to decide weather the bimodality distribution of malignancy score is significant.
`bool.intraTumor`	A logical value indicating whether to use the identified tumor clusters to perform following intra-tumor heterogeneity analyses.
`bool.runCellCycle`	A logical value indicating whether to estimate cell cycle scores.
`bool.runStemness`	A logical value indicating whether to estimate stemness scores.
`bool.runGeneSets`	A logical value indicating whether to estimate gene sets signature scores.
`geneSets`	A list of gene sets to be analyzed. The default is NULL and 50 hallmark gene sets from MSigDB will be used.
`geneSet.method`	The method to be used in calculate gene set scores. Currently, only "average" and "GSVA" are allowed.
`bool.runExprProgram`	A logical value indicating whether to run non-negative matrix factorization (NMF) to identify expression programs.
`nmf.rank`	An integer of decomposition rank used in NMF.
`bool.runInteraction`	A logical value indicating whether to run cell set ligand-receptor interaction analysis.
`genReport`	A logical value indicating whether to generate a .html/.md report (suggest to set TRUE).

Value

A results list with all useful objects used in the function.

wguo-research/scCancer documentation built on May 26, 2024, 9:12 p.m.

wguo-research/scCancer index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

wguo-research/scCancer
A package for automated processing of single cell RNA-seq data in cancer

runScAnnotation: runScAnnotation
In wguo-research/scCancer: A package for automated processing of single cell RNA-seq data in cancer