RunSeurat: Run Seurat Pipeline

Description Usage Arguments Value

View source: R/RunSeurat.R

Description

This function run the following steps of the Seurat pipeline :

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
RunSeurat(
  data.dir = getwd(),
  object = NULL,
  output.dir = getwd(),
  min.features = 200,
  min.cells = 3,
  project.name = "project_name",
  mt.pattern = "^mt-",
  max.percent.mt = 15,
  max.features = NULL,
  max.nCount = NULL,
  sctransform = FALSE,
  logtransform = TRUE,
  no.plot = FALSE,
  vars.to.regress = c("percent.mt", "nCount_RNA"),
  ndims = 50,
  dims = 1:35,
  resolution = seq(0.1, 1, 0.1),
  cellcycle = TRUE,
  genes.FeaturePlot = NULL,
  genes.DotPlot = NULL,
  find.all.markers = TRUE,
  only.pos = TRUE,
  min.pct = 0.25,
  logfc.threshold = 0.25,
  test.use = "MAST",
  save.rds = TRUE,
  integrated.assay = FALSE,
  tf.activity = FALSE,
  species = "mouse",
  dorothea.confidence = c("A", "B", "C"),
  ...
)

Arguments

data.dir

Directory containing the matrix.mtx, genes.tsv (or features.tsv), and barcodes.tsv files provided by 10X. A vector or named vector can be given in order to load several data directories. If a named vector is given, the cell barcode names will be prefixed with the name.

object

A Seurat object

output.dir

Path to the destination folder of saved files

min.features

Include cells where at least this many features are detected.

min.cells

Include features detected in at least this many cells. Will subset the counts matrix as well. To reintroduce excluded features, create a new object with a lower cutoff.

project.name

Name of the project/object used for titles in plots

mt.pattern

Regex pattern of the mitochondrial genes ('^MT-' or '^mt-')

max.percent.mt

Mitochondrial counts threshold (default set to 15)

max.features

Maximum number of gene per cell (default 99-quantile)

max.nCount

Maximum number of reads per cell (default 99-quantile)

sctransform

If set, use SCTransform normalization

logtransform

Run the default log-normalization from Seurat

no.plot

If set, run the pipeline without saving the plots

vars.to.regress

Variables to regress out in a second non-regularized linear regression. For example, percent.mito. Default is NULL

ndims

Number of dimensions to plot standard deviation for

dims

Which dimensions to use as input features, used only if features is NULL

resolution

Value of the resolution parameter, use a value above (below) 1.0 if you want to obtain a larger (smaller) number of communities.

cellcycle

Run CellCycle Scoring from Seurat

genes.FeaturePlot

A list of genes for Seurat FeaturePlot

genes.DotPlot

A list of genes for Seurat DotPlot

find.all.markers

Run Differential Gene Expression

only.pos

Only return positive markers (FALSE by default)

min.pct

only test genes that are detected in a minimum fraction of min.pct cells in either of the two populations. Meant to speed up the function by not testing genes that are very infrequently expressed. Default is 0.1

logfc.threshold

Limit testing to genes which show, on average, at least X-fold difference (log-scale) between the two groups of cells. Default is 0.25 Increasing logfc.threshold speeds up the function, but can miss weaker signals.

test.use

Denotes which test to use. Available options are:

  • "wilcox" : Identifies differentially expressed genes between two groups of cells using a Wilcoxon Rank Sum test (default)

  • "bimod" : Likelihood-ratio test for single cell gene expression, (McDavid et al., Bioinformatics, 2013)

  • "roc" : Identifies 'markers' of gene expression using ROC analysis. For each gene, evaluates (using AUC) a classifier built on that gene alone, to classify between two groups of cells. An AUC value of 1 means that expression values for this gene alone can perfectly classify the two groupings (i.e. Each of the cells in cells.1 exhibit a higher level than each of the cells in cells.2). An AUC value of 0 also means there is perfect classification, but in the other direction. A value of 0.5 implies that the gene has no predictive power to classify the two groups. Returns a 'predictive power' (abs(AUC-0.5) * 2) ranked matrix of putative differentially expressed genes.

  • "t" : Identify differentially expressed genes between two groups of cells using the Student's t-test.

  • "negbinom" : Identifies differentially expressed genes between two groups of cells using a negative binomial generalized linear model. Use only for UMI-based datasets

  • "poisson" : Identifies differentially expressed genes between two groups of cells using a poisson generalized linear model. Use only for UMI-based datasets

  • "LR" : Uses a logistic regression framework to determine differentially expressed genes. Constructs a logistic regression model predicting group membership based on each feature individually and compares this to a null model with a likelihood ratio test.

  • "MAST" : Identifies differentially expressed genes between two groups of cells using a hurdle model tailored to scRNA-seq data. Utilizes the MAST package to run the DE testing.

  • "DESeq2" : Identifies differentially expressed genes between two groups of cells based on a model using DESeq2 which uses a negative binomial distribution (Love et al, Genome Biology, 2014).This test does not support pre-filtering of genes based on average difference (or percent detection rate) between cell groups. However, genes may be pre-filtered based on their minimum detection rate (min.pct) across both cell groups. To use this method, please install DESeq2, using the instructions at https://bioconductor.org/packages/release/bioc/html/DESeq2.html

save.rds

Save final Seurat object in RDS format (default set to True)

integrated.assay

If set, run the pipeline on the integrated assay

tf.activity

Run Seurat pipeline on TF activity

species

Mouse or human

dorothea.confidence

Confidence levels of regulons

...

Arguments passed to as.sparse

Value

A processed Seurat Object along with QC and visualization plots


Theob0t/scEasyPip documentation built on Dec. 18, 2021, 4:10 p.m.