RunscMC: Seurat wrapper for scMC

Description Usage Arguments Value

View source: R/modeling.R

Description

Run scMC algorithm with Seurat pipelines

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
RunscMC(
  object.list,
  resolution = NULL,
  method = c("matrix", "igraph"),
  algorithm = 4,
  resRange = NULL,
  nDims.consensus = 30,
  clustering.method = c("hierarchical", "community"),
  graph.name = NULL,
  quantile.cutoff = 0.75,
  similarity.cutoff = 0.6,
  new.assay.name = NULL,
  nDims.scMC = 40,
  lambda = 1,
  integrationFeatures.method = c("joint", "individual"),
  selection.method = c("vst", "mean.var.plot"),
  nfeatures = 2000,
  mean.cutoff = c(0.01, 5),
  dispersion.cutoff = c(0.25, Inf),
  nDims.pca = 40,
  force.pca = TRUE,
  nDims.knn = 40,
  k.param = 20,
  prune.SNN = 1/15,
  features = NULL,
  test.use = "wilcox",
  only.pos = TRUE,
  min.pct = 0.25,
  logfc.threshold = 0.25,
  add.cell.ids = NULL,
  assay = "RNA",
  ...
)

Arguments

object.list

a list of Seurat objects, one per dataset Parameters in identifyClusters

resolution

the resolution in Leiden algorithm; if it is NULL, the optimal resoultion will be inferred based on eigen spectrum

method

Method for running leiden (defaults to matrix which is fast for small datasets). Enable method = "igraph" to avoid casting large data to a dense matrix.

algorithm

Algorithm for modularity optimization (1 = original Louvain algorithm; 2 = Louvain algorithm with multilevel refinement; 3 = SLM algorithm; 4 = Leiden algorithm). Leiden requires the leidenalg python.

resRange

the range of resolution values in Leiden algorithm; if it is NULL, the default range of resoultion will be from 0.1 to 0.5

nDims.consensus

the number of singular values to estimate from the consensus matrix.

clustering.method

method for performing clustering on the consensus matrix from a range of resolutions

graph.name

Name of graph to use for the clustering algorithm

Parameters in identifyConfidentCells

quantile.cutoff

quantile cutoff (default = 0.75) Parameters in learnTechnicalVariation

similarity.cutoff

a thresholding parameter determining whether cell clusters are shared across different datasets based on their similarity. If T is too small, the biological variation may be removed. If T is too large, the technical variation could not be removed. Parameters in integrateData

new.assay.name

Name for the new assay containing the integrated data

nDims.scMC

number of dimensions to compute in the scMC integrated space

lambda

the tuning parameter, non-negative.

Parameters in identifyIntegrationFeatures

integrationFeatures.method

"joint" or "individual"; "joint": Identify integration features from the concatenated data matrix; "individual": ranks features by the number of datasets they are deemed variable in, breaking ties by the median variable feature rank across datasets. It returns the top scoring features by this ranking.

selection.method

The method to choose top variable features: vst: First, fits a line to the relationship of log(variance) and log(mean) using local polynomial regression (loess). Then standardizes the feature values using the observed mean and expected variance (given by the fitted line). Feature variance is then calculated on the standardized values after clipping to a maximum (see clip.max parameter). mean.var.plot (mvp): First, uses a function to calculate average expression (mean.function) and dispersion (dispersion.function) for each feature. Next, divides features into num.bin (deafult 20) bins based on their average expression, and calculates z-scores for dispersion within each bin. The purpose of this is to identify variable features while controlling for the strong relationship between variability and average expression.

nfeatures

Number of features to select as top variable features; only used when selection.method is set to 'vst'

mean.cutoff

A two-length numeric vector with low- and high-cutoffs for feature means

dispersion.cutoff

A two-length numeric vector with low- and high-cutoffs for feature dispersions

Parameters in identifyNeighbors

nDims.pca

the number of dimensions to use for running PCA

force.pca

Set force.pca = FALSE to skip the PCA calculation. Default = TRUE will calculate PCA.

nDims.knn

the number of dimensions to use for building SNN

k.param

Defines k for the k-nearest neighbor algorithm

prune.SNN

Sets the cutoff for acceptable Jaccard index when computing the neighborhood overlap for the SNN construction. Any edges with values less than or equal to this will be set to 0 and removed from the SNN graph. Essentially sets the strigency of pruning (0 — no pruning, 1 — prune everything).

Parameters in identifyMarkers

features

features used to perform statistical test

test.use

which test to use

only.pos

Only return positive markers

min.pct

Threshold of the percent of cells enriched in one cluster

logfc.threshold

Threshold of Log Fold Change

add.cell.ids

A character vector of length(object.list) when merging multiple objects. Appends the corresponding values to the start of each objects' cell names.

assay

Assay to use

...

other parameter passing to Seurat functions

Value

A Seurat object with the integrated space from scMC


amsszlh/scMC documentation built on Jan. 2, 2021, 1:51 p.m.