treecor_harmony: Harmony Integration

View source: R/treecor_harmony.R

treecor_harmonyR Documentation

Harmony Integration

Description

This function is developed using 'Seurat v3.2.2'. It takes in a raw count gene expression matrix and a sample meta data frame and performs harmony integration.

Usage

treecor_harmony(
  count,
  sample_meta,
  output_dir,
  cell_meta = NULL,
  num_PCs = 20,
  num_harmony = 20,
  num_features = 2000,
  min_cells = 0,
  min_features = 0,
  pct_mito_cutoff = 20,
  exclude_genes = NULL,
  vars_to_regress = c("sample"),
  resolution = 0.5,
  verbose = T
)

Arguments

count

A raw count gene expression matrix with genes on rows and cells on columns. Note that cell barcode shall use ':' to separate sample name and barcode (i.e. "sample:barcode")

sample_meta

Sample metadata. Must contain a column named as 'sample'.

output_dir

Output directory

cell_meta

A data frame that contains both 'barcode' (cell barcode) and 'sample' columns (its corresponding sample). By default, the sample information is contained in cell barcode with "sample:barcode" format. If your data is not in this format, you should specify this parameter.

num_PCs

Number of PCs used in integration (by default: 20)

num_harmony

Number of harmony embedding used in integration (by default: 20)

num_features

Number of features used in integration (by default: 2000)

min_cells

Include features detected in at least this many cells (by default: 0). Same as 'min.cells' parameter in 'CreateSeuratObject()' function from 'Seurat' package.

min_features

Include cells where at least this many features are detected (by default: 0). Same as 'min.features' parameter in 'CreateSeuratObject()' function from 'Seurat' package.

pct_mito_cutoff

Include cells with less than this many percent of mitochondrial percent are detected (by default: 20). Ranges from 0 to 100. Will be used as a QC metric to subset the count matrix. Genes starting with 'MT-' are defined as a set of mitochondrial genes.

exclude_genes

Additional genes to be excluded from integration. Will subset the count matrix.

vars_to_regress

Variables to be regressed out during Harmony integration (by default: 'sample'). Same as 'group.by.vars' in 'RunHarmony()' function from 'harmony' package.

resolution

A clustering resolution (by default: 0.5). A higher (lower) value indicates larger (smaller) number of cell subclusters.

verbose

Show progress

Value

A Seurat object

Author(s)

Boyang Zhang <bzhang34@jhu.edu>, Hongkai Ji

Examples

# default setting
treecor_harmony(count, sample_meta, output_dir)
# additionally regress out 'study' ID
treecor_harmony(count, sample_meta, output_dir,vars_to_regress = c('sample','study'))
# increase clustering resolution (with more refined cell clusters)
treecor_harmony(count, sample_meta, output_dir,resolution = 0.8)

byzhang23/TreeCorTreat documentation built on May 7, 2024, 8:37 a.m.