get_pcs: Get PCs
In marcalva/diem: Debris-Containing Droplet Identification using EM

get_pcs

R Documentation

Get PCs

Description

Run PCA and get top n_pcs PCs for the test set. The PCs are used as the features for the initial k-means clustering. Only droplets with at least min_genes are used in the PCA, and thus used in the initialization. The counts data for the test set are count-normalized to the median and log transformed. Then the top n_var_genes variable genes are calculated using the function get_var_genes. PCA is run on the normalized count data for these variable genes only.

Usage

get_pcs(x, droplets.use = NULL, min_genes = 200, n_var_genes = 2000,
  lss = 0.3, threads = 1, n_pcs = 30, seedn = 1)

Arguments

`x`	An SCE object.
`droplets.use`	Specify droplets to calculate PCs for.
`min_genes`	Calculate PCs from droplets with at least this many genes detected.
`n_var_genes`	Number of top variable genes to use for PCA.
`lss`	The span parameter of the loess regression, the parameter for the function `loess`. The loess regression is used to regress out the effect of mean expression on variance.
`threads`	Number of threads for parallel execution. Default is 1.
`n_pcs`	Number of PCs to return.
`seedn`	The seed to set for irlba PCA calculation. It is set to 1 for reproducibility but can be set to NULL for a random initialization.

Value

An SCE object with PCs

Examples



# Get PCs with default parameters
sce <- get_pcs(sce)

# Run initialization with droplets that have at least 150 genes
# detected
sce <- get_pcs(sce, min_genes = 150)

# Using top 3,000 variable genes
sce <- get_pcs(sce, n_var_genes = 3000)

# Use top 50 PCs for initialization
sce <- get_pcs(sce, n_pcs = 50)

# Return PCs from random irlba initializations
sce <- get_pcs(sce, seedn = NULL)
sce <- get_pcs(sce, seedn = NULL)
sce <- get_pcs(sce, seedn = NULL)

marcalva/diem documentation built on Jan. 1, 2023, 2:33 a.m.