filterTrajFeaturesByFF: Filter features by Fano Factor

Description Usage Arguments Details Value Author(s) See Also Examples

Description

Filters trajectory features that exhibit a significantly high fano factor (index of dispersion) by considering average expression levels.

Usage

1
2
3
4
5
6
7
filterTrajFeaturesByFF(
  sce,
  threshold = 1.7,
  min_expr = 0,
  design = NULL,
  show_plot = TRUE
)

Arguments

sce

An SingleCellExperiment object

threshold

A Z-score cutoff (default: 1.7)

min_expr

Minimum average expression of feature to be considered

design

A numeric matrix describing the factors that should be blocked

show_plot

Indicates if plot should be shown (default: TRUE)

Details

To identify the most variable features an unsupervised strategy that controls for the relationship between a features’s average expression intensity and its expression variability is applied. Features are placed into 20 bins based on their mean expression. For each bin the fano factor (a windowed version of the index of dispersion, IOD = variance / mean) distribution is computed and standardized (Z-score(x) = x/sd(x) - mean(x)/sd(x)). Features with a Z-score greater than threshold remain labeled as trajectory feature in the SingleCellExperiment object. The parameter min_expr defines the minimum average expression level of a feature to be considered for this filter method. Please note that spike-in controls are ignored and are not listed as trajectory features.

To account for systematic bias in the expression data (e.g., cell cycle effects), a design matrix can be provided for the learning process. It should list the factors that should be blocked and their values per sample. It is suggested to construct a design matrix with model.matrix.

Value

A character vector

Author(s)

Daniel C. Ellwanger

See Also

trajFeatureNames model.matrix

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
# Simulate example data
set.seed(1101)
dat <- simulate_exprs(n_features=15000, n_samples=100)

# Create container
alist <- list(logcounts=dat)
sce <- SingleCellExperiment(assays=alist)

# Filter incrementally
trajFeatureNames(sce) <- filterTrajFeaturesByDL(sce, threshold=2)
trajFeatureNames(sce) <- filterTrajFeaturesByCOV(sce, threshold=0.5)
trajFeatureNames(sce) <- filterTrajFeaturesByFF(sce, threshold=1.7)

# Number of features
length(trajFeatureNames(sce)) #filtered
nrow(sce) #total

elldc/CellTrails documentation built on May 16, 2020, 4:40 a.m.