define_top_genes: Define the reference window using the most highly expressed...

Description Usage Arguments Details Value Examples

View source: R/binning.R

Description

Define the group of features in the dataset that will be considered as reference, the top window, by specifying either a number of features or an expression threshold.

Usage

1
2
define_top_genes(dataset, window_size = NULL, mean_expression = NULL,
  min_expression = NULL)

Arguments

dataset

A data frame, containing features as rows and cells as columns, and where the mean expression value for each gene has been added as a column. Usually the output of calculate_cvs.

window_size

Number of features in the defined top window. Recommended to 100 features.

mean_expression

A number. Genes with a mean expression across cells higher than the value will be selected. Ignored if window_size is defined.

min_expression

A number. Genes with a minimum expression across all cells higher than the value will be selected. Ignored if window_size or mean_expression is defined.

Details

There are three selection methods available:

In general, it is advisable to avoid generating top windows larger than 250 features (100 features is the recommended value), to prevent excessively long computation time as well as to preserve the quality of the analysis, as the top window should only include a subset of reliable values.

Value

A list with two elements, both data frames: the defined top window, and the rest of the genes.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
library(magrittr)
expMat <- matrix(
    c(1, 1, 1,
      1, 2, 3,
      0, 1, 2,
      0, 0, 2),
    ncol = 3, byrow = TRUE, dimnames = list(paste("gene", 1:4), paste("cell", 1:3))
)

calculate_cvs(expMat) %>%
    define_top_genes(window_size = 2)

calculate_cvs(expMat) %>%
    define_top_genes(mean_expression = 1.5)

scFeatureFilter documentation built on Nov. 8, 2020, 7:49 p.m.