selectFeatures: Find the most informative features (genes/transcripts) for...

Description Usage Arguments Details Value Examples

Description

This is a modification of the M3Drop method. Instead of fitting a Michaelis-Menten model to the log expression-dropout relation, we fit a linear model. Namely, the linear model is build on the log(expression) versus log(dropout) distribution. After fitting a linear model important features are selected as the top N residuals of the linear model.

Usage

1
2
3
4
5
6
7
selectFeatures(object, n_features = 500, suppress_plot = TRUE)

selectFeatures.SingleCellExperiment(object, n_features, suppress_plot)

## S4 method for signature 'SingleCellExperiment'
selectFeatures(object, n_features = 500,
  suppress_plot = TRUE)

Arguments

object

an object of SingleCellExperiment class

n_features

number of the features to be selected

suppress_plot

boolean parameter, which defines whether to plot log(expression) versus log(dropout) distribution for all genes. Selected features are highlighted with the red colour.

Details

Please note that feature_symbol column of rowData(object) must be present in the input object and should not contain any duplicated feature names. This column defines feature names used during projection. Feature symbols in the reference dataset must correpond to the feature symbols in the projection dataset, otherwise the mapping will not work!

Value

an object of SingleCellExperiment class with a new column in rowData(object) slot which is called scmap_features. It can be accessed by using as.data.frame(rowData(object))$scmap_features.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
library(SingleCellExperiment)
sce <- SingleCellExperiment(assays = list(normcounts = as.matrix(yan)), colData = ann)
# this is needed to calculate dropout rate for feature selection
# important: normcounts have the same zeros as raw counts (fpkm)
counts(sce) <- normcounts(sce)
logcounts(sce) <- log2(normcounts(sce) + 1)
# use gene names as feature symbols
rowData(sce)$feature_symbol <- rownames(sce)
# remove features with duplicated names
sce <- sce[!duplicated(rownames(sce)), ]
sce <- selectFeatures(sce)

hemberg-lab/scmap documentation built on Nov. 29, 2020, 1:06 p.m.