scandal_preprocess: Scandal object preprocessing

Description Usage Arguments Details Value Author(s)

View source: R/preprocess.R

Description

Performs preprocessing of ScandalDataSet objects including breaking up the dataset into objects representing the specific samples, filtering out low quality cells and lowly expressed genes and log transforming.

Usage

1
2
3
4
5
6
scandal_preprocess(
  object,
  forced_genes_set = NULL,
  use_housekeeping_filter = FALSE,
  verbose = FALSE
)

Arguments

object

a ScandalDataSet object (the underlying object).

forced_genes_set

a vector of genes that should be included in the final processed object even if their expression is low with the exception of forced genes with absolute count equals to zero which will be filtered out. Default is NULL.

use_housekeeping_filter

should cells with low expression of house-keeping genes be filtered out. Default is FALSE.

verbose

suppresses all messages from this function. Default is FALSE.

Details

This function is the entry point for preprocessing a ScandalDataSet object to allow downstream analysis.

The main steps of preprocessing are as follows:

  1. Create a ScandalDataSet object for each individual (samples are identified by the names of the preproc_config_list components).

  2. Filtering out low quality cells (cells with low complexity) by summing-up for each cell (column) the number of genes with count greater than zero and removing the cells outside the complexity cutoff range configured for the specific sample in preproc_config_list.

  3. A possible step of filtering out cells with low expression of house-keeping genes, i.e. genes that are normally highly expressed in most cells (for example, genes that encode ribosomes). Cells with mean expression of HK genes less than the housekeeping cutoff configured for the specific sample in preproc_config_list will be removed.

  4. Filtering out lowly expressed genes i.e. genes with log2 mean expression less than the expression cutoff range configured for the specific sample in preproc_config_list.

  5. Log-transforming the expression data.

  6. The ScandalDataSet object is added to the childNodes slot of underlying object

  7. The underlying object is preprocessed in the same way described above beside cell filtering (the cells that passed qaulity control are aggregated from the childNodes objects) to allow plotting all cells together in downstream analysis.

Value

A processed ScandalDataSet object ready for analysis.

Author(s)

Avishay Spitzer


dravishays/scandal documentation built on Jan. 8, 2020, 1:30 p.m.