pre_proc_data: Pre-process Data

Description Usage Arguments Value Examples

Description

This function pre-process the data so that SparseDC can be applied. SparseDC requires data that have been normalized for sequencing depth, log-transformed and centralized on a gene-by-gene basis. For the sequencing depth normalization we recommend that users use one of the many methods developed for normalizing scRNA-Seq data prior to using SparseDC and so can set norm = FALSE. However, here we normalize the data by dividing by the total number of reads. This function log transforms the data by applying log(x + 1) to each of the data sets. By far the most important pre-processing step for SparseDC is the centralization of the data. Having centralized data is a core component of the SparseDC algorithm and is necessary for both accurate clustering of the cells and identifying marker genes. We therefore recommend that all users centralize their data using this function and that only experienced users set center = FALSE.

Usage

1
pre_proc_data(dat1, dat2, norm = TRUE, log = TRUE, center = TRUE)

Arguments

dat1

The data for the first condition with samples (cells) as columns and features (genes) as rows.

dat2

The data for the second condition with samples (cells) as columns and features (genes) as rows.

norm

This parameter controls whether the data is normalized for sequencing depth by dividing each column by the total number of reads for that sample. We recommend that user use one of the many methods for normalizing scRNA-Seq data and so set this as FALSE. The default value is TRUE

log

This parameter controls whether the data is transformed using log(x + 1). The default value is TRUE.

center

This parameter controls whether the data is centered on a gene by gene basis. We recommend all users center their data prior to applying SparseDC and only experienced users should set this as FALSE. The default value is TRUE.

Value

This function returns the two pre-processed datasets stored as a list

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
set.seed(10)
# Select small dataset for example
data_test <- data_biase[1:100,]
# Split data into condition A and B
data_A <- data_test[ , which(condition_biase == "A")]
data_B <- data_test[ , which(condition_biase == "B")]
# Pre-process the data
pre_data <- pre_proc_data(data_A, data_B, norm = FALSE, log = TRUE,
center = TRUE)
# Extract Data
pdata_A <- pre_data[[1]]
pdata_B <- pre_data[[2]]

SparseDC documentation built on May 2, 2019, 9:29 a.m.