data_prepare: Data Prepare

View source: R/data_prepare.R

data_prepareR Documentation

Data Prepare

Description

Prepares the data for further analysis

Usage

data_prepare(
  file,
  most.variables = 0,
  lower = 0,
  upper = 0,
  normalize = TRUE,
  write = FALSE,
  verbose = TRUE,
  plot = FALSE
)

Arguments

file

a string for the scRNAseq data file

most.variables

a number

lower

a number in [0,1], low quantile threshold

upper

a number in [0,1], high quantile threshold

normalize

a logical, if TRUE, then computes 99th percentile normalization

write

a logical

verbose

a logical

plot

a logical

Details

'file' is the path to the file containing the read or UMI count matrix the user wants to analyze.

'most.variables' can be set to N to select the Nth most variables genes. This option allows the user to use a reduced matrix (N x number of cells) to perform the clustering step faster.

'lower' and 'upper' are used to remove the genes whose average counts are outliers. The values of these arguments are fractions of the total number of genes and hence must be between 0 and 1. Namely, if 'lower = 0.05', then the function removes the 5 removes the 5

If 'normalize' is FALSE, then the function skips the 99th percentile normalization and the log transformation.

If 'write' is TRUE, then the function writes two text files. One for the normalized and gene thresholded read counts table and another one for the genes that passed the lower and upper threshold. Note that the length of the genes vector written in the *genes.txt* file is equal to the number of rows of the table of read counts written in the *data.txt* file.

Value

The function returns a data frame of filtered and/or normalized data with genes as row names.

Examples

file <- system.file("scRNAseq_dataset.txt",package = "SingleCellSignalR")
data <- data_prepare(file = file)

SCA-IRCM/SingleCellSignalR documentation built on Dec. 11, 2022, 2:30 p.m.