Description Usage Arguments Details Value Author(s) Examples
preprocess the raw single-cell data
1 2 3 4 5 6 7 8 9 10 | preprocess(
data,
clusternum = NULL,
takelog = TRUE,
logbase = 2,
pseudocount = 1,
minexpr_value = 1,
minexpr_percent = 0.5,
cvcutoff = 1
)
|
data |
The raw single_cell data, which is a numeric matrix or data.frame. Rows represent genes/features and columns represent single cells. |
clusternum |
The number of clusters for doing cluster, typically 5 percent of number of all genes. The clustering will be done after all the transformation and trimming. If NULL no clustering will be performed. |
takelog |
Logical value indicating whether to take logarithm |
logbase |
Numeric value specifiying base of logarithm |
pseudocount |
Numeric value to be added to the raw data when taking logarithm |
minexpr_value |
Numeric value specifying the minimum cutoff of log transformed (if takelog is TRUE) value |
minexpr_percent |
Numeric value specifying the lowest percentage of highly expressed cells (expression value bigger than minexpr_value) for the genes/features to be retained. |
cvcutoff |
Numeric value specifying the minimum value of coefficient of variance for the genes/features to be retained. |
This function first takes logarithm of the raw data and then filters out genes/features in which too many cells are low expressed. It also filters out genes/features with low coefficient of variance which indicates the genes/features does not contain much information. The default setting will first take log2 of the raw data after adding a pseudocount of 1. Then genes/features in which at least half of cells have expression values are greater than 1 and the coefficeints of variance across all cells are at least 1 are retained.
Matrix or data frame with the same format as the input dataset.
Zhicheng Ji, Hongkai Ji <zji4@zji4.edu>
1 2 | data(lpsdata)
procdata <- preprocess(lpsdata)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.