View source: R/data_preprocess.R
exp_preprocess | R Documentation |
Preprocess expression data for network reconstruction
exp_preprocess(
exp,
NA_rm = TRUE,
replaceby = 0,
Zk_filtering = TRUE,
zk = -2,
cor_method = "spearman",
remove_nonexpressed = TRUE,
method = "median",
min_exp = 1,
min_percentage_samples = 0.25,
remove_confounders = TRUE,
variance_filter = FALSE,
n = NULL,
percentile = NULL,
vstransform = FALSE
)
exp |
A gene expression data frame with genes in row names and samples in column names or a 'SummarizedExperiment' object. |
NA_rm |
Logical. It specifies whether to remove missing values from the expression data frame or not. Default = TRUE. |
replaceby |
If NA_rm is TRUE, what to use instead of NAs. One of 0 or 'mean'. Default is 0. |
Zk_filtering |
Logical. It specifies whether to filter outlying samples by Zk or not. Default: TRUE. |
zk |
If Zk_filtering is TRUE, the standardized connectivity threshold. Samples below this threshold will be considered outliers. Default is -2. |
cor_method |
If Zk_filtering is TRUE, the correlation method to use. One of 'spearman', 'bicor', or 'pearson'. Default is 'spearman'. |
remove_nonexpressed |
Logical. It specifies whether non-expressed genes should be removed or not. Default is TRUE. |
method |
If remove_nonexpressed is TRUE, the criterion to filter non-expressed genes out. One of "mean", "median", "percentage", or "allsamples". Default is 'median'. |
min_exp |
If method is 'mean', 'median', or 'allsamples', the minimum value for a gene to be considered expressed. If method is 'percentage', the minimum value each gene must have in at least n percent of samples to be considered expressed. |
min_percentage_samples |
If method is 'percentage', expressed genes must have expression >= min_exp in at least this percentage. Values must range from 0 to 1. Default = 0.25. |
remove_confounders |
Logical. If TRUE, it removes principal components that add noise to the data. |
variance_filter |
Logical. If TRUE, it will filter genes by variance. Default is FALSE. |
n |
If variance_filter is TRUE, the number of most variable genes to keep. |
percentile |
If variance_filter is TRUE, the percentage of most variable genes to keep. |
vstransform |
Logical indicating if data should be variance stabilizing transformed. This parameter can only be set to TRUE if data is a matrix of raw read counts. |
Processed gene expression data frame with gene IDs in row names and sample names in column names or 'SummarizedExperiment' object.
Fabricio Almeida-Silva
Love, M. I., Huber, W., & Anders, S. (2014). Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome biology, 15(12), 1-21.
varianceStabilizingTransformation
data(zma.se)
exp <- exp_preprocess(zma.se, variance_filter=TRUE, n=1000)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.