Description Usage Arguments Details Value Examples
View source: R/Download_Preprocess.R
Pre-processes gene expression data from TCGA.
1 2 | Preprocess_GeneExpression(CancerSite, MAdirectories,
MissingValueThresholdGene = 0.3, MissingValueThresholdSample = 0.1)
|
CancerSite |
character of length 1 with TCGA cancer code. |
MAdirectories |
character vector with directories with the downloaded data. It can be the object returned by the Download_DNAmethylation function. |
MissingValueThresholdGene |
threshold for missing values per gene. Genes with a percentage of NAs greater than this threshold are removed. Default is 0.3. |
MissingValueThresholdSample |
threshold for missing values per sample. Samples with a percentage of NAs greater than this threshold are removed. Default is 0.1. |
Pre-process includes eliminating samples and genes with too many NAs, imputing NAs, and doing Batch correction.
List with the pre-processed data matrix for cancer and normal samples.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | ## Not run:
# Optional register cluster to run in parallel
library(doParallel)
cl <- makeCluster(5)
registerDoParallel(cl)
# Gene expression data for ovarian cancer
cancerSite <- "OV"
targetDirectory <- paste0(getwd(), "/")
# Downloading gene expression data
GEdirectories <- Download_GeneExpression(cancerSite, targetDirectory, TRUE)
# Processing gene expression data
GEProcessedData <- Preprocess_GeneExpression(cancerSite, GEdirectories)
# Saving gene expression processed data
saveRDS(GEProcessedData, file = paste0(targetDirectory, "GE_", cancerSite, "_Processed.rds"))
stopCluster(cl)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.