Description Usage Arguments Details Value Examples
View source: R/Download_Preprocess.R
Pre-processes DNA methylation data from TCGA.
1 2 | Preprocess_DNAmethylation(CancerSite, METdirectories,
MissingValueThreshold = 0.2)
|
CancerSite |
character of length 1 with TCGA cancer code. |
METdirectories |
character vector with directories with the downloaded data. It can be the object returned by the Download_DNAmethylation function. |
MissingValueThreshold |
threshold for removing samples or genes with missing values. |
Pre-process includes eliminating samples and genes with too many NAs, imputing NAs, and doing Batch correction. If there is both 27k and 450k data, and both data sets have more than 50 samples, we combine the data sets, by reducing the 450k data to the probes present in the 27k data, and bath correction is performed again to the combined data set. If there are samples with both 27k and 450k data, the 450k data is used and the 27k data is discarded, before the step mentioned above. If the 27k or the 450k data does not have more than 50 samples, we use the one with the greatest number of samples, we do not combine the data sets.
List with the pre-processed data matrix for cancer and normal samples.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 | ## Not run:
# Optional register cluster to run in parallel
library(doParallel)
cl <- makeCluster(5)
registerDoParallel(cl)
# Methylation data for ovarian cancer
cancerSite <- "OV"
targetDirectory <- paste0(getwd(), "/")
# Downloading methylation data
METdirectories <- Download_DNAmethylation(cancerSite, targetDirectory, TRUE)
# Processing methylation data
METProcessedData <- Preprocess_DNAmethylation(cancerSite, METdirectories)
# Saving methylation processed data
saveRDS(METProcessedData, file = paste0(targetDirectory, "MET_", cancerSite, "_Processed.rds"))
# Clustering methylation data
res <- ClusterProbes(METProcessedData[[1]], METProcessedData[[2]])
# Saving methylation clustered data
toSave <- list(METcancer = res[[1]], METnormal = res[[2]], ProbeMapping = res$ProbeMapping)
saveRDS(toSave, file = paste0(targetDirectory, "MET_", cancerSite, "_Clustered.rds"))
stopCluster(cl)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.