getTCGA: Import TCGA data into R and create the appropriate R object

Description Usage Arguments Details Value Author(s) Examples

View source: R/portal.R

Description

This function is the main user-level function in the tcgaR package. It downloads files from the TCGA portal for methylation and expression data and create the corresponding R objects via the minfi package.

Usage

1
getTCGA(cancer, datatype = c("methylation", "expression"), platform = c("450k", "27k"), idat=FALSE, idatDir=NULL, verbose=FALSE, n.samples=NULL, return=TRUE)

Arguments

cancer

An object of class RGChannelSet.

datatype

String indicating what data type should be imported. Should be either methylation or expression.

platform

String indicating which methylation platform should be used. Should be either 450k or 27k.

idat

Should the IDAT files be downloaded and saved on the disk?

idatDir

Directory in which the IDAT files will be saved if idat=TRUE

verbose

Should the function be verbose?

n.samples

Maximum number of samples downloaded for the cancer type. Mostly used for testing.

return

Should an R object be returned?

Details

This function implements functional normalization preprocessing for Illumina methylation microarrays. Functional normalization extends the idea of quantile normalization by adjusting for known covariates measuring unwanted variation. For the 450k array, the first k principal components of the internal control probes matrix play the role of the covariates adjusting for technical variation. The number k of principal components can be set by the argument nPCs. By default nPCs is set to 2, and have been shown to perform consistently well across different datasets. This parameter should only be modified by expert users. The normalization procedure is applied to the Meth and Unmeth intensities separately, and to type I and type II signals separately. For the probes on the X and Y chromosomes we normalize males and females separately using the gender information provided in the sex argument. For the Y chromosome, standard quantile normalization is used due to the small number of probes, which results in instability for functional normalization. If sex is unspecified (NULL), a guess is made using by the getSex function using copy number information. Note that this algorithm does not rely on any assumption and therefore can be be applicable for cases where global changes are expected such as in cancer-normal comparisons or tissue differences.

Value

If return is TRUE, an object of class RGChannelSet for 450k array data, and an object of class MethylSet for 27k array data. If idat is TRUE, the raw files are saved to the disk.

Author(s)

Jean-Philippe Fortin jfortin@jhsph.edu,

Examples

1
2
3
4
## Not run: 
  obj <- getTCGA("coad", datatype="methylation", platform="27k", n.samples=10)

## End(Not run)

Jfortin1/tcgaR documentation built on May 7, 2019, 10:40 a.m.