PC.correct: Correct a big.matrix by principle components

Description Usage Arguments Value Author(s) See Also Examples

View source: R/bigpca.R

Description

Principle components (PC) can be used as a way of capturing bias (when common variance represents bias) and so PC correction is a way to remove such bias from a dataset. Using the first 'n' PCs from an an analysis performed using big.PCA(), this function will transform the original matrix by regressing onto the 'n' principle components (and optionally gender) and returing the residuals. The result is returned as a big.matrix object, so that objects larger than available RAM can be processed, and multiple processors can be utilised for greater speed for large datasets.

Usage

1
2
3
4
5
PC.correct(pca.result, bigMat, dir = getwd(), num.pcs = 9, n.cores = 1,
  pref = "corrected", big.cor.fn = NULL, write = FALSE,
  sample.info = NULL, correct.sex = FALSE, add.int = FALSE,
  preserve.median = FALSE, tracker = TRUE, verbose = TRUE,
  delete.existing = getOption("deleteFileBacked"))

Arguments

pca.result

result returned by 'big.PCA()', or a list with 2 elements containing the principle components and the eigenvalues respectively (or SVD equivalents). Alternatively, can be the name of an R binary file containing such an object.

bigMat

a big.matrix with exactly corresponding samples (columns) to those submitted to PCA prior to correction

dir

directory containing the big.matrix backing file

num.pcs

number of principle components (or SVD components) to correct for

n.cores

number of cores to use in parallel for processing

pref

prefix to add to the file name of the resulting corrected matrix backing file

big.cor.fn

instead of using 'pref' directly specify the desired file name

write

whether to write the result to a file.backed big.matrix or to simply return a pointer to the resulting corrected big.matrix

sample.info

if using 'correct.sex=TRUE' then this object should be a dataframe containing the sex of each sample, with sample names as rownames

correct.sex

if sample.info is a dataframe containing a column named 'gender' or 'sex' (case insensitive), then add a sex covariate to the PC correction linear model

add.int

logical, whether to maintain the pre-corrected means of each variable, i.e, post-correction add the mean back onto the residuals which will otherwise have mean zero for each variable.

preserve.median

logical, if add.int=TRUE, then setting this parameter to TRUE will preserve the median of the original data, instead of the mean. This is because after PC-correction the skew may change.

tracker

logical, whether to display a progress bar

verbose

logical, whether to display preview of pre- and post- corrected matrix

delete.existing

logical, whether to automatically delete filebacked matrices (if they exist) before rewriting. This is because of an update since 20th October 2015 where bigmemory won't allow overwrite of an existing filebacked matrix. If you wish to set this always TRUE or FALSE, use options(deleteFileBacked)

Value

A big.matrix of the same dimensions as original, corrected for n PCs and an optional covariate (sex)

Author(s)

Nicholas Cooper

See Also

big.PCA

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
orig.dir <- getwd(); setwd(tempdir()); # move to temporary dir
if(file.exists("testMyBig.bck")) { unlink(c("testMyBig.bck","testMyBig.dsc")) }
mat2 <- sim.cor(500,200,genr=function(n){ (runif(n)/2+.5) })
bmat2 <- as.big.matrix(mat2,backingfile="testMyBig.bck",
 descriptorfile="testMyBig.dsc",  backingpath = getwd())
## calculate PCA ##
 result2 <- big.PCA(bmat2,thin=FALSE)
corrected <- PC.correct(result2,bmat2)
corrected2 <- PC.correct(result2,bmat2,n.cores=2)
c1 <- get.big.matrix(corrected) ; c2 <- get.big.matrix(corrected2)
all.equal(as.matrix(c1),as.matrix(c2)) 
rm(bmat2) 
unlink(c("testMyBig.bck","testMyBig.dsc"))
setwd(orig.dir) # reset working dir to original

bigpca documentation built on Nov. 22, 2017, 1:02 a.m.