Description Usage Arguments Value Author(s) See Also Examples
Principle components (PC) can be used as a way of capturing bias (when common variance represents bias) and so PC correction is a way to remove such bias from a dataset. Using the first 'n' PCs from an an analysis performed using big.PCA(), this function will transform the original matrix by regressing onto the 'n' principle components (and optionally gender) and returing the residuals. The result is returned as a big.matrix object, so that objects larger than available RAM can be processed, and multiple processors can be utilised for greater speed for large datasets.
1 2 3 4 5 |
pca.result |
result returned by 'big.PCA()', or a list with 2 elements containing the principle components and the eigenvalues respectively (or SVD equivalents). Alternatively, can be the name of an R binary file containing such an object. |
bigMat |
a big.matrix with exactly corresponding samples (columns) to those submitted to PCA prior to correction |
dir |
directory containing the big.matrix backing file |
num.pcs |
number of principle components (or SVD components) to correct for |
n.cores |
number of cores to use in parallel for processing |
pref |
prefix to add to the file name of the resulting corrected matrix backing file |
big.cor.fn |
instead of using 'pref' directly specify the desired file name |
write |
whether to write the result to a file.backed big.matrix or to simply return a pointer to the resulting corrected big.matrix |
sample.info |
if using 'correct.sex=TRUE' then this object should be a dataframe containing the sex of each sample, with sample names as rownames |
correct.sex |
if sample.info is a dataframe containing a column named 'gender' or 'sex' (case insensitive), then add a sex covariate to the PC correction linear model |
add.int |
logical, whether to maintain the pre-corrected means of each variable, i.e, post-correction add the mean back onto the residuals which will otherwise have mean zero for each variable. |
preserve.median |
logical, if add.int=TRUE, then setting this parameter to TRUE will preserve the median of the original data, instead of the mean. This is because after PC-correction the skew may change. |
tracker |
logical, whether to display a progress bar |
verbose |
logical, whether to display preview of pre- and post- corrected matrix |
delete.existing |
logical, whether to automatically delete filebacked matrices (if they exist) before rewriting. This is because of an update since 20th October 2015 where bigmemory won't allow overwrite of an existing filebacked matrix. If you wish to set this always TRUE or FALSE, use options(deleteFileBacked) |
A big.matrix of the same dimensions as original, corrected for n PCs and an optional covariate (sex)
Nicholas Cooper
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | orig.dir <- getwd(); setwd(tempdir()); # move to temporary dir
if(file.exists("testMyBig.bck")) { unlink(c("testMyBig.bck","testMyBig.dsc")) }
mat2 <- sim.cor(500,200,genr=function(n){ (runif(n)/2+.5) })
bmat2 <- as.big.matrix(mat2,backingfile="testMyBig.bck",
descriptorfile="testMyBig.dsc", backingpath = getwd())
## calculate PCA ##
result2 <- big.PCA(bmat2,thin=FALSE)
corrected <- PC.correct(result2,bmat2)
corrected2 <- PC.correct(result2,bmat2,n.cores=2)
c1 <- get.big.matrix(corrected) ; c2 <- get.big.matrix(corrected2)
all.equal(as.matrix(c1),as.matrix(c2))
rm(bmat2)
unlink(c("testMyBig.bck","testMyBig.dsc"))
setwd(orig.dir) # reset working dir to original
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.