Description Usage Arguments Details Value Author(s) References See Also Examples
View source: R/gPCA.batchdetect.R
Tests for batch effects an n \times p data set with batch vector given by batch
using the δ
statistic resulting from guided principal componenets analysis (gPCA).
1 2 | gPCA.batchdetect(x, batch, filt = NULL, nperm = 1000, center = FALSE, scaleY=FALSE,
seed = NULL)
|
x |
an n x p matrix of data where n denotes observations and p denotes the number of features (e.g. probe, gene, SNP, etc.). |
batch |
a length n vector that indicates batch (group or class) for each observation. |
filt |
(optional) the number of features to retain after applying a variance filter. If NULL, no filter is applied. Filtering can significantly reduce the processing time in the case of very large data sets. |
nperm |
the number of permutations to perform for the permutation test, default is 1000. |
center |
(logical) Is your data |
scaleY |
(logical) Do you want to scale the |
seed |
the seed number for |
Guided principal components analysis (gPCA) is an extension of principal components analysis (PCA) that guides the singular value decomposition (SVD) of PCA by applying SVD to \mathbf{Y}'\mathbf{X} where \mathbf{Y} is a n \times b batch indicator matrix of ones when an observation i (i=1,…,n) is in batch b and zeros otherwise.
The test statistic δ along with a one-sided p-value results from a gPCA.batchdetect()
call,
along with the values of δ_p from the permutation test. The δ_p values can be used to visualize
the permutation distribution of your test using the gDist
function. For more information on gPCA, please
see reese.
delta |
test statistic δ from gPCA. |
p.val |
p-value associated with δ resulting from gPCA. |
delta.p |
|
batch |
returns your length n batch vector. |
filt |
returns the number of features the variance filter retained. |
n |
the number of observations |
p |
the number of features |
b |
the number of batches |
PCu |
principal component matrix from unguided PCA. |
PCg |
principal component matrix from gPCA. |
varPCu1 |
the proportion out of the total variance associated with the first principal component of unguided PCA. |
varPCg1 |
the proportion out of the total variance associated with the first principal component of gPCA. |
cumulative.var.u |
length n vector of the cumulative variance of the i=1,…,n principal components from unguided PCA. |
cumulative.var.g |
length b vector of the cumulative variance of the k=1,…,b principal components from gPCA. |
Sarah Reese reesese@vcu.edu
Reese, S. E., Archer, K. J., Therneau, T. M., Atkinson, E. J., Vachon, C. M., de Andrade, M., Kocher, J. A., and Eckel-Passow, J. E. A new statistic for identifying batch effects in high-throughput genomic data that uses guided principal components analysis. Bioinformatics, (in review).
gDist
, PCplot
, CumulativeVarPlot
,
1 2 3 4 5 6 7 8 9 10 11 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.