Description Usage Arguments Details Value Author(s) References See Also Examples
To calculate the sample eigenvectors using the specified SNP loadings
1 | snpgdsPCASampLoading(loadobj, gdsobj, sample.id=NULL, num.thread=1, verbose=TRUE)
|
loadobj |
the |
gdsobj |
a GDS file object ( |
sample.id |
a vector of sample id specifying selected samples; if NULL, all samples are used |
num.thread |
the number of CPU cores used |
verbose |
if TRUE, show information |
the sample.id
are usually different from the samples used in the calculation of SNP loadings.
Return a snpgdsPCAClass
object, and it is a list:
sample.id |
the sample ids used in the analysis |
snp.id |
the SNP ids used in the analysis |
eigenval |
eigenvalues |
eigenvect |
eigenvactors, “# of samples” x “eigen.cnt” |
TraceXTX |
the trace of the genetic covariance matrix |
Bayesian |
whether use bayerisan normalization |
Xiuwen Zheng
Patterson N, Price AL, Reich D (2006) Population structure and eigenanalysis. PLoS Genetics 2:e190.
Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D (2006) Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet. 38, 904-909.
Zhu, X., Li, S., Cooper, R. S., and Elston, R. C. (2008). A unified association analysis approach for family and unrelated samples correcting for stratification. Am J Hum Genet, 82(2), 352-365.
snpgdsPCA
, snpgdsPCACorr
, snpgdsPCASNPLoading
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | # open an example dataset (HapMap)
genofile <- openfn.gds(snpgdsExampleFileName())
sample.id <- read.gdsn(index.gdsn(genofile, "sample.id"))
PCARV <- snpgdsPCA(genofile, eigen.cnt=8)
SnpLoad <- snpgdsPCASNPLoading(PCARV, genofile)
# calculate sample eigenvectors from SNP loadings
SL <- snpgdsPCASampLoading(SnpLoad, genofile, sample.id=sample.id[1:100])
diff <- PCARV$eigenvect[1:100,] - SL$eigenvect
summary(c(diff))
# ~ ZERO
# close the genotype file
closefn.gds(genofile)
|
Loading required package: gdsfmt
SNPRelate -- supported by Streaming SIMD Extensions 2 (SSE2)
Hint: it is suggested to call `snpgdsOpen' to open a SNP GDS file instead of `openfn.gds'.
Principal Component Analysis (PCA) on genotypes:
Excluding 365 SNPs on non-autosomes
Excluding 1 SNP (monomorphic: TRUE, MAF: NaN, missing rate: NaN)
Working space: 279 samples, 8,722 SNPs
using 1 (CPU) core
PCA: the sum of all selected genotypes (0,1,2) = 2446510
CPU capabilities: Double-Precision SSE2
Thu Dec 7 09:10:20 2017 (internal increment: 408)
[..................................................] 0%, ETC: ---
[==================================================] 100%, completed in 0s
Thu Dec 7 09:10:20 2017 Begin (eigenvalues and eigenvectors)
Thu Dec 7 09:10:20 2017 Done.
Hint: it is suggested to call `snpgdsOpen' to open a SNP GDS file instead of `openfn.gds'.
SNP loading:
Working space: 279 samples, 8722 SNPs
using 1 (CPU) core
using the top 8 eigenvectors
SNP Loading: the sum of all selected genotypes (0,1,2) = 2446510
Thu Dec 7 09:10:20 2017 (internal increment: 3288)
[..................................................] 0%, ETC: ---
[==================================================] 100%, completed in 0s
Thu Dec 7 09:10:20 2017 Done.
Hint: it is suggested to call `snpgdsOpen' to open a SNP GDS file instead of `openfn.gds'.
Sample loading:
Working space: 100 samples, 8722 SNPs
using 1 (CPU) core
using the top 8 eigenvectors
Sample Loading: the sum of all selected genotypes (0,1,2) = 878146
Thu Dec 7 09:10:21 2017 (internal increment: 9172)
[..................................................] 0%, ETC: ---
[==================================================] 100%, completed in 0s
Thu Dec 7 09:10:21 2017 Done.
Min. 1st Qu. Median Mean 3rd Qu. Max.
-8.882e-16 -6.939e-17 -1.735e-18 2.873e-17 6.700e-17 3.553e-15
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.