Description Usage Arguments Details Value Author(s) References See Also Examples
To calculate the SNP correlations between eigenvactors and SNP genotypes
1 2 |
pcaobj |
a |
gdsobj |
an object of class |
snp.id |
a vector of snp id specifying selected SNPs; if NULL, all SNPs are used |
eig.which |
a vector of integers, to specify which eigenvectors to be used |
num.thread |
the number of (CPU) cores used; if |
with.id |
if |
outgds |
|
verbose |
if TRUE, show information |
If an output file name is specified via outgds
, "sample.id",
"snp.id" and "correlation" will be stored in the GDS file. The GDS node
"correlation" is a matrix of correlation coefficients, and it is stored with
the format of packed real number ("packedreal16" preserving 4 digits, 0.0001
is the smallest number greater zero, see add.gdsn).
Return a list if outgds=NULL
,
sample.id |
the sample ids used in the analysis |
snp.id |
the SNP ids used in the analysis |
snpcorr |
a matrix of correlation coefficients, "# of eigenvectors" x "# of SNPs" |
Xiuwen Zheng
Patterson N, Price AL, Reich D (2006) Population structure and eigenanalysis. PLoS Genetics 2:e190.
snpgdsPCA
, snpgdsPCASampLoading
,
snpgdsPCASNPLoading
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | # open an example dataset (HapMap)
genofile <- snpgdsOpen(snpgdsExampleFileName())
# get chromosome index
chr <- read.gdsn(index.gdsn(genofile, "snp.chromosome"))
pca <- snpgdsPCA(genofile)
cr <- snpgdsPCACorr(pca, genofile, eig.which=1:4)
plot(abs(cr$snpcorr[3,]), xlab="SNP Index", ylab="PC 3", col=chr)
# output to a gds file if limited memory
snpgdsPCACorr(pca, genofile, eig.which=1:4, outgds="test.gds")
(f <- openfn.gds("test.gds"))
m <- read.gdsn(index.gdsn(f, "correlation"))
closefn.gds(f)
# check
summary(c(m - cr$snpcorr)) # should < 1e-4
# close the file
snpgdsClose(genofile)
# delete the temporary file
unlink("test.gds", force=TRUE)
|
Loading required package: gdsfmt
SNPRelate -- supported by Streaming SIMD Extensions 2 (SSE2)
Principal Component Analysis (PCA) on genotypes:
Excluding 365 SNPs on non-autosomes
Excluding 1 SNP (monomorphic: TRUE, MAF: NaN, missing rate: NaN)
# of samples: 279
# of SNPs: 8,722
using 1 thread
# of principal components: 32
PCA: the sum of all selected genotypes (0,1,2) = 2446510
CPU capabilities: Double-Precision SSE2
Wed Feb 17 17:09:36 2021 (internal increment: 408)
[..................................................] 0%, ETC: ---
[==================================================] 100%, completed, 0s
Wed Feb 17 17:09:36 2021 Begin (eigenvalues and eigenvectors)
Wed Feb 17 17:09:36 2021 Done.
SNP Correlation:
# of samples: 279
# of SNPs: 9,088
using 1 thread
Correlation: the sum of all selected genotypes (0,1,2) = 2553065
Wed Feb 17 17:09:36 2021 (internal increment: 3288)
[..................................................] 0%, ETC: ---
[==================================================] 100%, completed, 0s
Wed Feb 17 17:09:36 2021 Done.
SNP Correlation:
# of samples: 279
# of SNPs: 9,088
using 1 thread
Creating 'test.gds' ...
Correlation: the sum of all selected genotypes (0,1,2) = 2553065
Wed Feb 17 17:09:36 2021
[..................................................] 0%, ETC: ---
[==================================================] 100%, completed, 0s
Wed Feb 17 17:09:36 2021 Done.
File: /work/tmp/test.gds (66.4K)
+ [ ]
|--+ sample.id { Str8 279 LZMA_ra(30.5%), 701B }
|--+ snp.id { Int32 9088 LZMA_ra(10.1%), 3.6K }
\--+ correlation { PackedReal16 4x9088 LZMA_ra(86.4%), 61.3K }
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
-5e-05 -2e-05 0e+00 0e+00 3e-05 5e-05 32
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.