snpgdsPCASNPLoading: SNP loadings in principal component analysis

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/PCA.r

Description

To calculate the SNP loadings in Principal Component Analysis

Usage

1
snpgdsPCASNPLoading(pcaobj, gdsobj, num.thread=1, verbose=TRUE)

Arguments

pcaobj

the snpgdsPCAClass object returned from the function snpgdsPCA

gdsobj

a GDS file object (gds.class)

num.thread

the number of CPU cores used

verbose

if TRUE, show information

Details

Calculate the SNP loadings (or SNP eigenvectors) from the principal component analysis conducted in snpgdsPCA.

Value

Return a snpgdsPCASNPLoading object, which is a list:

sample.id

the sample ids used in the analysis

snp.id

the SNP ids used in the analysis

eigenval

eigenvalues

snploading

the SNP loadings, or SNP eigenvectors

TraceXTX

the trace of the genetic covariance matrix

Bayesian

whether use bayerisan normalization

avefreq

the allele frequency used in snpgdsPCA

scale

internal parameter

Author(s)

Xiuwen Zheng

References

Patterson N, Price AL, Reich D (2006) Population structure and eigenanalysis. PLoS Genetics 2:e190.

Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D (2006) Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet. 38, 904-909.

Zhu, X., Li, S., Cooper, R. S., and Elston, R. C. (2008). A unified association analysis approach for family and unrelated samples correcting for stratification. Am J Hum Genet, 82(2), 352-365.

See Also

snpgdsPCA, snpgdsPCASampLoading, snpgdsPCACorr

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
# open an example dataset (HapMap)
genofile <- openfn.gds(snpgdsExampleFileName())

PCARV <- snpgdsPCA(genofile, eigen.cnt=8)
SnpLoad <- snpgdsPCASNPLoading(PCARV, genofile)

names(SnpLoad)
# [1] "sample.id"  "snp.id"     "eigenval"   "snploading" "TraceXTX"
# [6] "Bayesian"   "avefreq"    "scale"
dim(SnpLoad$snploading)
# [1]     8 8722

plot(SnpLoad$snploading[1,], type="h", ylab="PC 1")

# close the genotype file
closefn.gds(genofile)

Example output

Loading required package: gdsfmt
SNPRelate -- supported by Streaming SIMD Extensions 2 (SSE2)
Hint: it is suggested to call `snpgdsOpen' to open a SNP GDS file instead of `openfn.gds'.
Principal Component Analysis (PCA) on genotypes:
Excluding 365 SNPs on non-autosomes
Excluding 1 SNP (monomorphic: TRUE, MAF: NaN, missing rate: NaN)
Working space: 279 samples, 8,722 SNPs
    using 1 (CPU) core
PCA:    the sum of all selected genotypes (0,1,2) = 2446510
CPU capabilities: Double-Precision SSE2
Wed Feb 14 12:59:37 2018    (internal increment: 408)

[..................................................]  0%, ETC: ---    
[==================================================] 100%, completed in 0s
Wed Feb 14 12:59:37 2018    Begin (eigenvalues and eigenvectors)
Wed Feb 14 12:59:37 2018    Done.
Hint: it is suggested to call `snpgdsOpen' to open a SNP GDS file instead of `openfn.gds'.
SNP loading:
Working space: 279 samples, 8722 SNPs
    using 1 (CPU) core
    using the top 8 eigenvectors
SNP Loading:    the sum of all selected genotypes (0,1,2) = 2446510
Wed Feb 14 12:59:37 2018    (internal increment: 3288)

[..................................................]  0%, ETC: ---    
[==================================================] 100%, completed in 0s
Wed Feb 14 12:59:37 2018    Done.
[1] "sample.id"  "snp.id"     "eigenval"   "snploading" "TraceXTX"  
[6] "Bayesian"   "avgfreq"    "scale"     
[1]    8 8722

SNPRelate documentation built on May 2, 2019, 4:56 p.m.