snpgdsPCASNPLoading: SNP loadings in principal component analysis

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/PCA.R

Description

To calculate the SNP loadings in Principal Component Analysis

Usage

1
snpgdsPCASNPLoading(pcaobj, gdsobj, num.thread=1L, verbose=TRUE)

Arguments

pcaobj

a snpgdsPCAClass object returned from the function snpgdsPCA or a snpgdsEigMixClass from snpgdsEIGMIX

gdsobj

an object of class SNPGDSFileClass, a SNP GDS file

num.thread

the number of (CPU) cores used; if NA, detect the number of cores automatically

verbose

if TRUE, show information

Details

Calculate the SNP loadings (or SNP eigenvectors) from the principal component analysis conducted in snpgdsPCA.

Value

Returns a snpgdsPCASNPLoading object if pcaobj is snpgdsPCAClass, which is a list:

sample.id

the sample ids used in the analysis

snp.id

the SNP ids used in the analysis

eigenval

eigenvalues

snploading

SNP loadings, or SNP eigenvectors

TraceXTX

the trace of the genetic covariance matrix

Bayesian

whether use bayerisan normalization

avgfreq

two times allele frequency used in snpgdsPCA

scale

internal parameter

Or returns a snpgdsEigMixSNPLoadingClass object if pcaobj is snpgdsEigMixClass, which is a list:

sample.id

the sample ids used in the analysis

snp.id

the SNP ids used in the analysis

eigenval

eigenvalues

snploading

SNP loadings, or SNP eigenvectors

afreq

allele frequency

Author(s)

Xiuwen Zheng

References

Patterson N, Price AL, Reich D (2006) Population structure and eigenanalysis. PLoS Genetics 2:e190.

Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D (2006) Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet. 38, 904-909.

Zhu, X., Li, S., Cooper, R. S., and Elston, R. C. (2008). A unified association analysis approach for family and unrelated samples correcting for stratification. Am J Hum Genet, 82(2), 352-365.

See Also

snpgdsPCA, snpgdsEIGMIX, snpgdsPCASampLoading, snpgdsPCACorr

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
# open an example dataset (HapMap)
genofile <- snpgdsOpen(snpgdsExampleFileName())

PCARV <- snpgdsPCA(genofile, eigen.cnt=8)
SnpLoad <- snpgdsPCASNPLoading(PCARV, genofile)

names(SnpLoad)
# [1] "sample.id"  "snp.id"     "eigenval"   "snploading" "TraceXTX"
# [6] "Bayesian"   "avgfreq"    "scale"
dim(SnpLoad$snploading)
# [1]     8 8722

plot(SnpLoad$snploading[1,], type="h", ylab="PC 1")

# close the genotype file
snpgdsClose(genofile)

Example output

Loading required package: gdsfmt
SNPRelate -- supported by Streaming SIMD Extensions 2 (SSE2)
Principal Component Analysis (PCA) on genotypes:
Excluding 365 SNPs on non-autosomes
Excluding 1 SNP (monomorphic: TRUE, MAF: NaN, missing rate: NaN)
    # of samples: 279
    # of SNPs: 8,722
    using 1 thread
    # of principal components: 8
PCA:    the sum of all selected genotypes (0,1,2) = 2446510
CPU capabilities: Double-Precision SSE2
Fri Jun 18 11:24:25 2021    (internal increment: 408)

[..................................................]  0%, ETC: ---        
[==================================================] 100%, completed, 0s
Fri Jun 18 11:24:25 2021    Begin (eigenvalues and eigenvectors)
Fri Jun 18 11:24:25 2021    Done.
SNP Loading:
    # of samples: 279
    # of SNPs: 8,722
    using 1 thread
    using the top 8 eigenvectors
SNP Loading:    the sum of all selected genotypes (0,1,2) = 2446510
Fri Jun 18 11:24:25 2021    (internal increment: 3288)

[..................................................]  0%, ETC: ---        
[==================================================] 100%, completed, 0s
Fri Jun 18 11:24:25 2021    Done.
[1] "sample.id"  "snp.id"     "eigenval"   "snploading" "TraceXTX"  
[6] "Bayesian"   "avgfreq"    "scale"     
[1]    8 8722

SNPRelate documentation built on Nov. 8, 2020, 5:31 p.m.