snpgdsPCASampLoading: Project individuals onto existing principal component axes

View source: R/PCA.R

snpgdsPCASampLoadingR Documentation

Project individuals onto existing principal component axes

Description

To calculate the sample eigenvectors using the specified SNP loadings

Usage

snpgdsPCASampLoading(loadobj, gdsobj, sample.id=NULL, num.thread=1L,
    verbose=TRUE)

Arguments

loadobj

a snpgdsPCASNPLoadingClass or snpgdsEigMixSNPLoadingClass object returned from snpgdsPCASNPLoading

gdsobj

an object of class SNPGDSFileClass, a SNP GDS file

sample.id

a vector of sample id specifying selected samples; if NULL, all samples are used

num.thread

the number of CPU cores used

verbose

if TRUE, show information

Details

The sample.id are usually different from the samples used in the calculation of SNP loadings.

Value

Returns a snpgdsPCAClass object, and it is a list:

sample.id

the sample ids used in the analysis

snp.id

the SNP ids used in the analysis

eigenval

eigenvalues

eigenvect

eigenvactors, “# of samples” x “eigen.cnt”

TraceXTX

the trace of the genetic covariance matrix

Bayesian

whether use bayerisan normalization

Or returns a snpgdsEigMixClass object, and it is a list:

sample.id

the sample ids used in the analysis

snp.id

the SNP ids used in the analysis

eigenval

eigenvalues

eigenvect

eigenvactors, “# of samples” x “eigen.cnt”

afreq

allele frequencies

Author(s)

Xiuwen Zheng

References

Patterson N, Price AL, Reich D (2006) Population structure and eigenanalysis. PLoS Genetics 2:e190.

Zhu, X., Li, S., Cooper, R. S., and Elston, R. C. (2008). A unified association analysis approach for family and unrelated samples correcting for stratification. Am J Hum Genet, 82(2), 352-365.

See Also

snpgdsPCA, snpgdsPCACorr, snpgdsPCASNPLoading

Examples

# open an example dataset (HapMap)
genofile <- snpgdsOpen(snpgdsExampleFileName())

sample.id <- read.gdsn(index.gdsn(genofile, "sample.id"))

# first PCA
pca <- snpgdsPCA(genofile, eigen.cnt=8)
snp_load <- snpgdsPCASNPLoading(pca, genofile)

# calculate sample eigenvectors from SNP loadings
samp_load <- snpgdsPCASampLoading(snp_load, genofile, sample.id=sample.id[1:100])

diff <- pca$eigenvect[1:100,] - samp_load$eigenvect
summary(c(diff))
# ~ ZERO


# combine eigenvectors
allpca <- list(
    sample.id = c(pca$sample.id, samp_load$sample.id),
    snp.id = pca$snp.id,
    eigenval = c(pca$eigenval, samp_load$eigenval),
    eigenvect = rbind(pca$eigenvect, samp_load$eigenvect),
    varprop = c(pca$varprop, samp_load$varprop),
    TraceXTX = pca$TraceXTX
)
class(allpca) <- "snpgdsPCAClass"
allpca


# close the genotype file
snpgdsClose(genofile)

zhengxwen/SNPRelate documentation built on Nov. 19, 2024, 1:02 p.m.