eigstrat: Perform Eigenstrat marker PCA

Description Usage Arguments Details Value

View source: R/eigstrat.R


Perform Eigenstrat marker PCA


eigstrat(geno, snps = "rows", nvecs = 25)



A matrix of SNP data encoded in minor-allele dosage format: 2 = homozygous for minor allele 1 = heterozygous 0 = homozygous for major allele NA = missing data


Either "rows" to indicate that SNPs are in rows and individuals are in columns, or "cols" to indicate the SNPs are in columns and individuals are in rows


Either "all" to return all eigenvectors, or an integer to return the first n eigenvectors


This function performs principle component analysis on a matrix of SNP data in the method of EIGENSTRAT (Price et al, 2006; doi:10.1038/ng1847). Note that while the user can select the number of eigenvectors to return, the function always returns all the eigenvalues.

The rate limiting step in this function is the calculation of a covariance matrix. This can be sped up by using the covar() function from the 'coop' package. The function will check if coop is installed, and use it if so, otherwise falling back on the base cov() function. Note that the performance of the coop package is dependent upon the BLAS library used on the system. On Linux systems, (in this example Ubuntu-flavored), OpenBlas can be installed and selected for use in a terminal:

sudo apt install libopenblas-dev sudo update-alternatives --config libblas.so.3

On Windows and Mac, Microsoft R Open may be used, as it includes the IntelMKL library.

For large datasets, it is probably a good idea to use a higher-performance method for performing PCA, such as the SNPRelate package, or else use Eigensoft, GCTA, PLINK, etc.


A list with components


Data frame containing the eigenvectors of the input genotypic matrix. The number of output eigenvectors is specified by the nvecs argument


Vector of eigenvalues of the input genotypic matrix

etnite/bwardr documentation built on April 14, 2021, 7:04 p.m.