eigstrat: Perform Eigenstrat marker PCA

View source: R/eigstrat.R

eigstratR Documentation

Perform Eigenstrat marker PCA

Description

Perform Eigenstrat marker PCA

Usage

eigstrat(geno, snps = "rows", nvecs = 25)

Arguments

geno

A matrix of SNP data encoded in minor-allele dosage format: 2 = homozygous for minor allele 1 = heterozygous 0 = homozygous for major allele NA = missing data

snps

Either "rows" to indicate that SNPs are in rows and individuals are in columns, or "cols" to indicate the SNPs are in columns and individuals are in rows

nvecs

Either "all" to return all eigenvectors, or an integer to return the first n eigenvectors

Details

This function performs principle component analysis on a matrix of SNP data in the method of EIGENSTRAT (Price et al, 2006; doi:10.1038/ng1847). Note that while the user can select the number of eigenvectors to return, the function always returns all the eigenvalues.

The rate limiting step in this function is the calculation of a covariance matrix. This can be sped up by using the covar() function from the 'coop' package. The function will check if coop is installed, and use it if so, otherwise falling back on the base cov() function. Note that the performance of the coop package is dependent upon the BLAS library used on the system. On Linux systems, (in this example Ubuntu-flavored), OpenBlas can be installed and selected for use in a terminal:

sudo apt install libopenblas-dev sudo update-alternatives --config libblas.so.3

On Windows and Mac, Microsoft R Open may be used, as it includes the IntelMKL library.

For large datasets, it is probably a good idea to use a higher-performance method for performing PCA, such as the SNPRelate package, or else use Eigensoft, GCTA, PLINK, etc.

Value

A list with components

eigvecs

Data frame containing the eigenvectors of the input genotypic matrix. The number of output eigenvectors is specified by the nvecs argument

eigvals

Vector of eigenvalues of the input genotypic matrix


etnite/bwardr documentation built on Jan. 6, 2023, 7:12 a.m.