plotPCA4plink: Population outlier detection

View source: R/genotypeQC.R

plotPCA4plinkR Documentation

Population outlier detection

Description

Principle component analysis (PCA) on the genotype data is performed to detect population outliers, and the first two PCs are plotted for the visualization.

Usage

plotPCA4plink(
  gcta,
  inputPrefix,
  nThread = 20,
  outputPC4subjFile,
  outputPCplotFile
)

Arguments

gcta

an executable program in either the current working directory or somewhere in the command path.

inputPrefix

the prefix of the input PLINK binary files.

nThread

the number of threads used for computation. The default is 20.

outputPC4subjFile

the pure text file that stores all the subject IDs and their corresponding eigenvalues of the first two principle components.

outputPCplotFile

the plot file for visualizing the first two principle components of all investigated subjects.

Details

Before population outlier detection, it's better to perform QC on the genotype data. Only autosomal genotypes are used for principle component analysis.

Value

The output pure text file and plot file for storing first two principle components of study subjects.

Author(s)

Junfang Chen

Examples

 
## In the current working directory
bedFile <- system.file("extdata", "QCdata.bed", package="Gimpute")
bimFile <- system.file("extdata", "QCdata.bim", package="Gimpute") 
famFile <- system.file("extdata", "QCdata.fam", package="Gimpute")
system(paste0("scp ", bedFile, bimFile, famFile, " ."))  
inputPrefix <- "QCdata" 
outputPC4subjFile <- "2_13_eigenvalAfterQC.txt"
outputPCplotFile <- "2_13_eigenvalAfterQC.png" ## png format
## Not run: Requires an executable program GCTA, e.g.
## gcta <- "/home/tools/gcta64"
## plotPCA4plink(gcta, inputPrefix, nThread=20, 
##               outputPC4subjFile, outputPrefix)

transbioZI/Gimpute documentation built on April 10, 2022, 4:20 a.m.