genoQC: Quality control for genotype data

View source: R/genotypeQC.R

genoQCR Documentation

Quality control for genotype data

Description

Perform quality control on the genotype data.

Usage

genoQC(
  plink,
  inputPrefix,
  snpMissCutOffpre = 0.05,
  sampleMissCutOff = 0.02,
  Fhet = 0.2,
  cutoffSubject,
  cutoffSNP,
  snpMissCutOffpost = 0.02,
  snpMissDifCutOff = 0.02,
  femaleChrXmissCutoff = 0.05,
  pval4autoCtl = 1e-06,
  pval4femaleXctl = 1e-06,
  outputPrefix,
  keepInterFile = TRUE
)

Arguments

plink

an executable program in either the current working directory or somewhere in the command path.

inputPrefix

the prefix of the input PLINK binary files.

snpMissCutOffpre

the cutoff of the missingness for removing SNPs before subject removal. The default is 0.05.

sampleMissCutOff

the cutoff of the missingness for removing subjects/instances. The default is 0.02.

Fhet

the cutoff of the autosomal heterozygosity deviation. The default is 0.2.

cutoffSubject

the cutoff determines that families (subjects) with more than the predefined cutoff of Mendel errors by considering all SNPs will be removed. The default is 0.05.

cutoffSNP

the cutoff indicates that SNPs with more than the predefined cutoff of Mendel error rate will be excluded (i.e. based on the number of trios/duos). The default is 0.1.

snpMissCutOffpost

the cutoff of the missingness for removing SNPs after subject removal. The default is 0.02.

snpMissDifCutOff

the cutoff of the difference in missingness between cases and controls. The default is 0.02.

femaleChrXmissCutoff

the cutoff of the missingness in female chromosome X SNPs. The default is 0.05.

pval4autoCtl

the p-value cutoff for controlling HWE test in either control or case subjects. Only autosomal SNPs are considered. The default is 0.000001

pval4femaleXctl

the p-value cutoff for controlling HWE test in female control subjects. Only chromosome X SNPs are considered. The default is 0.000001

outputPrefix

the prefix of the output PLINK binary files after QC.

keepInterFile

a logical value indicating if the intermediate processed files should be kept or not. The default is TRUE.

Details

The original PLINK files are implicitly processed by the following default steps: 1.) Set all heterozygous alleles of SNPs on male chrX as missing; 2.) SNP missingness < 0.05 (before sample removal); 3.) Subject missingness < 0.02; 4.) Remove subjects with |Fhet| >= 0.2; 5.) Reset paternal and maternal codes; 6.) SNP missingness < 0.02 (after sample removal); 7.) Remove SNPs with difference >= 0.02 of SNP missingness between cases and controls; 8.) Remove subjects or SNPs with Mendel errors for family based data. 9.) Remove chrX SNPs with missingness >= 0.05 in females. (Optional, if no chrX data); 10.) Remove autosomal SNPs with HWE p < 10-6 in controls; 11.) Remove chrX SNPs with HWE p < 10-6 in female controls. (Optional, if no chrX data).

Value

The output PLINK binary files after QC.

Author(s)

Junfang Chen

References

Schizophrenia Working Group of the Psychiatric Genomics, C. (2014). Biological insights from 108 schizophrenia-associated genetic loci. Nature 511(7510): 421-427.

Examples

 
## In the current working directory
bedFile <- system.file("extdata", "genoUpdatedData.bed", package="Gimpute")
bimFile <- system.file("extdata", "genoUpdatedData.bim", package="Gimpute") 
famFile <- system.file("extdata", "genoUpdatedData.fam", package="Gimpute")
system(paste0("scp ", bedFile, bimFile, famFile, " ."))  
inputPrefix <- "genoUpdatedData" 
outputPrefix <- "2_13_removedSnpHweFemaleX"  
## Not run: Requires an executable program PLINK, e.g.
## plink <- "/home/tools/plink"
## genoQC(plink, inputPrefix, 
##        snpMissCutOffpre=0.05, 
##        sampleMissCutOff=0.02, 
##        Fhet=0.2, cutoffSubject, cutoffSNP,
##        snpMissCutOffpost=0.02, 
##        snpMissDifCutOff=0.02,
##        femaleChrXmissCutoff=0.05, 
##        pval4autoCtl=0.000001, 
##        pval4femaleXctl=0.000001, 
##        outputPrefix, keepInterFile=TRUE)

transbioZI/Gimpute documentation built on April 10, 2022, 4:20 a.m.