qc: Quality control

Description Usage Arguments Details Value Author(s)

Description

qc a GUESSFM run

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
qc(object, data)

## S4 method for signature 'ppnsnp,missing'
qc(object)

## S4 method for signature 'snpmod,SnpMatrix'
qc(object, data)

## S4 method for signature 'list,ANY'
qc(object)

Arguments

object

snpmod object or object returned by pp.nsnp

data

SnpMatrix data for LD calculation if object is a snpmod

Details

With all genetic data, we use some QC measures to determine "bad" SNPs. The qc() functions in GUESSFM attempt to flag features that experience suggests is related to spurious differential SNP calls between cases and controls.

The function pp.nsnp generates a posterior distribution for the number of SNPs in a model. We expect this posterior distribution to have some right skew (as does the binomial or beta binomial prior) and be unimodal. Experience suggests that a posterior that does not have these properties may have favoured models with "bad" SNPs. Running qc on the object returned by pp.nsnp will flag these issues.

You can also call qc directly on a snpmod object. This may take a little longer, and attempts to estimate the maximum r squared between SNPs in any model. GUESS has a prior which should enforce that highly correlated SNPs are not both placed in a model. Sometimes it may be that two correlated SNPs are indeed required to model a trait, but experience with imputed data suggests that when a majority of models above a given size contain highly correlated SNPs, there is a problem with differential genotype calling which requires further investigation.

Value

data.frame of traits in pp.nsnp together with qc measures or data.frame of models and associated size and max r squared.

Author(s)

Chris Wallace

Chris Wallace


chr1swallace/GUESSFM documentation built on May 13, 2019, 6:17 p.m.