runGprimeAnalysis: Identify QTL using a smoothed G statistic
In bmansfeld/QTLseqr: QTL mapping using Bulk Segregant Analysis of Next Generation Sequencing data.

Description Usage Arguments Value Examples

A wrapper for all the functions that perform the full G prime analysis to identify QTL. The following steps are performed:
1) Genome-wide G statistics are calculated by getG.
G is defined by the equation:

G = 2 * ∑ n_i * ln(obs(n_i)/exp(n_i))

Where for each SNP, n_i from i = 1 to 4 corresponds to the reference and alternate allele depths for each bulk, as described in the following table:

Allele	High Bulk	Low Bulk
Reference	n_1	n_2
Alternate	n_3	n_4

...and obs(n_i) are the observed allele depths as described in the data frame. getG calculates the G statistic using expected values assuming read depth is equal for all alleles in both bulks:

exp(n_1) = ((n_1 + n_2)*(n_1 + n_3))/(n_1 + n_2 + n_3 + n_4)

exp(n_2) = ((n_2 + n_1)*(n_2 + n_4))/(n_1 + n_2 + n_3 + n_4)

exp(n_3) = ((n_3 + n_1)*(n_3 + n_4))/(n_1 + n_2 + n_3 + n_4)

exp(n_4) = ((n_4 + n_2)*(n_4 + n_3))/(n_1 + n_2 + n_3 + n_4)

2) G' - A tricube-smoothed G statistic is predicted by local regression within each chromosome using tricubeStat. This works as a weighted average across neighboring SNPs that accounts for Linkage disequilibrium (LD) while minizing noise attributed to SNP calling errors. G values for neighboring SNPs within the window are weighted by physical distance from the focal SNP.

3) P-values are estimated based using the non-parametric method described by Magwene et al. 2011 with the function getPvals. Breifly, using the natural log of Gprime a median absolute deviation (MAD) is calculated. The Gprime set is trimmed to exclude outlier regions (i.e. QTL) based on Hampel's rule. An alternate method for filtering out QTL is proposed using absolute delta SNP indeces greater than 0.1 to filter out potential QTL. An estimation of the mode of the trimmed set is calculated using the mlv function from the package modeest. Finally, the mean and variance of the set are estimated using the median and mode and p-values are estimated from a log normal distribution.

4) Negative Log10- and Benjamini-Hochberg adjusted p-values are calculated using p.adjust

1 2	runGprimeAnalysis(SNPset, windowSize = 1e+06, outlierFilter = "deltaSNP", filterThreshold = 0.1, ...)

`SNPset`	Data frame SNP set containing previously filtered SNPs
`windowSize`	the window size (in base pairs) bracketing each SNP for which to calculate the statitics. Magwene et. al recommend a window size of ~25 cM, but also recommend optionally trying several window sizes to test if peaks are over- or undersmoothed.
`outlierFilter`	one of either "deltaSNP" or "Hampel". Method for filtering outlier (ie QTL) regions for p-value estimation
`filterThreshold`	The absolute delta SNP index to use to filter out putative QTL (default = 0.1)
`...`	Other arguments passed to `locfit` and subsequently to `locfit.raw`() (or the lfproc). Usefull in cases where you get "out of vertex space warnings"; Set the maxk higher than the default 100. See `locfit.raw`(). But if you are getting that warning you should seriously consider increasing your window size.

The supplied SNP set tibble after G' analysis. Includes five new columns:

G - The G statistic for each SNP
Gprime - The tricube smoothed G statistic based on the supplied window size
pvalue - the pvalue at each SNP calculatd by non-parametric estimation
negLog10Pval - the -Log10(pvalue) supplied for quick plotting
qvalue - the Benajamini-Hochberg adjusted p-value

1	df_filt <- runGprimeAnalysis(df_filt,windowSize = 2e6,outlierFilter = "deltaSNP")

bmansfeld/QTLseqr documentation built on Jan. 24, 2020, 3:56 p.m.

bmansfeld/QTLseqr index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

bmansfeld/QTLseqr
QTL mapping using Bulk Segregant Analysis of Next Generation Sequencing data.

runGprimeAnalysis: Identify QTL using a smoothed G statistic
In bmansfeld/QTLseqr: QTL mapping using Bulk Segregant Analysis of Next Generation Sequencing data.

Description

Usage

Arguments

Value

Examples

Related to runGprimeAnalysis in bmansfeld/QTLseqr...

R Package Documentation

Browse R Packages

We want your feedback!

bmansfeld/QTLseqr QTL mapping using Bulk Segregant Analysis of Next Generation Sequencing data.

runGprimeAnalysis: Identify QTL using a smoothed G statistic In bmansfeld/QTLseqr: QTL mapping using Bulk Segregant Analysis of Next Generation Sequencing data.

Description

Usage

Arguments

Value

Examples

Related to runGprimeAnalysis in bmansfeld/QTLseqr...

R Package Documentation

Browse R Packages

We want your feedback!

bmansfeld/QTLseqr
QTL mapping using Bulk Segregant Analysis of Next Generation Sequencing data.

runGprimeAnalysis: Identify QTL using a smoothed G statistic
In bmansfeld/QTLseqr: QTL mapping using Bulk Segregant Analysis of Next Generation Sequencing data.