qq.chisq: Quantile-quantile plot for chi-squared tests In NikNakk/snpStats: SnpMatrix and XSnpMatrix classes and methods

Description

This function plots ranked observed chi-squared test statistics against the corresponding expected order statistics. It also estimates an inflation (or deflation) factor, lambda, by the ratio of the trimmed means of observed and expected values. This is useful for inspecting the results of whole-genome association studies for overdispersion due to population substructure and other sources of bias or confounding.

Usage

 ```1 2 3 4 5 6``` ```qq.chisq(x, df=1, x.max, main="QQ plot", sub=paste("Expected distribution: chi-squared (",df," df)", sep=""), xlab="Expected", ylab="Observed", conc=c(0.025, 0.975), overdisp=FALSE, trim=0.5, slope.one=FALSE, slope.lambda=FALSE, pvals=FALSE, thin=c(0.25,50), oor.pch=24, col.shade="gray", ...) ```

Arguments

 `x` A vector of observed chi-squared test values `df` The degreees of freedom for the tests `x.max` If present, truncate the observed value (Y) axis at `abs(x.max)`. If `x.max` is negative, the y-axis will extend to `abs(x.max)` even if the observed data do not `main` The main heading `sub` The subheading `xlab` x-axis label (default "Expected") `ylab` y-axis label (default "Observed") `conc` Lower and upper probability bounds for concentration band for the plot. Set this to `NA` to suppress this `overdisp` If `TRUE`, an overdispersion factor, lambda, will be estimated and used in calculating concentration band `trim` Quantile point for trimmed mean calculations for estimation of lambda. Default is to trim at the median `slope.one` Is a line of slope one to be superimpsed? `slope.lambda` Is a line of slope lambda to be superimposed? `pvals` Are P-values to be indicated on an axis drawn on the right-hand side of the plot? `thin` A pair of numbers indicating how points will be thinned before plotting (see Details). If `NA`, no thinning will be carried out `oor.pch` Observed values greater than `x.max` are plotted at `x.max`. This argument sets the plotting symbol to be used for out-of-range observations `col.shade` The colour with which the concentration band will be filled `...` Further graphical parameter settings to be passed to `points()`

Details

To reduce plotting time and the size of plot files, the smallest observed and expected points are thinned so that only a reduced number of (approximately equally spaced) points are plotted. The precise behaviour is controlled by the parameter `thin`, whose value should be a pair of numbers. The first number must lie between 0 and 1 and sets the proportion of the X axis over which thinning is to be applied. The second number should be an integer and sets the maximum number of points to be plotted in this section.

The "concentration band" for the plot is shown in grey. This region is defined by upper and lower probability bounds for each order statistic. The default is to use the 2.5 Note that this is not a simultaneous confidence region; the probability that the plot will stray outside the band at some point exceeds 95

When required, the dispersion factor is estimated by the ratio of the observed trimmed mean to its expected value under the chi-squared assumption.

Value

The function returns the number of tests, the number of values omitted from the plot (greater than `x.max`), and the estimated dispersion factor, lambda.

Note

All tests must have the same number of degrees of freedom. If this is not the case, I suggest transforming to p-values and then plotting -2log(p) as chi-squared on 2 df.

Author(s)

David Clayton [email protected]

References

Devlin, B. and Roeder, K. (1999) Genomic control for association studies. Biometrics, 55:997-1004

`single.snp.tests`, `snp.lhs.tests`, `snp.rhs.tests`
 `1` ```## See example the single.snp.tests() function ```