Test for normality of residuals

Share:

Description

One assumption when performing APV is that the residuals from the regressions are normally distributed. anota assesses this by comparing the Q-Q plots of the residuals to envelopes derived by sampling from the normal distribution.

Usage

1
2
3
anotaResidOutlierTest(anotaQcObj=NULL, confInt=0.01, iter=5,
generateSingleGraph=FALSE, nGraphs=200, generateSummaryGraph=TRUE,
residFitPlot=TRUE, useProgBar=TRUE)

Arguments

anotaQcObj

The object returned by anotaPerformQc.

confInt

Controls how many samples from the normal distribution will be used to generate the envelope to which the residuals are compared. Default is 0.01 which will generate 99 samples from the normal distribution to compare to the actual residuals.

iter

How many times should the analysis be performed? Default is 5 meaning that 5 sets of samples (each with the size controlled by confInt) will be generated. Notice that the summary plotting is only performed for the last set but the percentage of outliers for each iteration can be found in the output object.

generateSingleGraph

The analysis is performed per identifier and plots can be generated for each identifier. However, due to the high number of identifiers, a large number of plots will typically be generated. Default is FALSE.

nGraphs

If generateSingleGraph is set to TRUE, nGraphs controls for how many identifiers such single gene graphs will be generated.

generateSummaryGraph

The function can generate a summary graph that shows the envelopes generated by sampling from the normal distribution compared to the obtained values for all genes. Default is TRUE, thus the graph is generated but only from the last iteration.

residFitPlot

Generates an output of the fitted values and residuals. Default is TRUE, generate the plot.

useProgBar

Should the progress bar be shown. Default is TRUE, show progress bar.

Details

The anotaResidOutlierTest function assesses whether the residuals from the per identifier linear regressions of translationally active mRNA level~cytosolic mRNA level+phenoType are normally distributed. anota generates normal Q-Q plots of the residuals. If the residuals are normally distributed, the data quantiles will form a straight diagonal line from bottom left to top right. Because there are typically relatively few data points, anota calculates "envelopes" based on a set of samplings from the normal distribution using the same number of data points as for the true data (Venables and Ripley 1999).To enable a comparison both the actual and the sampled data are centered (mean=0) and scaled (sd=1). The data (both true and sampled) are then sorted and the true sample is compared to the envelopes of the sampled data at each sort position. The result is presented as a Q-Q plot of the true data where the envelopes of the sampled data are indicated. If there are 99 samplings we expect that 1/100 values to be outside the envelopes obtained from the samplings. Thus it is possible to assess if approximately the expected number of outlier residuals are obtained. The result is presented as both a graphical output and an output object.

Value

anotaResdiOutlierTest generates a graphical output ("ANOTA_residual_distribution_summary.pdf") showing the Q-Q plots from all genes as well as the envelopes from the sampled data. The obtained percentage of outliers is shown at each rank position and all combined. Optionally, when the generateSingleGraph is set to TRUE, the function also generates individual plots (stored as "ANOTA_residual_distributions_single.pdf") for n genes (set by nGraphs). When residFitPlot is set to TRUE an output comparing the fitted values to the residuals is generated (stored as "ANOTA_residuals_vs_fitted.jpeg"). An output list object with the following slots is also generated:

confInt

The selected confInt (see function arguments).

inputResiduals

The residuals used.

rnormIter

The number of sampled data sets.

outlierMatrixLog

A logical matrix describing which residuals were outliers in the last iteration of the analysis.

meanOutlierPerIteration

The fraction outliers per iteration.

obtainedComparedToExpected

The ratio of the expected number of outlier residuals compared to the expected number of outliers given the selected confInt.

nExpected

Number of expected outlier residuals.

nObtained

Number of obtained outliers residuals.

Author(s)

Ola Larsson ola.larsson@ki.se, Nahum Sonenberg nahum.sonenberg@mcgill.ca, Robert Nadon robert.nadon@mcgill.ca

Source

Modern Applied Statistics with S-PLUS. Venables, B.N. and Ripley, B.D., Springer. 1999

See Also

anotaPerformQc, anotaGetSigGenes, anotaPlotSigGenes

Examples

1
 ## See example for \code{\link{anotaPlotSigGenes}}