lmPlot: Several diagnostic plots for checking p-value influencers

Description Usage Arguments Value Note Author(s) References Examples

View source: R/lmInfl.R


Seven different plot types that visualize p-value influencers.

1. lmPlot: plots the linear regression, marks the influencer(s) in red and displays trend lines for the full and leave-one-out (LOO) data set (black and red, respectively).
2. pvalPlot: plots the p-values for each LOO data point and displays the values as a full model/LOO model plot, together with the alpha border as defined in lmInfl.
3. inflPlot: plots dfbeta for slope, dffits, covratio, cooks.distance, leverage (hatvalues) and studentized residuals (rstudent) against the Δp-value. Herewith, changes in these six parameters can be compared to the effect on the corresponding drop/rise in p-value. The plots include vertical boundaries for threshold values as defined in the literature under 'References'.
4. slsePlot: plots all LOO-slopes and their standard errors together with the corresponding original model values and a t-value border as calculated by \mathit{Q_t}(1 - \frac{α}{2}, n-2). LOO of points on the right of this border result in a significant model, and vice versa.
5. threshPlot: plots the output of lmThresh, i.e. the regression plot including confidence/prediction intervals, as well as for each response value y_i the region in which the model is significant (green). This is tested for either i) y_i that are shifted into this region (newobs = FALSE in lmThresh) or ii) when a new observation y2_i is added (newobs = TRUE in lmThresh). In the latter case, it is informative if this region resides within the prediction interval (dashed line), indicating that a future additional measurement at x_i might reverse the significance statement.
6. multPlot: plots the output of lmMult as a point cloud of p-values for each 1...max sample removals and n combinations. All combinations for which the sample removal resulted in a significance reversal are colored in red, the percentages of these are given on top of the plot.
7. stabPlot: for single (to be selected) response values from the output of lmThresh, this function displays the region of significance reversal within the surrounding prediction interval. The probability of a either shifting the response value (if lmThresh(..., newobs = FALSE)) or of including a future (measurement) point (if lmThresh(..., newobs = TRUE)) to reverse the significance is shown as the integral between the "end of significance region" (eosr) and the nearest prediction interval boundary.

NOTE: The visual display should always be supplemented with the corresponding stability analysis.


lmPlot(infl, ...) 
pvalPlot(infl, ...) 
inflPlot(infl, ...)
slsePlot(infl, ...)
threshPlot(thresh, bands = FALSE, ...)
multPlot(mult, log = FALSE, ...)
stabPlot(stab, which = NULL, ...)



an object obtained from lmInfl.


an object obtained from lmThresh.


an object obtained from using stability on an lmThresh output.


logical. If TRUE, plots the confidence and prediction bands.


an object obtained from lmMult.


should the p-values be displayed on a logarithmic y-axis?


which response value should be shown in stabPlot?


other plotting parameters.


The corresponding plot.


Cut-off values for the different influence measures are the following:
dfbeta slope: | Δβ1_i | > 2/√{n}
dffits: | \mathrm{dffits}_i | > 2√{2/n}
covratio: |\mathrm{covr}_i - 1| > 3k/n
Cook's D: D_i > F(0.5, k, n - k); 4/(n - k)
leverage: h_{ii} > 2k/n
studentized residual: t_i > t(0.975, n - k - 1)


Andrej-Nikolai Spiess


Linear Regression Diagnostics.
Welsch RE & Kuh E.
Nber Computer Research Center (2017).

Applied Regression Analysis: A Research Tool.
Rawlings JO, Pantula SG, Dickey DA.
Springer; 2nd Corrected ed. 1998. Corr. 2nd printing 2001.

Applied Regression Analysis and Generalized Linear Models. Fox J. SAGE Publishing, 3rd ed, 2016.


## See Examples in 'lmInfl', 'lmThresh' and 'lmMult'.

anspiess/reverseR documentation built on May 30, 2018, 11:20 a.m.