stability | R Documentation |
This function calculates stability values for LOO (lmInfl
), and response value shifting/addition (lmThresh
).
stability(x, pval = FALSE, ...)
x |
a result of either |
pval |
logical. If |
... |
other parameters, not yet implemented. |
For results of lmInfl
:
A [0, 1]-bounded stability measure S = 1-\frac{n}{N}, with n = number of influencers (significance reversers) and N = total number of response values.
For results of lmThresh
:
A [0, 1]-bounded stability measure S = 1-\frac{n}{N}, with n = number of response values where one of the ends of the significance region is within the prediction interval and N = total number of response values.
If pval = TRUE
, the exact p-value is calculated in the following manner:
1) Mean square error (MSE) and prediction standard error (se) are calculated from the linear model:
\mathrm{MSE} = ∑_{i=1}^n \frac{(y_i - \hat{y}_i)^2}{n-2} \quad\quad \mathrm{se}_i = √{\mathrm{MSE} \cdot ≤ft(1 + \frac{1}{n} + \frac{(x_i - \bar{x}_i)^2}{∑_{i=1}^n (x_i - \bar{x}_i)^2}\right)}
2) Upper and lower prediction intervals boundaries are calculated for each \hat{y}_i:
\hat{y}_i \pm Q_t(α/2, n-2) \cdot \rm{se}_i
The prediction interval around \hat{y}_i is a scaled/shifted t-distribution with density function
P_{tss}(y, n-2) = \frac{1}{\rm{se}_i} \cdot P_t≤ft(\frac{y - \hat{y}_i}{\rm{se}_i}, n-2\right)
, where P_t is the density function of the central, unit-variance t-distribution.
3) The probability of either shifting the response value (if lmThresh(..., newobs = FALSE)
) or including a future response value y_{2i} (if lmThresh(..., newobs = TRUE)
) to reverse the significance of the linear model is calculated as the integral between the end of the significance region (eosr) and the upper/lower α/2, 1-α/2 prediction interval:
P(\mathrm{reverse}) = \int_{\mathrm{eosr}}^{1-α/2} P_{tss}(y, n-2)dy \quad \mathrm{or} \quad \int_{α/2}^{\mathrm{eosr}} P_{tss}(y, n-2)dy
The stability value.
Andrej-Nikolai Spiess
## See examples in 'lmInfl' and 'lmThresh'. ## The implemented strategy of calculating the ## probability of significance reversal, as explained above ## and compared to 'stabPlot'. set.seed(125) a <- 1:20 b <- 5 + 0.08 * a + rnorm(length(a), 0, 1) LM1 <- lm(b ~ a) res1 <- lmThresh(LM1, newobs = TRUE) st1 <- stability(res1, pval = TRUE) ## Let's check that the prediction interval encompasses 95%: dt.scaled <- function(x, df, mu, s) 1/s * dt((x - mu)/s, df) integrate(dt.scaled, lower = st1$stats[1, "lower"], st1$stats[1, "upper"], df = 18, mu = st1$stats[1, "fitted"], s = st1$stats[1, "se"]) ## => 0.95 with absolute error < 8.4e-09 ## This is the interval between "end of significance region" and upper ## prediction boundary: integrate(dt.scaled, lower = st1$stats[1, "eosr.2"], st1$stats[1, "upper"], df = 18, mu = st1$stats[1, "fitted"], s = st1$stats[1, "se"]) ## => 0.09264124 with absolute error < 1e-15 ## We can recheck this value by P(B) - P(A): pt.scaled <- function(x, df, mu, s) pt((x - mu)/s, df) pA <- pt.scaled(x = st1$stats[1, "eosr.2"], df = 18, mu = st1$stats[1, "fitted"], s = st1$stats[1, "se"]) 0.975 - pA ## => 0.09264124 as above
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.