Description Usage Arguments Value Note See Also Examples
Returns diagnostic measures for a binary regression model by covariate pattern
1 2 3 4 |
x |
A regression model with class |
... |
Additional arguments
which can be passed to:
|
byCov |
Return values by covariate pattern, rather than by individual observation. |
A data.table
, with rows sorted by dBhat.
If byCov==TRUE
, there is one row per covariate pattern
with at least one observation.
The initial columns give the predictor variables 1 ... p.
Subsequent columns are labelled as follows:
y |
The actual number of observations with y=1 in the model data. |
P |
Probability of this covariate pattern.
|
n |
Number of observations with these covariates.
|
yhat |
The predicted number of observations having
a response of y=1, according to the model.
yhat[i] = n[i] * P[i] |
h |
Leverage, the diagonal of the hat matrix used to generate the model: H = V^0.5 X (X'VX)^-1 X'V^0.5 Here ^-1 is the inverse and
' is the transpose of a matrix.
v[i][i] = n[i]P[i] * (1 - P[i]) Leverage H is also the estimated covariance matrix of
Bhat.
h[i] = x[i] - mean(x), 0.1 < P[i] < 0.9 That is, leverage is approximately equal to the distance of
the covariate pattern i from the mean mean(x).
|
Pr |
The Pearson residual, a measure of influence. This is: Pr[i] = (y[i] - ybar) / SD[y] where ybar and SD[y] refer
to the mean and standard deviation of a binomial distribution.
E(y=1) = ybar = yhat = nP and SE[y] = (nP(1-P))^0.5 Thus: Pr[i] = (y[i] - n[i]P[i]) / (n[i]P[i](1 - P[i])^0.5) |
dr |
The deviance residual, a measure of influence: dr[i] = sign(y[i] - yhat[i]) * d[i]^0.5 d[i] is the contribution of observation i
to the model deviance.
In logistic regression this is: y[i] = 1 --> dr[i] = (2 * log (1 + e^f(x) - f(x)))^0.5 y[i] = 0 --> dr[i] = (2 * log (1 + e^f(x))) where f(x) is the linear function of the predictors 1 ... p: f(x) = B[0] + B[1][i] * x[1][i] + ... + B[p][i] * x[p][i] this is also: dr[i] = sign(y[i] - yhat[i]) [2 * (y[i] * log(y[i] / n[i] * p[i])) + (n[i] - y[i]) * log((n[i] - y[i]) / (n[i] * (1 - p[i])))]^0.5 To avoid the problem of division by zero: y[i] = 0 --> dr[i] = (2 * n[i] * | log(1 - P[i]) |)^0.5 Similarly to avoid log(Inf): y[i] = n[i] --> dr[i] = (2 * n[i] * | log(P[i]) |)^0.5 The above equations are used when calculating dr[i] by covariate group. |
sPr |
The standardized Pearson residual.
sPr[i] = Pr[i] / (1 - h[i])^0.5 |
sdr |
The standardized deviance residual.
sdr[i] = dr[i] / (1 - h[i])^0.5 |
dChisq |
The change in the Pearson chi-square statistic with observation i removed. Given by: dChi^2 = sPr[i]^2 = Pr[i]^2 / (1 - h[i]) where sPr[i] is the standardized Pearson residual,
Pr[i] is the Pearson residual and
h[i] is the leverage.
|
dDev |
The change in the deviance statistic
D = SUM dr[i]
with observation i excluded.
dDev[i] = sdr[i]^2 = dr[i]^2 / (1 - h[i]) |
dBhat |
The change in Bhat
with observation i excluded.
dBhat = h[i] * sPr[i]^2 / (1 - h[i]) where sPR[i] is the standardized Pearson residual.
|
By default, values for the statistics are calculated by
covariate pattern.
Different values may be obtained if
calculated for each individual
obervation (e.g. rows in a data.frame
).
Generally, the values calculated by
covariate pattern are preferred,
particularly where the number of observations in a group is >5.
In this case Pearsons chi-squared and the deviance statistic
should follow a chi-squared distribution with i - p degrees of freedom.
1 2 3 4 5 6 7 8 9 10 11 12 | ## H&L 2nd ed. Table 5.8. Page 182.
## Pattern nos. 31, 477, 468
data(uis)
uis <- within(uis, {
NDRGFP1 <- 10 / (NDRGTX + 1)
NDRGFP2 <- NDRGFP1 * log((NDRGTX + 1) / 10)
})
(d1 <- dx(g1 <- glm(DFREE ~ AGE + NDRGFP1 + NDRGFP2 + IVHX +
RACE + TREAT + SITE +
AGE:NDRGFP1 + RACE:SITE,
family=binomial, data=uis)))
d1[519:521, ]
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.