Description Usage Arguments Value Note See Also Examples
Diagnostics for binomial regression
Returns diagnostic measures for a binary regression model by covariate pattern
1 2 3 4 
x 
A regression model with class 
... 
Additional arguments
which can be passed to:

byCov 
Return values by covariate pattern, rather than by individual observation. 
A data.table
, with rows sorted by dBhat.
If byCov==TRUE
, there is one row per covariate pattern
with at least one observation.
The initial columns give the predictor variables 1 ... p.
Subsequent columns are labelled as follows:
y 
The actual number of observations with y=1 in the model data. 
P 
Probability of this covariate pattern.

n 
Number of observations with these covariates.

yhat 
The predicted number of observations having
a response of y=1, according to the model.
yhat[i] = n[i] * P[i] 
h 
Leverage, the diagonal of the hat matrix used to generate the model: H = V^0.5 X (X'VX)^1 X'V^0.5 Here ^1 is the inverse and
' is the transpose of a matrix.
v[i][i] = n[i]P[i] * (1  P[i]) Leverage H is also the estimated covariance matrix of
Bhat.
h[i] = x[i]  mean(x), 0.1 < P[i] < 0.9 That is, leverage is approximately equal to the distance of
the covariate pattern i from the mean mean(x).

Pr 
The Pearson residual, a measure of influence. This is: Pr[i] = (y[i]  ybar) / SD[y] where ybar and SD[y] refer
to the mean and standard deviation of a binomial distribution.
E(y=1) = ybar = yhat = nP and SE[y] = (nP(1P))^0.5 Thus: Pr[i] = (y[i]  n[i]P[i]) / (n[i]P[i](1  P[i])^0.5) 
dr 
The deviance residual, a measure of influence: dr[i] = sign(y[i]  yhat[i]) * d[i]^0.5 d[i] is the contribution of observation i
to the model deviance.
In logistic regression this is: y[i] = 1 > dr[i] = (2 * log (1 + e^f(x)  f(x)))^0.5 y[i] = 0 > dr[i] = (2 * log (1 + e^f(x))) where f(x) is the linear function of the predictors 1 ... p: f(x) = B[0] + B[1][i] * x[1][i] + ... + B[p][i] * x[p][i] this is also: dr[i] = sign(y[i]  yhat[i]) [2 * (y[i] * log(y[i] / n[i] * p[i])) + (n[i]  y[i]) * log((n[i]  y[i]) / (n[i] * (1  p[i])))]^0.5 To avoid the problem of division by zero: y[i] = 0 > dr[i] = (2 * n[i] *  log(1  P[i]) )^0.5 Similarly to avoid log(Inf): y[i] = n[i] > dr[i] = (2 * n[i] *  log(P[i]) )^0.5 The above equations are used when calculating dr[i] by covariate group. 
sPr 
The standardized Pearson residual.
sPr[i] = Pr[i] / (1  h[i])^0.5 
sdr 
The standardized deviance residual.
sdr[i] = dr[i] / (1  h[i])^0.5 
dChisq 
The change in the Pearson chisquare statistic with observation i removed. Given by: dChi^2 = sPr[i]^2 = Pr[i]^2 / (1  h[i]) where sPr[i] is the standardized Pearson residual,
Pr[i] is the Pearson residual and
h[i] is the leverage.

dDev 
The change in the deviance statistic
D = SUM dr[i]
with observation i excluded.
dDev[i] = sdr[i]^2 = dr[i]^2 / (1  h[i]) 
dBhat 
The change in Bhat
with observation i excluded.
dBhat = h[i] * sPr[i]^2 / (1  h[i]) where sPR[i] is the standardized Pearson residual.

By default, values for the statistics are calculated by
covariate pattern.
Different values may be obtained if
calculated for each individual
obervation (e.g. rows in a data.frame
).
Generally, the values calculated by
covariate pattern are preferred,
particularly where the number of observations in a group is >5.
In this case Pearsons chisquared and the deviance statistic
should follow a chisquared distribution with i  p degrees of freedom.
1 2 3 4 5 6 7 8 9 10 11 12  ## H&L 2nd ed. Table 5.8. Page 182.
## Pattern nos. 31, 477, 468
data(uis)
uis < within(uis, {
NDRGFP1 < 10 / (NDRGTX + 1)
NDRGFP2 < NDRGFP1 * log((NDRGTX + 1) / 10)
})
(d1 < dx(g1 < glm(DFREE ~ AGE + NDRGFP1 + NDRGFP2 + IVHX +
RACE + TREAT + SITE +
AGE:NDRGFP1 + RACE:SITE,
family=binomial, data=uis)))
d1[519:521, ]

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.