Description Usage Arguments Details Value Note References Examples
Sample size for a given coefficient and events per covariate for model
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
x |
A regression model with class |
... |
Not used. |
alpha |
significance level alpha for the null-hypothesis significance test. |
beta |
power Beta for the null-hypothesis significance test. |
coeff |
Name of coefficient (variable) in the model to be tested. |
std |
Standardize the coefficient?
z[x] = (x[i] - xbar) / SD[x] |
alternative |
The default, |
OR |
Odds ratio. The size of the change in the probability. |
Px0 |
The probability that x=0.
|
Gives the sample size necessary to demonstrate that a coefficient
in the model for the
given predictor is equal to its given value
rather than equal to zero (or, if OR
is supplied,
the sample size needed to check for such a change in probability).
Also, the number of events per predictor.
This is the smaller value of the outcome y=0 and outcome y=1.
For a continuous coefficient, the calculation uses
Bhat, the estimated coefficient from the model,
delta:
delta = (1 + (1 + Bhat^2)exp(1.25 * Bhat^2)) / (1 + exp(1 + exp(-0.25 * Bhat^2)))
and P[0], the probability calculated from the intercept term
B[0] from the logistic model
glm(x$y ~ coeff, family=binomial)
as
P[0] = exp(B[0]) / (1 + exp(B[0]))
For a model with one predictor, the calculation is:
n = (1 + 1 * P[0] * delta) * (z[1-alpha] + z[beta] exp((0.25 * Bhat)^2)^2) / P[0] * Bhat^2
For a multivariable model, the value is adjusted by R^2, the correlation
of coeff
with the other predictors in the model:
n[m] = n / (1 - R^2)
For a binomial coefficient, the calculation uses P[0], the probability given the null hypothesis and P[a], the probability given the alternative hypothesis and and the average probability Pbar = (P[0] + P[a]) /2 The calculation is:
n = (z[1-alpha](2Pbar(1 - Pbar)^0.5) + z[beta](P[0](1 - P[0]) + P[1](1 - P[1]))^0.5)^2 / (P[1] - P[0])^2
An alternative given by Whitemore uses Phat = P(x=0).
The lead term in the equation below is used to correct for
large values of Phat:
n = (1 + 2P[0]) * (z[1-alpha]sqrt(1/Phat + 1/(1+Phat)) + z[beta]sqrt(1/Phat + 1/(Phat exp(Bhat))))^2 / (P[0]Bhat)^2
As above these can be adjusted in the multivariable case:
n[m] = n / (1 - R^2)
In this case, Pearsons R^2 correlation is between the
fitted values from a logistic regression with coeff
as the response
and the other predictors as co-variates.
The calculation uses Pbar, the mean probability (mean of the
fitted values from the model):
R^2 = (sum(y[i] - Pbar)(P[i] - Pbar))^2 / (sum(y[i] - Pbar)^2 * sum (P[i] - Pbar)^2)
A list of:
ss |
Sample size required to show coefficient for predictor is as given in the model rather than the alternative (by default =0). |
epc |
Events per covariate; should be >10 to make meaningful statements about the coefficients obtained. |
The returned list
has the additional
class
of "ss.glm"
.
The print
method for this class
does not
show the attributes.
Whitemore AS (1981). Sample Size for Logistic Regression with Small Response Probability. Journal of the American Statistical Association. 76(373):27-32. doi: 10.2307/2287036 Also available at JSTOR at https://www.jstor.org/stable/2287036
Hsieh FY (1989). Sample size tables for logistic regression. Statistics in Medicine. 8(7):795-802. doi: 10.1002/sim.4780080704 Also available at statpower (free).
Fleiss J (2003). Statistical methods for rates and proportions. 3rd ed. John Wiley, New York. doi: 10.1002/0471445428 Also available at Google books (free preview).
Peduzzi P, Concato J, Kemper E, Holford T R, Feinstein A R (1996). A simulation study of the number of events per variable in logistic regression analysis. Journal of Clinical Epidemiology. 49(12):1373-79. doi: 10.1016/S0895-4356(96)00236-3
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | ## H&L 2nd ed. Section 8.5.
## Results here are slightly different from the text due to rounding.
data(uis)
with(uis, prop.table(table(DFREE, TREAT), 2))
(g1 <- glm(DFREE ~ TREAT, data=uis, family=binomial))
ss(g1, coeff="TREATlong")
## Pages 340 - 341.
ss(g1, coeff="TREATlong", OR=1.5, Px0=0.5)
## standardize
uis <- within(uis, {
AGES <- (AGE - 32) / 6
NDRGTXS <- (NDRGTX - 5) / 5
})
## H&L 2nd ed. Section 8.5. Page 343.
## results slightly different due to rounding
g1 <- glm(DFREE ~ AGES, data=uis, family=binomial)
ss(g1, coeff="AGES", std=FALSE, OR=1.5)
## H&L 2nd ed. Section 8.5. Table 8.37. Page 344.
summary(g1 <- glm(DFREE ~ AGES + NDRGTXS + IVHX + RACE + TREAT,
data=uis, family=binomial))
## H&L 2nd ed. Section 8.5. Page 345.
## results slightly different due to rounding
ss(g1, coeff="AGES", std=FALSE, OR=1.5)
ss(g1, coeff="TREATlong", std=FALSE, OR=1.5)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.