svyvif | R Documentation |
Compute a VIF for fixed effects, general linear regression models fitted with data collected from one- and two-stage complex survey designs.
svyvif(mobj, X, w, stvar=NULL, clvar=NULL)
mobj |
model object produced by |
X |
n \times p matrix of real-valued covariates used in fitting a linear regression; n = number of observations, p = number of covariates in model, excluding the intercept. A column of 1's for an intercept should not be included. |
w |
n-vector of survey weights used in fitting the model. No missing values are allowed. |
stvar |
field in |
clvar |
field in |
svyvif
computes a variance inflation factor (VIF) appropriate for linear models and some general linear models (GLMs) fitted from complex survey data (see Liao & Valliant 2012). A VIF measures the inflation of a slope estimate caused by nonorthogonality of the predictors over and above what the variance would be with orthogonality (Theil 1971; Belsley, Kuh, and Welsch 1980). The standard VIF equals 1/(1 - R^2_k) where R_k is the multiple correlation of the k^{th} column of X
regressed on the remaining columns. The complex sample value of the VIF for a linear model consists of the standard VIF multiplied by two adjustments denoted in the output as zeta
and varrho
. The VIF for a GLM is similar (Liao 2010, chap. 5). There is no widely agreed-upon cutoff value for identifying high values of a VIF, although 10 is a common suggestion.
p \times 5 matrix with columns:
svy.vif |
complex sample VIF |
reg.vif |
standard VIF, 1/(1 - R^2_k), that omits the factors, |
zeta |
1st multiplicative adjustment to |
varrho |
2nd multiplicative adjustment to |
zeta.x.varrho |
product of the two adjustments to |
R.square |
R-square in the regression of the k^{th} x on the other x's, including the intercept |
Richard Valliant
Belsley, D.A., Kuh, E. and Welsch, R.E. (1980). Regression Diagnostics: Identifying Influential Data and Sources of Collinearity. New York: Wiley-Interscience.
Liao, D. (2010). Collinearity Diagnostics for Complex Survey Data. PhD thesis, University of Maryland. http://hdl.handle.net/1903/10881.
Liao, D, and Valliant, R. (2012). Variance inflation factors in the analysis of complex survey data. Survey Methodology, 38, 53-62.
Theil, H. (1971). Principles of Econometrics. New York: John Wiley & Sons, Inc.
Lumley, T. (2010). Complex Surveys. New York: John Wiley & Sons.
Lumley, T. (2018). survey: analysis of complex survey samples. R package version 3.34.
Vmat
require(survey) data(nhanes2007) X1 <- nhanes2007[order(nhanes2007$SDMVSTRA, nhanes2007$SDMVPSU),] # eliminate cases with missing values delete <- which(complete.cases(X1)==FALSE) X2 <- X1[-delete,] nhanes.dsgn <- svydesign(ids = ~SDMVPSU, strata = ~SDMVSTRA, weights = ~WTDRD1, nest=TRUE, data=X2) # linear model m1 <- svyglm(BMXWT ~ RIDAGEYR + as.factor(RIDRETH1) + DR1TKCAL + DR1TTFAT + DR1TMFAT, design=nhanes.dsgn) summary(m1) # construct X matrix using model.matrix from stats package X3 <- model.matrix(~ RIDAGEYR + as.factor(RIDRETH1) + DR1TKCAL + DR1TTFAT + DR1TMFAT, data = data.frame(X2)) # remove col of 1's for intercept with X3[,-1] svyvif(mobj=m1, X=X3[,-1], w = X2$WTDRD1, stvar=NULL, clvar=NULL) # Logistic model X2$obese <- X2$BMXBMI >= 30 nhanes.dsgn <- svydesign(ids = ~SDMVPSU, strata = ~SDMVSTRA, weights = ~WTDRD1, nest=TRUE, data=X2) m2 <- svyglm(obese ~ RIDAGEYR + as.factor(RIDRETH1) + DR1TKCAL + DR1TTFAT + DR1TMFAT, design=nhanes.dsgn, family="quasibinomial") summary(m2) svyvif(mobj=m2, X=X3[,-1], w = X2$WTDRD1, stvar = "SDMVSTRA", clvar = "SDMVPSU")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.