svycollinear | R Documentation |
Compute condition indexes and variance decompositions for diagnosing collinearity in fixed effects, linear regression models fitted with data collected from one- and two-stage complex survey designs.
svycollinear(mod, intcpt=TRUE, w, Vcov, sc=TRUE, svyglm.obj, rnd=3, fuzz=0.3)
mod |
Either (i) an n \times p matrix of real-valued covariates used in fitting a linear regression; n = number of observations, p = number of covariates in model, excluding the intercept; the matrix |
intcpt |
|
w |
n-vector of survey weights used in fitting the model. No missing values are allowed. |
Vcov |
Variance-covariance matrix of the estimated slopes in the regression model; component |
sc |
|
svyglm.obj |
Is |
rnd |
Round the output to |
fuzz |
Replace any variance decomposition proportions that are less than |
svycollinear
computes condition indexes and variance decomposition proportions to use for diagnosing collinearity in a linear model fitted from complex survey data as discussed in Liao and Valliant (2012). All measures are based on \widetilde{\mathbf{X}} = \mathbf{W}^{1/2}\mathbf{X} where \mathbf{W} is the diagonal matrix of survey weights and X is the n \times p matrix of covariates. In a full-rank model with p covariates, there are p condition indexes, defined as the ratio of the maximum eigenvalue of \widetilde{\mathbf{X}} to its minimum eigenvalue. Before computing condition indexes, as recommended by Belsley (1991), the columns are normalized by their individual Euclidean norms, √{\tilde{\mathbf{x}}^T\tilde{\mathbf{x}}}, so that each column has unit length. The columns are not centered around their means because that can obscure near-dependencies between the intercept and other covariates (Belsley 1984).
Variance decompositions are for the variance of each estimated regression coefficient and are based on a singular value decomposition of the variance formula. Proportions of the model variance, Var_M(\hat{\mathbf{β}}_k), associated with each column of \widetilde{\mathbf{X}} are displayed in an output matrix described below.
p \times (p+1) data frame, \mathbf{Π}. The first column gives the condition indexes of \widetilde{\mathbf{X}}. Values of 10 or more are usually considered to potentially signal collinearity of two or more columns of \widetilde{\mathbf{X}}. The remaining columns give the proportions (within columns) of variance of each estimated regression coefficient associated with a singular value decomposition into p terms. Columns 2, …, p+1 will each approximately sum to 1. Note that some ‘proportions’ can be negative due to the nature of the variance decomposition. If two proportions in a given row of \mathbf{Π} are relatively large and its associated condition index in that row in the first column of \mathbf{Π} is also large, then near dependencies between the covariates associated with those elements are influencing the regression coefficient estimates.
Richard Valliant
Belsley, D.A., Kuh, E. and Welsch, R.E. (1980). Regression Diagnostics: Identifying Influential Data and Sources of Collinearity. New York: Wiley-Interscience.
Belsley, D.A. (1984). Demeaning conditioning diagnostics through centering. The American Statistician, 38(2), 73-77.
Belsley, D.A. (1991). Conditioning Diagnostics, Collinearity, and Weak Data in Regression. New York: John Wiley & Sons, Inc.
Liao, D, and Valliant, R. (2012). Condition indexes and variance decompositions for diagnosing collinearity in linear model analysis of survey data. Survey Methodology, 38, 189-202.
Lumley, T. (2010). Complex Surveys. New York: John Wiley & Sons.
Lumley, T. (2021). survey: analysis of complex survey samples. R package version 4.1-1.
svyvif
require(survey) # example from svyglm help page data(api) dstrat <- svydesign(id=~1,strata=~stype, weights=~pw, data=apistrat, fpc=~fpc) m1 <- svyglm(api00~ell+meals+mobility, design=dstrat) # send model object from svyglm CI.out <- svycollinear(mod = m1, w=apistrat$pw, Vcov=vcov(m1), sc=TRUE, svyglm.obj=TRUE, rnd=3, fuzz= 0.3) # send model matrix from svyglm svycollinear(mod = m1$model, w=apistrat$pw, Vcov=vcov(m1), sc=TRUE, svyglm.obj=TRUE, rnd=3, fuzz=0.3) # use model.matrix to create matrix of covariates in model data(nhanes2007) newPSU <- paste(nhanes2007$SDMVSTRA, nhanes2007$SDMVPSU, sep=".") nhanes.dsgn <- svydesign(ids = ~newPSU, strata = NULL, weights = ~WTDRD1, data=nhanes2007) m1 <- svyglm(BMXWT ~ RIDAGEYR + as.factor(RIDRETH1) + DR1TKCAL + DR1TTFAT + DR1TMFAT + DR1TSFAT + DR1TPFAT, design=nhanes.dsgn) X <- model.matrix(~ RIDAGEYR + as.factor(RIDRETH1) + DR1TKCAL + DR1TTFAT + DR1TMFAT + DR1TSFAT + DR1TPFAT, data = data.frame(nhanes2007)) CI.out <- svycollinear(mod = X, w=nhanes2007$WTDRD1, Vcov=vcov(m1), sc=TRUE, svyglm.obj=FALSE, rnd=2, fuzz=0.3)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.