multiCol: Collinearity detection in a linear regression model

Description Usage Arguments Value Note Author(s) References See Also Examples

View source: R/multiCol.R

Description

The function collects all existing measures to detect worrying multicollinearity in the package multiCol.

Usage

1
multiCol(X, dummy = F, pos = NULL)

Arguments

X

A numeric design matrix that should contain more than one regressor (intercept included).

dummy

A logical value that indicates if there are dummy variables in the design matrix X. By default dummy=F.

pos

A numeric vector that indicates the position of the dummy variables, if these exist, in the design matrix X. By default pos=NULL.

Value

If X contains two independent variables (intercept included) see SLM function.

If X contains more than two independent variables (intercept included):

CV

Coeficients of variation of quantitative variables in X.

Prop

Proportion of ones in the dummy variables.

R

Matrix correlation of the quantitative variables in X.

detR

Determinant of the matrix correlation of the quantitative variables in X.

VIF

Variance Inflation Factors of the quantitative variables in X.

CN

Condition Number of X.

ki

Stewart's index of the quantitative variables in X.

Note

For more detail, see the help of the functions in See Also.

Author(s)

R. Salmer<f3>n (romansg@ugr.es) and C. Garc<ed>a (cbgarcia@ugr.es).

References

L. R. Klein and A.S. Goldberger (1964). An economic model of the United States, 1929-1952. North Holland Publishing Company, Amsterdan.

H. Theil (1971). Principles of Econometrics. John Wiley & Sons, New York.

See Also

SLM, CV, PROPs, RdetR, VIF, CN, ki.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# Henri Theil's textile consumption data modified
data(theil)
head(theil)
cte = array(1,length(theil[,2]))
theil.X = cbind(cte,theil[,-(1:2)])
multiCol(theil.X, TRUE, pos = 4)

# Klein and Goldberger data on consumption and wage income
data(KG)
head(KG)
cte = array(1,length(KG[,1]))
KG.X = cbind(cte,KG[,-1])
multiCol(KG.X)

# random
x1 = array(1,25)
x2 = rnorm(25,100,1)
x = cbind(x1,x2)
head(x)
multiCol(x)

# random
x1 = array(1,25)
x2 = sample(cbind(array(1,25),array(0,25)),25)
x = cbind(x1,x2)
head(x)
multiCol(x, TRUE)

Example output

      obs consume income relprice twentys
[1,] 1923    99.2   96.7    101.0       1
[2,] 1924    99.0   98.1    100.1       1
[3,] 1925   100.0  100.0    100.0       1
[4,] 1926   111.6  104.9     90.6       1
[5,] 1927   122.2  104.9     86.5       1
[6,] 1928   117.6  109.5     89.7       1
$`Coeficients of Variation`
[1] 0.04993766 0.21441845

$`Proportion of ones in the dummys variable`
[1] 41.17647

$`R and det(R)`
$`R and det(R)`$`Correlation matrix`
            income  relprice
income   1.0000000 0.1788467
relprice 0.1788467 1.0000000

$`R and det(R)`$`Correlation matrix's determinant`
[1] 0.9680139


$`Variance Inflation Factors`
  income relprice 
1.033043 1.033043 

$CN
$CN$`Condition Number without intercept`
[1] 24.15423

$CN$`Condition Number with intercept`
[1] 53.39671

$CN$`Increase (in percentage)`
[1] 54.76458


$ki
$ki$`Stewart index`
[1] 403.20963 415.28266  23.50258

$ki$`Proportion of essential collinearity in i-th independent variable (without intercept)`
[1] 0.2487566 4.3954455

$ki$`Proportion of non-essential collinearity in i-th independent variable (without intercept)`
[1] 99.75124 95.60455


  consumption wage.income non.farm.income farm.income
1        62.8       43.41           17.10        3.96
2        65.0       46.44           18.65        5.48
3        63.9       44.35           17.09        4.37
4        67.5       47.82           19.28        4.51
5        71.3       51.02           23.24        4.88
6        76.6       58.71           28.11        6.37
$`Coeficients of Variation`
[1] 0.2660921 0.2503487 0.2867863

$`Proportion of ones in the dummys variable`
[1] "At least one qualitative independent variable are needed (excluding the intercept)"

$`R and det(R)`
$`R and det(R)`$`Correlation matrix`
                wage.income non.farm.income farm.income
wage.income       1.0000000       0.9431118   0.8106989
non.farm.income   0.9431118       1.0000000   0.7371272
farm.income       0.8106989       0.7371272   1.0000000

$`R and det(R)`$`Correlation matrix's determinant`
[1] 0.03713592


$`Variance Inflation Factors`
    wage.income non.farm.income     farm.income 
      12.296544        9.230073        2.976638 

$CN
$CN$`Condition Number without intercept`
[1] 30.2987

$CN$`Condition Number with intercept`
[1] 35.88644

$CN$`Increase (in percentage)`
[1] 15.57062


$ki
$ki$`Stewart index`
[1]  17.86327 185.96422 156.50013  39.16836

$ki$`Proportion of essential collinearity in i-th independent variable (without intercept)`
[1] 6.612317 5.897805 7.599598

$ki$`Proportion of non-essential collinearity in i-th independent variable (without intercept)`
[1] 93.38768 94.10219 92.40040


     x1        x2
[1,]  1 102.53548
[2,]  1 101.20446
[3,]  1  99.75182
[4,]  1 100.68300
[5,]  1  98.74066
[6,]  1 100.46400
$`Coeficient of Variation`
[1] 0.009759725

$`Variance Inflation Factor`
[1] 1

$`Condition Number`
[1] 204.9287

$`Stewart index`
[1] 10499.44 10499.44

     x1 x2
[1,]  1  1
[2,]  1  1
[3,]  1  1
[4,]  1  1
[5,]  1  0
[6,]  1  0
$`Proportion of ones in the dummy variable`
[1] 44

$`Condition Number`
[1] 2.222711

multiColl documentation built on May 2, 2019, 4:53 p.m.