qlm: Quick Linear Regression

Description Usage Arguments Examples

View source: R/qlm.R

Description

Reduces the independent variables based on specified P value and Variance Inflation Factor (VIF) level, and reduces following manual efforts.
1. Checking VIF first then removing number of independent variables based on the VIF level.
2. Then Checking p-value of remaining independent variables and removing them.

User can select significance level and VIF level as argument.

Please note: Function reduces above manual efforts, hence I called it as quick regression.
Also, function uses existing lm() function as is, so it will not improve core lm() function execution.
User can provide existing arguments of lm functions.

Especially with small data set it would be very handy tool for Linear Model preparation.

Usage

1
2
3
4
qlm(data, V_dependent, signifi = 0.05, vifl = 5, subset = NULL,
  weights = NULL, na.action = NULL, method = "qr", model = TRUE,
  x = FALSE, y = FALSE, qr = TRUE, singular.ok = TRUE,
  contrasts = NULL, offset = NULL)

Arguments

data

is data set name (e.g. bank)

V_dependent

is dependent variable name. No need of double quotes.

signifi

is significant level in lm model.(e.g. 0.05,0.01) (default to 0.05)

vifl

is variance-inflation level. (default to 5)

subset

Existing lm() function argument, an optional vector specifying a subset of observations to be used in the fitting process.

weights

Existing lm() function argument, an optional vector of weights to be used in the fitting process. Should be NULL or a numeric vector. If non-NULL, weighted least squares is used with weights weights (that is, minimizing sum(w*e^2)); otherwise ordinary least squares is used.

na.action

Existing lm() function argument, a function which indicates what should happen when the data contain NAs. The default is set by the na.action setting of options, and is na.fail if that is unset. The <e2><80><98>factory-fresh<e2><80><99> default is na.omit. Another possible value is NULL, no action. Value na.exclude can be useful.

method

Existing lm() function argument, the method to be used; for fitting, currently only method = "qr" is supported; method = "model.frame" returns the model frame (the same as with model = TRUE).

model, x, y, qr

Existing lm() function argument, logicals. If TRUE the corresponding components of the fit (the model frame, the model matrix, the response, the QR decomposition) are returned.

singular.ok

Existing lm() function argument, logical. If FALSE (the default in S but not in R) a singular fit is an error.

contrasts

Existing lm() function argument, an optional list.

offset

Existing lm() function argument, this can be used to specify an a priori known component to be included in the linear predictor during fitting. This should be NULL or a numeric vector of length equal to the number of cases. One or more offset terms can be included in the formula instead or as well, and if more than one are specified their sum is used.

Examples

1
2
3
4
5
6
7
  a<-mtcars[,c(1,3,4,5,6,7)]
  b<-qlm(a,mpg)
  summary(b)
  b<-qlm(a,mpg,signifi =0.20)
  summary(b)
  b<-qlm(a,mpg,signifi =0.20,vifl=20)
  summary(b)

Example output

Loading required package: car
Loading required package: carData
[1] "Variable removed due to high VIF"
[1] "disp"

[1] "Variable removed due to high P value"
[1] "hp"   "drat"

Call:
lm(formula = as.formula(a5), data = data, subset = subset, weights = weights, 
    na.action = na.action, method = method, model = model, x = x, 
    y = y, qr = qr, singular.ok = singular.ok, contrasts = contrasts, 
    offset = offset)

Residuals:
    Min      1Q  Median      3Q     Max 
-4.3962 -2.1431 -0.2129  1.4915  5.7486 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  19.7462     5.2521   3.760 0.000765 ***
wt           -5.0480     0.4840 -10.430 2.52e-11 ***
qsec          0.9292     0.2650   3.506 0.001500 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.596 on 29 degrees of freedom
Multiple R-squared:  0.8264,	Adjusted R-squared:  0.8144 
F-statistic: 69.03 on 2 and 29 DF,  p-value: 9.395e-12

[1] "Variable removed due to high VIF"
[1] "disp"

[1] "Variable removed due to high P value"
[1] "hp"

Call:
lm(formula = as.formula(a5), data = data, subset = subset, weights = weights, 
    na.action = na.action, method = method, model = model, x = x, 
    y = y, qr = qr, singular.ok = singular.ok, contrasts = contrasts, 
    offset = offset)

Residuals:
    Min      1Q  Median      3Q     Max 
-4.1152 -1.8273 -0.2696  1.0502  5.5010 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  11.3945     8.0689   1.412  0.16892    
drat          1.6561     1.2269   1.350  0.18789    
wt           -4.3978     0.6781  -6.485 5.01e-07 ***
qsec          0.9462     0.2616   3.616  0.00116 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.56 on 28 degrees of freedom
Multiple R-squared:  0.837,	Adjusted R-squared:  0.8196 
F-statistic: 47.93 on 3 and 28 DF,  p-value: 3.723e-11

[1] "Variable removed due to high VIF"
NULL

[1] "Variable removed due to high P value"
[1] "disp" "hp"  

Call:
lm(formula = as.formula(a5), data = data, subset = subset, weights = weights, 
    na.action = na.action, method = method, model = model, x = x, 
    y = y, qr = qr, singular.ok = singular.ok, contrasts = contrasts, 
    offset = offset)

Residuals:
    Min      1Q  Median      3Q     Max 
-4.1152 -1.8273 -0.2696  1.0502  5.5010 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  11.3945     8.0689   1.412  0.16892    
drat          1.6561     1.2269   1.350  0.18789    
wt           -4.3978     0.6781  -6.485 5.01e-07 ***
qsec          0.9462     0.2616   3.616  0.00116 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.56 on 28 degrees of freedom
Multiple R-squared:  0.837,	Adjusted R-squared:  0.8196 
F-statistic: 47.93 on 3 and 28 DF,  p-value: 3.723e-11

quickregression documentation built on May 2, 2019, 8:55 a.m.