knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
library(linearRegression)

This introduction of 'linearReg' function will use the 'mtcars' data set built into R, which has 32 observations and 11 variables.

head(mtcars)

Set up before implementation of 'linearReg' function:

Since we need a matrix of the explanatory variables, we will extract the explanatory variables of the data set into a matrix and the response variable as a vector. In this example, the explanatory variables of interest will be 'cyl' (number of cylinder), and 'wt' (weight of vehicle) and the responding variable will be 'mpg' (Miles/(US) gallon).

##Extract X-> select cyl (column 2) and wt (column 6) variables
X=as.matrix(mtcars[c("cyl", "wt")])

##Extract y-> mpg (column 1)
y=as.vector(mtcars["mpg"])

Using the function 'linearReg':

The model of interest in this example is: mpg = B0 + B1cyl + B2wt

linearReg(y,X)

Comparison of the 'lm()' function summary to 'linearReg':

The summary of the 'lm()' function is very similar to 'linearReg' with a few exceptions:

  1. 'linearReg' includes 95% Confidence Intervals, Variance Inflation Factor (VIF), and the input is a matrix and vector

  2. 'lm()' includes Residual Standard Error and F-statistic information and the input is a model formula

The summary of the 'lm()' function in the stats package for comparison:

#original 'lm' function
model=lm(mpg~cyl+wt, data=mtcars)
summary(model)

The values that are shared within both function's outputs appear to be executed with the same amount of precision. The 'summary(lm())' function has the same output values as 'linearReg' only the values of the former are rounded to the ten-thousandth decimal point.

Comparison of the VIF from 'linearReg' to the VIF of 'lm()' using 'vif' function in the car package:

#VIF of 'linearReg'
linearReg(y,X)$VIF

#VIF of lm
car::vif(model)

The above demonstrates that the 'linearReg' function accurately estimates the variance inflation factors of the covariates in comparison to the 'vif' function.

To compare the efficiency between 'linearReg' and 'summary(lm())':

system.time(linearReg(y,X))
system.time(summary(lm(mpg~cyl+wt, data=mtcars)))

bench::mark(linearReg(y,X))
bench::mark(summary(lm(mpg~cyl+wt, data=mtcars)))

The 'linearReg' function and 'summary(lm())' function appear to have similar system time elapses indicating similar efficiency. Benchmarking the functions with 'bench::mark()' further justifies that the efficiency is comparable between the two functions.

As stated above, 'linearReg' and 'summary(lm())' functions are very similar but have some differences in outputs which can potentially contribute to small discrepancies when benchmarking the efficiency of these functions. Again, these differences include: 'linearReg' additionally outputs 95% confidence intervals and variance inflation factors while 'summary(lm())' additionally outputs residual standard error and F-statistic information.



pangoria/linearRegression documentation built on Dec. 22, 2021, 6:39 a.m.