knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
The LinearRegression package aims to provide almost every piece of information needed for linear regression, except diagnostics. The LR( ) function mimics the output from lm( ), summary(lm( )), confint( ), hatvalues(lm( )), rstandard( ), and rstudent( ). The ANOVA( ) function mimics the output from anova(lm( )) function for sequential sums of squares and output from car::Anova(lm( ), type="III") function for partial sums of squares.
library(LinearRegression)
LR( ) function.LR(mpg ~ cyl + wt, mtcars) LR(mpg ~ cyl + wt, mtcars, include.intercept = FALSE)
You can either fit the linear regression model with or without the intercept. The LR( ) function does not explicitly return any result until you extract the desired output with $.
model = LR(mpg ~ cyl + wt + qsec, mtcars) names(model)
The LR( ) function is very versatile, it contains lots of information. We can choose from above to obtain the desired output. Let's demonstrate via a few examples and compare the result with the original r function to see if they yields the same results.
model$coeff_summary model.r = lm(mpg ~ cyl + wt + qsec, mtcars) summary(model.r)$coefficients all.equal(as.numeric(model$coeff_summary), as.numeric(summary(model.r)$coefficients))
Since we are only interested in whether LR( ) and lm( ) yield the same numeric results, I coerced the both outputs to "numeric" class. Through all.equal( ), we observed that the outputs are indeed equal. What about the its speed? We would next compare their speed via bench::mark( ).
library(bench) plot(mark(as.numeric(model$coeff_summary), as.numeric(summary(model.r)$coefficients)))
Using this naive built-in data set mtcars, LR( ) function perform better than lm( ) function. One of the main reason why LR( ) outperforms the built-in r function is that there is no need to call for another function for coefficients summary.
Next, let's demonstrate two more example with LR( ) and compare it with the original r function.
to.predict parameterMyPredict = LR(cyl ~ mpg + wt, mtcars, to.predict = matrix(c(mean(mtcars$mpg), 18, mean(mtcars$wt), 4), 2, 2))$predicted print(MyPredict) RPredict = predict(lm(cyl~mpg+wt,mtcars), newdata=data.frame(mpg=c(mean(mtcars$mpg),18), wt = c(mean(mtcars$wt),4))) print(RPredict) all.equal(as.numeric(MyPredict), as.numeric(RPredict)) plot(mark(as.numeric(MyPredict), as.numeric(RPredict)))
Both function yields the same predicted values;however, the built-in r function performs slightly better than LR( ) in terms of speed.
include.intercept parametermodel2 = LR(mpg ~ cyl + wt + qsec, mtcars, include.intercept = FALSE) model.r2 = lm(mpg ~ -1 + cyl + wt + qsec, mtcars) model2$CI confint(model.r2) all.equal(as.numeric(model2$CI), as.numeric(confint(model.r2))) plot(mark(as.numeric(model2$CI), as.numeric(confint(model.r2))))
Again, same numeric outputs from both functions. However, LR( ) outperforms the built-in r function because we do not need to call another function to compute the confidence interval of the fitted model.
ANOVA( ) function.We could choose either to obtain an ANOVA table for sequential sums of squares or partial sums of squares.
Note: ANOVA( ) automatically fit the model with an intercept.
ANOVA(mpg ~ cyl + wt + disp, mtcars, type = "Sequential") anova(lm(mpg ~ cyl + wt + disp, mtcars)) MyNumeric = as.numeric(unlist(ANOVA(mpg ~ cyl + wt, mtcars, type = "Sequential"))) RNumeric = as.numeric(unlist(anova(lm(mpg ~ cyl + wt, mtcars)))) all.equal(MyNumeric, RNumeric) plot(mark(as.numeric(unlist(ANOVA(mpg ~ cyl + wt, mtcars, type = "Sequential"))), as.numeric(unlist(anova(lm(mpg ~ cyl + wt, mtcars))))))
From above, we have shown that ANOVA( ) and built-in function anova( ) output the same numeric results via all.equal( ). ANOVA( ) actually outperforms anova( ) in terms of speed.
require("car") ANOVA(mpg ~ cyl + wt + disp, mtcars, type = "Partial") Anova(lm(mpg ~ cyl + wt + disp, mtcars), type = "III") MyNumeric = as.numeric(unlist(ANOVA(mpg ~ cyl + wt + disp, mtcars, type = "Partial"))) RNumeric = as.numeric(unlist(Anova(lm(mpg ~ cyl + wt + disp, mtcars), type = "III"))) all.equal(MyNumeric, RNumeric) plot(mark(as.numeric(unlist(ANOVA(mpg ~ cyl + wt + disp, mtcars, type = "Partial"))), as.numeric(unlist(Anova(lm(mpg ~ cyl + wt + disp, mtcars), type = "III")))))
Again, we have shown that ANOVA( ) and original r function Anova( ) in cars package output the same numeric results via all.equal( ). ANOVA( ) is more efficient than anova( ) in terms of speed.
Tips: Since ANOVA( ) outputs a data.frame, we can indexing or selecting desired element with all means of data.frame such as ANOVA(mpg ~ cyl + wt + disp, mtcars, type = "Sequential")["Sum Sq"].
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.