knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
library(stats) library(datasets) library(zmjianglm)
The main function mylm
is trying to do the linear regression procedures including estimation and inference and get some important values. mylm
combines lm
and summary.lm
in pacakge stats
together. Users can choose to print or not to print the output and can also choose either a simple version or a complete version to print. The simple version is similar to the output of lm
and the complete version is similar to the output of summary.lm
. The function would return a list of elements which is almost the same as summary.lm
no matter what printing style is chosen.
And the function predictlm
is written for get predicted value in a new data set based on the fitted model.
The example data set is mtcars
, which is included in package datasets
. More information could be viewed by:
?mtcars
mylm
To use the function mylm
:
mylm(mpg~wt,mtcars,"nothing")
To print out the simple version of output:
mylm(mpg~wt,mtcars,"simple")
To print out the complete version of output:
mylm(mpg~wt,mtcars,"summary")
Do multiple linear regression:
mylm(mpg~wt+drat,mtcars,"summary")
Add categorical covariate:
mylm(mpg~wt+drat+factor(cyl),mtcars,"summary")
Add interaction term:
mylm(mpg~wt+drat+I(wt*drat),mtcars,"summary")
Do transformation on covariate:
mylm(mpg~wt+drat+I(wt^2),mtcars,"summary")
predictlm
To use function predictlm
n <- nrow(mtcars) train_n <- floor(n*0.7) traindata <- mtcars[1:train_n, ]#the original data set testdata <- mtcars[(train_n+1):n, ]#the new data set mylm1 <- mylm(mpg~wt,traindata,"nothing") predicted_value <- predictlm(mylm1,testdata)
The correctness is tested by testthat
. All the tests are passed. Basically, I compared the result of summary.lm
and mylm
as well as the result of predict.lm
and predictlm
. The example of correctness testing is as the following:
#mylm summary(lm(mpg~wt+drat,mtcars)) cat("--------","\n") mylm(mpg~wt+drat,mtcars,"summary") cat("--------","\n") all.equal(unname(mylm(mpg~wt+drat,mtcars,"nothing")$coefficients[,1]),unname(lm(mpg~wt+drat,mtcars)$coefficients), tolerance = 1e-6)
#predictlm n <- nrow(mtcars) train_n <- floor(n*0.7) traindata <- mtcars[1:train_n, ]#the original data set testdata <- mtcars[(train_n+1):n, ]#the new data set mylm1 <- mylm(mpg~wt,traindata,"nothing") lm1 <- lm(mpg~wt,traindata) all.equal(sum(predictlm(mylm1,testdata)-predict(lm1,testdata)), 0, tolerance = 1e-6)
I compare the efficiency between mylm
and summary.lm
. mylm
has a lower efficiency.
system.time(for(i in 1:100) mylm(mpg~wt+drat,mtcars,"nothing")) system.time(for(i in 1:100) summary(lm(mpg~wt+drat,mtcars)))
I also compare the efficiency between predictlm
and predict.lm
. The efficiency of predictlm
is a bit lower than predict.lm
.
n <- nrow(mtcars) train_n <- floor(n*0.7) traindata <- mtcars[1:train_n, ]#the original data set testdata <- mtcars[(train_n+1):n, ]#the new data set system.time(for(i in 1:100) predictlm(mylm(mpg~wt+drat,mtcars,"nothing"),testdata)) system.time(for(i in 1:100) predict(lm(mpg~wt+drat,traindata),testdata))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.