knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 6, fig.height = 4 )
library(lm.hw4) library(bench)
The lm.hw4 package aims to estimate and help interpret the linear relations between predictor variables and response variables:
It estimates the parameters of the linear regression model.
It summarizes the properties of fitted models.
It computes analysis of variance (or deviance) tables for fitted model objects.
This document introduces you to lm.hw4's basic set of tools.
To explore the usage of lm.hw4 package, we use simulation data.
n = 10; p = 3; q = 2; x = matrix(rnorm(n * p), n, p) # no intercept y1 = rnorm(n) y2 = matrix(rnorm(n * q), n, q)
lm_fit()lm_fit() is used to fit linear regression model between a response vector/matrix Y and a predictor matrix X. The function return a list of response values.
Before fitting the model, you have to decide whether a coefficient term should be added to the model. If yes, add 'add.intercept = TRUE' to the function statement.
There are two methods supported, "inv" and "qr".
In the following example, we fit the linear regression model without adding an intercept term using method "qr".
lm_fit(x, y1, add.intercept = FALSE, method = "qr")
lm_summary.lm_summary is used to calculate properties of a fitted model. This function only support cases when the response variable is a vector or a single column matrix and the method used to fit the model is "qr". The function return a list of response values.
If you want to calculate the correlation matrix of the coefficients, add 'correlation = TRUE' to the function statement.
z = lm_fit(x, y1, add.intercept = FALSE, method = "qr") lm_summary(z, correlation = TRUE)
lm_anova.lm_anova is used to computes analysis of variance tables for fitted model objects. This function only support cases when the response variable is a vector or a single column matrix and the method used to fit the model is "qr". The function return a list of response values.
lm_anova(z)
# Simulation Data n = 1000; p = 10; x = matrix(rnorm(n * p), n, p) # no intercept y = rnorm(n)
Since the methods for estimating the parameters are different, there might be a slightly difference (around 1e-15) between the results of lm.hw4 functions and the original R functions.
Correctness is tested through the unit tests. In unit tests, 3 cases are included: simulation data of small sample size, simulation data of big sample size and real data. Each test case compares the output values of lm.hw4 functions and the original R function. All test cases are passed.
lm_fit v.s. lm.fitresult = bench::mark(lm_fit(x,y, method = "inv"), lm_fit(x,y, method = "qr"), lm.fit(x,y), check = FALSE, filter_gc = FALSE, min_time = 3) print(result) plot(result)
From the above results, we can see that the original R function is the most effective method because it use "qr" method and rewrite "qr()" function with Fortan. Compare lm_fit() function using two methods, the implementations using "qr" method are 1 times faster than the implementations using "inv" method because calculating a inverse matrix with high dimensions is time consuming. To improve the effectness of the lm_fit() function, the "qr()" function should be rewrite in C++.
lm_summary v.s. summary.lmz1 = lm_fit(x,y, method = "qr") z2 = lm(y~0+x) result = bench::mark(lm_summary(z1),summary.lm(z2), check = FALSE, filter_gc = FALSE, min_time = 3) print(result) plot(result)
From the above results, we can see that lm_summary() function is faster than the original R function.
lm_anova v.s. anovaz1 = lm_fit(x,y, method = "qr") z2 = lm(y~0+x) result = bench::mark(lm_anova(z1),anova(z2), check = FALSE, filter_gc = FALSE, min_time = 3) print(result) plot(result)
From the above results, we can see that lm_summary() function is faster than the original R function anova. The reason is that the original R function anova is not only designed for linear regression cases, which means that it has to transform and manipulate the linear regression result.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.