testPkg is a R package for fitting linear regression model using Least Squares Method.
For linear regression model with similar function form below:
where Y is respond variable and X's are predictors, the estimated equation is following:
based on the observation samples of respond Y and predictor X's.
Least Squares Method aimed to estimate the coefficients above by minimizing residual sum of squares so that
. And this package uses RcppArmadillo to do matrix operation during estimation for efficiency.
Use package devtools to install testPkg package.
library(devtools)
devtools::install_github("XuelinGu/LinearRegression", build_vignettes = T)
library(testPkg)
There is one main function lr() in this package. Basically, you can use the add formula to the argument which includes respond variable left, predictor variable right and connects previous two elements with ~, to run the function lr() and the samples can be vector, matrix or dataframe.
y = c(23, 24, 26, 37, 38, 25, 36, 40)
x1 = c(1, 2, 3, 4, 5, 6, 7, 8)
x2 = c(23, 32, 34, 20, 24, 56, 34, 24)
x3 = c("M", "F", "F", "U", "M", "F", "U", "M")
x4 = mtcars$hp ### using "mtcars" dataframe from R
result = lr(y ~ x1 + x2 + x3)
result2 = lr(mtcars$mpg ~ cyl + disp + x4, data = mtcars)
The result will return a list of estimates and inference results: "Coefficients", "F test and R square", "Fitted values", "Residuals", "Sum of Squares", "X inverse matrix", and "Coefficients Variance".
In addition, there are 4 optional arguments for using lr() more flexily:
data: indicates the dataframe name where variables' data come from.
coding: "reference"(by default) or "means", which indicates the cell reference coding or cell means coding method used when categorical X included. The revious one will take first appeared group of X as reference and reserve the intercept term while the latter one will eliminate the intercept of the model.
intercept: "TRUE"(by default) or "FALSE", which indicates whether the model will contains an intercept term
reference: 1 (by default take the first appeared group of X as reference) or any group value of included categorical X choosen as reference group.
result1 = lr(y ~ x1 + x2 + x3) ##coding = "reference", intercept = TRUE, reference = 1
result2 = lr(y ~ x1 + x2 + x3, coding = "means") ##with cell means coding method
result3 = lr(y ~ x1 + x2 + x3, intercept = FALSE) ##elinimate intercept term
result4 = lr(y ~ x1 + x2 + x3, reference = "F") ##speficy "F" reference group
For fully knowing the usage and characteristics of this package or lr(), you could check vignettes with:
browseVignettes("testPkg")
or check the help page with
?lr
Firstly, lr() havs same correctness performance with lm(). Secondly, lr() is more efficient and uses less memory than common used lm() regarding to small or big samples without categorical variables. When categorical samples included for estimating model, lr() is little less efficient than lm(). However, lr() uses less memory than lm() and you can specify any group as reference flexibly.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.