README.md

lm.Bios625.Package.2021

R-CMD-check test-coverage codecov

Description:

This is a toy package for assignment of Bios 625. The Package only containing a lm_s() function which is a mimic of R lm() function.

The lm_s function will fit a linear model using given data set. It will automatically print summary of coefficients and anova table of the model.

The function may not treat interaction terms correctly and cannot fit a model without an intercept properly.

Usage:

data(mtcars)
d = mtcars
m = lm_s(mpg ~ hp+wt,data = d)

image

Correctness:

We can compare the result of above lm_s() and below lm(). They are identical.

n = lm(mpg ~ hp+wt,data = d)
summary(n)
anova(n)

image

Efficiency:

The lm_s() function has very poor efficiency compared with R lm() even if most of the operation has been vectorized.

Benchmark using package profvis.

profvis({           
    for (i in 1:1000){          
        x = rnorm(1000)         
        y = rnorm(1000)         
        z = rnorm(1000)         
        m <- lm(y ~ x+z)            
        n1 <- lm_s(y ~ x+z) 
        n2 <- lm_s2(y ~ x+z)            
    }           
})

lm_s is the first implement of the lm_s, lm_s2 is a more compact version of that, try more vectorized and remove automatic printing. But both of the function are hundreds time slower than the lm() and also consume much more memory.

image

The most time and memory consuming step is the calculation related to hat matrix, as it is a n*n matrix.

image



yt-pan/lm.Bios625.Package.2021 documentation built on Dec. 23, 2021, 8:18 p.m.