knitr::opts_chunk$set(echo = TRUE) library(lmcoef)
Assume now we have a vector for the response variable, y
, with length of n, and a n*m matrix for the independent variable x
.
y = c(78.5, 74.3, 104.3, 87.6, 95.9, 109.2, 102.7, 72.5, 93.1, 115.9, 83.8, 113.3, 109.4) x = cbind(X1 = c(7, 1, 11, 11, 7, 11, 3, 1, 2, 21, 1, 11, 10), X2 = c(26, 29, 56, 31, 52, 55, 71, 31, 54, 47, 40, 66, 68), X3 = c(6, 15, 8, 8, 6, 9, 17, 22, 18, 4, 23, 9, 8), X4 = c(60, 52, 20, 47, 33, 22, 6, 44, 22, 26, 34, 12, 12)) data = data.frame(cbind(y,x))
To use the function lm_coef
:
rst = lm_coef(y,x)
To compare the result for the estimated coefficients of this new implement with the original lm
function:
lm_est = summary(lm(y~., data))$coefficients lm_coef_est = rst$coefficients row.names(lm_est) <- NULL colnames(lm_est) <- NULL row.names(lm_coef_est) <- NULL colnames(lm_coef_est) <- NULL all.equal(lm_est, lm_coef_est)
Since 'anova' object must have colnames, To compare the result for the ANOVA table of this new implement with the original lm
function, we just simply observe two outputs and they are the same:
lm_anova = anova(lm(y~., data)) lm_coef_anova = rst$anova lm_anova lm_coef_anova
To compare the result for the residuals of this new implement with the original lm
function:
lm_res = as.vector(summary(lm(y~x))$residuals) lm_coef_res = as.vector(rst$residuals) all.equal(lm_res, lm_coef_res)
To compare the result for the residuals of this new implement with the original lm
function:
lm_fit = as.vector(summary(lm(y~x))$fitted.values) lm_coef_fit = as.vector(rst$fitted.values) all.equal(lm_res, lm_coef_res)
We compare the speed of these two implements - the new implement is slower than the original function:
library(microbenchmark) microbenchmark(summary(lm(y~x))$coefficients, lm_coef(y,x)$coefficients) microbenchmark(anova(lm(y~x)), lm_coef(y,x)$anova) microbenchmark(summary(lm(y~x))$residuals, lm_coef(y,x)$residuals) microbenchmark(summary(lm(y~x))$fitted.values, lm_coef(y,x)$fitted.values)
cpr_coef
is a integrated function that could achieve a single-step estimation of constraint polynomial regression. Currently, no fucntion like this has been included in any R package. The conventional implementation was based on the lm
function and needed prior calculation before finally using lm
. This new implementation would save much time when user facing a condition as the following example.
As an example, assume now we know we would like to fit a polynomial regression with the power (p
) of 3, and we have a group of outcome data y
, that is:
p = 3 y = c(0.5 ,1 ,1.5)
For each y
, if the following constraint is proposed for the given model:
for $i = 1,2,...,n$ and $\epsilon_i \sim N(0,\sigma^2)$,
$$Y_i = \beta_0 + \sum_{j=1}^p{\beta_j * (X_i) ^ j} + \epsilon_i$$
where coefficient $\beta_j = \beta_{j-1} / j$ for $j = 2,...,p$, and predictor $X_i = i / n$,
the function cpr_coef
could be implemented:
rst = cpr_coef(p,y)
To compare the result for the estimated coefficients of this new implement with the original lm
function:
# Using lm function lm_res <- function(p, y){ n = length(y) x = (1:n) / n x_final = x / p ## apply Horner's rule to compute polynomial ## if(p > 1){ for(j in seq(p-1, 1, by = -1)){ x_final = x_final + 1 x_final = x / j * x_final } } b1 = summary(lm(y ~ x_final))$coefficients[2,1] b0 = summary(lm(y ~ x_final))$coefficients[1,1] beta_n = c(b1) for (k in 1:(p-1)){ beta_n = append(beta_n, beta_n[k] / (k+1)) } beta = c(b0, beta_n) return(beta) } lm_est = lm_res(p,y) cpr_coef_est = as.vector(cpr_coef(p,y)$coefficients) row.names(cpr_coef_est) <- NULL colnames(cpr_coef_est) <- NULL all.equal(lm_est, cpr_coef_est)
By comparing the speed of these two implements, we could observe that the cpr_coef
function could save much time:
library(microbenchmark) microbenchmark(lm_res(p,y), cpr_coef(p,y)$coefficients)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.