The lab4ab package provides the main function to compute a linear regression when a set of data and the variables to investigate are given. Compared to the classic linear regression function (lm()) linreg() helps to calculate all the most significative elements of a linear regression in just one step just giving the observations (contained in a dataset) and the formula. lab4ab also provides a plotting function and a summary function just as lm() does.

Linear regression function: linreg()

linreg$new(data, formula) needs two arguments:

Applying the function to those objects the dependent variable is stored in a vector Y and the independent variables in a matrix X. At the same time all the following statistics are calculated:

  1. Regression coefficients (beta_hat) : $\hat{\beta}= (X^TX)^{-1}X^Ty$

  2. The fitted values (fitted) : $\hat{y}= X\hat{\beta}$

  3. The residuals (res) : $\hat{e}= y-X\hat{\beta}$

  4. The degree of freedom (df) : $df=n-p$ , where $n$ is the number of observations and $p$ is the number of parameters in the model

  5. The residual variance (res_var) : $\hat{\sigma}^2=\frac{e^Te}{df}$

  6. The variance of the regression coefficients (reg_var) : $\hat{Var}(\hat{\beta})=\hat{\sigma}^2(X^TX)^{-1}) $

  7. The t-values for each coefficient (t_value) :$t_\beta=\frac{\hat{\beta}}{\sqrt{Var(\hat{\beta})}}$

  8. The p-values for each coefficient (p_value) : $P(|>t|)$ the probability of each coefficient to be equal to 0

linreg is a class, all the calculations have been stored in it, so to access to one of the previews statistics it may be used $ as it follows:

data <- iris
formula <- Petal.Length ~ Sepal.Width + Sepal.Length

#my_linear_reg <- linreg$new(formula, data)

#residuals <- my_linear_reg$res()

#residuals 

The package provides also three other methods which return some of the statistics showed before:

The printing method: print()

The $print() shows exactly the same data of lm() function: it provides the coefficients found in the linear regression and their names.

data <- iris

formula <- Petal.Length ~ Sepal.Width + Sepal.Length

#my_linear_reg <- linreg$new(formula, data)

#my_linear_reg$print()

Plotting

In lab4ab package there is also a method which contains two graphs: one representing the residuals vs the fitted values, the second one a scaled version of the first one, the residuals have been standardized by the formula: $$\sqrt{ \frac{|mean(res)-res|}{\sqrt{{Var(res)}}}}$$ . The line represents the trend of the observations in the graphs. The smooth has been realized with a local polynomial regression fitting with a span (parameter which controls the degree of smoothing) of 1.5. It looks very close to the one obtained by plot(lm()). The methid $plot() plots, using ggplot2 package that has to be installed before using lab4ab. The graphs respect some of the graphical criteria of Linkoping University.

#my_linear_reg$plot()

$summary()

Last method in the package is summary(). It's aim is to return a similar printout as printed for lm(). The statistics of this output are:

#my_linear_reg$summary()

Examples

data <- iris

formula <- Petal.Length ~ Species

#my_linear_reg <- linreg$new(formula, data)
knitr::kable(head(iris,10))
#my_linear_reg$print()
#my_linear_reg$summary()


brbatv/lab4 documentation built on May 3, 2019, 2:56 p.m.