library(bench)
library(stats)
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
library(hw4)

To use the function lm2 and print_lm2:

lm2 is a function that is designed to perform linear model regressions. print_lm2 is a function that is designed to print the output from lm2. A data set containing information regarding red wine quality named "wine" is included in this package for examples and tutorials.

Call the function lm2 and use print_lm2 to print output
Note: na.action defaults to omit if call is excluded

print_lm2(lm2(formula = quality ~ volatile.acidity + citric.acid + residual.sugar, data = wine))

Use na.action impute

# note slight difference from previous example: we've imputed missing values with column means
print_lm2(lm2(formula = quality ~ volatile.acidity + citric.acid + residual.sugar, data = wine,
    na.action='impute'))

Use na.action fail

# we expect an error as the dataset wine contains missing variables
# lm2(formula = quality ~ volatile.acidity + citric.acid + residual.sugar, data = wine, 
    #na.action = 'fail')

Additional information exported
Above, we can see that the formula and data calls are required. There is additional information exported "invisibly". To access this information, including residuals, degrees of freedom, and rank etc, we must index our saved output using the $:

# save the output
saved_output <- lm2(formula = quality ~ volatile.acidity + citric.acid + residual.sugar, 
                    data = wine) 

saved_output$rank

Two ways to include interaction terms

1) Using * includes main effects and interaction

print_lm2(lm2(formula = quality ~ volatile.acidity * citric.acid , data = wine))
  1. Using : includes only the interaction and omits main effects
print_lm2(lm2(formula = quality ~ volatile.acidity : citric.acid , data = wine))

Benchmark lm2 against lm:

In this section, we will compare the outputs of lm2 against current method lm. To begin, we will save the outputs from both lm2 and lm functions.

quality_lm2 <- lm2(formula = quality ~ volatile.acidity + citric.acid + residual.sugar, 
                   data = wine)

quality_lm <- lm(formula = quality ~ volatile.acidity + citric.acid + residual.sugar, data = wine)

Below, we use all.equal and mark functions from base R and bench package, respectively, to compare outputs from lm2 and lm.

Compare: Coefficients

# Note default na.action matches lm function
all.equal(quality_lm2$coefficients, quality_lm$coefficients) 
bench::mark(quality_lm2$coefficients, quality_lm$coefficients) 

Compare: Residuals

all.equal(quality_lm2$residuals, quality_lm$residuals) 
bench::mark(quality_lm2$residuals, quality_lm$residuals)

Compare: Rank

all.equal(quality_lm2$rank, quality_lm$rank) 
bench::mark(quality_lm2$rank, quality_lm$rank)

Compare: Fitted Values

all.equal(quality_lm2$fitted.values, quality_lm$fitted.values) 
bench::mark(quality_lm2$fitted.values, quality_lm$fitted.values) 

Compare: Model

all.equal(as.matrix(quality_lm$model), as.matrix(quality_lm2$model))
bench::mark(as.matrix(quality_lm$model), as.matrix(quality_lm2$model))

To use the function summarylm2() and print_summarylm2:

summarylm2 is a function that is designed to take an output of function lm2 and deliver additional information that may be useful when executing a linear regression model. For these examples, we will occasionally use the output from an earlier example using lm2 named quality_lm2.

Two ways to call summarylm2

summarylm2(quality_lm2)
summarylm2(lm2(formula = quality ~ volatile.acidity + citric.acid + residual.sugar, 
                   data = wine))

Print Output

print_lm2(summarylm2(quality_lm2))

Additional information exported
Above, we can see that it is necessary to use summarylm2 in conjunction with lm2. Similar to before, there is additional information exported "invisibly". To access this information, including residual table (in quartiles), coefficients, standard error, t-values, and p-values etc, we must index our saved output using the $:

# save the output
saved_output <- summarylm2(lm2(formula = quality ~ volatile.acidity + citric.acid + residual.sugar, data = wine))

saved_output$coefficients

Benchmark summarylm2 against summary:

In this section, we will compare the outputs of summarylm2 used in conjunction with lm2 against current method, summary used in conjunction to lm. To begin, we will save the outputs from both lm2 and lm functions.

quality_summarylm2 <- summarylm2(quality_lm2)

quality_summarylm <- summary(quality_lm)

Compare: Coefficients

all.equal(as.matrix(quality_summarylm2$coefficients), as.matrix(quality_summarylm$coefficients)) 
bench::mark(as.matrix(quality_summarylm2$coefficients), as.matrix(quality_summarylm$coefficients))

Compare: R Squared

all.equal(as.matrix(quality_summarylm2$r.squared), as.matrix(quality_summarylm$r.squared)) 
bench::mark(as.matrix(quality_summarylm2$r.squared), as.matrix(quality_summarylm$r.squared))

Compare: Adjusted R Squared

all.equal(quality_summarylm2$adj.r.squared, quality_summarylm$adj.r.squared)
bench::mark(quality_summarylm2$adj.r.squared, quality_summarylm$adj.r.squared)

Compare: Unscaled Covariance

all.equal(quality_summarylm2$cov.unscaled, quality_summarylm$cov.unscaled)
bench::mark(quality_summarylm2$cov.unscaled, quality_summarylm$cov.unscaled)

Compare: F statistics and P values

all.equal(quality_summarylm2$fstatistic, quality_summarylm$fstatistic)
bench::mark(quality_summarylm2$fstatistic, quality_summarylm$fstatistic)


gnbosma/hw4 documentation built on March 19, 2021, 1:47 p.m.