source('_format.R')

Chapter 10 of the text discusses fitting regression models. Thankfully, R's support for such models is very good. Again, we must be careful to set the contrasts of unordered factors to "contr.sum" as follows:

options(contrasts=c(unordered='contr.sum', ordered='contr.poly'))

Example 10.1

The example investigates the relationship of reaction temperature and catalyst feed rate to the viscosity of a polymer. First, let's do the regression through matrix operations. Define the $X$ matrix using the model.matrix command:

X <- model.matrix(~ReactionTemperature + CatalystFeedRate, data=Table10.2)
print(X)

\noindent The response $y$ is just the Viscosity column:

y <- Table10.2$Viscosity
print(y)

\noindent Now we solve $(X^\mathrm{T} X)^{-1} X^\mathrm{T} y$. We can find the inverse $(X^\mathrm{T} X)^{-1}$ with solve(t(X) %*% X) and then multiply by $X^\mathrm{T} y$, but it's numerically more stable to supply $X^\mathrm{T} y$ as the second argument to solve:

beta.hat <- solve(t(X) %*% X, t(X) %*% y)
print(beta.hat)

\noindent Now, to do this using lm is far simpler:

model <- lm(Viscosity ~ ReactionTemperature + CatalystFeedRate, data=Table10.2)
print(summary(model))

Table 10.3 gives many types of residuals and diagnostic measures, many of which we'll create below:

Table10.3 <- data.frame(
    'y'=Table10.2$Viscosity,
    'Predicted'=predict(model),
    'Residual'=resid(model),
    'hii'=hatvalues(model),
    'StudentizedResids'=rstudent(model),
    'CooksD'=cooks.distance(model)
)
kable(Table10.3) %>% kable_styling()

Example 10.6

Consider the viscosity data from Table 10.2. If we wish to test $H_0: \beta_2 = 0$ versus $H_1: \beta_2 \ne 0$ by partial $F$-test we can use the anova function:

model01 <- lm(Viscosity ~ ReactionTemperature, data=Table10.2)
model012 <- lm(Viscosity ~ ReactionTemperature + CatalystFeedRate, data=Table10.2)
print(anova(model01, model012))

Example 10.7

Parameter confidence intervals can be retrieved with the confint function:

model <- lm(Viscosity ~ ReactionTemperature + CatalystFeedRate, data=Table10.2)
confint(model)

\noindent We can also get other intervals. For example, the prediction interval for the response over ReactionTemperature at CatalystFeedRate set to 10.5 is given below:

x1 <- seq(from=80, to=100, length.out=100)
x2 <- rep(10.5, 100)
y <-predict(model, newdata=data.frame('ReactionTemperature'=x1, 'CatalystFeedRate'=x2), interval="confidence") 
plot(x1, y[,1], type='l', ylim=c(2248, 2434), main='Prediction Interval', xlab='Reaction Temperature', ylab='Viscosity')
lines(x1, y[,2], lty=2)
lines(x1, y[,3], lty=2)


ehassler/MontgomeryDAE documentation built on March 20, 2021, 7:08 p.m.