In ehassler/MontgomeryDAE: MontgomeryDAE - Data and examples from Montgomery's "Design and Analysis of Experiments"

source('_format.R')

When factors are used in linear models then there's some flexibility in how we input the factor into the model matrix. By default, R uses a function contr.treatment to encode unordered factors and contr.poly to encode ordered factors. If a factor has $k$ levels then contr.treatment function will encode the last $k-1$ as indicator functions taking values $0$ when the level is not present and $1$ when the level is present. This causes the first level of the factor to be aliased into the intercept term in the model and the remaining coefficients represent the difference from this mean. If we want the levels of the factor to be encoded in such a way that the sum of their coefficients is zero we have to use contr.sum. The following will set the default for unordered factor to be contr.sum and leave the ordered factor behavior as the default.

options(contrasts=c('unordered'='contr.sum', 'ordered'='contr.poly'))

\noindent Note that this setting only affects the effect estimates --- things like ANOVA table sums of squares shouldn't be affected.

Example 3.1 --- The Plasma Etching Experiment

The plasma etching experiment of Table 3.1 relates the RF power to the depth of the etching. At each of the four power levels there were 5 observations. The structure of the data.frame is given below:

str(Table3.1)

\noindent Fitting the ANOVA model we find that we can reject the null hypothesis that all of the power levels have the same average etch amount:

summary(aov(Observation ~ Power, data=Table3.1))

\noindent Compare this result to Table 3.4 in Example 3.1. To get the effect estimates in Example 3.3 we have to do a little extra work:

model <- lm(Observation ~ Power, data=Table3.1)
summary(model)

\noindent Since the sum of the coefficients for the Power factor is zero we know the Power4 coefficient is r -sum(coef(model)[2:4]).

To check the model adequacy we can generate the normal probability plot of the residuals from the model.

qqnorm(resid(model))
qqline(resid(model), col='orange')

\noindent The normal probability plot doesn't show any strong signs of non-normality. Next, let's examine the residual-by-run-order plot to look for patterns that would indicate a lack of independence and the predicted-versus-residual plot to check for equality of variance:

op <- par(mfrow=c(1, 2))
plot(resid(model), main='Residual vs. Run Order')
plot(x=predict(model), y=resid(model), main='Predicted vs. Residual')
par(op)

\noindent Neither plot seems to show strong signs of trouble. We can also perform a formal test for equality of variance. Bartlett's test, described in Section 3.4, tests $$ H_0: \sigma_1^2 = \sigma_2^2 = \ldots = \sigma_a^2 \quad\text{vs.}\quad H_A: \text{above not true for at leas one}~\sigma_i^2~. $$ This is easy to do in R:

bartlett.test(Observation ~ Power, data=Table3.1)

\noindent Bartlett's test fails to reject the null hypothesis. We find no strong evidence that the model is an inadequate description of the data.

Example 3.5 and Variance Stabilizing Transforms

In Example 3.5 an engineer is investigating if there's a difference between the four methods of estimating flood flow frequency. Note that here, estimation method is coded as 1, 2, 3, or 4. It's important to check that these are a factor so that R doesn't try to treat 4 as twice the estimation method as 2:

str(Example3.5)

\noindent Having verified EstimationMethod is a factor, let's construct the ANOVA table as in Table 3.8:

model <- aov(Observation ~ EstimationMethod, data=Example3.5)
summary(model)

\noindent The following diagnostic plot of predicted-versus-residual suggests that the model is not appropriate:

plot(predict(model), resid(model), main='Predicted vs. Residual')

As suggested in the text, we apply a $y^\star = \sqrt{y}$ transform and refit the model, achieving Table 3.10 in the text:

y <- sqrt(Example3.5$Observation)
model <- aov(y ~ EstimationMethod, data=Example3.5)
summary(model)

\noindent The new predicted versus residual plot looks much better:

plot(predict(model), resid(model), main='Predicted vs. Residual')

Even though the conclusions drawn from the untransformed data are the same as with the transformed data, having validated our model assumptions means we can put more trust in our results.

Regression Model

Section 3.5.1 suggests using a regression model to fit the etch data of Table 3.1. To do this, first we convert Power to a numeric vector x. Next, we fit the linear and quadratic models. The symbol I(x^2) tells R that x^2 is how you want to transform x and include it in the model. If we left the I part off the result would be that R would effectively treat it as a second x term in the formula and ignore it.

x <- as.numeric(as.character(Table3.1$Power))
linear.model <- lm(Observation ~ x, data=Table3.1)
quadratic.model <- lm(Observation ~ x + I(x^2), data=Table3.1)

\noindent Next let's plot the linear versus the quadratic fits:

op <- par(mfrow=c(1,2))
plot(x, Table3.1$Observation, main='Linear Model')
abline(a=coef(linear.model)[1], b=coef(linear.model)[2], col='orange')
plot(x, Table3.1$Observation, main='Quadratic Model')
u <- seq(from=160, to=220, length.out=100)
v <- cbind(rep(1, length(u)), u, u^2) %*% coef(quadratic.model)
lines(u, v, col='orange') 
par(op)

\noindent Visual inspection suggests that the quadratic model is a better fit for the data. This model lets us estimate mean etch depth at power levels between 160 and 220, which would not be possible using the original ANOVA model.

Contrasts

Recall the ANOVA model for the plasma etching experiment:

model <- aov(Observation ~ Power, data=Table3.1)
summary(model)

\noindent One way to calculate individual contrasts is to use the emmeans package [@emmeans]. Here we'll create an object and then derive the contrasts from it:

library(emmeans)
emmodel <- emmeans(model,   ~ Power)
contrast(emmodel, list(
    'C1'=c(1, -1, 0, 0),
    'C2'=c(1, 1, -1, -1),
    'C3'=c(0, 0, 1, -1)
))

\noindent These contrasts match those given in Example 3.6. We can also do all pairwise comparisons of the Power factor using the Tukey adjustment of Example 3.7:

pairs(emmodel, adjust="tukey")

Both Fisher's LSD and Dunnett's test can be run by comparing the derived critical value to the estimate given above.

Determining Sample Size

As in Chapter 2 we will use the pwr package to calculate power/sample size. For the single factor ANOVA, we wish to calculate an effect size $f$ to use in our power calculation. Let there be $a$ levels each with $n$ samples for a total of $N = an$ samples overall. Let $\mu_i$ be the mean of level $i$, $\mu$ the overall mean (mean of all means), and let $\sigma$ be the standard deviation. The effect size $f$ is $$ f = \frac{1}{\sigma} \sqrt{\frac{1}{a} \sum_{i=1}^k (\mu_i - \mu)^2 } $$ \noindent In Example 3.1, the standard deviation is 25 with prospective means 575, 600, 650, and 675. Thus, if we have three samples at each mean and our test is to have $\alpha =0.01$ then the power would be:

library(pwr)
w <- c(575, 600, 650, 675)
a <- length(w)
sigma <- 25
f <- sqrt((1/a)* sum((w - mean(w))^2)) / sigma
pwr.anova.test(k=4, f=f, n=3, sig.level=0.01)

\noindent so about 75%. We can construct the power curve easily:

library(pwr)
n <- seq(from=2, to=8, length.out=50)
y <- pwr.anova.test(k=4, f=f, sig.level=0.01, n=n)$power
plot(n, y, type='l', 
     main='Power for Example 3.1', 
     xlab='Sample Size', 
     ylab='Power')

\noindent According to the plot, to get 90% power would require a sample size of 4 at each group (16 overall). If we wish for 90% power we can get that directly:

pwr.anova.test(k=4, f=f, sig.level=0.01, power=0.9)

\noindent Rounding up, we'd require $n=4$ samples at each level of Power.

Chocolate Consumption Experiment of Section 3.8.1

Consider the experiment described in Section 3.8.1 where three different types of chocolate were given to subjects and then the antioxidant capacity of their blood plasma was measured. A boxplot of the data is presented below:

boxplot(Observation ~ Factor, data=Table3.12)

\noindent Next, we fit an ANOVA model and look for the general significance test:

summary(aov(Observation ~ Factor, data=Table3.12))

\noindent This indicates that the chocolate concoctions had differing mean effects on antioxidant levels. To estimate the individual effects we use a linear model without an intercept:

means.model <- lm(Observation ~ Factor - 1, data=Table3.12)
print(summary(means.model))

\noindent Next, we generate confidence intervals for each chocolate level:

confint(means.model)

\noindent Next, we'll look at all pairwise comparisons (without making any adjustment for multiple comparisons):

pairwise.model <- pairs(emmeans(aov(Observation ~ Factor, data=Table3.12), ~Factor), adjust='none')
print(pairwise.model)

\noindent We follow the point estimates with confidence intervals for the pairwise differences:

confint(pairwise.model)

\noindent This is strong evidence that DC produces a larger effect than DC+MK and MC. Finally, let's produce the diagnostic plots to ensure our model is adequate.

op <- par(mfrow=c(1,2))
qqnorm(resid(means.model), main='Normal Probability Plot')
qqline(resid(means.model), col='orange')
plot(x=predict(means.model), y=resid(means.model), main='Fitted vs. Residual')
par(op)

\noindent Both diagnostic plots fail to show any strong evidence of deviation from our modelling assumptions. This provides us with enough information to draw the same conclusion as the text --- dark chocolate "DC" results in a higher mean blood antioxidant capacity than the other two chocolates.

End-Aisle Experiment of Section 3.8.2

The end-aisle experiment of Section 3.8.2 compares three different display designs in terms of the percent increase in sales. Here, the design is encoded using the numbers 1, 2, and 3 so we must be careful that those are recorded as factors:

str(Table3.14)

\noindent A visual inspection of the distributions of sales-increases suggests that DisplayDesign has an effect:

boxplot(Observations ~ DisplayDesign, data=Table3.14)

\noindent Let's start by running the ANOVA general $F$-test:

model <- aov(Observations ~ DisplayDesign, data=Table3.14)
summary(model)

\noindent We are able to reject the null hypothesis that all for the treatments have equal means. Next, we estimate the mean for each DisplayDesign. We can use the regular aov function if we switch to the cell means model:

means.model <- lm(Observations ~ DisplayDesign - 1, data=Table3.14)
summary(means.model)

\noindent Then, we can construct the confidence intervals for the means with confint:

confint(means.model)

Alternatively, we could use the emmeans package to do this from the effects model. We indicate that we want the means with respect to DisplayDesign:

ls.model <- emmeans(model, ~DisplayDesign)
print(ls.model)

\noindent Next, we construct the confidence intervals:

confint(ls.model)

\noindent Both methods are presented because the cell means model is often easier but the emmeans package will be of more use as we go forward. Note that, in constructing 95% confidence intervals, it appears that design 3's confidence interval is disjoint from those from designs 1 and 2. Looking at all pairwise comparisons, it appears that design 3 produces the largest mean increase in sales:

pairwise.model <- pairs(ls.model, adjust='none')
print(pairwise.model)

\noindent We can also construct confidence intervals on the pairwise differences.

confint(pairwise.model)

\noindent Next, let's examine the diagnostic plots:

op <- par(mfrow=c(1,2))
qqnorm(resid(means.model), main='Normal Probability Plot')
qqline(resid(means.model), col='orange')
plot(x=predict(means.model), y=resid(means.model), main='Fitted vs. Residual')
par(op)

\noindent There may be a slight increase in variance displayed by the fitted versus residual plots, but no strong evidence of model misspecification.

Smelting Experiment of Section 3.8.3

The aluminum smelting experiment of Section 3.8.3 describes four different ratio control algorithms governing the introduction of alumina to a cell where a response related to cell voltage was recorded. Here, we're going to model the standard deviation of cell voltage to determine which algorithm is most consistent. First, let's check that RatioControlAlgorithm is a factor:

str(Table3.16)

\noindent As suggested by the text, we apply the negative log transform to the StDevVoltage before fitting our model:

y <- -log(Table3.16$StDevVoltage)
summary(aov(y ~ RatioControlAlgorithm, data=Table3.16))

\noindent A quick visual inspection using box plots strongly suggests that algorithm 3 produces the largest noise, whereas the others all appear about the same:

boxplot(y ~ RatioControlAlgorithm, data=Table3.16)

Random Effects Model of Section 3.9 (Example 3.10)

For the loom experiment of Section 3.9 we wish to model the looms as a population, and hence wish to use a random effects model. We can get the ANOVA values for balanced designs using the Error term and the normal aov function:

model <- aov(Strength ~ Error(Looms), data=Example3.10)
summary(model)

\noindent In this method we would have to manually construct the mean square errors, $F$-statistic, and $p$-value. The other method for fitting this model is using the lme4 package [@lme4] and lmerTest package [@lmerTest] for fitting random or mixed effects models. In lme4, to indicate a random intercept for each level of loom we use (1|Looms) in the formula as follows:

library(lme4)
library(lmerTest)

model <- lmer(
    Strength ~ (1|Looms),
    data=Example3.10,
    REML=TRUE
)
print(summary(model))
print(rand(model))

\noindent This matches the JMP output of Table 3.20. We can generate confidence intervals for the variance components easily:

confint(model, oldNames=FALSE)

\noindent Above, the confidence intervals for the standard deviations differ from JMP in that these are based on the profile likelihood.

Kruskal-Wallis Test of Section 3.11

Section 3.11 describes the non-parametric Kruskal-Wallis test. This test doesn't assume normality and is very easy to use in R. Here, we run the test against the plasma etch experiment data of Table 3.1:

kruskal.test(Observation ~ Power, data=Table3.1)

\noindent We find significant difference between the treatments.

References

ehassler/MontgomeryDAE documentation built on March 20, 2021, 7:08 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

ehassler/MontgomeryDAE
MontgomeryDAE - Data and examples from Montgomery's "Design and Analysis of Experiments"

In ehassler/MontgomeryDAE: MontgomeryDAE - Data and examples from Montgomery's "Design and Analysis of Experiments"

Example 3.1 --- The Plasma Etching Experiment

Example 3.5 and Variance Stabilizing Transforms

Regression Model

Contrasts

Determining Sample Size

Chocolate Consumption Experiment of Section 3.8.1

End-Aisle Experiment of Section 3.8.2

Smelting Experiment of Section 3.8.3

Random Effects Model of Section 3.9 (Example 3.10)

Kruskal-Wallis Test of Section 3.11

References

R Package Documentation

Browse R Packages

We want your feedback!

ehassler/MontgomeryDAE MontgomeryDAE - Data and examples from Montgomery's "Design and Analysis of Experiments"

In ehassler/MontgomeryDAE: MontgomeryDAE - Data and examples from Montgomery's "Design and Analysis of Experiments"

Example 3.1 --- The Plasma Etching Experiment

Example 3.5 and Variance Stabilizing Transforms

Regression Model

Contrasts

Determining Sample Size

Chocolate Consumption Experiment of Section 3.8.1

End-Aisle Experiment of Section 3.8.2

Smelting Experiment of Section 3.8.3

Random Effects Model of Section 3.9 (Example 3.10)

Kruskal-Wallis Test of Section 3.11

References

R Package Documentation

Browse R Packages

We want your feedback!

ehassler/MontgomeryDAE
MontgomeryDAE - Data and examples from Montgomery's "Design and Analysis of Experiments"