library(nlme) library(ggplot2)
## Read in ascorbic data set ## load owprm package library(owprm) data(ascorbic) ## summary(ascorbic) ## str(ascorbic) ##
Repeated measures data arise in a wide array of different research environments. In this context, ...
The term 'repeated' is used here to describe measurements which are made on the same characteristic on the same observational unit but on more than one occasion. (Crowder and Hand, 1990, p. 1)
Because the measurements for a particular individual, subject, patient or observational unit are repeated, they are not independent within the individual. Therefore, special care must be taken in the statistical analysis of such data.
We analyze some repeated measures data taken from the textbook by Crowder and Hand (1990), Example 3.3 on page 32. Twelve hospital patients underwent a dietary regime treatment. Over the course of seven occasions, measurements were taken on ascorbic acid for each patient. The seven occasions were divided into three treatment phases - twice before, thrice during, and twice after the treatment regime. The seven occasions occurred at weeks 1, 2, 6, 10, 14, 15, & 16. The ascorbic acid measurements are made on the same patient (subject) over time, and therefore, are not statistically independent within the individual patient.
For this problem, the model may be written as
$$ y_{ijk} = \pi_{i} + \Phi_{j} + o_{k(j)} + \pi\Phi_{ij} + \pi o_{ik(j)} $$
where $y_{ijk}$ is $ijk^{th}$ response for the $i^{th}$ patient on the $k^{th}$ occasion in the $j^{th}$ phase, $\pi_{i}$ is the $i^{th}$ patient, $\Phi_{j}$ is the $j^{th}$ phase, $o_{k(j)}$ is the $k^{th}$ occasion nested within the $j^{th}$ phase, and $\pi o_{ik(j)}$ is the interaction between the $i^{th}$ patient and the $k^{th}$ occasion nested within the $j^{th}$ phase. In this analysis, the patients (patient
), occasions (occ
) and phases (Phase
) are considered as factors in the statistical setup. In particular, the occ
variable is a factor variable, not a numeric variable. Therefore, this statistical model is fully-saturated. It contains no error term. That is, there is no $\epsilon_{ijk}$ represented in the model. The error term for testing the hypothesis of interest in the anova
is constructed in a special way, as described in the sequel.
The patient, patient
, represented as $\pi_{i}$, is a random factor. A patient
is randomly selected from a potentially large population of patients. However, Phase
(Phase) and occ
(Occasion) are fixed factors. They are measured or controlled by the investigator and are comprised of a relatively low number of finite levels. In the model, any interaction between a fixed factor and a random factor is a random factor.
In order to test for the hypothesis of whether some relative upward or downward shift occurs among Phase
's, it is necessary to construct the $F$ statistic in a non-traditional way. To obtain the correct $F$ statistic, it is necessary to generate the expected mean-squares from the structural model. The expected mean-squares can be derived from different algorithms, such as the Cornfield-Tukey algorithm, as outlined, for example, in Winer (1971).
Figure 1 shows a plot of the data from Crowder and Hand (1990). This plot is constructed from a groupedData
data frame from package nlme
(Pinheiro \& Bates, 2000). The data appear to show a rise at week 6 and a fall (drop) at week 15.
## names(ascorbic) ## with(ascorbic, levels(Phase))
## ascorbic.gD.df <- groupedData(ascorbic.acid ~ week | patient, data = ascorbic) ## plot ascorbic.gD.df plot(ascorbic.gD.df, outer = ~ 1, key = FALSE, ylab = 'Ascorbic Acid', xlab = 'Week', aspect = 0.4, main = 'Reaction of Patients to Rx') ##
Figure 2 shows an alternative plot of the ascorbic
data, emphasizing that the occasions on the $x$-axis are to be considered as a factor dimension as opposed to a numeric dimension. This plot is constructed from the ggplot2
package (Wickham, 2009).
## ## ***** Get ggplot ## ## Create ggplot2 plot object library(ggplot2) ## occ.p0 <- ggplot(ascorbic, aes(occ, ascorbic.acid)) ## set.seed(11) ## for jittered points to stay fixed occ.p1 <- occ.p0 + geom_boxplot(stat = 'boxplot', outlier.shape = 3) + labs(y = 'Ascorbic Acid', x = 'Occasion', title = 'Ascorbic Acid Responses for 12 Patients Across 7 Occasions in 3 Phases') ## occ.p1 ## basic boxplot ## set.seed(11) ## for jittered points to stay fixed occ.p2 <- occ.p1 + geom_point(position = position_jitter(width = 0.2), aes(colour = patient)) ## occ.p2 # add points color-coded by patient ## set.seed(11) ## for jittered points to stay fixed occ.p3 <- occ.p2 + geom_vline(xintercept = c(2.5, 5.5), col = 'blue', lwd = 1.2, linetype = 'longdash') ## occ.p3 # add vertical lines to separate Phases ## set.seed(11) ## for jittered points to stay fixed occ.p4 <- occ.p3 + annotate('text', x = 1.5, y = 1.6, label = 'pre Rx') + annotate('text', x = 4, y = 0.3, label = 'Rx') + annotate('text', x = 4, y = 0.2, label = '(treatment)') + annotate('text', x = 6.5, y = 1.6, label = 'post Rx') ## occ.p4 # label Phases ## set.seed(11) ## for jittered points to stay fixed occ.p5 <- occ.p4 + annotate('text', x = 1.5, y = 1.7, label = 'Phase 1') + annotate('text', x = 4, y = 0.4, label = 'Phase 2') + annotate('text', x = 6.5, y = 1.7, label = 'Phase 3') occ.p5 # further label Phases ## ## ***** End ggplot ##
On the Occasions axis, the responses are jittered to avoid over-plotting. Legends for the patients, patient
, are included in the plot. For these data, we have three phases, Phase
, which are designated as pre
, Rx
, and post
to correspond to the respones before, during, and after the treatment regimen. With respect to the spread or variability of the data, it appears that the ascorbic acid responses are reasonably homogeneous throughout the seven occasions.
In these analyses, we are interested in detecting Phase
shifts. The responses appear to be relatively low in the first phase, pre
. They are followed by an increase in the second phase, Rx
and then by a drop in the third phase, post
.
In this problem, there is one hypothesis of interest. It is motivated by the question: Is there a difference in the mean levels of responses across the three Phase
s under consideration, pre
, Rx
, and post
? The criterion variable is ascorbic acid, i.e., ascorbic.acid
in the ascorbic
data frame. Here is an abbreviated anova
(analysis of variance) summary table for this analysis.
## aovObj <- aov(ascorbic.acid ~ patient*Phase*occ, data = ascorbic) ## (alternatively) aovObj <- aov(ascorbic.acid ~ patient + Phase + Phase:occ + patient:Phase + patient:Phase:occ, data = ascorbic) ## aovObj ## To avoid having warning messages sent to console ## suppressWarnings(owprm(aovObj)) owprm_aovObj <- suppressWarnings(owprm(aovObj)) owprm_aovObj$'Summary Table of aov object' ## summary(owprm_aovObj) ## display p value in prettier form ## options(digits = 4, scipen = 2) ## owprm_aovObj ## summary(owprm_aovObj) ##
The correct test for determining whether there is an effect due to Phase
is constructed by taking the mean square for Phase
(2.9805) and dividing it by the mean square for patient:Phase
(0.1090), and entering the $F$ distribution with the appropriate degrees of freedom, shown here. The printing of the probability value to four significant digits is deliberate.
## options(digits = 4, scipen = 2) ## owprm_aovObj summary(owprm_aovObj) ##
## Reset options to original digits and scipen options(digits = 7, scipen = 0) ##
owprm
PackageWe examine/analyze this problem with the use of the owprm
package. First load the owprm
package.
library(owprm) data(ascorbic) str(ascorbic)
ascorbic.acid
is the criterion variable of interest. We fit a fully-saturated analysis of variance (aov
) object with the interaction of three factor
variables, patient
, Phase
and occ
. Next we display the aovObj
aovObj <- aov(ascorbic.acid ~ patient*Phase*occ, data = ascorbic) aovObj
Before proceeding further, use the summary
method on the aovObj
.
summary(aovObj)
We next apply the owprm()
function to the aovObj
. Because the aovObj
is fully saturated, a warning is printed to the screen.
owprm_aovObj <- owprm(aovObj)
The warning messages can be avoided by executing the suppressWarnings()
command. Now display the owprm_aovObj
object.
owprm_aovObj <- suppressWarnings(owprm(aovObj)) owprm_aovObj
The owprm_aovObj
object is a list containing two components: (1) Anova Summary Table for Occ w/in Phases Rep Meas Analy
, and (2) Summary of Saturated ANOVA Object
. The Anova Summary Table for Occ w/in Phases Rep Meas Analy
shows the correct $F$ statistic under the null hypothesis of no Phase
shifts, the numerator and denominator degrees of freedom, and the probability value. The Anova Summary Table for Occ w/in Phases Rep Meas Analy
simply shows the anova
summary table for the fully saturated aov
fitted object. No probability values are shown because there is no "error" term in the fully saturated model.
In practice, we are generally most interest in the first Anova Summary Table for Occ w/in Phases Rep Meas Analy
component. This conveniently provided by the summary
method applied to the owprm_aovObj
object.
summary(owprm_aovObj)
This result may be displayed in a "prettier" form with the following code.
options(digits = 4, scipen = 2) summary(owprm_aovObj)
Alvord WG and Carchedi N (2015) 'Occasions Within Phases Repeated Measures Analysis with Derivation of Expected Mean Squares by the Cornfield-Tukey Algorithm', https://rpubs.com/ga4247/53980
Cornfield J and Tukey JW (1956) 'Average values of mean squares in factorials' Annals of Mathematical Statistics, 27, 907-949.
Crowder MJ and Hand DJ (1990) Analysis of Repeated Measures, Chapman and Hall, London.
Del Prete GQ, ..., Alvord WG, ... Lifson JD (2015) 'Elevated plasma viral loads in romidepsin treated SIV-infected rhesus macaques on suppressive combination antiretroviral therapy.' Antimicrobial Agents and Chemotherapy, 12/2015; DOI:10.1128/AAC.02625-15.
Del Prete GQ, ..., Alvord WG, ... Lifson JD (2014) 'Effect of SAHA administration on the residual virus pool in a model of combination antiretroviral therapy-mediated suppression in SIVmac239-infected Indian rhesus macaques.' Antimicrobial Agents and Chemotherapy, 58(11), 6790-806.
Pinheiro jC and Bates DM (2000) Mixed-Effects Models in S and S-Plus, Springer, New York.
Pinheiro J, Bates D, DebRoy S, Sarkar D and R
Core Team (2015). nlme: Linear and Nonlinear
Mixed Effects Models. R package version 3.1-122,
Winer BJ (1971) Statistical Principles in Experimental Design, 2nd ed., McGraw-Hill.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.