# granovagg.1w: Elemental Graphic Display for One-Way ANOVA In granovaGG: Graphical Analysis of Variance Using ggplot2

## Description

Graphic to display data for a one-way analysis of variance – that is for unstructured groups. Also to help understand how data play out in the context of the basic one-way model, how the F statistic is generated for the data at hand, etc. The graphic may be called 'elemental' or 'natural' because it is built upon the central question that drives one-way ANOVA (see details below).

## Usage

 ```1 2 3 4``` ```granovagg.1w(data, group = NULL, h.rng = 1, v.rng = 1, jj = NULL, dg = 2, resid = FALSE, print.squares = TRUE, xlab = "default_x_label", ylab = "default_y_label", main = "default_granova_title", plot.theme = "theme_granova_1w", ...) ```

## Arguments

 `data` Dataframe or vector. If a dataframe, the two or more columns are taken to be groups of equal size (whence `group` is NULL). If `data` is a vector, `group` must be a vector, perhaps a factor, that indicates groups (unequal group sizes allowed with this option). `group` Group indicator, generally a factor in case `data` is a vector. `h.rng` Numeric; controls the horizontal spread of groups, default = 1 `v.rng` Numeric; controls the vertical spread of points, default = 1. `jj` Numeric; sets horiz. jittering level of points. `jj` gets passed as the `amount` parameter to `jitter`. When `jj = NULL` (the default behavior), the degree of jitter will take on a sensible value. In addition, if pairs of ordered means are close to one another and `jj = NULL`, the degree of jitter will default to the smallest difference between two adjacent contrasts. `dg` Numeric; sets number of decimal points in output display, default = 2 `resid` Logical; displays marginal distribution of residuals (as a 'rug') on right side (wrt grand mean), default = FALSE. `print.squares` Logical; displays graphical squares for visualizing the F-statistic as a ratio of MS-between to MS-within `xlab` Character; horizontal axis label, can be supplied by user, default = `"default_x_label"`, which leads to a generic x-axis label ("Contrast coefficients based on group means"). `ylab` Character; vertical axis label, can be supplied by user, default = `"default_y_label"`, which leads to a generic y-axis label ("Dependent variable (response)"). `main` Character; main label, top of graphic; can be supplied by user, default = `"default_granova_title"`, which will print a generic title for graphic. `plot.theme` argument indicating a ggplot2 theme to apply to the graphic; defaults to a customized theme created for the one-way graphic `...` Optional arguments to/from other functions

## Details

The one-way ANOVA graphic shows how the comparison of unstructured groups, viz. their means, entails a particular linear combination (L.C.) of the group means. In particular, we use the fact that the numerator of the one-way F statistic, the mean square between (MS.B), is a linear combination of the group means; each weight – one for each group – in the L.C. is (principally) a function of the difference between the group's mean and the grand mean, viz., (M_j - M..) where M_j denotes the jth group's mean, and M.. denotes the grand mean. The L.C. can be written as a sum of products of the form MS.B = Sum((1/df.B)(n_j (M_j - M..) M_j)) for j = 1...J. The denominator of the F-statistic, MS.W (mean square within), can be described as a 'scaling factor'. It is just the (weighted) average of the variances of the J groups (j = 1 ... J). (n_j's are group sizes.) The differences (M_j - M..) are themselves the 'effects' in the analysis. When the effects are plotted against the group means (the horizontal and vertical axes) a straight line necessarily ensues. Group means are plotted as triangles along this line. Once the means have been plotted, the data points (jittered) for the groups are displayed (vertical axis) with respect to the respective contrasts. Since the group means are just the fitted values in one-way ANOVA, and the deviations of the scores within groups are the residuals (subsetted by groups), the graphic can be seen as showing fitted vs. residual values for the line that shows the locus of ordered group means – from the smallest on the left) the the largest (on the right). If desired, the aggregate of all such residuals can be plotted (as a rug plot) on the right margin of the graphic centered on the grand mean (large green dot in 'middle'). The use of effects to locate groups this way yields what we term an 'elemental' graphic because it is based on the central question that drives one-way ANOVA.

Note that groups need not have the same size, nor do data need to reflect any particular distributional characteristics. Finally, the gray bars (one for each group) at the bottom of the graphic show the relative sizes of the group standard deviations with referene to the 'average' group s.d. (more precisely, the square root of the MS.W). This 'average' corresponds to the thin white line that runs horizontally across these bars.

## Value

Returns a plot object of class `ggplot`. The function also provides printed output including by-group statistical summaries and information about groups that might be overplotted (if applicable):

 `group` group names `group means` means for each group `trimmed.mean` 20% trimmed group means `contrast` Contrasts (group main effects) `variance` variances `standard.deviation` standard deviations `group.size` group sizes `overplotting information` Information about groups that, due to their close means, may be overplotted

## Author(s)

Brian A. Danielak brian@briandk.com
Robert M. Pruzek RMPruzek@yahoo.com

with contributions by:
William E. J. Doane wil@drdoane.com
James E. Helmreich James.Helmreich@Marist.edu
Jason Bryer jason@bryer.org

## References

Fundamentals of Exploratory Analysis of Variance, Hoaglin D., Mosteller F. and Tukey J. eds., Wiley, 1991.

Wickham, H. (2009). Ggplot2: Elegant Graphics for Data Analysis. New York: Springer.

Wilkinson, L. (1999). The Grammar of Graphics. Statistics and computing. New York: Springer.

`granovagg.contr`, `granovagg.ds`, `granovaGG`

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32``` ```data(arousal) #Drug A granovagg.1w(arousal[,1:2], h.rng = 1.6, v.rng = 0.5) ### data(anorexia) wt.gain <- anorexia[, 3] - anorexia[, 2] granovagg.1w(wt.gain, group = anorexia[, 1]) ### data(poison) ##Note violation of constant variance across groups in following graphic. granovagg.1w(poison\$SurvTime, group = poison\$Group, ylab = "Survival Time") ##RateSurvTime = SurvTime^-1 granovagg.1w(poison\$RateSurvTime, group = poison\$Group, ylab = "Survival Rate = Inverse of Survival Time") ##Nonparametric version: RateSurvTime ranked and rescaled ##to be comparable to RateSurvTime; ##note labels as well as residual (rug) plot below. granovagg.1w(poison\$RankRateSurvTime, group = poison\$Group, ylab = "Ranked and Centered Survival Rates", main = "One-way ANOVA display, poison data (ignoring 2-way set-up)", res = TRUE) ### data(chickwts) ?chickwts # An explanation of the chickwts dataset with(chickwts, granovagg.1w(weight, group = feed)) # Modeling weight as explained by feed type ```

### Example output      ```Loading required package: ggplot2

By-group summary statistics for your input data (ordered by group means)
group group.mean trimmed.mean contrast variance standard.deviation
1 Placebo      20.43        20.30    -1.92     5.83               2.41
2  Drug.A      24.27        24.45     1.92     7.89               2.81
group.size
1         10
2         10

Below is a t-test summary of your input data

Two Sample t-test

data:  unstacked.data[, 1] and unstacked.data[, 2]
t = -3.2786, df = 18, p-value = 0.004174
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-6.300681 -1.379319
sample estimates:
mean of x mean of y
20.43     24.27

By-group summary statistics for your input data (ordered by group means)
group group.mean trimmed.mean contrast variance standard.deviation group.size
2  Cont      -0.45        -1.16    -3.21    63.82               7.99         26
1   CBT       3.01         1.80     0.24    53.41               7.31         29
3    FT       7.26         7.91     4.50    51.23               7.16         17

Below is a linear model summary of your input data

Call:
lm(formula = score ~ group, data = owp\$data)

Residuals:
Min      1Q  Median      3Q     Max
-12.565  -4.543  -1.007   3.846  17.893

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)    3.007      1.398   2.151   0.0350 *
groupCont     -3.457      2.033  -1.700   0.0936 .
groupFT        4.258      2.300   1.852   0.0684 .
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 7.528 on 69 degrees of freedom
Multiple R-squared:  0.1358,	Adjusted R-squared:  0.1108
F-statistic: 5.422 on 2 and 69 DF,  p-value: 0.006499

By-group summary statistics for your input data (ordered by group means)
group group.mean trimmed.mean contrast variance standard.deviation
3      3       0.21         0.21    -0.27     0.00               0.02
9      9       0.24         0.24    -0.24     0.00               0.01
2      2       0.32         0.32    -0.16     0.01               0.08
12    12       0.32         0.32    -0.15     0.00               0.03
6      6       0.34         0.34    -0.14     0.00               0.05
8      8       0.38         0.38    -0.10     0.00               0.06
1      1       0.41         0.41    -0.07     0.00               0.07
7      7       0.57         0.57     0.09     0.02               0.16
10    10       0.61         0.61     0.13     0.01               0.11
11    11       0.67         0.67     0.19     0.07               0.27
5      5       0.82         0.82     0.34     0.11               0.34
4      4       0.88         0.88     0.40     0.03               0.16
group.size
3           4
9           4
2           4
12          4
6           4
8           4
1           4
7           4
10          4
11          4
5           4
4           4

The following groups are likely to be overplotted
group group.mean contrast
2      2       0.32    -0.16
12    12       0.32    -0.15
6      6       0.34    -0.14

Below is a linear model summary of your input data

Call:
lm(formula = score ~ group, data = owp\$data)

Residuals:
Min       1Q   Median       3Q      Max
-0.32500 -0.04875  0.00500  0.04312  0.42500

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)  0.41250    0.07457   5.532 2.94e-06 ***
group2      -0.09250    0.10546  -0.877 0.386230
group3      -0.20250    0.10546  -1.920 0.062781 .
group4       0.46750    0.10546   4.433 8.37e-05 ***
group5       0.40250    0.10546   3.817 0.000513 ***
group6      -0.07750    0.10546  -0.735 0.467163
group7       0.15500    0.10546   1.470 0.150304
group8      -0.03750    0.10546  -0.356 0.724219
group9      -0.17750    0.10546  -1.683 0.101000
group10      0.19750    0.10546   1.873 0.069235 .
group11      0.25500    0.10546   2.418 0.020791 *
group12     -0.08750    0.10546  -0.830 0.412164
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.1491 on 36 degrees of freedom
Multiple R-squared:  0.7335,	Adjusted R-squared:  0.6521
F-statistic:  9.01 on 11 and 36 DF,  p-value: 1.986e-07

By-group summary statistics for your input data (ordered by group means)
group group.mean trimmed.mean contrast variance standard.deviation
4      4       1.16         1.16    -1.46     0.04               0.20
5      5       1.39         1.39    -1.23     0.31               0.55
10    10       1.69         1.69    -0.93     0.13               0.36
11    11       1.70         1.70    -0.92     0.49               0.70
7      7       1.86         1.86    -0.76     0.24               0.49
1      1       2.49         2.49    -0.14     0.25               0.50
8      8       2.71         2.71     0.09     0.17               0.42
6      6       3.03         3.03     0.41     0.18               0.42
12    12       3.09         3.09     0.47     0.06               0.24
2      2       3.27         3.27     0.65     0.68               0.82
9      9       4.26         4.26     1.64     0.06               0.23
3      3       4.80         4.80     2.18     0.28               0.53
group.size
4           4
5           4
10          4
11          4
7           4
1           4
8           4
6           4
12          4
2           4
9           4
3           4

The following groups are likely to be overplotted
group group.mean contrast
10    10       1.69    -0.93
11    11       1.70    -0.92
6      6       3.03     0.41
12    12       3.09     0.47

Below is a linear model summary of your input data

Call:
lm(formula = score ~ group, data = owp\$data)

Residuals:
Min       1Q   Median       3Q      Max
-0.76848 -0.29639 -0.06915  0.25455  1.07933

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)   2.4869     0.2450  10.151 4.16e-12 ***
group2        0.7816     0.3465   2.256 0.030247 *
group3        2.3158     0.3465   6.684 8.56e-08 ***
group4       -1.3234     0.3465  -3.820 0.000508 ***
group5       -1.0935     0.3465  -3.156 0.003226 **
group6        0.5421     0.3465   1.565 0.126414
group7       -0.6242     0.3465  -1.801 0.080010 .
group8        0.2270     0.3465   0.655 0.516468
group9        1.7781     0.3465   5.132 1.00e-05 ***
group10      -0.7972     0.3465  -2.301 0.027299 *
group11      -0.7853     0.3465  -2.267 0.029517 *
group12       0.6049     0.3465   1.746 0.089344 .
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.49 on 36 degrees of freedom
Multiple R-squared:  0.8681,	Adjusted R-squared:  0.8277
F-statistic: 21.53 on 11 and 36 DF,  p-value: 1.289e-12

By-group summary statistics for your input data (ordered by group means)
group group.mean trimmed.mean contrast variance standard.deviation
4      4       1.11         1.11    -1.38     0.03               0.18
5      5       1.36         1.36    -1.13     0.28               0.53
10    10       1.67         1.67    -0.82     0.10               0.31
11    11       1.69         1.69    -0.80     0.50               0.71
7      7       1.82         1.82    -0.67     0.24               0.49
1      1       2.39         2.39    -0.10     0.30               0.55
8      8       2.72         2.72     0.23     0.19               0.44
6      6       3.04         3.04     0.55     0.18               0.42
12    12       3.09         3.09     0.61     0.05               0.22
2      2       3.15         3.15     0.66     0.39               0.62
9      9       3.78         3.78     1.29     0.03               0.16
3      3       4.04         4.04     1.55     0.03               0.16
group.size
4           4
5           4
10          4
11          4
7           4
1           4
8           4
6           4
12          4
2           4
9           4
3           4

The following groups are likely to be overplotted
group group.mean contrast
10    10       1.67    -0.82
11    11       1.69    -0.80
6      6       3.04     0.55
12    12       3.09     0.61
2      2       3.15     0.66

Below is a linear model summary of your input data

Call:
lm(formula = score ~ group, data = owp\$data)

Residuals:
Min      1Q  Median      3Q     Max
-0.7375 -0.2900 -0.0375  0.2606  0.9225

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)   2.3925     0.2195  10.899 5.93e-13 ***
group2        0.7550     0.3105   2.432 0.020121 *
group3        1.6425     0.3105   5.291 6.16e-06 ***
group4       -1.2825     0.3105  -4.131 0.000205 ***
group5       -1.0300     0.3105  -3.318 0.002083 **
group6        0.6475     0.3105   2.086 0.044157 *
group7       -0.5775     0.3105  -1.860 0.071043 .
group8        0.3250     0.3105   1.047 0.302141
group9        1.3900     0.3105   4.477 7.33e-05 ***
group10      -0.7225     0.3105  -2.327 0.025691 *
group11      -0.7050     0.3105  -2.271 0.029235 *
group12       0.7025     0.3105   2.263 0.029775 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.439 on 36 degrees of freedom
Multiple R-squared:  0.8542,	Adjusted R-squared:  0.8097
F-statistic: 19.18 on 11 and 36 DF,  p-value: 7.233e-12

chickwts               package:datasets                R Documentation

_C_h_i_c_k_e_n _W_e_i_g_h_t_s _b_y _F_e_e_d _T_y_p_e

_D_e_s_c_r_i_p_t_i_o_n:

An experiment was conducted to measure and compare the
effectiveness of various feed supplements on the growth rate of
chickens.

_U_s_a_g_e:

chickwts

_F_o_r_m_a_t:

A data frame with 71 observations on the following 2 variables.

'weight' a numeric variable giving the chick weight.

'feed' a factor giving the feed type.

_D_e_t_a_i_l_s:

Newly hatched chicks were randomly allocated into six groups, and
each group was given a different feed supplement.  Their weights
in grams after six weeks are given along with feed types.

_S_o_u_r_c_e:

Anonymous (1948) _Biometrika_, *35*, 214.

_R_e_f_e_r_e_n_c_e_s:

McNeil, D. R. (1977) _Interactive Data Analysis_.  New York:
Wiley.

_E_x_a_m_p_l_e_s:

require(stats); require(graphics)
boxplot(weight ~ feed, data = chickwts, col = "lightgray",
varwidth = TRUE, notch = TRUE, main = "chickwt data",
ylab = "Weight at six weeks (gm)")
anova(fm1 <- lm(weight ~ feed, data = chickwts))
opar <- par(mfrow = c(2, 2), oma = c(0, 0, 1.1, 0),
mar = c(4.1, 4.1, 2.1, 1.1))
plot(fm1)
par(opar)

By-group summary statistics for your input data (ordered by group means)
group group.mean trimmed.mean contrast variance standard.deviation
2 horsebean     160.20       154.33  -101.11  1491.96              38.63
3   linseed     218.75       219.50   -42.56  2728.57              52.24
5   soybean     246.43       246.50   -14.88  2929.96              54.13
4  meatmeal     276.91       280.43    15.60  4212.09              64.90
1    casein     323.58       331.38    62.27  4151.72              64.43
6 sunflower     328.92       326.38    67.61  2384.99              48.84
group.size
2         10
3         12
5         14
4         11
1         12
6         12

Below is a linear model summary of your input data

Call:
lm(formula = score ~ group, data = owp\$data)

Residuals:
Min       1Q   Median       3Q      Max
-123.909  -34.413    1.571   38.170  103.091

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)     323.583     15.834  20.436  < 2e-16 ***
grouphorsebean -163.383     23.485  -6.957 2.07e-09 ***
grouplinseed   -104.833     22.393  -4.682 1.49e-05 ***
groupmeatmeal   -46.674     22.896  -2.039 0.045567 *
groupsoybean    -77.155     21.578  -3.576 0.000665 ***
groupsunflower    5.333     22.393   0.238 0.812495
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 54.85 on 65 degrees of freedom
Multiple R-squared:  0.5417,	Adjusted R-squared:  0.5064
F-statistic: 15.36 on 5 and 65 DF,  p-value: 5.936e-10
```

granovaGG documentation built on May 2, 2019, 2:09 a.m.