Description Usage Arguments Details References

View source: R/onewayanova.R


Computes a one-way analysis of variance with post hoc tests.


  subset = NULL,
  weights = NULL,
  compare = "Pairwise",
  correction = "Tukey Range",
  alternative = "Two-sided", = FALSE,
  missing = "Exclude cases with missing data",
  show.labels = TRUE, = NULL, = NULL,
  p.cutoff = 0.05,
  seed = 1223,
  return.all = FALSE,



The outcome variable.


The factor representing the groups.


An optional vector specifying a subset of observations to be used in the fitting process, or, the name of a variable in data. It may not be an expression. subset may not


An optional vector of sampling weights, or, the name or, the name of a variable in data. It may not be an expression.


One of "To mean", "Pairwise", "To first" (which implement's Dunnett's C, when combined with 'correction' == 'Tukey Range'), or "All"


The multiple comparison adjustment method: "Tukey Range", "None", "False Discovery Rate", "Benjamini & Yekutieli", "Bonferroni", "Free Combinations" (Westfall et al. 1999), "Hochberg", "Holm", "Hommel", "Single-step" (Bretz et al. 2010) "Shaffer", and "Westfall".


The alternative hypothesis: "Two sided", "Greater", or "Less". The main application of this is when Compare us set 'To first' (e.g., if testing a new product, where the purpose is to work out of the new product is superior to an existing product, "Greater" would be chosen).

If TRUE, computes standard errors that are robust to violations of the assumption of constant variance for linear and Poisson models, using the HC3 modification of White's (1980) estimator (Long and Ervin, 2000). This parameter is ignored if weights are applied (as weights already employ a sandwich estimator). Other options are FALSE and "FALSE"No, which do the same thing, and "hc0", "hc1", "hc2", "hc4".


How missing data is to be treated in the ANOVA. Options: "Error if missing data". "Exclude cases with missing data", and "Imputation (replace missing values with estimates)".


Shows the variable labels, as opposed to the labels, in the outputs, where a variables label is an attribute (e.g., attr(foo, "label")).

The name of the outcome variable. Only used when show.labels is FALSE. Defaults to the actual variable name, which does not work so well if the function is being called by another function.

The name of the predictor variable. Only used when show.labels is FALSE. Defaults to the actual variable name, which does not work so well if the function is being called by another function.


The alpha level to be used in testing.


The random number seed used when evaluating the multivariate t-distribution.


If TRUE, returns all the internal computations in the output object. If FALSE, returns just the information required to print the output.


Other parameters to be passed to wrapped functions.


When 'Tukey Range' is selected, p-values are computed using t'tests, with a correction for the family-wise error rate such that the p-values are correct for the largest range of values being compared (i.e., the biggest difference between the smallest and largest means). This is a single-step test. The method of calculation is valid for both balanced and unbalanced samples (Bretz et al. 2011), and consequently the results may differ for unbalanced samples to those that appear in most software and books (which instead employee an approximation when the samples are unbalanced).

When missing = "Imputation (replace missing values with estimates)", all selected outcome and predictor variables are included in the imputation, along with all, excluding cases that are excluded via subset or have invalid weights, but including cases with missing values of the outcome variable. Then, cases with missing values in the outcome variable are excluded from the analysis (von Hippel 2007). See Imputation.


Bretz,Frank, Torsten Hothorn and Peter Westfall (2011), Multiple Comparisons Using R, CRC Press, Boca Raton. Benjamini, Y., and Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society Series B 57, 289-300. Benjamini, Y., and Yekutieli, D. (2001). The control of the false discovery rate in multiple testing under dependency. Annals of Statistics 29, 1165-1188. Holm, S. (1979). A simple sequentially rejective multiple test procedure. Scandinavian Journal of Statistics 6, 65-70. Hochberg, Y. (1988). A sharper Bonferroni procedure for multiple tests of significance. Biometrika 75, 800-803. Hommel, G. (1988). A stagewise rejective multiple test procedure based on a modified Bonferroni test. Biometrika 75, 383-386. Hothorn, Torsten, Frank Bretz and Peter Westfall (2008), Simultaneous Inference in General Parametric Models. Biometrical Journal, 50(3), 346-363. Long, J. S. and Ervin, L. H. (2000). Using heteroscedasticity consistent standard errors in the linear regression model. The American Statistician, 54(3): 217-224. Shaffer, Juliet P. (1986), Modified sequentially rejective multiple test procedures. Journal of the American Statistical Association, 81, 826-831. Shaffer, Juliet P. (1995). Multiple hypothesis testing. Annual Review of Psychology 46, 561-576. Sarkar, S. (1998). Some probability inequalities for ordered MTP2 random variables: a proof of Simes conjecture. Annals of Statistics 26, 494-504. Sarkar, S., and Chang, C. K. (1997). Simes' method for multiple hypothesis testing with positively dependent test statistics. Journal of the American Statistical Association 92, 1601-1608. Tukey, John (1949). "Comparing Individual Means in the Analysis of Variance". Biometrics. 5 (2): 99-114. Peter H. Westfall (1997), Multiple testing of general contrasts using logical constraints and correlations. Journal of the American Statistical Association, 92, 299-306. P. H. Westfall, R. D. Tobias, D. Rom, R. D. Wolfinger, Y. Hochberg (1999). Multiple Comparisons and Multiple Tests Using the SAS System. Cary, NC: SAS Institute Inc. von Hippel, Paul T. 2007. "Regression With Missing Y's: An Improved Strategy for Analyzing Multiply Imputed Data." Sociological Methodology 37:83-117. Wright, S. P. (1992). Adjusted P-values for simultaneous inference. Biometrics 48, 1005-1013. White, H. (1980), A heteroskedastic-consistent covariance matrix estimator and a direct test of heteroskedasticity. Econometrica, 48, 817-838.

Displayr/flipAnalysisOfVariance documentation built on Aug. 11, 2021, 12:58 a.m.