View source: R/anova.manyany.R
anova.manyany  R Documentation 
Compute an analysis of deviance table for many univariate model fits. Slowly!
## S3 method for class 'manyany' anova(object, ..., nBoot=99, p.uni="none", block=object1$block, nCores=1, bootID=NULL, replace=TRUE) ## S3 method for class 'anova.manyany' print(x, ...)
object 
of class 
... 
other generic 
nBoot 
the number of Bootstrap iterations, default is 
p.uni 
whether to calculate univariate test statistics and their Pvalues. 
block 
a factor specifying the sampling level to be resampled. Default is resampling rows (if composition=TRUE in the manyany command, this means resampling rows of data as originally sent to manyany). 
nCores 
Number of cores to use for computations (for parallel computing). 
bootID 
A userentered matrix of indices for which observations to use in which resample. Bootstrap
resamples in rows, observations in columns. When specified, overwrites 
replace 
whether to sample with or without replacement, as in the 
x 

The anova.manyany
function returns a table summarising the statistical significance
of a fitted manyany model under the alternative hypothesis (object2
) as compared
to a fit under the null hypothesis (object
). Typically the alternative model
is nested in the null although it doesn't need to be (but consider seriously if what
you are doing makes sense if they are not nested).
This function is quite computationally intensive, and a little fussy  it is an early version we hope to improve on. Feedback welcome!
This function behaves a lot like anova.manyglm
, the most conspicuous differences
being in flexibility and computation time. Since this function is based on manyany
,
it offers much greater flexibility in terms of types of models that can be fitted (most
fixed effects model with predict
and family
arguments could be accommodated).
For information on the different types of data that can be modelled using manyany, see
manyany
.
However this flexibility comes at considerable cost in terms of computation time, and the
default nBoot
has been set to 99 to reflect this (although rerunning at 999 is
recommended). Other more cosmetic differences from anova.manyglm
are that
two and only two models can be supplied as input here; adjusted univariate Pvalues
are not yet implemented; and the range of test statistics and resampling algorithms is
more limited. All test statistics constructed here are sumoflikelihood ratio statistics
as in Warton et al (2012), and the resampling method used here is the PITtrap (short
for 'probability integral transform residual bootstrap', Warton et al 2017).
To check model assumptions, use plot.manyany
.
The block
argument allows for block resampling, such that valid inferences can
be made across independent blocks of correlated sets of observations.
For example, if data have multiple rows of records for each site, e.g. multispecies
data with entries for different species on different rows, you can use your site ID
variable as the block argument to resample sites, for valid crosssite inferences despite
withinsite species correlation. Well, valid assuming sites are independent. You could
do similarly for a repeated measures design to make inferences robust to temporal autocorrelation.
Note that block
needs to be balanced, e.g. equal number of species entries for
each site (i.e. include rows for zero abundances too).
The anova.manyany
function is designed specifically for highdimensional data
(that, is when the number of variables p is not small compared to the number of observations
N). In such instances a correlation matrix is computationally intensive to estimate
and is numerically unstable, so by default the test statistic is calculated assuming
independence of variables. Note however that the resampling scheme used ensures that
the Pvalues are approximately correct even when the independence assumption is not
satisfied.
Rather than stopping after testing for multivariate effects, it is often of interest
to find out which response variables express significant effects. Univariate statistics
are required to answer this question, and these are reported if requested. Setting p.uni="unadjusted"
returns resamplingbased univariate Pvalues for all effects as well as the multivariate
Pvalues, if composition=FALSE
. There are currently no univariate Pvalue options
when composition=TRUE
(it's not entirely clear how such Pvalues should be obtained)
and if univariate P's are of interest why not rerun the model with composition=FALSE
.
A current limitation of the function is that composition
needs to be set to
the same value in each manyany object being compared  it is not currently possible
to compare models with and without a compositional term in them.
stat 
the observed value of the test statistic. 
p 
the Pvalue as estimated from 
stat.i 
the values of the test statistic in each of the 
p.i 
the Pvalue in each of the 
p.uni 
the 
If p.uni="unadjusted"
the output list also contains
uni.test 
a table showing the test statistics of the univariate tests. 
uni.p 
a table showing the pvalues of the univariate tests. 
statj.i 
a matrix of values of the univariate test statistics in each of the 
The comparison between two or more models by anova.manyglm
will only be valid if they are fitted to the same dataset. This may be a problem if there are missing values and R's default of na.action = na.omit
is used.
David Warton <David.Warton@unsw.edu.au>.
Warton D. I., Wright S., and Wang, Y. (2012). Distancebased multivariate analyses confound location and dispersion effects. Methods in Ecology and Evolution, 3(1), 89101.
Warton D. I., Thibaut L., Wang Y. A. (2017). The PITtrap  A "modelfree" bootstrap procedure for inference about regression models with discrete, multivariate responses. PLoS One, 12(7), e0181790.
manyany
, anova.manyglm
.
## Try fitting Tikus Islands data with Tweedie models with power parameter 1.5, ## to test for compositional effect: data(tikus) coral < as.matrix(tikus$abund[1:20,]) sumSpp = apply(coral>0,2,sum) coral < coral[,sumSpp>6] ## cutting to just species with seven(!) or more presences to cut ## computation time. Maybe rerun with less (e.g. 4 or more presences) if curious and patient. coralX < tikus$x[1:20,] require(tweedie) require(statmod) ftTimeRep < manyany(coral ~ time+rep, "glm", data=coralX, family=tweedie(var.power=1.5, link.power=0), var.power=1.5, composition=TRUE) ftRep < manyany(coral ~ rep, "glm", data=coralX, family=tweedie(var.power=1.5, link.power=0), var.power=1.5, composition=TRUE) anova(ftRep,ftTimeRep,nBoot=9) #this takes a few seconds to run even for just 9 resamples ## This should be rerun for nBoot=999, which could a few minutes... ## Not run: library(ordinal) ## First construct an ordinal dataset: ## Not run: spidOrd = spider$abund ## Not run: spidOrd[spider$abund>1 & spider$abund<=10]=2 ## Not run: spidOrd[spider$abund>10]=3 ## Now fit a model using the clm function: ## Not run: manyOrd=manyany(spidOrd~bare.sand+fallen.leaves,"clm",data=spider$x) ## Test to see if fallen.leaves needs to be there: ## Not run: manyOrd0=manyany(spidOrd~bare.sand,"clm",data=spider$x) ## Not run: anova(manyOrd0,manyOrd,nBoot=19)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.