Fitting Many Univariate Models to Multivariate Abundance Data
Description
manyany
is used to fit many univariate models (GLMs, GAMs, otherwise) to highdimensional data, such as multivariate abundance data in ecology. This is the base modelfitting function  see plot.manyany
for assumption checking, and anova.manyany
for significance testing.
Usage
1 2 3 4 
Arguments
fn 
a character string giving the name of the function for the univariate model to be applied. e.g. "glm". 
yMat 
a matrix of response variables, e.g. multivariate abundances. 
formula 
an object of class 
data 
a data frame containing predictor variables (a matrix is also acceptable). This is REQUIRED and needs to have more than one variable in it (even if only one is used in the model). 
family 
a description of the error distribution function to be used
in the model, either as a character string, a 
composition 
logical. FALSE (default) fits a separate model to each species. TRUE fits a single model to all variables, including site as a row effect, such that all other terms model relative abundance (compositional effects). 
block 
a factor specifying the sampling level to be resampled. Default is resampling rows (if composition=TRUE in the manyany command, this means resampling rows of data as originally sent to manyany). 
get.what 
what to return from each model fit: "details" (default) includes predicted values and residuals in output, "models" also returns the fitted objects for each model, "none" returns just the loglikelihood (mostly for internal use). 
var.power 
the power parameter, if using the tweedie distribution. 
na.action 
Default set to 
... 
further arguments passed to the fitting function. 
x 
an object of class "manyany", usually, a result of a call to

digits 
how many digits to include in the printed anova table. 
Details
manyany
can be used to fit the specified model type to many variables
simultaneously, a generalisation of manyglm
. It should be able to handle
any fixed effects modelling function that has predict
and logLik
functions, and that accepts a family
argument, provided that the family
is on our list (currently 'gaussian', 'poisson', 'binomial', 'negative.binomial'
and 'tweedie', although models for ordinal data are also accepted if using the
clm
function form the ordinal
package). Models for manyany
are specified symbolically, see for example the details section of lm
and formula
.
Unlike manyglm
, this function accepts family
functions as arguments
instead of just character strings, giving greater flexibility. For example, you can
use family=binomial(link="cloglog")
to fit a model using the complementary
loglog link, rather than being restricted to the default logit link.
A data
argument is required, and it must be a dataframe containing more than
one object. It need not contain that matrix of response variables, that is specified
separately as yMat
.
Setting composition=TRUE
enables compositional analyses, where predictors are
used to model relative abundance rather than mean abundance. This is achieved by
vectorising the response matrix and fitting a single model across all variables, with
a row effect to account for differences in relative abundance across rows.
The default composition=FALSE
just fits a separate model for each variable.
Value
manyany
returns an object inheriting from "manyany"
.
The function anova
(i.e. anova.manyany
) will produce a significance test comparing two manyany
objects.
Currently there is no summary
resampling function for objects of this class.
The generic accessor functions fitted.values
, residuals
, logLik
, AIC
, plot
can be used to extract various useful features of the value returned by manyany
.
An object of class "manyany"
is a list containing at least the
following components:
logL 
a vector of loglikelihood terms for each response variable in the fitted model. 
fitted.values 
the matrix of fitted mean values, obtained by transforming the linear predictors by the inverse of the link function. 
residuals 
the matrix of probability integral transform (PIT) residuals. If the fitted model is a good fit, these will be approximately standard uniformly distributed. 
linear.predictor 
the linear fit on link scale. But for ordinal models fitted using 
family 
a vector of 
call 
the matched call. 
model 
the 
terms 
a list of 
Author(s)
David Warton <David.Warton@unsw.edu.au>.
References
Warton D. I., Wright S., and Wang, Y. (2012). Distancebased multivariate analyses confound location and dispersion effects. Methods in Ecology and Evolution, 3(1), 89101.
See Also
anova.manyany
, residuals.manyany
, plot.manyany
.
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33  data(spider)
abund < spider$abund
X < as.matrix(spider$x)
## To fit a loglinear model assuming counts are negative binomial, via manyglm:
spidNB < manyany("manyglm",abund,data=X,abund~X,family="negative.binomial")
logLik(spidNB) # a number of generic functions are applible to manyany objects
## To fit a glm with complementary loglog link to presence/absence data:
PAdat = pmin(as.matrix(abund),1) #constructing presence/absence dataset
spidPA < manyany("glm",PAdat,data=X,PAdat~X,family=binomial("cloglog"))
plot(spidPA)
# There are some wild values in there for the Pardmont variable (residuals >5 or <8).
#The Pardmont model didn't converge, coefficients are a bit crazy:
coef(spidPA)
# Can try again using the glm2 package to fit the models, this fixes things up:
# library(glm2)
# spidPA2 < manyany("glm",PAdat,data=X,PAdat~X,family=binomial("cloglog"),method="glm.fit2")
# plot(spidPA2) #looks much better.
## To simultaneously fit models to ordinal data using the ordinal package:
# library(ordinal)
## First construct an ordinal dataset:
# spidOrd = abund
# spidOrd[abund>1 & abund<=10]=2
# spidOrd[abund>10]=3
# for(iVar in 1:dim(spidOrd)[2])
# spidOrd[,iVar]=factor(spidOrd[,iVar])
##Now fit a model using the clm function:
# manyOrd=manyany("clm",spidOrd,abund~bare.sand+fallen.leaves,data=X)
# plot(manyOrd)
