knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.align='center'
  )

This vignette takes you through the main functions in mvabund to help you get started! We recommend reading the manuscript associated with the package and taking a look at Other Resources in our README.

First things first

Let's load the package and get our hands on the Tasmania data set to look at the effects of disturbance treatment on invertebrate abundances. Note that the Tasmania data set is a list object. We will only look at the copepods data frame for this walk-through. The copepods data frame can be accessed using Tasmania$copepods or the attach() function which will make the contents of Tasmania searchable.

library(mvabund)

data(Tasmania) 
attach(Tasmania)

skimr::skim(copepods) # Great function to get an overview of the data

Visualise the multivariate data

We first need to turn our data into a mvabund object so functions for this package and work with the data

copepod_abund <- mvabund(copepods)

Now lets take a look at abundance for each species across our treatment sites (Disturbed vs. Undisturbed). Observations were collected using a spatially blocked design where researchers took four samples at each block (2 per treatment). We can set the colour (col) of the points to represent that four sampling blocks

plot(copepod_abund~treatment, col = block)

Fitting Predictive Models

It was hypothesised that the abundance of Ameira and Ectinosoma was reduced in Disturbed sites, whereas the abundance of Mictyricola may have increased. Lets test this hypothesis using the manyglm() function. This function fits a generalised linear model for each species. We specified family = "negative.binomial" as count data tends to follow a negative binomial distribution. Other distributions are available too! See ?manyglm()

cope.nb <- manyglm(copepods ~ treatment*block, family =  "negative.binomial")

Checking Model Assumptions

Before we look at the model output, we should check on the model residuals. What we want to see is little pattern as this implies that our choice of negative binomial distribution is appropriate.

plot(cope.nb)

Now, lets proceed to check on the mean-variance relationship. We want to to see if the mean-variance relationship of our data adheres to that of a negative binomial distribution which tends to be quadratic rather than linear. The meanvar.plot() function plots the sample variance against the sample mean for each species within each factor level of (tr.block). A quadratic relationship seems appropriate for our sample mean and variance.

meanvar.plot(copepods~tr.block, col = treatment)

Hypothesis Testing

To test whether treatment and block had an effect on the abundances of copepods we can use the anova() function. This function returns a Analysis of Deviance table which tests the significance of each model term. Setting p.uni = "adjusted" allows for our p-values to be adjusted for multiple testing of different species.

anova(cope.nb, p.uni = "adjusted")

The test statistics in this table are by default calculated by summing the change in deviance across all responses. You could use a different type of test statistic by changing the test and cor.type arguments. We can see that there is a significant effect of the treatment factor meaning that treatment has a significant multiplicative effect on mean abundance. The interaction between blocks and treatments is not significant, meaning that the multiplicative treatment effect is consistent across blocks.

If you do not have a specific hypothesis in mind that you want to test, and are instead interested in which model terms are statistically significant, then the summary() function will come in handy. However results aren’t quite as trustworthy as for anova(). The reason is that re-samples are taken under the alternative hypothesis for summary(), where there is a greater chance of fitted values being zero, especially for rarer taxa (e.g. if there is a treatment combination in which a taxon is never present). Abundances don’t re-sample well if their predicted mean is zero.

summary(cope.nb) 

If obtaining predicted values from the model is the goal, you may use the predict() function. Note that type = response will produce values on the scale of the response variable (i.e. counts)

predict(cope.nb, type = "response") 


aliceyiwang/mvabund documentation built on March 13, 2024, 1:58 a.m.