Get started"
In ambiR: Calculate AZTI’s Marine Biotic Index

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

This is a brief introduction to AMBI index calculations with ambiR.

Structuring species observation data

Species counts (or abundance) should be organized in a data frame arranged in long format. That is, species names are in one column and species counts in another column. If the data represents several stations and/or if there are replicates, then the data should include columns with this information.

Looking at the test_data example dataset included with the ambiR package, one can see how the data should be arranged:

library(ambiR)

head(test_data)

If your data look like this, you can go directly to [Calculate AMBI]. If not, the following examples show how to reorganize your data.

Species names in columns {#data-with-species-names-in-columns}

Here is an example dataframe where there are counts for species in separate columns:

library(dplyr)
library(tidyr)

wide_data_species <- test_data %>%
  pivot_wider(names_from = "species", 
              values_from = "count",
              values_fill = 0)

head(wide_data_species)

To arrange the data in the correct form, use tidyr::pivot_longer():

# columns 1 and 2 contain station and replicate information
# so, select all columns from 3 to be pivoted 

long_data <- wide_data_species %>%
  pivot_longer(cols = 3:ncol(wide_data_species), 
               names_to = "species",
               values_to = "count")

head(long_data)

Stations and replicates in columns

Here is an example where each column contains species counts for a station and replicate. The first row of the table contains the station ID 1, 2, ... and the second row contains the replicate ID a, b, .... The first column of the table contains species names.

df <- test_data %>%
  pivot_wider(names_from = c("station","replicate"), 
              values_from = "count", values_fill = 0)

stns <- names(df)[2:ncol(df)]

stns <- stns %>% 
  sapply(strsplit,"_") 
reps <- stns  %>%  
  sapply(function(x){x[2]})
stns <- stns  %>%  
  sapply(function(x){x[1]})
n <- 1+length(stns)
df2 <- matrix(nrow=2, ncol=n) %>% 
  as.data.frame()

df2[1,2:n] <- stns
df2[2,2:n] <- reps
names(df) <- names(df2)
df <- df %>%
  mutate(across(all_of(names(df)), as.character))

df <- df2 %>%
  bind_rows(df)

names(df) <- rep("", ncol(df))

wide_data_stns <- df

head(wide_data_stns)

Note that if the if the observation data only has stations or replicates, then rearranging the data can be done in one step, as in the previous example. But in this case, the station and replicate information are in separate rows. The restructuring process is a little more complicated.

Combine station and replicate values

Before we can use a pivot function, we need to combine the station ID and replicate ID for each column into a single value. In this example, each station ID and replicate ID are joined into a single character value, with an underscore to separate them _. The underscore will be used again after pivoting the table to identify where to split the combined station/replicate values back into separate values again.

If there are station names which already contain underscores, then another suitable character should be used when joining and splitting station/replicate IDs.

sep_character <- "_"

# get the station IDs from row 1, excluding column 1 (this contains species names)
stations <- wide_data_stns[1, 2:ncol(wide_data_stns)]

# get the replicate IDs from row 2
replicates <- wide_data_stns[2, 2:ncol(wide_data_stns)]

# combine the station and replicate for each column using an underscore
station_replicate <- paste(stations, replicates, sep=sep_character)

# now we have extracted the station/replicate information, we can drop the 
# first two rows of the data frame
wide_data_stns <- wide_data_stns[3:nrow(wide_data_stns),]

# apply the station_replicates as column names
names(wide_data_stns) <- c("species", station_replicate)

head(wide_data_stns)

Transpose using the combined station/replicate variable

We can see that the column names now contain the combined station and replicate information. We are ready to transpose the data.

# column 1 contains species names
# so, select all columns from 2 to be pivoted 

long_data <- wide_data_stns %>%
  pivot_longer(cols = 2:ncol(wide_data_stns), 
               names_to = "stn_rep",
               values_to = "count")

head(long_data)

Retrieve station and replicate variables from the transposed data

Now we can split the stn_rep column into separate columns for station and replicate, We will use the underscore we introduced earlier to identify where the split should occur:

long_data <- long_data %>%
  separate_wider_delim(cols="stn_rep", 
                       delim = sep_character,
                       names=c("station", "replicate"))

head(long_data)

Now we are ready to calculate AMBI...

Calculate AMBI

We have now ensured that our species abundance/count data have the correct structure, as in the example test_data provided:

head(test_data)

Call the AMBI() function:

res <- AMBI(test_data, by="station", var_rep="replicate", 
            var_species="species", var_count="count")

Results

The AMBI() function returns a list of three dataframes:

.$AMBI
.$AMBI_rep
.$matched

.$AMBI- the main results with the AMBI index calculated for each unique combination of by variables, in our case the results are per station.

In addition to the AMBI index, the results also include the Shannon diversity index H' and the Species richness S, the three metrics which are necessary to calculate M-AMBI.

res$AMBI

.$AMBI_rep - if the observations include replicates, then the function also returns results calculated for each replicate, within each unique combination of by variables:

res$AMBI_rep

.$matched - for each observation, this dataframe shows which species in the AMBI list the observed species was matched with, if any. It also shows the AMBI species group assigned. This dataframe has the same number of rows as the input data.

head(res$matched)

For more information about the principles underlying the AMBI calculations and the grouping of species according to sensitivity to pollution, see vignette("background").

Calculate M-AMBI {#calculate-m-ambi}

Calculate M-AMBI the multivariate AMBI index, based on the three separate species diversity metrics:

AMBI index AMBI.
Shannon diversity index H
Species richness S.

All three indices required to calculate MAMBI()are included in the results returned by the AMBI() function.

Limit values

In addition to index values calculated from observed species data, the M-AMBI factor analysis requires values defining the limits for the three metrics, corresponding to the best and worst possible conditions.

See @MUXIKA200716 for more details.

The default limit values used by MAMBI() are:

limits_AMBI <- c("bad" = 6, "high" = 0)

limits_H  <- c("bad" = 0, "high" = NA)

limits_S  <- c("bad" = 0, "high" = NA)

By default, upper limit values for H and S are not provided (= NA). If they are not provided explicitly, then the maximum values found in the input data will be used. The results returned by MAMBI() show the limits used.

Status class boundaries

MAMBI() also returns the status class (Bad, Poor, Moderate, Good or High) for each M-AMBI index value. To do this, it compares the calculated M-AMBI index value with values defining the boundaries between status classes.

PB - Poor / Bad
MP - Moderate / Poor
GM - Good / Moderate
HG - High / Good

The default values for the class boundaries are:

bounds <-c("PB" = 0.2, "MP" = 0.39, "GM" = 0.53, "HG" = 0.77)

Call `MAMBI()`

We call MAMBI() using the previously generated AMBI results:

res_mambi <- MAMBI(res$AMBI, var_H = "H", var_S = "S", var_AMBI = "AMBI")

M-AMBI results

In addition to retaining the values for AMBI. H and S, the results include the following information:

x, y, z - factor scores giving coordinates in the new factor space.
MAMBI - the calculated M-AMBI index value
Status - the status class for the M-AMBI index value
EQR - the normalised EQR index

The dataframe returned contains 2 more rows than the input data. Our input data contained 3 rows (1 for each station). The results contain 5 rows. The 2 additional rows show the limit values for each of the three metrics used in the M-AMBI calculations.

res_mambi %>%
  select(station, AMBI, H, S, x, y, z, MAMBI, Status, EQR)

EQR values {#eqr-values}

As well as the status class, the function also returns a normalised index value from 0 to 1 (EQR), calculated using the bounds boundary values for the M-AMBI index. The following EQR values correspond to status class boundaries:

0.2 - Poor / Bad
0.4 - Moderate / Poor
0.6 - Good / Moderate
0.8 _ High / Good

For example, the M-AMBI value for station 3 is 0.478. This lies between the M-AMBI value corresponding to the Moderate/Poor status class boundary ("MP" = 0.39) and the M-AMBI value corresponding to the Good/Moderate status class boundary ("GM" = 0.53).

The normalised EQR value is given by linear interpolation: $$ EQR = 0.4 + (0.6 - 0.4) \frac{0.478 - 0.39}{0.53 - 0.39 } $$

References

Any scripts or data that you put into this service are public.

ambiR documentation built on Dec. 20, 2025, 1:06 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

ambiR
Calculate AZTI’s Marine Biotic Index

Get started"
In ambiR: Calculate AZTI’s Marine Biotic Index

Structuring species observation data

Species names in columns {#data-with-species-names-in-columns}

Stations and replicates in columns

Combine station and replicate values

Transpose using the combined station/replicate variable

Retrieve station and replicate variables from the transposed data

Calculate AMBI

Results

Calculate M-AMBI {#calculate-m-ambi}

Limit values

Status class boundaries

Call `MAMBI()`

M-AMBI results

EQR values {#eqr-values}

References

Try the ambiR package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

ambiR Calculate AZTI’s Marine Biotic Index

Get started" In ambiR: Calculate AZTI’s Marine Biotic Index

Structuring species observation data

Species names in columns {#data-with-species-names-in-columns}

Stations and replicates in columns

Combine station and replicate values

Transpose using the combined station/replicate variable

Retrieve station and replicate variables from the transposed data

Calculate AMBI

Results

Calculate M-AMBI {#calculate-m-ambi}

Limit values

Status class boundaries

Call MAMBI()

M-AMBI results

EQR values {#eqr-values}

References

Try the ambiR package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

ambiR
Calculate AZTI’s Marine Biotic Index

Get started"
In ambiR: Calculate AZTI’s Marine Biotic Index

Call `MAMBI()`