optim_species: Optimisation methods for plot selection using different...

View source: R/species_optimisation.R

optim_speciesR Documentation

Optimisation methods for plot selection using different diversity indices in order to maximise species accumulation


This function applies different optimisation methods to select a subset of plots that maximise species accumulation. Optimisation methods vary based on the desired biodiversity parameters chosen by the user. The function operates under the 'Maximum covering problem' framework, which seeks to identify the plots that will protect the maximum number of species in a specified number of plots. The user specifies the number of plots to select, and each optimiser selects the plots that represent the largest number of species for that given number of plots.


  optim_species(speciesVsitesMatrix, n.plt=250, richness=TRUE, RRR=TRUE, 
  CWE=TRUE, shannon=TRUE, simpson=TRUE, simpson_beta=TRUE, 
  start="fixed", plot_name= NULL, frequent=TRUE, random=FALSE, 
  iterations=10, plot=TRUE, verbose=TRUE)



dataframe or matrix, formatted as species (presence/absence or abundance) versus sites, where rows are sites and columns are species. See Details for more information on formatting.


Number of plots, n, conforming the subset to be optimised.


Species richness selected as optimiser. By default is TRUE.


Range rarity richness selected as optimiser. By default is TRUE.


Corrected Weight Endemism selected as optimiser. By default is TRUE.


Shannon-Wiener diversity index selected as optimiser. By default is TRUE.


Simpson diversity index selected as optimiser. By default is TRUE.


Simpson dissimilarity selected as optimiser. By default is TRUE.


Choose under which condition the first plot for the simpson_beta optimiser is selected. The default, "fixed", uses the most speciose plot as the start seed. See details for additional options.


Specific plot to start the optimisation when the user has selected start = "defined". By default, plot_name = NULL. If rownames are provided, the plot_name should be in character format and must match the relevant rowname in the input species matrix. If there are no rownames, plot_name can be numeric and will specify the row to use as the start seed.


See Details. By default is TRUE.


Number of random seed replications for simpson_beta if frequent=TRUE and also number of random replicates if random=TRUE.


Generation of a set of random species accumulations. By default is FALSE


Whether or not to immediately plot the species accumulation curves for all the optimisers.


Logical, whether to print progress of iterations to the console.


The input is a species versus sites matrix or dataframe. If a dataframe, the first column can list the site names. Site names must contain text, they cannot be numeric. Depending on the desired optimiser, the input data needs to be either species presence/absence or abundance. For richness, RRR, CWE, and simpson_beta optimisers, a presence/absence input matrix is required. For shannon and simpson optimisers, an abundance matrix is required. If the input data include abundance, the function automatically generates binary presence/absence data when required. TERN plot species occurrence matrices generated by the ausplotsR function species_table can be directly incorporated into optimiser function.

The start refers to the first plot used as the starting seed when using the simpson_beta optimiser. If "fixed", the plot with the greatest species richness is chosen as the start seed. If "random", a random plot is selected as the start seed. If "defined" the user assigns a specific starting plot as the start seed. Note that if start = "defined" then the plot_name argument must be provided.

plot_name must be provided when start = "defined" for the simpson_beta optimiser. If the input matrix or dataframe contains site names, then plot_name must match the rowname of the desired site to be used as the starting seed. If the input matrix does not include site names, then plot_name should specify the row number to be used as the starting seed.

richness refers to Species richness, which is the count of the number of species present in a given site. It is best used when the goal is to identify biodiversity hotspots.

RRR refers to Range rarity richness, which is a rarity-weighted richness calculated as the inverse of the number of sites in which a species occur. It is best used when the goal is to identify areas of high biodiversity and biological uniqueness.

CWE refers to Corrected Weight Endemism, which is calculated as range rarity richness (RRR) divided by species richness. It is best used when the goal is to identify centers of endemism highlighting range-restricted species.

shannon refers to the Shannon–Wiener diversity index, which combines species richness and the evenness or equitability by computing the species' relative abundances. The Shannon-Wiener diversity index assumes that all species are represented in a sample and that they are randomly sampled. The Shannon–Wiener index is defined as H = -\sum_i p_i \log_{b} p_i, where p_i is the proportional abundance of species i and b is the base of the logarithm. It is most popular to use natural logarithms, but some argue for base b = 2 (which makes sense, but no real difference).

simpson refers to the Simpson diversity index which combines species richness and the evenness or equitability by computing the species' relative abundances. The Simpson diversity index is a dominance index, giving more weight to common or dominant species. The Simpson diversity index is based on D = \sum p_i^2 and returns 1-D

simpson_beta refers to the Simpson dissimilarity, which is based on diversity partitioning, which separates species replacement (i.e. turnover) from species loss (i.e. nestedness). The Simpson dissimilarity corresponds to the turnover component of the Sorensen dissimilarity. Thus, it is used to maximise species turnover.

By default, the function will extract and compile data from the top n selected plots (as specified by the user) based on all the different optimisers. The starting seed by default will be "fixed" which will correspond to the site with the highest value of species richness.

frequent refers to the most frequent sites that have been selected using the simpson_beta optimiser in a certain number of iterations that must be defined by the user. In order to do so, the starting seed for the simpson_beta optimiser must be "random". The result will display two different accumulation curves: one with the most frequent selected plots and another one with the mean and standard deviation of all the species accumulation curves obtained with a random starting seed for all the iterations

random refers to the species accumulation when the sites are selected randomly.

plot calls a function that allows plotting all the species accumulation curves obtained for each of the optimisers included in the optim_species function, see plot_opt.


Returns a list containing, for each optimiser within the function, a species accumulation object (see specaccum, object as returned from vegan R package). The species accumulation refers to the cumulative curve or the number of species for a certain number of selected sites or individuals. Likewise, a list of the sites selected in order to maximise the value of the species accumulated using each optimiser will be returned. Additionally, the function will plot by default of the species cumulative curves obtained by the different optimisers.


Irene Martin-Fores, Samantha Munroe and Greg Guerin.


Albuquerque, F. & Beier, P. (2015) Rarity-weighted richness: a simple and reliable alternative to integer programming and heuristic algorithms for minimum set and maximum coverage problems in conservation planning. PLoS ONE 10, e0119905.

Baselga, A. (2010) Multiplicative partition of true diversity yields independent alpha and beta components; additive partition does not. Ecology 91, 1974-1981.

Baselga, A. (2012) The relationship between species replacement, dissimilarity derived from nestedness, and nestedness. Global Ecology and Biogeography 21, 1223-1232.

Baselga, A. & Leprieur, F. (2015) Comparing methods to separate components of beta diversity. Methods in Ecology and Evolution 6, 1069-1079.

Guerin, G.R. & Lowe, A.J. (2015) 'Sum of inverse range-sizes'(SIR), a biodiversity metric with many names and interpretations. Biodiversity and conservation 24, 2877-2882.

Guerin, G.R., Ruokolainen, L. & Lowe, A.J. (2015). A georeferenced implementation of weighted endemism. Methods in Ecology and Evolution 6, 845-852.

Jost, L. (2007). Partitioning diversity into independent alpha and beta components. Ecology 88, 2427–2439.

Koleff, P., Gaston, K. J., & Lennon, J. J. (2003). Measuring beta diversity for presence-absence data. Journal of Animal Ecology 72, 367-382.

Martín‐Forés, I., Guerin, G. R., Munroe, S. E., & Sparrow, B. (2021). Applying conservation reserve design strategies to define ecosystem monitoring priorities. Ecology and Evolution 11, 17060-17070.

Oksanen, J. et al. (2016) vegan: Community Ecology Package. R package version 2.4-3. Vienna: R Foundation for Statistical Computing.

See Also





  #example with dune database from vegan
  optim_species(dune, n.plt=15, frequent=FALSE)
  #example with auplots database from ausplotsR
  ## Not run: 
  ausplotsdata <- get_ausplots(veg.PI=TRUE)
  ausplotsPAdata <- species_table(ausplotsdata$veg.PI, m_kind="percent_cover", 
  cover_type="PFC", species_name="SN")
  optim_species(ausplotsPAdata, n.plt=5, iterations= 5)
  optim_species(ausplotsPAdata, n.plt=5, start="defined", plot=TRUE, 
  plot_name="WAANUL0001-56966", frequent=FALSE, random=TRUE, iterations=20)
## End(Not run)

ausplotsR documentation built on Nov. 17, 2023, 9:06 a.m.