mop: Analysis of extrapolation risks using the MOP metric

Analysis of extrapolation risks using the MOP metric


Analysis to calculate the mobility-oriented parity metric and other sub-products to represent dissimilarities and non-analogous conditions when comparing a set of reference conditions (M; m) against another set of conditions of interest (G; g).


mop(m, g, type = "basic",  calculate_distance = FALSE,
    where_distance = "in_range", distance = "euclidean",
    scale = FALSE, center = FALSE, fix_NA = TRUE, percentage = 1,
    comp_each = 2000, tol = NULL, rescale_distance = FALSE,
    parallel = FALSE, n_cores = NULL, progress_bar = TRUE)



a SpatRaster or matrix of variables representing a set of conditions of reference (e.g., the set of conditions in which a model was calibrated). If a matrix is used, each column represents a variable.


a SpatRaster or matrix of variables representing a set of conditions of interest for which dissimilarity values and non-analogous conditions will be detected (e.g., conditions in which a model is projected). Variable names must match between m and g.


character, type of MOP analysis to be performed. See Details for options.


logical, whether to calculate distances (dissimilarities) between m and g. The default, FALSE, runs rapidly and does not assess dissimilarity levels.


character, where to calculate distances, considering how conditions in g are positioned in comparison to the range of conditions in m. See Details for options.


character, which distances are calculated, euclidean or mahalanobis. Valid if calculate_distance = TRUE.


scaling options, logical or numeric-alike as in scale.


logical or numeric-alike center options as in scale.


logical, whether to fix layers so cells with NA values are the same in all layers. Setting to FALSE may save time if the rasters are big and have no NA matching problems.


numeric, percentage of m closest conditions used to derive mean environmental distances to each combination of conditions in g.


numeric, number of combinations in g to be used for distance calculations at a time. Increasing this number requires more RAM.


tolerance to detect linear dependencies when calculating Mahalanobis distances. The default, NULL, uses .Machine$double.eps.


logical, whether or not to re-scale distances 0-1. Re-scaling prevents comparisons of dissimilarity values obtained from runs with different values of percentage.


logical, whether calculations should be performed in parallel using n_cores of the computer. Using this option will speed up the analysis but will demand more RAM.


numeric, number of cores to be used in parallel processing. If parallel = TRUE and n_cores = NULL (all CPU cores on current host - 1) will be used.


logical, whether to show a progress bar.


type options return results that differ in the detail of how non-analogous conditions are identified.

  • basic - makes calculation as proposed by Owens et al. (2013) doi:10.1016/j.ecolmodel.2013.04.011.

  • simple - calculates how many variables in the set of interest are non-analogous to those in the reference set.

  • detailed - calculates five additional extrapolation metrics. See mop_detailed under Value below for full details.

where_distance options determine what values should be used to calculate dissimilarity

  • in_range - only conditions inside m ranges

  • out_range - only conditions outside m ranges

  • all - all conditions

When the variables used to represent conditions have different units, scaling and centering are recommended. This step is only valid when Euclidean distances are used.


A object of class mop_results containing:

  • summary - a list with details of the data used in the analysis:

    • variables - names of variables considered.

    • type - type of MOP analysis performed.

    • scale - value according to the argument scale.

    • center - value according to the argument center.

    • calculate_distance - value according to the argument calculate_distance.

    • distance - option regarding distance used.

    • percentage - percentage of m used as reference for distance calculation.

    • rescale_distance - value according to the argument rescale_distance.

    • fix_NA - value according to the argument fix_NA.

    • N_m - total number of elements (cells with values or valid rows) in m.

    • N_g - total number of elements (cells with values or valid rows) in g.

    • m_ranges - matrix with ranges of variables in reference conditions (m).

  • mop_distances - if calculate_distance = TRUE, a SpatRaster or vector with distance values for the set of interest (g). Higher values represent greater dissimilarity compared to the set of reference (m).

  • mop_basic - a SpatRaster or vector, for the set of interest, representing conditions in which at least one of the variables is non-analogous to the set of reference. Values should be: 1 for non-analogous conditions, and NA for conditions inside the ranges of the reference set.

  • mop_simple - a SpatRaster or vector, for the set of interest, representing how many variables in the set of interest are non-analogous to those in the reference set. NA is used for conditions inside the ranges of the reference set.

  • mop_detailed - a list containing:

    • interpretation_combined - a data.frame to help identify combinations of variables in towards_low_combined and towards_high_combined that are non-analogous to m.

    • towards_low_end - a SpatRaster or matrix for all variables representing where non-analogous conditions were found towards low values of each variable.

    • towards_high_end - a SpatRaster or matrix for all variables representing where non-analogous conditions were found towards high values of each variable.

    • towards_low_combined - a SpatRaster or vector with values representing the identity of the variables found to have non-analogous conditions towards low values. If vector, interpretation requires the use of the data.frame interpretation_combined.

    • towards_high_combined - a SpatRaster or vector with values representing the identity of the variables found to have non-analogous conditions towards high values. If vector, interpretation requires the use of the data.frame interpretation_combined.

# data
reference_layers <- terra::rast(system.file("extdata", "reference_layers.tif",
                                            package = "mop"))

layers_of_interest <- terra::rast(system.file("extdata",
                                              package = "mop"))

# analysis
mop_res <- mop(m = reference_layers, g = layers_of_interest)


