uniformG_selection: Selection of survey sites maximizing uniformity in geography
In biosurvey: Tools for Biological Survey Planning

Description Usage Arguments Details Value See Also Examples

Selection of sites to be sampled in a survey, with the goal of maximizing uniformity of points in geographic space.

uniformG_selection(master, expected_points, guess_distances = TRUE,
                   initial_distance = NULL, increase = NULL,
                   max_n_samplings = 1, replicates = 10,
                   use_preselected_sites = TRUE,
                   median_distance_filter = NULL, set_seed = 1,
                   verbose = TRUE, force = FALSE)

`master`	master_matrix object derived from function `prepare_master_matrix` or master_selection object derived from functions `random_selection`, `uniformE_selection`, or `EG_selection`.
`expected_points`	(numeric) number of survey points (sites) to be selected.
`guess_distances`	(logical) whether or not to use internal algorithm to select automatically `initial_distance` and `increase`. Default = TRUE. If FALSE, `initial_distance` and `increase` must be defined.
`initial_distance`	(numeric) distance in km to be used for a first process of thinning and detection of remaining points. Default = NULL.
`increase`	(numeric) initial value to be added to or subtracted from `initial_distance` until reaching the number of `expected_points`. Default = NULL.
`max_n_samplings`	(numeric) maximum number of samples to be chosen after performing all thinning `replicates`. Default = 1.
`replicates`	(numeric) number of thinning replicates. Default = 10.
`use_preselected_sites`	(logical) whether to use sites that have been defined as part of the selected sites previous any selection. Object in `master` must contain the site(s) preselected in and element of name "preselected_sites" for this argument to be effective. Default = TRUE. See details for more information on the approach used.
`median_distance_filter`	(character) optional argument to define a median distance-based filter based on which sets of sampling sites will be selected. The default, NULL, does not apply such a filter. Options are: "max" and "min".
`set_seed`	(numeric) integer value to specify an initial seed. Default = 1.
`verbose`	(logical) whether or not to print messages about the process. Default = TRUE.
`force`	(logical) whether to replace existing set of sites selected with this method in `master`.

Survey sites are selected searching for maximum geographic distances among all sites. This approach helps in selecting points that can cover most of the geographic extent of the region of interest. This type of selection could be appropriate when the region of interest has a complex geographic pattern (e.g., an archipelago). This type of selection does not consider environmental conditions in the region of interest, which is why important environmental combinations may not be represented in the final selection of sites.

Exploring the geographic and environmental spaces of the region of interest would be a crucial first step before selecting survey sites. Such explorations can be done using the function explore_data_EG.

If use_preselected_sites = TRUE and such sites are included as an element in the object in master, the approach for selecting uniform sites in geography is different than what was described above. User-preselected sites will always be part of the sites selected. Other points are selected based on an algorithm that searches for sites that are uniformly distributed in geographic space but at a distance from preselected sites that helps in maintaining uniformity. Note that preselected sites will not be processed; therefore, uniformity of such points cannot be warrantied.

As multiple sets could result from selection when the use_preselected_sites is set as FALSE, the argument of the function median_distance_filter could be used to select the set of sites with the maximum ("max") or minimum ("min") median distance among selected sites. The option "max" will increase the geographic distance among sampling sites, which could be desirable if the goal is to cover the region of interest more broadly. The other option, "min", could be used in cases when the goal is to reduce resources and time needed to sample such sites.

A master_selection object (S3) with an element called selected_sites_G containing one or more sets of selected sites.

random_selection, uniformE_selection, EG_selection, plot_sites_EG

# Data
data("m_matrix", package = "biosurvey")

# Selecting sites uniformly in G space
selectionG <- uniformG_selection(m_matrix, expected_points = 40,
                                 max_n_samplings = 1, replicates = 5)