knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

This package implements the buffered random sampling algorithm described in a paper by Kingsley, M., Kanneworff, P., & Carlsson, D. (2004).

Getting started

The main function of the package, buffered_random_sampling, requires two data sets: one with strata and one with elements (points) within the strata.

The package includes two example data sets, element and stratum, showing how they should look. (See the exact details by running ?element and ?stratum in the console.)

library(buffersam)
set.seed(92)

stratum
head(element)

To create an allocation run the function on two data sets:

allocation <- buffered_random_sampling(element, stratum)
allocation

Get feedback while the algorithm is running

It's possible to get feedback while the algorithm is running, to know how far it has come or better understand how the algorithm works. This feedback can be provided in text or visuals.

Visual feedback

It's possible to visualise the allocation by using visualise=TRUE.

allocation <- buffered_random_sampling(element, stratum, visualise=TRUE)

Text feedback

Use verbose=TRUE to print the status in the terminal as the algorithm is running.

allocation <- buffered_random_sampling(element, stratum, verbose=TRUE)

Pause algorithm

It's possible to pause the execution of the algorithm while it's running to allow time for inspecting the progress. This only really makes sense if used together with one of the other feedback parameters.

allocation <- buffered_random_sampling(element, stratum, pause=TRUE)

Level of detail

detail controls how often the algorithm should give feedback, visual, text, or pause. Level 1 for least detail, level 4 for most.

allocation <- buffered_random_sampling(element, stratum, verbose=TRUE, detail=2)
allocation <- buffered_random_sampling(element, stratum, verbose=TRUE, detail=3)

Preselect elements

It's possible that some elements need to be selected without regard to buffered random sampling. These elements should be pre-selected, as if the algorithm had already selected them. These could be elements from previous years that need to be repeated.

preselect_element <- head(element, 2)
preselect_element
allocation <- buffered_random_sampling(element, stratum, preselect_element = preselect_element)

Algorithm in words

This section contains a brief description of the buffered random algorithm in words. Please refer to the paper for exact details.

  1. Create a list of all strata and a list of all elements.
  2. Calculate the sampling density for each stratum by dividing the required number of stations by the stratum's area.
  3. Pick the stratum with the lowest sampling density as the current stratum.
  4. Set buffering distance to sqrt((4𝜏A)/(nĪ€)), where A is stratum area, n is the required number of stations for the current stratum, and 𝜏 is the 'packing intensity' which is set to 0.5 by default (see paper for more detail).
  5. Pick an element at random that is not closer than the buffering distance to any other selected elements.
    1. If the element is within the current stratum, mark it as selected.
    2. If the element is not within the current stratum, mark it as temporary selected.
  6. If the required number of stations has been reached:
    1. Remove the current stratum from the stratum list.
    2. Move all the temporary selected elements back to the element list.
    3. Go to 3.
  7. If the element list is empty:
    1. Move all the temporary selected elements back to the element list.
    2. Move all the selected elements from this stratum back to the element list.
    3. Reduce the buffering distance by 10 %.
    4. Go to 5.
  8. Go to 5.

Strata and Elements

The survey area is divided into a number of strata which contains a number of elements. Each element is a fixed point with a surrounding area. In the West Greenland shrimp survey, each element is the center point of a ~2 nautical miles square.

The algorithm doesn't impose any requirements on how the elements are laid out or how many of them are available. However, an element list is needed; the algorithm cannot select arbitrary points within each stratum.

Elements belonging to multiple strata

Elements can be placed, so its surrounding area is inside two or more strata. This can be represented in the elements data frame by having multiple rows with the same elementId but different values for stratum.

Before the algorithm runs, these elements are assigned to one stratum at random by removing all but one of the duplicated elementId's from the data set. If the elements data frame contains the column preselect_probability, the sampling will be done proportionally to the values in this column. A possible value for this variable is the area the element occupies in the given stratum.

This is done to avoid increasing the likelihood of these elements being selected. Pre-selected elements are always kept.

This peculiarity is not described in the original paper.



johan-ejstrud/buffersam documentation built on July 31, 2024, 7:40 p.m.