spread2: Simulate a contagious spread process on a landscape, with...

View source: R/spread2.R

spread2R Documentation

Simulate a contagious spread process on a landscape, with data.table internals

Description

This can be used to simulate fires, seed dispersal, calculation of iterative, concentric, symmetric (currently) landscape values and many other things. Essentially, it starts from a collection of cells (start, called "events") and spreads to neighbours, according to the directions and spreadProb with modifications due to other arguments. NOTE: the spread function is similar, but sometimes slightly faster, but less robust, and more difficult to use iteratively.

Usage

spread2(
  landscape,
  start = ncell(landscape)/2 - ncol(landscape)/2,
  spreadProb = 0.23,
  persistProb = NA_real_,
  asRaster = TRUE,
  maxSize,
  exactSize,
  directions = 8L,
  iterations = 1000000L,
  returnDistances = FALSE,
  returnDirections = FALSE,
  returnFrom = FALSE,
  maxRetriesPerID = 10,
  spreadProbRel = NA_real_,
  plot.it = FALSE,
  circle = FALSE,
  asymmetry = NA_real_,
  asymmetryAngle = NA_real_,
  allowOverlap = 0,
  neighProbs = NA_real_,
  oneNeighbourOnly = FALSE,
  skipChecks = FALSE
)

Arguments

landscape

Required. A RasterLayer object. This defines the possible locations for spreading events to start and spread2 into. Required.

start

Required. Either a vector of pixel numbers to initiate spreading, or a data.table that is the output of a previous spread2. If a vector, they should be cell indices (pixels) on the landscape. If user has x and y coordinates, these can be converted with cellFromXY().

spreadProb

Numeric of length 1 or length ncell(landscape) or a RasterLayer that is the identical dimensions as landscape. If numeric of length 1, then this is the global (absolute) probability of spreading into each cell from a neighbour. If a numeric of length ncell(landscape) or a raster, then this must be the cell-specific (absolute) probability of a "receiving" potential cell. Default is 0.23. If relative probabilities are required, use spreadProbRel. If used together, then the relative probabilities will be re-scaled so that the mean relative probability of potential neighbours is equal to the mean of spreadProb of the potential neighbours.

persistProb

Numeric of length 1 or RasterLayer. If numeric of length 1, then this is the global (absolute) probability of cell continuing to burn per time step. If a raster, then this must be the cell-specific (absolute) probability of a fire persisting. Default is NA, which is the same as 0, i.e. a cell only burns for one time step.

asRaster

Logical, length 1. If TRUE, the function will return a Raster where raster non NA values indicate the cells that were "active", and the value is the initial starting pixel.

maxSize

Numeric. Maximum number of cells for a single or all events to be spread2. Recycled to match start length, if it is not as long as start. This will be overridden if exactSize also provided. See section on 'Breaking out of spread2 events'.

exactSize

Numeric vector, length 1 or length(start). Similar to maxSize, but these will be the exact final sizes of the events. i.e., the spread2 events will continue until they are floor(exactSize). This will override maxSize if both provided. See Details.

directions

The number adjacent cells in which to look; default is 8 (Queen case). Can only be 4 or 8.

iterations

Number of iterations to spread2. Leaving this NULL allows the spread2 to continue until stops spreading itself (i.e., exhausts itself).

returnDistances

Logical. Should the function include a column with the individual cell distances from the locus where that event started. Default is FALSE. See Details.

returnDirections

Logical. Should the function include a column with the individual directions (in radians) from the locus where that event started. Default is FALSE.

returnFrom

Logical. Should the function return a column with the source, i.e, the lag 1 "from" pixel, for each iteration.

maxRetriesPerID

Only active if exactSize is used. This is the number of attempts that will be made per event ID, before abandoning, therefore completing the spread2 for that event with a size that is smaller than exactSize. Default 10 times.

spreadProbRel

Optional RasterLayer indicating a surface of relative probabilities useful when using neighProbs (which provides a mechanism for selecting a specific number of cells at each iteration). This indicates the relative probabilities for the selection of successful neighbours. spreadProb will still be evaluated after the relative probabilities and neighProbs has been evaluated, i.e., potential cells will be identified, then some could be rejected via spreadProb. If absolute spreadProb is not desired, be sure to set spreadProb = 1. Ignored if neighProbs is not provided.

plot.it

If TRUE, then plot the raster at every iteration, so one can watch the spread2 event grow.

circle

Logical. If TRUE, then outward spread2 will be by equidistant rings, rather than solely by adjacent cells (via directions arg.). Default is FALSE. Using circle = TRUE can be dramatically slower for large problems. Note, this will likely create unexpected results if spreadProb < 1.

asymmetry

A numeric or RasterLayer indicating the ratio of the asymmetry to be used. i.e., 1 is no asymmetry; 2 means that the angles in the direction of the asymmetryAngle are 2x the spreadProb of the angles opposite tot he asymmetryAngle Default is NA, indicating no asymmetry. See details. This is still experimental. Use with caution.

asymmetryAngle

A numeric or RasterLayer indicating the angle in degrees (0 is "up", as in North on a map), that describes which way the asymmetry is.

allowOverlap

numeric (logical will work for backwards compatibility). See details. Default is 0, i.e., no overlapping.

neighProbs

An optional numeric vector, whose sum is 1. It indicates the probabilities that an individual spread iteration will spread to ⁠1, 2, ..., length(neighProbs)⁠ neighbours, respectively. If this is used (i.e., something other than NA), circle and returnDistances will not work currently.

oneNeighbourOnly

Logical. Default is FALSE. If TRUE, then this spread algorithm will allow exactly one neighbour to be spread to (not fewer or more). This could be used, e.g., for an animal moving. If this is TRUE, then allowOverlap will be set to 2 if it is 0 or 1.

skipChecks

Logical. If TRUE, the argument checking (i.e., assertions) will be skipped. This should likely only be used once it is clear that the function arguments are well understood and function speed is of the primary importance. This is likely most useful in repeated iteration cases i.e., if this call is using the previous output from this same function.

Details

There are 2 main underlying algorithms for active cells to "spread" to nearby cells (adjacent cells): spreadProb and neighProb. Using spreadProb, every "active" pixel will assess all neighbours (either 4 or 8, depending on directions), and will "activate" whichever neighbours successfully pass independent calls to runif(1,0,1) < spreadProb. The algorithm will iterate again and again, each time starting from the newly "activated" cells. Several built-in decisions are as follows.

  1. no active cell can activate a cell that was already activated by the same event (i.e., "it won't go backwards"). 2. If allowOverlap is FALSE, then the previous rule will also apply, regardless of which "event" caused the pixels to be previously active.

This function can be interrupted before all active cells are exhausted if the iterations value is reached before there are no more active cells to spread2 into. The interrupted output (a data.table) can be passed subsequently as an input to this same function (as start). This is intended to be used for situations where external events happen during a spread2 event, or where one or more arguments to the spread2 function change before a spread2 event is completed. For example, if it is desired that the spreadProb change before a spread2 event is completed because, for example, a fire is spreading, and a new set of conditions arise due to a change in weather.

asymmetry here is slightly different than in the spread function, so that it can deal with a RasterLayer of asymmetryAngle. Here, the spreadProb values of a given set of neighbours around each active pixel are adjusted to create adjustedSpreadProb which is calculated maintain the following two qualities:

mean(spreadProb) = mean(ajustedSpreadProb)

and

max(spreadProb)/min(spreadProb) = asymmetry

along the axis of asymmetryAngle. NOTE: this means that the 8 neighbours around an active cell may not fulfill the preceeding equality if asymmetryAngle is not exactly one of the 8 angles of the 8 neighbours. This means that

max(spreadProb)/min(spreadProb)

will generally be less than asymmetry, for the 8 neighbours. The exact adjustment to the spreadProb is calculated with:

angleQuality <- (cos(angles - rad2(asymmetryAngle))+1)/2

which is multiplied to get an angle-adjusted spreadProb:

spreadProbAdj <- actualSpreadProb * angleQuality

which is then rescaled:

adjustedSpreadProb = (spreadProbAdj - min(spreadProbAdj)) * par2 + par1

, where par1 and par2 are parameters calculated internally to make the 2 conditions above true.

If maxSize or exactSize are used, then spreading will continue and stop before or at maxSize or at exactSize, respectively. If iterations is specified, then the function will end, and the returned data.table may (if maxSize) or will (if exactSize) have at least one active cell per event that did not already achieve maxSize or exactSize. This will be very useful to build new, customized higher-level wrapper functions that iteratively call spread2.

Value

Either a data.table (asRaster=FALSE) or a RasterLayer (asRaster=TRUE, the default). The data.table will have one attribute named spreadState, which is a list containing a data.table of current cluster-level information about the spread events. If asRaster=TRUE, then the data.table (with its spreadState attribute) will be attached to the Raster as an attribute named pixel as it provides pixel-level information about the spread events.

The RasterLayer represents every cell in which a successful spread2 event occurred. For the case of, say, a fire this would represent every cell that burned. If allowOverlap is TRUE, the return will always be a data.table.

If asRaster is FALSE, then this function returns a data.table with 3 (or 4 if returnFrom is TRUE) columns:

initialPixels the initial cell number of that particular spread2 event.
pixels The cell indices of cells that have been touched by the spread2 algorithm.
state a logical indicating whether the cell is active (i.e., could still be a source for spreading) or not (no spreading will occur from these cells).
from The pixel indices that were the immediately preceding "source" for each pixels, i.e., the lag 1 pixels. Only returned if returnFrom is TRUE

The attribute saved with the name "spreadState" (e.g., attr(output, "spreadState")) includes a data.table with columns:

id An arbitrary code, from 1 to length(start) for each "event".
initialPixels the initial cell number of that particular spread2 event.
numRetries The number of re-starts the event did because it got stuck (normally only because exactSize was used and was not achieved.
maxSize The number of pixels that were provided as inputs via maxSize or exactSize.
size The current size, in pixels, of each event.

and several other objects that provide significant speed ups in iterative calls to spread2. If the user runs spread2 iteratively, there will likely be significant speed gains if the data.table passed in to start should have the attribute attached, or re-attached if it was lost, e.g., via setattr(outInput, "spreadState", attr(out, "spreadState")), where out is the returned data.table from the previous call to spread2, and outInput is the modified data.table. Currently, the modified data.table must have the same order as out.

Breaking out of spread2 events

There are 3 ways for the spread2 to "stop" spreading. Here, each "event" is defined as all cells that are spawned from each unique start location. The ways outlined below are all acting at all times, i.e., they are not mutually exclusive. Therefore, it is the user's responsibility to make sure the different rules are interacting with each other correctly.

spreadProb Probabilistically, if spreadProb is low enough, active spreading events will stop. In practice, this number generally should be below 0.3 to actually see an event stop.
maxSize This is the number of cells that are "successfully" turned on during a spreading event. spreadProb will still be active, so, it is possible that the end size of each event is smaller than maxSize, but they will not be greater than maxSize
exactSize This is the number of cells that are "successfully" turned on during a spreading event. This will override an event that stops probabilistically via spreadProb, but forcing its last set of active cells to try again to find neighbours. It will try maxRetriesPerID times per event, before giving up. During those maxRetriesPerID times, it will try to "jump" up to 4 cells outwards from each of the active cells, every 5 retries.
iterations This is a hard cap on the number of internal iterations to complete before returning the current state of the system as a data.table.

Building custom spreading events

This function can be used iteratively, with relatively little overhead compared to using it non-iteratively. In general, this function can be called with arguments set as user needs, and with specifying e.g., iterations = 1. This means that the function will spread outwards 1 iteration, then stop. The returned object will be a data.table or RasterLayer that can be passed immediately back as the start argument into a subsequent call to spread2. This means that every argument can be updated at each iteration.

When using this function iteratively, there are several things to keep in mind. The output will likely be sorted differently than the input (i.e., the order of start, if a vector, may not be the same order as that returned). This means that when passing the same object back into the next iteration of the function call, maxSize or exactSize may not be in the same order. To get the same order, the easiest thing to do is sort the initial start objects by their pixel location, increasing. Then, of course, sorting any vectorized arguments (e.g., maxSize) accordingly.

NOTE: the data.table or RasterLayer should not be altered when passed back into spread2.

allowOverlap

If 1 (or TRUE), then individual events can overlap with one another, i.e., allow overlap between events. If 2 (or NA), then each pixel is essentially independent, allowing overlap between and within events. This likely requires a user to intervene as it is possible to spread back onto itself. If 3 (did not exist previously), individual events can overlap, and there can be overlap within an event, but only within an iteration, i.e., once an iteration is finished, and a pixel was activated, then the spreading will not return onto these pixels. If 0 (or FALSE), then once a pixel is activated, it cannot be re-activated, within or between event. This allows events to not interfere with one another i.e., they do not interact (this is slower than if allowOverlap = FALSE). Default is 0. In the case of 2 or 3, this would be, perhaps, useful for dispersal of, say, insect swarms.

Note

exactSize may not be achieved if there aren't enough cells in the map. Also, exactSize may not be achieved because the active cells are "stuck", i.e., they have no inactivated cells to move to; or the spreadProb is low. In the latter two cases, the algorithm will retry again, but it will only retry from the last iteration's active cells. The algorithm will only retry 10 times before quitting. Currently, there will also be an attempt to "jump" up to four cells away from the active cells to try to continue spreading.

A common way to use this function is to build wrappers around this, followed by iterative calls in a while loop. See example.

Author(s)

Eliot McIntire and Steve Cumming

See Also

spread() for a different implementation of the same algorithm. spread is less robust but it is often slightly faster.

Examples

library(terra)

origDTThreads <- data.table::setDTthreads(2L)
origNcpus <- options(Ncpus = 2L)

a <- rast(ext(0, 10, 0, 10), res = 1)
sams <- sort(sample(ncell(a), 3))

# Simple use -- similar to spread(...)
out <- spread2(a, start = sams, 0.225)
if (interactive()) {
  terra::plot(out)
}

# Use maxSize -- this gives an upper limit
maxSizes <- sort(sample(1:10, size = length(sams)))
out <- spread2(a, start = sams, 0.225, maxSize = maxSizes, asRaster = FALSE)
# check TRUE using data.table .N
out[, .N, by = "initialPixels"]$N <= maxSizes

# Use exactSize -- gives an exact size, if there is enough space on the Raster
exactSizes <- maxSizes
out <- spread2(a, start = sams, spreadProb = 0.225,
               exactSize = exactSizes, asRaster = FALSE)
out[, .N, by = "initialPixels"]$N == maxSizes # should be TRUE TRUE TRUE

# Use exactSize -- but where it can't be achieved
exactSizes <- sort(sample(100:110, size = length(sams)))
out <- spread2(a, start = sams, 1, exactSize = exactSizes)

# Iterative calling -- create a function with a high escape probability
spreadWithEscape <- function(ras, start, escapeProb, spreadProb) {
  out <- spread2(ras, start = sams, spreadProb = escapeProb, asRaster = FALSE)
  while (any(out$state == "sourceActive")) {
    # pass in previous output as start
    out <- spread2(ras, start = out, spreadProb = spreadProb,
                    asRaster = FALSE, skipChecks = TRUE) # skipChecks for speed
  }
  out
}

set.seed(421)
out1 <- spreadWithEscape(a, sams, escapeProb = 0.25, spreadProb = 0.225)
set.seed(421)
out2 <- spread2(a, sams, 0.225, asRaster = FALSE)
# The one with high escape probability is larger (most of the time)
NROW(out1) > NROW(out2) ## TODO: not true

## Use neighProbs, with a spreadProb that is a RasterLayer
# Create a raster of different values, which will be the relative probabilities
#   i.e., they are rescaled to relative probabilities within the 8 neighbour choices.
#   The neighProbs below means 70% of the time, 1 neighbour will be chosen,
#   30% of the time 2 neighbours.
#   The cells with spreadProb of 5 are 5 times more likely than cells with 1 to be chosen,
#   when they are both within the 8 neighbours
sp <- rast(ext(0, 3, 0, 3), res = 1, vals = 1:9) #small raster, simple values
# Check neighProbs worked
out <- list()

# enough replicates to see stabilized probabilities
for (i in 1:100) {
  out[[i]] <- spread2(sp, spreadProbRel = sp, spreadProb = 1,
                      start = 5, iterations = 1,
                      neighProbs = c(1), asRaster = FALSE)
}
out <- data.table::rbindlist(out)[pixels != 5] # remove starting cell
table(sp[out$pixels])
# should be non-significant -- note no 5 because that was the starting cell
#  This tests whether the null model is true ... there should be proportions
#  equivalent to 1:2:3:4:6:7:8:9 ... i.e,. cell 9 should have 9x as many events
#  spread to it as cell 1. This comes from sp object above which is providing
#  the relative spread probabilities
keep <- c(1:4, 6:9)
chisq.test(keep, unname(tabulate(sp[out$pixels]$lyr.1, 9)[keep]),
           simulate.p.value = TRUE)

## Example showing asymmetry
sams <- ncell(a) / 4 - ncol(a) / 4 * 3
circs <- spread2(a, spreadProb = 0.213, start = sams,
                 asymmetry = 2, asymmetryAngle = 135,
                 asRaster = TRUE)
## ADVANCED: Estimate spreadProb when using asymmetry, such that the expected
##   event size is the same as without using asymmetry

  if (interactive()) {
    # Still using `raster` as it works more easily with parallelism due to not using pointers
    #   This will updated at a later release
    if (requireNamespace("raster", quietly = TRUE)) {
      ras <- raster::raster(a)
      ras[] <- 1

      n <- 100
      sizes <- integer(n)
      for (i in 1:n) {
        circs <- spread2(ras, spreadProb = 0.225,
                         start = round(ncell(ras) / 4 - ncol(ras) / 4 * 3),
                         asRaster = FALSE)
        sizes[i] <- circs[, .N]
      }
      goalSize <- mean(sizes)

      if (requireNamespace("DEoptim", quietly = TRUE)) {
        library(parallel)
        library(DEoptim)

        # need 10 cores for 10 populations in DEoptim
        cl <- makeCluster(pmin(10, detectCores() - 2))
        parallel::clusterEvalQ(cl, {
          library(SpaDES.tools)
          library(terra)
          library(raster)
          library(fpCompare)
        })

        objFn <- function(sp, n = 20, ras, goalSize) {
          sizes <- integer(n)
          for (i in 1:n) {
            circs <- spread2(ras, spreadProb = sp, start = ncell(ras) / 4 - ncol(ras) / 4 * 3,
                             asymmetry = 2, asymmetryAngle = 135,
                             asRaster = FALSE)
            sizes[i] <- circs[, .N]
          }
          abs(mean(sizes) - goalSize)
        }
        aa <- DEoptim(objFn, lower = 0.2, upper = 0.23,
                      control =
                        DEoptim.control(
                          cluster = cl, NP = 10, VTR = 0.02,
                          # imposing itermax simply for example; should let go to completion
                          itermax = 5,
                          initialpop = as.matrix(rnorm(10, 0.213, 0.001))),
                      ras = ras, goalSize = goalSize)

        # The value of spreadProb that will give the
        #    same expected event sizes to spreadProb = 0.225 is:
        sp <- aa$optim$bestmem
        circs <- spread2(ras, spreadProb = sp, start = ncell(ras) / 4 - ncol(ras) / 4 * 3,
                         asymmetry = 2, asymmetryAngle = 135, asRaster = FALSE)

        stopCluster(cl)
      }
    }
  }

  # clean up
  data.table::setDTthreads(origDTThreads)
  options(Ncpus = origNcpus)


PredictiveEcology/SpaDES.tools documentation built on April 18, 2024, 3:21 a.m.