predImportMakeData: Master function to create common sets of data for simulations

View source: R/predImportMakeData.r

predImportMakeDataR Documentation

Master function to create common sets of data for simulations

Description

This function is used to create multiple simulated data sets for further analysis. The simulated data represents data that one would typically need for modeling species distributions (i.e., training and test presences, test absences and background sites). Typical implementation is to use predImportMakeData to create simulated data sets, then predImportTrainModels to train SDMs on those data sets, then predImportEval to evaluate the models.

Usage

predImportMakeData(
  geography,
  response,
  simDir,
  numTrainPres = 200,
  numTrainAbs = 200,
  numTestPres = 200,
  numTestAbs = 200,
  numBg = 10000,
  iters = 1:100,
  circle = FALSE,
  sizeNative = 1024,
  sizeResampled = NULL,
  fileFlag = NULL,
  userdata = NULL,
  b0 = NA,
  b1 = NA,
  b2 = NA,
  b11 = NA,
  b12 = NA,
  mu1 = NA,
  mu2 = mu2,
  sigma1 = NA,
  sigma2 = NA,
  rho = NA,
  overwrite = FALSE,
  verbose = 1,
  ...
)

Arguments

geography

A list of lists describing the simulated environmental layers. Each sublist pertains to one later. See genesis for details.

response

A function describing the response of the species to the environment. This must be one of: logistic, logisticShift, or gaussian.

simDir

Character, path name of directory in which scenario data files are saved.

numTrainPres

Positive integer, number of training presences to locate.

numTrainAbs

Positive integer, number of training absences to locate.

numTestPres

Positive integer, number of test presences to locate.

numTestAbs

Positive integer, number of test absences to locate.

numBg

Positive integer, number of training and number of test background sites to locate.

iters

Vector of positive integers, data iterations to generate.

circle

Logical, if FALSE (default), all landscapes are square. If TRUE then landscapes are circular.

sizeNative

Positive integer, size of landscape in number of cells on a side. This specifies the spatial resolution at which the species perceives the landscape. See Details.

sizeResampled

Positive integer, size of landscape in number of cells on a side. This specifies the spatial resolution at which environmental data for model calibration and evaluation is available to the "modeler". Note that resampling will thus change the environmental values of the training and test data. See Details.

fileFlag

Either NULL or a character string. If a character string then this is included in the simulated data file name and each model file name. If NULL (default), nothing is added, so file names will be as "model XXX.RData". If a character string, then the file name will be as "ALGORITHM FLAG model XXX.RData" where "XXX" is the iteration number, "FLAG" the string in fileFlag, and "ALGORITHM" the model algorithm name.

userdata

Either NULL (default) or a 1-line data frame to be included as part of the data in the sim object. This will be included in the evaluation data frame generated by the predImportEval function. It is useful for specifying aspects of the simulation that are not recorded by default. This metadata will be included in the evaluation data frame created by the prediMportEval function so will be available for analysis.

b0

Numeric, parameters for logistic or logisticShift function specified in the response argument (above). Logistic intercept. Default is NA.

b1

Numeric, parameters for logistic or logisticShift function specified in the response argument (above). Logistic slope. Default is NA.

b2

Numeric, parameters for logistic or logisticShift function specified in the response argument (above). Logistic slope. Default is c.

b11

Numeric, parameters for logistic or logisticShift function specified in the response argument (above). Left-right shift along variable. Default is NA.

b12

Numeric, parameters for logistic or logisticShift function specified in the response argument (above). Interaction term. Default is NA.

mu1

Numeric, parameters for gaussian function specified in the response argument (above). Mean of variable. Default is NA.

mu2

Numeric, parameters for gaussian function specified in the response argument (above). Mean of variable. Default is NA.

sigma1

Numeric, parameters for gaussian function specified in the response argument (above). Standard deviation of variable. Default is NA.

sigma2

Numeric, parameters for gaussian function specified in the response argument (above). Standard deviation of variable. Default is NA.

rho

Numeric, parameters for gaussian function specified in the response argument (above). Covariance term. Default is NA.

overwrite

Logical, if TRUE then save over pre-existing data files. Default is FALSE.

verbose

Numeric, if 0 then show minimal output, 1 more output, 2 even more, >2 all of it.

...

Other arguments (unused).

Details

In addition to its many capabilities, this function can be used to examine the effects of differences in spatial resolution of the scale at which a species responds to the environment and the scale at which environmental data is available to the modeler. When sizeResampled is NULL then the response scale is the same as the scale of environmental data. But when sizeResampled is not NULL and different from sizeNative then the landscape will be resampled to the stated resolution before environmental calibration and evaluation data is extracted. However, presences and absences and background sites will be drawn from the distribution of the true probability of presence generated using the native resolution landscape. Thus resampling to a different resolution maybe (intentionally) "confusing" to model because it wil be presented with data that is not necessarily indicative of the observed state (presence/background).

Value

Nothing (saves data files to disc).

See Also

predImportTrainModels, predImportEval


adamlilith/enmSdmPredImport documentation built on Dec. 31, 2022, 5:40 p.m.