validate: Validates the accuracy of the calibrated reproductive rate...

View source: R/validate.R

validateR Documentation

Validates the accuracy of the calibrated reproductive rate and dispersal scales of the pops model.

Description

This function uses the quantity, allocation, and configuration disagreement to validate the model across the landscape using the parameters from the calibrate function. Ideally the model is calibrated with 2 or more years of data and validated for the last year or if you have 6 or more years of data then the model can be validated for the final 2 years.

Usage

validate(
  infected_years_file,
  number_of_iterations = 10,
  number_of_cores = NA,
  parameter_means,
  parameter_cov_matrix,
  pest_host_table,
  competency_table,
  infected_file_list,
  host_file_list,
  total_populations_file,
  temp = FALSE,
  temperature_coefficient_file = "",
  precip = FALSE,
  precipitation_coefficient_file = "",
  model_type = "SI",
  latency_period = 0,
  time_step = "month",
  season_month_start = 1,
  season_month_end = 12,
  start_date = "2008-01-01",
  end_date = "2008-12-31",
  use_survival_rates = FALSE,
  survival_rate_month = 3,
  survival_rate_day = 15,
  survival_rates_file = "",
  use_lethal_temperature = FALSE,
  temperature_file = "",
  lethal_temperature = -12.87,
  lethal_temperature_month = 1,
  mortality_frequency = "year",
  mortality_frequency_n = 1,
  management = FALSE,
  treatment_dates = c(""),
  treatments_file = "",
  treatment_method = "ratio",
  natural_kernel_type = "cauchy",
  anthropogenic_kernel_type = "cauchy",
  natural_dir = "NONE",
  anthropogenic_dir = "NONE",
  pesticide_duration = 0,
  pesticide_efficacy = 1,
  mask = NULL,
  output_frequency = "year",
  output_frequency_n = 1,
  movements_file = "",
  use_movements = FALSE,
  start_exposed = FALSE,
  generate_stochasticity = TRUE,
  establishment_stochasticity = TRUE,
  movement_stochasticity = TRUE,
  dispersal_stochasticity = TRUE,
  establishment_probability = 0.5,
  dispersal_percentage = 0.99,
  quarantine_areas_file = "",
  use_quarantine = FALSE,
  use_spreadrates = FALSE,
  use_overpopulation_movements = FALSE,
  overpopulation_percentage = 0,
  leaving_percentage = 0,
  leaving_scale_coefficient = 1,
  exposed_file_list = "",
  write_outputs = "None",
  output_folder_path = "",
  point_file = "",
  network_filename = "",
  network_movement = "walk",
  use_distance = FALSE,
  use_configuration = FALSE,
  use_initial_condition_uncertainty = FALSE,
  use_host_uncertainty = FALSE,
  weather_type = "deterministic",
  temperature_coefficient_sd_file = "",
  precipitation_coefficient_sd_file = "",
  dispersers_to_soils_percentage = 0,
  quarantine_directions = "",
  multiple_random_seeds = FALSE,
  file_random_seeds = NULL,
  use_soils = FALSE,
  soil_starting_pest_file = "",
  start_with_soil_populations = FALSE,
  county_level_infection_data = FALSE
)

Arguments

infected_years_file

years of initial infection/infestation as individual locations of a pest or pathogen in raster format

number_of_iterations

how many iterations do you want to run to allow the calibration to converge at least 10

number_of_cores

enter how many cores you want to use (default = NA). If not set uses the # of CPU cores - 1. must be an integer >= 1

parameter_means

the parameter means from the abc calibration function (posterior means)

parameter_cov_matrix

the parameter covariance matrix from the ABC calibration function (posterior covariance matrix)

pest_host_table

The file path to a csv that has these columns in this order: host, susceptibility, mortality rate, and mortality time lag as columns with each row being the species. Host species must be in the same order in the host_file_list, infected_file_list, pest_host_table rows, and competency_table columns. The host column is only used for metadata and labeling output files.

competency_table

A csv with the hosts as the first n columns (n being the number of hosts) and the last column being the competency value. Each row is a set of Boolean for host presence and the competency value (between 0 and 1) for that combination of hosts in a cell.

infected_file_list

paths to raster files with initial infections and standard deviation for each host can be based in 2 formats (a single file with number of hosts or a single file with 2 layers number of hosts and standard deviation).. Units for infections are based on data availability and the way the units used for your host file is created (e.g. percent area, # of hosts per cell, etc.).

host_file_list

paths to raster files with number of hosts and standard deviation on those estimates can be based in 2 formats (a single file with number of hosts or a single file with 2 layers number of hosts and standard deviation). The units for this can be of many formats the two most common that we use are either percent area (0 to 100) or # of hosts in the cell. Usually depends on data available and estimation methods.

total_populations_file

path to raster file with number of total populations of all hosts and non-hosts. This depends on how your host data is set up. If host is percent area then this should be a raster with values that are 100 anywhere with host. If host file is # of hosts in a cell then this should be a raster with values that are the max of the host raster any where the # of hosts is greater than 0.

temp

boolean that allows the use of temperature coefficients to modify spread (TRUE or FALSE)

temperature_coefficient_file

path to raster file with temperature coefficient data for the timestep and and time period specified (e.g. if timestep = week and start_date = 2017_01_01 and end_date = 2019_12_31 this file would have 52 * 3 bands = 156 bands with data being weekly precipitation coefficients). We convert raw precipitation values to coefficients that affect the reproduction and survival of the pest all values in the raster are between 0 and 1.

precip

boolean that allows the use of precipitation coefficients to modify spread (TRUE or FALSE)

precipitation_coefficient_file

Raster file with precipitation coefficient data for the timestep and time period specified (e.g. if timestep = week and start_date = 2017_01_01 and end_date = 2019_12_31 this file would have 52 * 3 bands = 156 bands with data being weekly precipitation coefficients). We convert raw precipitation values to coefficients that affect the reproduction and survival of the pest all values in the raster are between 0 and 1.

model_type

What type of model most represents your system. Options are "SEI" (Susceptible - Exposed - Infected/Infested) or "SI" (Susceptible - Infected/Infested). Default value is "SI".

latency_period

How many times steps does it take to for exposed populations become infected/infested. This is an integer value and must be greater than 0 if model type is SEI.

time_step

How often should spread occur options: ('day', 'week', 'month').

season_month_start

When does spread first start occurring in the year for your pest or pathogen (integer value between 1 and 12)

season_month_end

When does spread end during the year for your pest or pathogen (integer value between 1 and 12)

start_date

Date to start the simulation with format ('YYYY_MM_DD')

end_date

Date to end the simulation with format ('YYYY_MM_DD')

use_survival_rates

Boolean to indicate if the model will use survival rates to limit the survival or emergence of overwintering generations.

survival_rate_month

What month do over wintering generations emerge. We suggest using the month before for this parameter as it is when the survival rates raster will be applied.

survival_rate_day

What day should the survival rates be applied

survival_rates_file

Raster file with survival rates from 0 to 1 representing the percentage of emergence for a cell.

use_lethal_temperature

A boolean to answer the question: does your pest or pathogen have a temperature at which it cannot survive? (TRUE or FALSE)

temperature_file

Path to raster file with temperature data for minimum temperature

lethal_temperature

The temperature in degrees C at which lethal temperature related mortality occurs for your pest or pathogen (-50 to 60)

lethal_temperature_month

The month in which lethal temperature related mortality occurs for your pest or pathogen integer value between 1 and 12

mortality_frequency

Sets the frequency of mortality calculations occur either ('year', 'month', week', 'day', 'time step', or 'every_n_steps')

mortality_frequency_n

Sets number of units from mortality_frequency in which to run the mortality calculation if mortality_frequency is 'every_n_steps'. Must be an integer >= 1.

management

Boolean to allow use of management (TRUE or FALSE)

treatment_dates

Dates in which to apply treatment list with format ('YYYY_MM_DD') (needs to be the same length as treatment_file and pesticide_duration)

treatments_file

Path to raster files with treatment data by dates. Needs to be a list of files the same length as treatment_dates and pesticide_duration.

treatment_method

What method to use when applying treatment one of ("ratio" or "all infected"). ratio removes a portion of all infected and susceptibles, all infected removes all infected a portion of susceptibles.

natural_kernel_type

What type of dispersal kernel should be used for natural dispersal. Current dispersal kernel options are ('Cauchy', 'exponential', 'uniform', 'deterministic neighbor','power law', 'hyperbolic secant', 'gamma', 'weibull', 'logistic')

anthropogenic_kernel_type

What type of dispersal kernel should be used for anthropogenic dispersal. Current dispersal kernel options are ('cauchy', 'exponential', 'uniform', 'deterministic neighbor','power law', 'hyperbolic secant', 'gamma', 'weibull', 'logistic', 'network')

natural_dir

Sets the predominate direction of natural dispersal usually due to wind values ('N', 'NW', 'W', 'SW', 'S', 'SE', 'E', 'NE', 'NONE')

anthropogenic_dir

Sets the predominate direction of anthropogenic dispersal usually due to human movement typically over long distances (e.g. nursery trade, movement of firewood, etc..) ('N', 'NW', 'W', 'SW', 'S', 'SE', 'E', 'NE', 'NONE')

pesticide_duration

How long does the pesticide (herbicide, vaccine, etc..) last before the host is susceptible again. If value is 0 treatment is a culling (i.e. host removal) not a pesticide treatment. (needs to be the same length as treatment_dates and treatment_file)

pesticide_efficacy

How effective is the pesticide at preventing the disease or killing the pest (if this is 0.70 then when applied it successfully treats 70 percent of the plants or animals).

mask

Raster file used to provide a mask to remove 0's that are not true negatives from comparisons (e.g. mask out lakes and oceans from statics if modeling terrestrial species).

output_frequency

Sets when outputs occur either ('year', 'month', week', 'day', 'time step', or 'every_n_steps')

output_frequency_n

Sets number of units from output_frequency in which to export model results if mortality_frequency is 'every_n_steps'. Must be an integer >= 1.

movements_file

This is a csv file with columns lon_from, lat_from, lon_to, lat_to, number of animals, and date.

use_movements

This is a boolean to turn on use of the movement module.

start_exposed

Do your initial conditions start as exposed or infected (only used if model_type is "SEI"). Default False. If this is TRUE need to have both infected_files (this can be a raster of all 0's) and exposed_files

generate_stochasticity

Boolean to indicate whether to use stochasticity in reproductive functions default is TRUE

establishment_stochasticity

Boolean to indicate whether to use stochasticity in establishment functions default is TRUE

movement_stochasticity

Boolean to indicate whether to use stochasticity in movement functions default is TRUE

dispersal_stochasticity

Boolean to indicate whether to use a stochasticity in the dispersal kernel default is TRUE

establishment_probability

Threshold to determine establishment if establishment_stochasticity is FALSE (range 0 to 1, default = 0.5)

dispersal_percentage

Percentage of dispersal used to calculate the bounding box for deterministic dispersal

quarantine_areas_file

Path to raster file with quarantine boundaries used in calculating likelihood of quarantine escape if use_quarantine is TRUE

use_quarantine

Boolean to indicate whether or not there is a quarantine area if TRUE must pass in a raster file indicating the quarantine areas (default = FALSE)

use_spreadrates

Boolean to indicate whether or not to calculate spread rates

use_overpopulation_movements

Boolean to indicate whether to use the overpopulation pest movement module (driven by the natural kernel with its scale parameter modified by a coefficient)

overpopulation_percentage

Percentage of occupied hosts when the cell is considered to be overpopulated

leaving_percentage

Percentage of pests leaving an overpopulated cell

leaving_scale_coefficient

Coefficient to multiply scale parameter of the natural kernel (if applicable)

exposed_file_list

paths to raster files with initial exposeds and standard deviation for each host can be based in 2 formats (a single file with number of hosts or a single file with 2 layers number of hosts and standard deviation).. Units for infections are based on data availability and the way the units used for your host file is created (e.g. percent area, # of hosts per cell, etc.).

write_outputs

Either c("summary_outputs", "all_simulations", or "None"). If not "None" output folder path must be provided.

output_folder_path

this is the full path with either / or \ (e.g., "C:/user_name/desktop/pops_sod_2020_2023/outputs/")

point_file

file for point comparison if not provided skips calculations

network_filename

The entire file path for the network file. Used if anthropogenic_kernel_type = 'network'.

network_movement

What movement type do you want to use in the network kernel either "walk", "jump", or "teleport". "walk" allows dispersing units to leave the network at any cell along the edge. "jump" automatically moves to the nearest node when moving through the network. "teleport" moves from node to node most likely used for airport and seaport networks.

use_distance

Boolean if you want to compare distance between simulations and observations. Default is FALSE.

use_configuration

Boolean if you want to use configuration disagreement for comparing model runs. Default is FALSE.

use_initial_condition_uncertainty

Boolean to indicate whether or not to propagate and partition uncertainty from initial conditions. If TRUE the infected_files needs to have 2 layers one with the mean value and one with the standard deviation. If an SEI model is used the exposed_file needs to have 2 layers one with the mean value and one with the standard deviation

use_host_uncertainty

Boolean to indicate whether or not to propagate and partition uncertainty from host data. If TRUE the host_file needs to have 2 layers one with the mean value and one with the standard deviation.

weather_type

string indicating how the weather data is passed in either as a mean and standard deviation to represent uncertainty ("probabilistic") or as a time series ("deterministic")

temperature_coefficient_sd_file

Raster file with temperature coefficient standard deviation data for the timestep and time period specified (e.g. if timestep = week this file would have 52 bands with data being weekly temperature coefficient standard deviations). We convert raw temperature values to coefficients that affect the reproduction and survival of the pest all values in the raster are between 0 and 1.

precipitation_coefficient_sd_file

Raster file with precipitation coefficient standard deviation data for the timestep and time period specified (e.g. if timestep = week this file would have 52 bands with data being weekly precipitation coefficient standard deviations). We convert raw precipitation values to coefficients that affect the reproduction and survival of the pest all values in the raster are between 0 and 1.

dispersers_to_soils_percentage

Range from 0 to 1 representing the percentage of dispersers that fall to the soil and survive.

quarantine_directions

String with comma separated directions to include in the quarantine direction analysis, e.g., 'N,E'. By default all directions (N, S, E, W) are considered

multiple_random_seeds

Boolean to indicate if the model should use multiple random seeds (allows for performing uncertainty partitioning) or a single random seed (backwards compatibility option). Default is FALSE.

file_random_seeds

A file path to the file with the .csv file containing random_seeds table. Use if you are trying to recreate an exact analysis otherwise we suggest leaving the default. Default is Null which draws the seed numbers for each.

use_soils

Boolean to indicate if pests establish in the soil and spread out from there. Typically used for soil borne pathogens.

soil_starting_pest_file

path to the raster file with the starting amount of pest or pathogen.

start_with_soil_populations

Boolean to indicate whether to use a starting soil pest or pathogen population if TRUE then soil_starting_pest_file is required.

county_level_infection_data

Boolean to indicate if infection data is at the county level. If TRUE then the infected_file should be a polygon raster with county level infection/infestation counts.

Value

a data frame of statistical measures of model performance.


ncsu-landscape-dynamics/rpops documentation built on March 30, 2024, 2:17 p.m.