stats_evalues: Statistics of environmental conditions in M and for...

View source: R/stats_evalues.R

stats_evaluesR Documentation

Statistics of environmental conditions in M and for occurrences (multiple variables)

Description

stats_evalues helps in creating csv files with statistics of environmental conditions in accessible areas (M) and species occurrence records. This is done using data read directly from a local directory, and can be applied to various species and multiple variables.

Usage

stats_evalues(stats = c("median", "range"), M_folder, M_format, occ_folder,
              longitude, latitude, var_folder, var_format, round = FALSE,
              round_names, multiplication_factor = 1, percentage_out = 0,
              save = FALSE, output_directory, overwrite = FALSE,
              verbose = TRUE)

Arguments

stats

(character) name or vector of names of functions to be applied to get basic statistics of environmental values.

M_folder

(character) name of the folder containing files representing the accessible area (M) for each species to be analyzed. See details.

M_format

format of files representing the accessible area (M) for the species. Names of M files must match the ones for occurrence files in occ_folder. Format options are: "shp", "gpkg", or any of the options supported by rast (e.g., "tif" or "asc").

occ_folder

(character) name of the folder containing csv files of occurrence data for all species. Names of csv files must match the ones of M files in M_folder.

longitude

(character) name of the column in occurrence files containing values of longitude.

latitude

(character) name of the column in occurrence files containing values of latitude.

var_folder

(character) name of the folder containing layers to represent environmental variables.

var_format

format of layers to represent environmental variables. Format options are all the ones supported by rast (e.g., "tif" or "asc").

round

(logical) whether or not to round the values of one or more variables after multiplying them times the value in multiplication_factor. Default = FALSE. See details.

round_names

(character) names of the variables to be rounded. Default = NULL. If round = TRUE, names must be defined.

multiplication_factor

(numeric) value to be used to multiply the variables defined in round_names. Default = 1.

percentage_out

(numeric) percentage of extreme environmental data in M to be excluded in bin creation for further analyses. See details. Default = 0.

save

(logical) whether or not to save the results in working directory. Default = FALSE.

output_directory

(character) name of the folder in which results will be written.

overwrite

(logical) whether or not to overwrite existing results in output_directory. Default = FALSE.

verbose

(logical) whether messages should be printed. Default = TRUE.

Details

Coordinates in csv files in occ_folder, SpatVector-like files in M_folder, and raster layers in var_folder must coincide in the geographic projection in which they are represented. WGS84 with no planar projection is recommended.

Accessible area (M) is understood as the geographic area that has been accessible for a species for relevant periods of time. Defining M is usually a hard task, but also a very important one, because it allows identifying uncertainties about the ability of a species to maintain populations in certain environmental conditions. For further details on this topic, see Barve et al. (2011) doi:10.1016/j.ecolmodel.2011.02.011 and Machado-Stredel et al. (2021) doi:10.21425/F5FBG48814.

Rounding variables may be useful when multiple variables are considered and the values of some or all of them are too small (e.g., when using principal components). To round specific variables arguments round, round_names, and multiplication_factor, must be used accordingly.

The percentage to be defined in percentage_out excludes a percentage of extreme environmental values to prevent the algorithm from considering extremely rare environmental values in the accessible area for the species (M). Being too rare, these values may have never been explored by the species; therefore, including them in the process of preparation of the table of characters (bin table) is risky.

Value

A list named as the variables present in var_folder, containing all tables with statistics of environmental values in M and in species records. A folder named as in output_directory containing all resultant csv files with the tables of statistics will be created if save is set as TRUE.

Examples

# preparing data and directories for examples
## directories
tempdir <- file.path(tempdir(), "nevol_test")
dir.create(tempdir)

cvariables <- paste0(tempdir, "/variables")
dir.create(cvariables)

records <- paste0(tempdir, "/records")
dir.create(records)

m_areas <- paste0(tempdir, "/M_areas")
dir.create(m_areas)

## data
data("occ_list", package = "nichevol")

temp <- system.file("extdata", "temp.tif", package = "nichevol")

m_files <- list.files(system.file("extdata", package = "nichevol"),
                      pattern = "m\\d.gpkg", full.names = TRUE)

## writing data in temporal directories
spnames <- sapply(occ_list, function (x) as.character(x[1, 1]))
ocnames <-  paste0(records, "/", spnames, ".csv")

occs <- lapply(1:length(spnames), function (x) {
  write.csv(occ_list[[x]], ocnames[x], row.names = FALSE)
})

to_replace <- paste0(system.file("extdata", package = "nichevol"), "/")

otemp <- gsub(to_replace, "", temp)
file.copy(from = temp, to = paste0(cvariables, "/", otemp))

file.copy(from = m_files, to = paste0(m_areas, "/", spnames, ".gpkg"))
stats <- stats_evalues(stats = c("median", "range"), M_folder = m_areas,
                       M_format = "gpkg", occ_folder = records,
                       longitude = "x", latitude = "y",
                       var_folder = cvariables, var_format = "tif",
                       percentage_out = 5)

nichevol documentation built on March 31, 2023, 5:38 p.m.