lapply_kfold_species: Apply a function over the folds of a set of species

Description Usage Arguments Details Value See Also Examples

View source: R/lapply.R

Description

lapply_kfold_species returns a list of lists where each element is the result of applying fun to all species or the provided subset of species for the specified folds.

Usage

1
2
lapply_kfold_species(fun, ..., species = NULL, fold_type = "disc", k =
  1:5)

Arguments

fun

function. The function to be applied to the occurrence records of each species. Parameters are the species name, a list with the occurrence and background training and test records and a fold number.

...

optional arguments to fun.

species

dataframe or character vector. Dataframe like returned by list_species or the names of the species. If NULL (default) then fun is applied for all species.

fold_type

character. Type of partitioning you want to use, default is "disc".

k

integer vector. Numbers of the folds you want to get data for, if you want all 5-folds pass use 1:5, which is the default.

Details

The parameters passed to fun are speciesname, data where data is a list with 4 elements (occurrence_training, occurrence_test, background_training and background_test) and a parameter fold which contains the fold number.

The different fold_type are:

"disc": 5-fold disc partitioning of occurrences with pairwise distance sampled and buffer filtered random background points, equivalent to calling kfold_occurrence_background with occurrence_fold_type = "disc", k = 5, pwd_sample = TRUE, background_buffer = 200*1000

"grid_4" and "grid_9": 4-fold and 9-fold grid partitioning of occurrences with pairwise distance sampled and buffer filtered random background points, equivalent to calling kfold_occurrence_background with occurrence_fold_type = "grid", k = 4, pwd_sample = TRUE, background_buffer = 200*1000

"random": 5-fold random partitioning of occurrences and random background points, equivalent to calling kfold_occurrence_background with occurrence_fold_type = "random", k = 5, pwd_sample = FALSE, background_buffer = 0

"targetgroup": same way of partitioning as the "random" folds but instead of random background points, a random subset of all occurrences points was used creating a targetgroup background points set which has the same sampling bias as the entire dataset.

Value

A list with one named entry for every species provided or for all species. Every list entry is a list with k as names and the result of fun as value.

See Also

list_species lapply_species get_fold_data

lapply_species, get_fold_data, list_species

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
## Not run: 
plot_occurrences <- function(speciesname, data, fold) {
   title <- paste0(speciesname, " (fold = ", fold, ")")
   plot(data$occurrence_train[,c("longitude", "latitude")], pch=".",
        col="blue", main = title)
   points(data$occurrence_test[,c("longitude", "latitude")], pch=".",
        col="red")
}

# plot training (blue) and test (red) occurrences
# of the first 2 folds for the first 10 species
species <- list_species()
lapply_kfold_species(plot_occurrences, species=species[1:5,],
                     fold_type = "disc", k = 1:2)

## End(Not run)

lifewatch/marinespeed documentation built on Dec. 19, 2019, 2:59 a.m.