split_data | R Documentation |
split_data splits occurrences into training and testing data based on distinct methods.
split_data(data, method = "random", longitude, latitude,
train_proportion, raster_layer = NULL,
background_n = 10000, save = FALSE, name = "occurrences")
data |
data.frame of occurrence records containing at least longitude and latitude columns. |
method |
(character) method for selecting training and testing data. Options are: "random" and "block"; default = "random". |
longitude |
(character) if |
latitude |
(character) if |
train_proportion |
(numeric) proportion (from 0 to 1) of data to be used
as training occurrences. The remaining data will be used for testing.
Default = 0.5 if |
raster_layer |
optional RasterLayer to prepare background data if
|
background_n |
(numeric) optional number of coordinates to be extracted
using the |
save |
(logical) whether or not to save the results in the working directory. Default = FALSE. |
name |
(character) if |
A list containing all, training, and testing occurrences. If save
=
TRUE, three csv files will be written in the working directory according to
the name defined in name
plus the suffix _all for all records, _train
for the training set, and _test for the testing set.
If method
= "block", an additional data.frame containing all data and
an extra column with IDs for each block will be added to the resulted list.
If save
= TRUE, this data.frame will be written with the suffix _block.
If a raster layer is given in raster_layer
, background coordinates
will be returned as part of this list. Data will be named as bg_all, bg_train,
bg_test, and bg_block, for all, training, testing, and all background with
assigned blocks, respectively.
# reading data
occurrences <- read.csv(system.file("extdata", "occurrences.csv",
package = "ellipsenm"))
# random split 50% for trainig and 50% for testing
data_split <- split_data(occurrences, train_proportion = 0.5)
names(data_split)
lapply(data_split, head)
lapply(data_split, dim)
# random split 70% for trainig and 30% for testing
data_split1 <- split_data(occurrences, train_proportion = 0.7)
names(data_split1)
lapply(data_split1, head)
lapply(data_split1, dim)
# split 75% for trainig and 25% for testing using blocks
data_split2 <- split_data(occurrences, method = "block", longitude = "longitude",
latitude = "latitude", train_proportion = 0.75)
names(data_split2)
lapply(data_split2, head)
lapply(data_split2, dim)
# split data using blocks and preparing background
r_layer <- raster::raster(system.file("extdata", "bio_1.tif",
package = "ellipsenm"))
data_split3 <- split_data(occurrences, method = "block", longitude = "longitude",
latitude = "latitude", train_proportion = 0.75,
raster_layer = r_layer)
# saving data
data_split4 <- split_data(occurrences, train_proportion = 0.7, save = TRUE,
name = "occs")
# cheking directory
dir()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.