In case of extremely large species occurrence datasets, it may take a long time to run the analyses. Any number of sectors will provide the accurate results. However, computational time may be decreased by increasing the number of sectors considered. The higher the number of sectors, the larger the invasion radius at which points are compared by pairs in find_thresholds
, so the fewer distances need to be calculated. However, the lower the number of sectors, the better pre-identification of spatial discontinuities and the more pruned the list of potential jumps, so the faster find_jumps
. The lowest computational time is therefore obtained by a trade-off between dataset size, invasion radius, and number of sectors.
We demonstrate the effect of the number of sectors on computational time on the SLF dataset.
library(magrittr) library(tidyverse) library(here) library(dplyr) library(sf) library(slfjumps)
Load the grid data created in the first vignette
grid_data <- read.csv(file.path(here(), "exported-data", "grid_data.csv"))
Run the slfjumps functions successively for 8, 48, and 120 sectors and compare computation times.
sectors = c(8, 16, 48) optim <- data.frame(s = NULL, Time_sectors = NULL, Time_thresholds = NULL, potJumps = NULL, Time_jumps = NULL, Jumps = NULL, Time_secDiff = NULL) for (s in sectors){ print(paste0("Sectors: ", s)) #1 Attribute sectors start.time.attribute_sectors <- Sys.time() grid_data_sectors <- slfjumps::attribute_sectors(dataset = grid_data, nb_sectors = s, centroid = c(-75.675340, 40.415240)) end.time.attribute_sectors <- Sys.time() time.taken.attribute_sectors <- end.time.attribute_sectors - start.time.attribute_sectors #2 Find thresholds start.time.find_thresholds <- Sys.time() Results_thresholds <- slfjumps::find_thresholds(dataset = grid_data_sectors, gap_size = 15, negatives = T) preDist <- Results_thresholds$preDist potJumps <- Results_thresholds$potJumps end.time.find_thresholds <- Sys.time() time.taken.find_thresholds <- end.time.find_thresholds - start.time.find_thresholds #3 Find jumps start.time.find_jumps <- Sys.time() Results_jumps <- slfjumps::find_jumps(grid_data = grid_data, potJumps = potJumps, gap_size = 15, crs = 4326) Jumps <- Results_jumps$Jumps diffusers <- Results_jumps$diffusers potDiffusion <- Results_jumps$potDiffusion end.time.find_jumps <- Sys.time() time.taken.find_jumps <- end.time.find_jumps - start.time.find_jumps #4 Find sec diff start.time.find_secDiff <- Sys.time() Results_secDiff <- slfjumps::find_secDiff(potDiffusion = potDiffusion, Jumps = Jumps, diffusers = diffusers, Dist = preDist, gap_size = 15, crs = 4326) end.time.find_secDiff <- Sys.time() time.taken.find_secDiff <- end.time.find_secDiff - start.time.find_secDiff result <- data.frame(s = s, Time_sectors = time.taken.attribute_sectors, Time_thresholds = time.taken.find_thresholds, potJumps = dim(potJumps)[1], Time_jumps = time.taken.find_jumps, Jumps = dim(Jumps)[1], Time_secDiff = time.taken.find_secDiff) optim <- rbind(optim, result) } optim
For this dataset, all computational times are decreased by dividing space into 16 sectors instead of 8. Data is not dense enough for dividing space into 48 sectors, as indicated by multiple warning messages from find_threshold
.
-- end of vignette --
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.