extract_buffered_coords: Extract spatially buffered and temporally dynamic explanatory...

View source: R/extract_buffered_coords.R

extract_buffered_coordsR Documentation

Extract spatially buffered and temporally dynamic explanatory variable data for occurrence records.

Description

For each species occurrence record co-ordinate and date, spatially buffered and temporally dynamic explanatory data are extracted using Google Earth Engine.

Usage

extract_buffered_coords(
  occ.data,
  datasetname,
  bandname,
  spatial.res.metres,
  GEE.math.fun,
  moving.window.matrix,
  user.email,
  save.method,
  varname,
  temporal.res,
  temporal.level,
  temporal.direction,
  categories,
  save.directory,
  agg.factor,
  prj = "+proj=longlat +datum=WGS84",
  resume = TRUE
)

Arguments

occ.data

a data frame, with columns for occurrence record co-ordinates and dates with column names as follows; record longitude as "x", latitude as "y", year as "year", month as "month", and day as "day".

datasetname

a character string, the Google Earth Engine dataset to extract data from.

bandname

a character string, the Google Earth Engine dataset bandname to extract data for.

spatial.res.metres

a numeric value, the spatial resolution in metres for data extraction.

GEE.math.fun

a character string, the mathematical function to compute across the specified spatial matrix and period for each record.

moving.window.matrix

a matrix of weights with an odd number of sides, representing the spatial neighbourhood of cells (“moving window”) to calculate GEE.math.fun across from record co-ordinate. See details for more information.

user.email

a character string, user email for initialising Google Drive.

save.method

a character string, the method used to save extracted variable data. One of split or combined: can be abbreviated. See details.

varname

optional; a character string, a unique name for the explanatory variable. Default varname is “bandname_temporal.res_temporal.direction_GEE.math.fun_buffered".

temporal.res

optional; a numeric value, the temporal resolution in days to extract data and calculate GEE.math.fun across from occurrence record date.

temporal.level

a character string, the temporal resolution of the explanatory variable data. One of day, month or year: can be abbreviated. Default; day.

temporal.direction

optional; a character string, the temporal direction for extracting data across relative to the record date. One of prior or post: can be abbreviated.

categories

optional; a character string, the categories to use in calculation if data are categorical. See details for more information.

save.directory

a character string, path to a local directory to save extracted variable data to.

agg.factor

optional;a postive integer, the aggregation factor expressed as number of cells in each direction. See details.

prj

a character string, the coordinate reference system of occ.data coordinates.Default is "+proj=longlat +datum=WGS84".

resume

a logical indicating whether to search save.directory and return to previous progress. Only possible if save.method = split has previously and currently been employed. Default = TRUE.

Details

For each individual species occurrence record co-ordinate and date, this function extracts data for a given band within a Google Earth Engine dataset across a user-specified spatial buffer and temporal period and calculates a mathematical function on such data.

Value

Returns details of successful explanatory variable extractions.

Temporal dimension

If temporal.res and temporal.direction are not given, the function extracts explanatory variable data for all of the cells surrounding and including the cell containing the occurrence record co-ordinates.

If temporal.res and temporal.direction is given, the function extracts explanatory variable data for which GEE.math.fun has been first calculated over this period in relation to the occurrence record date.

Spatial dimension

Using the focal function from terra R package (Hijmans et al., 2022), GEE.math.fun is calculated across the spatial buffer area from the record co-ordinate. The spatial buffer area used is specified by the argument moving.window.matrix, which dictates the neighbourhood of cells surrounding the cell containing the occurrence record to include in this calculation.

See function get_moving_window() to generate appropriate moving.window.matrix.

Mathematical function

GEE.math.fun specifies the mathematical function to be calculated over the spatial buffered area and temporal period. Options are limited to Google Earth Engine ImageCollection Reducer functions (https://developers.google.com/earth-engine/apidocs/) for which an analogous R function is available. This includes: "allNonZero","anyNonZero", "count", "first","firstNonNull", "last", "lastNonNull", "max","mean", "median","min", "mode","product", "sampleStdDev", "sampleVariance", "stdDev", "sum" and "variance".

Categorical data

When explanatory variable data are categorical (e.g. land cover classes), argument categories can be used to specify the categories of importance to the calculation. The category or categories given will be converted in a binary representation, with “1” for those listed, and “0” for all others in the dataset. Ensure that the GEE.math.fun given is appropriate for such data. For example, the sum of suitable land cover classified cells across the “moving window” from the species occurrence record co-ordinates.

Categorical data and temporally dynamic variables

Please be aware, if specific categories are given (argument categories) when extracting categorical data, then temporal buffering cannot be completed. The most recent categorical data to the occurrence record date will be used for spatial buffering.

If specific categories are not given when extracting from categorical datasets, be careful to choose appropriate mathematical functions for such data. For instance, "first" or "last" may be more relevant that "sum" of land cover classification numbers.

Temporal level to extract data at:

temporal.level states the temporal resolution of the explanatory variable data and improves the speed of extract_buffered_coords() extraction. For example, if the explanatory data represents an annual variable, then all record co-ordinates from the same year can be extracted from the same buffered raster, saving computation time. However, if the explanatory data represents a daily variable, then only records from the exact same day can be extracted from the same raster. For the former, temporal.level argument should be year and for the latter, temporal.level should be day.

Aggregation factor

agg.factor given represents the factor to aggregate RasterLayer data with function aggregate in terra R package (Hijmans et al., 2022). Aggregation uses the GEE.math.fun as the function. Following aggregation spatial buffering using the moving window matrix occurs. This is included to minimise computing time if data are of high spatial resolution and a large spatial buffer is needed. Ensure to calculate get_moving_window() with the spatial resolution of the data post-aggregation by this factor.

Google Earth Engine

extract_buffered_coords() requires users to have installed R package rgee (Aybar et al., 2020) and initialised Google Earth Engine with valid log-in credentials. Please follow instructions on the following website https://cran.r-project.org/package=rgee

  • datasetname must be in the accepted Google Earth Engine catalogue layout (e.g. “MODIS/006/MCD12Q1” or “UCSB-CHG/CHIRPS/DAILY”)

  • bandname must be as specified under the dataset in the Google Earth Engine catalogue (e.g. “LC_Type5”, “precipitation”). For datasets and band names, see https://developers.google.com/earth-engine/datasets.

Google Drive

extract_buffered_coords() also requires users to have installed the R package googledrive(D'Agostino McGowan and Bryan, 2022) and initialised Google Drive with valid log-in credentials, which must be stated using argument user.email. Please follow instructions on https://googledrive.tidyverse.org/ for initialising the googledrive package.

Note: When running this function a folder labelled "dynamicSDM_download_bucket" will be created in your Google Drive. This will be emptied once the function has finished running and output rasters will be found in the save.drive.folder or save.directory specified.

Exporting extracted data

For save.method = combined, the function with save “csv” files containing all occurrence records and associated values for the explanatory variable.

For save.method = split, the function will save individual “csv” files for each record with each unique period of the given temporal.level (e.g. each year, each year and month combination or each unique date).

split protects users if internet connection is lost when extracting data for large occurrence datasets. The argument resume can be used to resume to previous progress if connection is lost.

References

Aybar, C., Wu, Q., Bautista, L., Yali, R. and Barja, A., 2020. rgee: An R package for interacting with Google Earth Engine. Journal of Open Source Software, 5(51), p.2272.

D'Agostino McGowan L., and Bryan J., 2022. googledrive: An Interface to Google Drive. https://googledrive.tidyverse.org, https://github.com/tidyverse/googledrive.

Hijmans, R.J., Bivand, R., Forner, K., Ooms, J., Pebesma, E. and Sumner, M.D., 2022. Package 'terra'. Maintainer: Vienna, Austria.

Examples



data(sample_filt_data)



user.email<-as.character(gargle::gargle_oauth_sitrep()$email)

matrix<-get_moving_window(radial.distance = 10000,
                            spatial.res.degrees = 0.05,
                            spatial.ext = sample_extent_data)

extract_buffered_coords(occ.data = sample_filt_data,
                      datasetname = "MODIS/006/MCD12Q1",
                      bandname = "LC_Type5",
                      spatial.res.metres = 500,
                      GEE.math.fun = "sum",
                      moving.window.matrix=matrix,
                      user.email = user.email,
                      save.method ="split",
                      temporal.level = "year",
                      categories = c(6,7),
                      agg.factor = 12,
                      varname = "total_grass_crop_lc",
                      save.directory = tempdir()
)


dynamicSDM documentation built on June 28, 2024, 5:08 p.m.