saveData: Save Species Occurrence Data

View source: R/saveData.R

saveDataR Documentation

Save Species Occurrence Data

Description

This function saves the occurrence data in any given directory, separated by species taxonomy, collections, years and countries, as well as by the confidence level of species identifications and coordinates.

Usage

saveData(
  df,
  file.name = "output",
  dir.name = "",
  path = "",
  by = NULL,
  file.format = "csv",
  compress = TRUE,
  rm.dup = FALSE
)

Arguments

df

a data frame with the occurrence data, generally as the output of the plantR validation functions.

file.name

character. Name of the file in which the data will be saved, without the extension. Default to "output".

dir.name

character. Name of the folder where the data should be saved. Default to "plantR_output".

path

character. The path to the directory of the output folder. Default to the user working directory.

by

character. The variable used for separating the data into different files before saving.

file.format

character. The file extension to be used for saving. Default to 'csv'.

compress

logical. Should the files be compressed? Default to TRUE.

rm.dup

logical. Should duplicated specimens be removed prior to saving? Default to FALSE.

Details

This function provides different option to save occurrence data. It provides options for saving data as 'csv' (function fwrite from package data.table) or 'rds' (base function saveRDS). The function tries to save data as fast as possible but processing time greatly depends or the size of the dataset. Options to compress data are available, but please be sure to have enough memory space for saving large datasets.

Currently, saving can be performed by grouping occurrence data by the following types of information:

  • code of the biological collection ('collection')

  • year of collection ('year')

  • taxonomy ('family', 'genus' or 'species')

  • country of collection ('country')

  • the confidence level of species identifications ('tax')

  • the validation categories of the geographical coordinates ('geo')

Note that if there are NAs in the grouping variable, they will be all saved under a file called 'NA.csv' or 'NA.rds'.


LimaRAF/plantR documentation built on Jan. 1, 2023, 10:18 a.m.