Zoopsynther: Integrates zooplankton datasets collected by the Interagency...

View source: R/zoop_synthesizer.R

ZoopsyntherR Documentation

Integrates zooplankton datasets collected by the Interagency Ecological Program from the Sacramento-San Joaquin Delta

Description

This function returns an integrated zooplankton dataset with taxonomic issues resolved, according to user-specifications, along with important caveats about the data. It requires the output of the Zoopdownloader function to run. This can be provided either as a list or paths to saved .Rds files generated by the Zoopdownloader function. The function defaults to loading pre-packaged combined datasets (which may be outdated).

Usage

Zoopsynther(
  Data_type = NULL,
  Zoop = zooper::zoopComb,
  ZoopEnv = zooper::zoopEnvComb,
  Zoop_path = NULL,
  Env_path = NULL,
  Sources = c("EMP", "FRP", "FMWT", "STN", "20mm", "DOP"),
  Size_class = c("Micro", "Meso", "Macro"),
  Time_consistency = FALSE,
  Intro_lag = 2,
  Response = "CPUE",
  Taxa = NULL,
  Date_range = c(NA, NA),
  Months = NA,
  Years = NA,
  Sal_bott_range = NA,
  Sal_surf_range = NA,
  Temp_range = NA,
  Lat_range = NA,
  Long_range = NA,
  Reload_data = F,
  Redownload_data = F,
  All_env = T,
  Shiny = F,
  Crosswalk = zooper::crosswalk,
  Undersampled = zooper::undersampled,
  CompleteTaxaList = zooper::completeTaxaList,
  StartDates = zooper::startDates,
  ...
)

Arguments

Data_type

What type of data are you looking for? This option allows you to to choose a final output dataset for either community (Data_type = "Community"; the default) or Taxa-specific (Data_type = "Taxa") analyses. NOTE: If you set Data_type="Community" we do not recommend utilizing the Taxa argument. See below for more explanation of this argument.

Zoop

Zooplankton data. You must provide the "Zooplankton" element from the list returned from Zoopdownloader(Save_object = FALSE, Return_object = TRUE, Return_object_type="List"). The default argument provides the built-in (and possibly outdated) version of this combined dataset. If you instead wish to provide paths to saved datasets from the Zoopdownloader function, set Data_list = NULL and provide Zoop_path.

ZoopEnv

Accessory environmental data. You must provide the "Environment" element from the list returned from Zoopdownloader(Save_object = FALSE, Return_object = TRUE, Return_object_type="List"). The default argument provides the built-in (and possibly outdated) version of this combined dataset. If you instead wish to provide paths to saved datasets from the Zoopdownloader function, set Data_list = NULL and provide Env_path.

Zoop_path

If you wish to save time by saving the combined zooplankton datasets returned from the zoopdatadownloader to disk, provider here the path to the combined zooplankton dataset on disk. You must also set Data_list = NULL.

Env_path

If you wish to save time by saving the combined zooplankton datasets returned from the zoopdatadownloader to disk, provider here the path to the combined accessory environmental data on disk. You must also set Data_list = NULL.

Sources

Source datasets to be included. Choices include "EMP" (Environmental Monitoring Program), "FRP" (Fish Restoration Program), "FMWT" (Fall Midwater Trawl), "STN" (Townet Survey), "DOP" (Directed Outflow Project), and "20mm" (20mm survey). The YBFMP datasets cannot be used in this function due to taxonomic and life stage issues with that dataset. Defaults to Sources = c("EMP", "FRP", "FMWT", "STN", "20mm", "DOP").

Size_class

Zooplankton size classes (as defined by net mesh sizes) to be included in the integrated dataset. Choices include "Micro" (43 \mum), "Meso" (150 - 160 \mum), and "Macro" (500-505 \mum). Defaults to Size_class = c("Micro", "Meso", "Macro").

Time_consistency

Would you like to apply a fix to enforce consistent taxonomic resolution over time? Only available for the Community option.

Intro_lag

Only applicable if Time_consistency = TRUE. How many years after a species is introduced should we expect surveys to start counting them? Defaults to 2.

Response

Which response variable(s) would you like for the zooplankton data? Choices are "CPUE" (catch per unit effort) and "BPUE" (carbon biomass per unit effort (\mug/ m3)). Defaults to Response = "CPUE".

Taxa

If you only wish to include a subset of taxa, provide a character vector of the taxa you wish included. This can include taxa of any taxonomic level (e.g., Taxa = "Calanoida") to include only calanoids. NOTE: we do not recommend you use this feature AND set Data_type="Community". This is better suited to selecting higher-level taxa. If you wish to just include one or a few species, it would be faster to just filter the output of Zoopdownloader to include those taxa. Defaults to NULL, which includes all taxa.

Date_range

Range of dates to include in the final dataset. To filter within a range of dates, include a character vector of 2 dates formatted in the yyyy-mm-dd format exactly, specifying the upper and lower bounds. To specify an infinite upper or lower bound (to include all values above or below a limit) input NA for that infinite bound. Defaults to Date_range = c(NA, NA), which includes all dates.

Months

Months (as integers) to be included in the integrated dataset. If you wish to only include data from a subset of months, input a vector of integers corresponding to the months you wish to be included. Defaults to Months = NA, which includes all months.

Years

Years to be included in the integrated dataset. If you wish to only include data from a subset of years, input a vector of years you wish to be included. Defaults to Years = NA, which includes all months.

Sal_bott_range

Filter the data by bottom salinity values. Include a vector of length 2 specifying the minimum and maximum values you wish to include. To include all values above or below a limit, utilize Inf or -Inf for the upper or lower bound respectively. Defaults to Sal_bott_range = NA, which includes all bottom salinities.

Sal_surf_range

Same as previous, but for surface salinity.

Temp_range

Same as Sal_bott_range but for surface temperature.

Lat_range

Latitude range to include in the final dataset. Include a vector of length 2 specifying the minimum and maximum values you wish to include, in decimal degree format. Defaults to Lat_range = NA, which includes all latitudes.

Long_range

Same as previous, but for longitude. Don't forget that Longitudes should be negative in the Delta!

Reload_data

If set to Reload_data = T runs the Zoopdownloader function to re-combine source datasets. To include local versions of the datasets without redownloading them from online, set Reload_data = TRUE and Redownload_data = FALSE. Defaults to Reload_data= FALSE

Redownload_data

Should data be re-downloaded from the internet? If set to Redownload_data = TRUE, runs Zoopdownloader(Redownload_data=Redownload_data, Zoop_path=Zoop_path, Env_path=Env_path, ...). Defaults to Redownload_data = FALSE.

All_env

Should all environmental parameters be included? Defaults to All_env = TRUE.

Shiny

Is this function being used within the shiny app? If set to Shiny = TRUE, outputs a list with the integrated dataset as one component and the caveats as the other component. Defaults to Shiny = FALSE.

Crosswalk

Crosswalk table to be used for conversions. Must have columns named for each unique combination of source and size class with an underscore separator, as well as all taxonomic levels Phylum through Species, Taxname (full scientific name) and Lifestage. See crosswalk (the default) for an example.

Undersampled

A table listing the taxonomic names and life stages of plankton undersampled by each net mesh size (i.e. size class). See undersampled (the default) for an example.

CompleteTaxaList

Character vector of all taxonomic names in source datasets. Defaults to completeTaxaList.

StartDates

Tibble with the starting dates of each source dataset. Defaults to startDates.

...

Arguments passed to Zoopdownloader.

Details

This function combines any combination of the zooplankton datasets (included as parameters) and calculates least common denominator taxa to facilitate comparisons across datasets with differing levels of taxonomic resolution. For more information on the source datasets see zooper.

Value

An integrated zooplankton dataset.

Data type

The Data_type parameter toggles between two approaches to resolving differences in taxonomic resolution. If you want all available data on given Taxa, use Data_type="Taxa" but if you want to conduct a community analysis, use Data_type = "Community".

Briefly, Data_type = "Community" optimizes for community-level analyses by taking all taxa x life stage combinations that are not measured in every input dataset, and summing them up taxonomic levels to the lowest taxonomic level they belong to that is covered by all datasets. Remaining Taxa x life stage combos that are not covered in all datasets up to the phylum level (usually something like Annelida or Nematoda or Insect Pupae) are removed from the final dataset. However, some taxa x life stage combos are retained if they are taxonomic levels higher than species that are counted in some surveys, and a lower taxonomic level within this group is counted in all surveys. For example, if we had 3 surveys where surveys A and B count Pseudodiaptomus forbesi, Pseudodiaptomus marinus, and Pseudodiaptomus spp. (UnID) but survey C only counts P. forbesi and P. marinus then the Pseudodiaptomus spp. (UnID) category would be retained after applying the community approach.

Data_type = "Taxa" optimizes for the Taxa-level user by maintaining all data at the original taxonomic level (but it outputs warnings for taxa not measured in all datasets, which we call "orphans"). To facilitate comparisons across datasets, this option also sums data into general categories that are comparable across all datasets and years: "summed groups." The new variable "Taxatype" identifies which taxa are summed groups (Taxatype = "Summed group"), which are measured to the species level (Taxatype = "Species"), and which are higher taxonomic groupings with the species designation unknown: (Taxatype = "UnID species").

Author(s)

Sam Bashevkin

See Also

Zoopdownloader, Taxnamefinder, SourceTaxaKeyer, crosswalk, undersampled, zoopComb, zoopEnvComb, zooper

Examples

MyZoops <- Zoopsynther(Data_type = "Community",
Sources = c("EMP", "FRP", "FMWT"),
Size_class = "Meso",
Date_range = c("1990-10-01", "2000-09-30"))

InteragencyEcologicalProgram/zooper documentation built on Feb. 6, 2025, 9:01 a.m.