View source: R/zoop_synthesizer.R
Zoopsynther | R Documentation |
This function returns an integrated zooplankton dataset with taxonomic issues resolved, according to user-specifications, along with important caveats about the data. It requires the output of the Zoopdownloader
function to run. This can be provided either as a list or paths to saved .Rds files generated by the Zoopdownloader
function. The function defaults to loading pre-packaged combined datasets (which may be outdated).
Zoopsynther(
Data_type = NULL,
Zoop = zooper::zoopComb,
ZoopEnv = zooper::zoopEnvComb,
Zoop_path = NULL,
Env_path = NULL,
Sources = c("EMP", "FRP", "FMWT", "STN", "20mm", "DOP"),
Size_class = c("Micro", "Meso", "Macro"),
Time_consistency = FALSE,
Intro_lag = 2,
Response = "CPUE",
Taxa = NULL,
Date_range = c(NA, NA),
Months = NA,
Years = NA,
Sal_bott_range = NA,
Sal_surf_range = NA,
Temp_range = NA,
Lat_range = NA,
Long_range = NA,
Reload_data = F,
Redownload_data = F,
All_env = T,
Shiny = F,
Crosswalk = zooper::crosswalk,
Undersampled = zooper::undersampled,
CompleteTaxaList = zooper::completeTaxaList,
StartDates = zooper::startDates,
...
)
Data_type |
What type of data are you looking for? This option allows you to to choose a final output dataset for either community ( |
Zoop |
Zooplankton data. You must provide the "Zooplankton" element from the list returned from |
ZoopEnv |
Accessory environmental data. You must provide the "Environment" element from the list returned from |
Zoop_path |
If you wish to save time by saving the combined zooplankton datasets returned from the |
Env_path |
If you wish to save time by saving the combined zooplankton datasets returned from the |
Sources |
Source datasets to be included. Choices include "EMP" (Environmental Monitoring Program), "FRP" (Fish Restoration Program), "FMWT" (Fall Midwater Trawl), "STN" (Townet Survey), "DOP" (Directed Outflow Project), and "20mm" (20mm survey). The YBFMP datasets cannot be used in this function due to taxonomic and life stage issues with that dataset. Defaults to |
Size_class |
Zooplankton size classes (as defined by net mesh sizes) to be included in the integrated dataset. Choices include "Micro" (43 |
Time_consistency |
Would you like to apply a fix to enforce consistent taxonomic resolution over time? Only available for the Community option. |
Intro_lag |
Only applicable if |
Response |
Which response variable(s) would you like for the zooplankton data? Choices are "CPUE" (catch per unit effort) and "BPUE" (carbon biomass per unit effort ( |
Taxa |
If you only wish to include a subset of taxa, provide a character vector of the taxa you wish included. This can include taxa of any taxonomic level (e.g., |
Date_range |
Range of dates to include in the final dataset. To filter within a range of dates, include a character vector of 2 dates formatted in the yyyy-mm-dd format exactly, specifying the upper and lower bounds. To specify an infinite upper or lower bound (to include all values above or below a limit) input |
Months |
Months (as integers) to be included in the integrated dataset. If you wish to only include data from a subset of months, input a vector of integers corresponding to the months you wish to be included. Defaults to |
Years |
Years to be included in the integrated dataset. If you wish to only include data from a subset of years, input a vector of years you wish to be included. Defaults to |
Sal_bott_range |
Filter the data by bottom salinity values. Include a vector of length 2 specifying the minimum and maximum values you wish to include. To include all values above or below a limit, utilize Inf or -Inf for the upper or lower bound respectively. Defaults to |
Sal_surf_range |
Same as previous, but for surface salinity. |
Temp_range |
Same as |
Lat_range |
Latitude range to include in the final dataset. Include a vector of length 2 specifying the minimum and maximum values you wish to include, in decimal degree format. Defaults to |
Long_range |
Same as previous, but for longitude. Don't forget that Longitudes should be negative in the Delta! |
Reload_data |
If set to |
Redownload_data |
Should data be re-downloaded from the internet? If set to |
All_env |
Should all environmental parameters be included? Defaults to |
Shiny |
Is this function being used within the shiny app? If set to |
Crosswalk |
Crosswalk table to be used for conversions. Must have columns named for each unique combination of source and size class with an underscore separator, as well as all taxonomic levels Phylum through Species, Taxname (full scientific name) and Lifestage. See |
Undersampled |
A table listing the taxonomic names and life stages of plankton undersampled by each net mesh size (i.e. size class). See |
CompleteTaxaList |
Character vector of all taxonomic names in source datasets. Defaults to |
StartDates |
Tibble with the starting dates of each source dataset. Defaults to |
... |
Arguments passed to |
This function combines any combination of the zooplankton datasets (included as parameters)
and calculates least common denominator taxa to facilitate comparisons across datasets with differing
levels of taxonomic resolution. For more information on the source datasets see zooper
.
An integrated zooplankton dataset.
The Data_type
parameter toggles between two approaches to resolving differences in taxonomic resolution.
If you want all available data on given Taxa, use Data_type="Taxa"
but if you want to conduct a community
analysis, use Data_type = "Community"
.
Briefly, Data_type = "Community"
optimizes for community-level analyses by taking all taxa x life stage
combinations that are not measured in every input dataset, and summing them up taxonomic levels to the lowest
taxonomic level they belong to that is covered by all datasets. Remaining Taxa x life stage combos that are not
covered in all datasets up to the phylum level (usually something like Annelida or Nematoda or Insect Pupae) are
removed from the final dataset. However, some taxa x life stage combos are retained if they are taxonomic levels
higher than species that are counted in some surveys, and a lower taxonomic level within this group is counted in all surveys.
For example, if we had 3 surveys where surveys A and B count Pseudodiaptomus forbesi, Pseudodiaptomus marinus,
and Pseudodiaptomus spp. (UnID) but survey C only counts P. forbesi and P. marinus then the
Pseudodiaptomus spp. (UnID) category would be retained after applying the community approach.
Data_type = "Taxa"
optimizes for the Taxa-level user by maintaining all data at the original taxonomic level
(but it outputs warnings for taxa not measured in all datasets, which we call "orphans").
To facilitate comparisons across datasets, this option also sums data into general categories that are comparable
across all datasets and years: "summed groups." The new variable "Taxatype" identifies which taxa are summed groups
(Taxatype = "Summed group"
), which are measured to the species level (Taxatype = "Species"
), and which
are higher taxonomic groupings with the species designation unknown: (Taxatype = "UnID species"
).
Sam Bashevkin
Zoopdownloader
, Taxnamefinder
, SourceTaxaKeyer
, crosswalk
, undersampled
, zoopComb
, zoopEnvComb
, zooper
MyZoops <- Zoopsynther(Data_type = "Community",
Sources = c("EMP", "FRP", "FMWT"),
Size_class = "Meso",
Date_range = c("1990-10-01", "2000-09-30"))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.