Description Usage Arguments Details Value Author(s) See Also
View source: R/read_zooplankton_data.R
Reads IOPAN and NPI standard format zooplankton data
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | read_zooplankton_data(
data_file,
sheet = 1,
dataStart = NULL,
dataEnd = 1000,
dataCols = NULL,
output_format = "as.Date",
control_species = c("species", "stage", "size_op", "length"),
lookup_cols = "biomass_conv",
species_info_cols = NULL,
remove_missing = TRUE,
control_stations = FALSE,
add_coordinates = FALSE,
control_sample_names = TRUE,
round2ceiling = FALSE
)
|
data_file |
Path to the Excel file containing zooplankton data |
sheet |
The name or index of the sheet to read the zooplanton data from. See |
dataStart |
The row number where zooplanton data starts from. If |
dataEnd |
The row number where zooplankton data ends. Larger than real row numbers in data are ignored. The default is 1000. Set to a higher value, if your dataset has more rows than that. |
dataCols |
Optional numeric index indicating the column numbers that contain zooplankton data. Not implemented yet. |
output_format |
Output formar for date. See |
control_species |
A character vector giving the names for species, stage, length operator and length columns from the Excel sheet. These names will be used as column names in the R output. The size operator ( |
lookup_cols |
Character vector specifying the names of columns from the zooplankton lookup list ( |
species_info_cols |
Character vector specifying the names of species information columns that should be preserved. Required only if |
remove_missing |
Logical indicating whether species with column sums of 0 should be removed from the output. |
control_stations |
Logical indicating whether station names should be controlled against a list of standardized station names (see |
add_coordinates |
If |
control_sample_names |
Logical indicating whether non-standard symbols in sample names should be replaced by standardized equivalents. May fix problems when trying to merge zooplankton samples with meta data from another file. These names tend to have typos. |
round2ceiling |
Logical indicating whether decimals should be rounded to ceiling integers: some Polish data come rounded this way. It is recommended to ask for nonrounded values as using this parameter may lead to very large biases in biomass estimates of deep samples. This argument is included only for making testing the impact of rounding easier. |
Zooplankton taxonomy data from IOPAN are received in (more or less) standard format on MS Excel sheets. This function attempts to read that format and enable passing data to futher manipulation in R. The structure of the Excel sheet is explained in Figure 1.
Figure 1. Example how zooplankton Excel sheets tend to be arranged.
Meta data are arranged row-vise (with headers on rows) and should contain following fields: "expedition", "station", "sample_name", "date", "from", "to", "unit", and "comment". The field names will be guessed. If the function does not guess the names correctly, try changing the names to the required field names. The dataCols
argument may be used as help to specify the column indices containing data to help the function (currently not implemented).
Data are listed column-wise for each station. Make sure that there are no blank data columns with meta data (entirely blank columns are OK) as the function does not manage to separate such columns yet. Specify the row number for beginning (dataStart
) of the data section. Rows > dataStart
will be considered as meta data. The dataEnd
argument can be used in cases where the sheet contains scrap data. Rows > dataEnd
will be dropped.
Species list is arranged column-wise and the field headers should be listed in the control_species
argument.
The correct Excel sheet containing all data is often named "ALL...", but this varies (purple text).
The function sums up duplicate species entries for each sample. The function attemps to match the species names in data_file
with the accepted ones listed in ZOOPL
. Sometimes this routine fails and manual fixes are required.
The function is currently relatively unstable and most likely requires manual debugging for each dataset.
Returns a list of class ZooplanktonData
. The list contains 3 data frames: $data
(abundance data), $meta
(meta-data), and $splist
(species information).
Mikko Vihtakari, Anette Wold
Other ZooplanktonData:
merge_zooplankton_data()
,
print.ZooplanktonData()
,
subset.ZooplanktonData()
,
summarize_zooplankton_data()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.