read_DARLEQ: Read DARLEQ diatom data from an Excel file

View source: R/read_darleq.R

read_DARLEQR Documentation

Read DARLEQ diatom data from an Excel file

Description

read_DARLEQ imports DARLEQ-formatted diatom data from an Excel file.

Usage

read_DARLEQ(file, sheet = NULL, verbose = TRUE)

Arguments

file

Name of Excel file. See Details below for guidelines on formatting the diatom data.

sheet

Name of sheet within Excel file. If blank the function will import the first sheet in the Excel file.

verbose

logical to indicate should function stop immediately on error (TRUE) or return a simpleError (FALSE). Defaults to TRUE.

Details

read_DARLEQ imports diatom data from an Excel file in either .xls or .xlsx format. An example Excel file is included in this package. See examples below to view it. The required data and layout are slightly different for river and lake samples. Figure 1 below shows the required format for performing TDI calculations for river samples.

The first four header rows are mandatory and must contain the following information:

  • Row 1: Sample identifier – a short numerical or alphanumeric code to uniquely identify the sample. This field cannot be empty (an empty cell indicates the end of data).

  • Row 2: Site identifier – a short numerical or alphanumeric code to uniquely identify the site. This code will be used to aggregate multiple samples when calculating confidence of class for a site.

  • Row 3: Sample Date in Day/Month/Year format. Missing dates are set to “Spring” for the purposes of classification using TDI3 and samples flagged with a warning.

  • Row 4: Mean annual alkalinity (or best available estimate) in mg l-1 (CaCO3). Missing values are set to 100 mg l-1 for the purposes of classification and samples flagged with a warning. Alkalinity values outside the range of the site prediction algorithm are set to the appropriate limit (6 or 150 mg l-1 for TDI3 and 5 or 250 mg l-1 for TDI4 and TDI5LM / TDI5NGS).

  • Rows 5+: Further option sample descriptors such as river name, reach name etc. These data are not used by the program but will be reproduced in the output. Note that the second column of the header information must be left blank.

Figure 1: Example format for river diatom samples

Identifiers for each row of the sample header information should be listed in column 1. Diatom data then follow the header information and may be in count or percentage format. The first column must contain the taxon code in either NBS or DiatCode (http://www.ecrc.ucl.ac.uk/?q=databases/diatcode) format. The codes in this column are used to link the data to the DARLEQ3 taxon list and ecological information and cannot be empty (an empty cell indicates the end of the data). The second column must include either the taxon name or code (ie. a repeat of column 1). Empty (blank) cells in the count or percentage data matrix will be read as zero. Character data in the diatom matrix will generate an error. A full list of diatom codes (either NBS or DiatCodes) are available in the dataframe darleq3_taxa.

If the Diatom Acidification Metric (DAM) is to be calculated, rows 5 and 6 must contain estimates of mean annual Calcium and DOC concentrations, in ueq l-1 and mg l-1 respectively. Figure 2 shows an example formatted for calculation of TDI and DAM. Note that if only DAM scores are required the Alkalinity field may be left blank. Sample Date is not used for calculating DAM and may be left blank.

Figure 2: Example format for river diatom TDI and DAM samples

The required format for lake samples is shown in Figure 3. This is exactly the same as for river data except that the fourth row must contain a code indicating lake type according to the GB lake typology alkalinity classes. Marl lakes are included in the high alkalinity (HA) group. Peat and brackish lakes are not covered by the tool. Sample date for lake samples is not used in the class calculations and can contain missing values.

Figure 3: Example format for lake diatom LTDI samples

Value

A list with the following named elements:

header

data frame containing the rows of environmental data from the top of the Excel file (ie. site, sample, water chemistry and data information)

diatom_data

data frame containing the diatom data

taxon_names

data frame containing taxon codes and names

file

name of the Excel file

filepath

full path to the Excel file

sheet

name of the Excel worksheet

Author(s)

Steve Juggins Stephen.Juggins@ncl.ac.uk

Examples

fn <- system.file("extdata/DARLEQ2TestData.xlsx", package="darleq3")
d <- read_DARLEQ(fn, "Rivers TDI Test Data")
head(d$diatom_data)
head(d$header)
## Not run: 
# view the example dataset in Excel
# note running the following lines will open the file in Excel (if installed)
fn <- system.file("extdata/DARLEQ2TestData.xlsx", package="darleq3")
shell.exec(fn)

## End(Not run)


nsj3/darleq3 documentation built on Oct. 11, 2023, 4:37 a.m.