chkinp: Check input taxonomy and site data

View source: R/chkinp.R

chkinpR Documentation

Check input taxonomy and site data

Description

Check input taxonomy and site for required information

Usage

chkinp(taxa, station, getval = FALSE)

Arguments

taxa

data.frame for input taxonomy data

station

data.frame for input station data

getval

logical to return a vector of values not satisfied by checks, useful for data prep

Details

The following are checked:

  • Required columns in taxonomy data: StationCode, SampleDate, Replicate, SampleTypeCode, BAResult, Result, FinalID

  • Taxonomic names are present in the STE reference file

  • Sites include both diatom and soft-bodied algae data (warning if not)

  • No missing abundance values for diatoms (for rarification)

  • One of CondQR50 or all predictors for the conductivity model in the station data

  • One of XerMtn or PSA6C in the station data

  • Additional required columns for the station data: StationCode, CondQR50, SITE_ELEV, TEMP_00_09, KFCT_AVE, AtmCa, PPT_00_09, MAX_ELEV

  • No missing data in additional required columns for stationdata

Value

A two element list of the original data (named taxa) and removed taxa by SampleID (named txrmv) if all checks are met. The original data also includes a new column for SampleID. An error message is returned if the datasetsdo not meet requirements or a vector of values that caused the error if getval = TRUE. Site data will include only those sites in the taxonomic data.

See Also

calcgis

Examples

# all checks passed, data returned with SampleID
chkinp(demo_algae_tax, demo_station)

# errors
## Not run: 
# missing columns in taxa data
tmp <- demo_algae_tax[, 1, drop = FALSE]
chkinp(tmp, demo_station)
chkinp(tmp, demo_station, getval = TRUE)

# incorrect taxonomy
tmp <- demo_algae_tax
tmp[1, 'FinalID'] <- 'asdf'
chkinp(tmp, demo_station)
chkinp(tmp, demo_station, getval = TRUE)

# missing diatom data at sites, returns only a warning
tmp <- merge(demo_algae_tax, STE, all.x = T) %>%
  filter(Class %in% 'Bacillariophyceae')
chkinp(tmp, demo_station)


# missing abundance data for diatoms
tmp <- demo_algae_tax
tmp$BAResult <- NA
chkinp(tmp, demo_station)
chkinp(tmp, demo_station, getval = TRUE)

# stations not shared between taxa and station
tmp <- demo_station[-1, ]
chkinp(demo_algae_tax, tmp)

# missing both of XerMtn and PSA6C in station
tmp <- demo_station[, !names(demo_station) %in% c('XerMtn', 'PSA6C')]
chkinp(demo_algae_tax, tmp)

# missing CondQR50 and incomplete predictor fields
tmp <- demo_station[, !names(demo_station) %in% c('CondQR50', 'TMAX_WS')]
chkinp(demo_algae_tax, tmp)

# missing remaining station predictors
tmp <- demo_station[, !names(demo_station) %in% c('AtmCa')]
chkinp(demo_algae_tax, tmp)

# missing data in remaining station predictors
tmp <- demo_station
tmp$AtmCa[2] <- NA
chkinp(demo_algae_tax, tmp)

## End(Not run)

fawda123/ASCI documentation built on Jan. 31, 2024, 4:10 a.m.