airdas_comments_process: Process comments in AirDAS data

Description Usage Arguments Details Value TURTLE and PHOCOENA comment-data Using comment.format Examples

View source: R/airdas_comments_process.R


Extract miscellaneous information recorded in AirDAS data comments, i.e. comment-data


airdas_comments_process(x, ...)

## S3 method for class 'data.frame'
airdas_comments_process(x, ...)

## S3 method for class 'airdas_dfr'
airdas_comments_process(x, comment.format = NULL, ...)

## S3 method for class 'airdas_df'
airdas_comments_process(x, comment.format = NULL, ...)



airdas_dfr or airdas_df object, or a data frame that can be coerced to a airdas_dfr object




list; default is NULL. See the 'Using comment.format' section


Historically, project-specific or miscellaneous data have been recorded in AirDAS comments using specific formats and character codes. This functions identifies and extracts this data from the comment text strings. However, different data types have different comment-data formats. Specifically, TURTLE and PHOCOENA comment-data uses identifier codes that each signify a certain data pattern, while other comment-data (usually that of CARETTA) uses data separated by some delimiter.


x, filtered for comments with recorded data, with the following columns added:

See the additional sections for more context. If comment.format is NULL, then the output data frame would two Misc# columns: a level one descriptor, e.g. "Fish ball" or "Jellyfish", and a level two descriptor, e.g. s, m, or c. However, if comment.format$n is say 4, then the output data frame would have columns Misc1, Misc2, Misc3, and Misc4.

Messages are printed if either comment.format is not NULL and not comment-data is identified using comment.format, or if x has TURTLE/PHOCOENA data but no TURTLE/PHOCOENA comment-data

TURTLE and PHOCOENA comment-data

Current supported data types are: fish balls, molas, jellyfish, and crab pots. See any of the AirDAS format PDFs (airdas_format_pdf) for information about the specific codes and formats used to record this data. All comments are converted to lower case for processing to avoid missing data.

These different codes contain (at most): a level one descriptor (e.g. fish ball or crab pot), a level two descriptor (e.g. size or jellyfish species), and a value (a count or percentage). Thus, the extracted data are returned together in this structure. The output data frame is long data, i.e. it has one piece of information per line. For instance, if the comment is "fb1s fb1m", then the output data frame will have one line for the small fish ball and one for the medium fish ball. See Value section for more details.

Currently this function only recognizes mola data recorded using the "m1", "m2", and "m3" codes (small, medium, and large mola, respectively). Thus, "mola" is not recognized and processed.

The following codes are used for the level two descriptors:

Description Code
Small s
Medium m
Large l
Unknown u
Chrysaora c
Moon jelly m
Egg yolk e
Other o

Using comment.format

comment.format is a list that allows the user to specify the comment-data format. To use this argument, data must be separated by a delimiter. This list must contain three named elements:

For instance, for most CARETTA data comment.format should be list(n = 5, sep = ";", type = c("character", "character", "numeric", "numeric", "character"))


y <- system.file("airdas_sample.das", package = "swfscAirDAS")
y.proc <- airdas_process(y)


swfscAirDAS documentation built on Jan. 10, 2021, 6:02 p.m.