knitr::opts_chunk$set(
    echo = TRUE,
    message = FALSE,
    warning = FALSE)
options(knitr.kable.NA = "",
        knitr.table.format = "pandoc")

options("show.signif.stars"=FALSE,"stringsAsFactors"=FALSE,
        "max.print"=50000,"width"=240)

library(datalowSA)
suppressPackageStartupMessages(library(knitr))

Introduction

There are functions relating to data input and formatting to facilitate the use of many of the functions inside datalowSA and cede. Essentially there are five objects that can contain data to be used by datalowSA. Here we will list them and then later

  1. fish is a data.frame (matrix) that contains at least the year and the catch, but can also contain cpue (note all column names are in lower case) along with many other variables with a value for each (or a subset) year.
  2. glb is an R list containing a selection of constants (see later).
  3. props is another data.frame (matrix) containing properties such as length-at-age, weight-at-age, maturity-at-age, and selectivity-at-age. Indeed, any variable that varies with age could be included in here.
  4. agedata is a list of five objects relating to the ageing data.
  5. lendata is a similar list of five objects relating to the length data.

To illustrate the standard format of this input data there is a function dataTemplate. As this function writes a csv file to a hard disk no example will be run in this vignette but the code would be dataTemplate(paste0(path,"/","fishery1.csv")), which would write a fishery1.csv file into the given path. Such a file can be read by the function readdata. In the example below you would have had to run the dataTemplate function first and then, of course, you would need to alter the file path and name from that in the example below. Note also that the data in this template is not internally consistent. The fish and glb objects relate to a deep water species while the age data derives from the English Plaice flatfish data taken from Beverton and Holt (1957).

# Obviously you need to include your own path to where you have stored fishery1.csv  
data("plaice")
str(plaice)

The fish Object

print(plaice$fish)

Quite often it will occur that one has catch data for a series of years before we have CPUE data and, as you can see, we have replaced the empty cells with the NA that R uses for missing data. This is more useful within R than using -99 or some other standard value. Currently the four columns present in the example are the only ones used within datalowSA and cede. However, we are still learning what data the various jurisdictions in Australia actually possess. Once that is known it should be possible to include methods that can usefully integrate any other data types available with a value each year.

The two packages are designed to use the component objects within the data set, and some methods (for example catch-MSY) only use the fish object and two parts of the glb (spsname and resilience). If that is all that is to be used (say where only catch data is available) then instead of filling in the whole data template the two objects can be created separately, which would be much more efficient.

year <- 1986:2016
catch <- c(112.9,206.3,95.7,183.1,147.4,198.9,102.1,235.5,247.8,426.8,448,577.4,
           558.5,427.9,509.3,502.4,429.6,360.2,306.2,195.7,210,287.3,214.2,260.6,
           272.2,356.9,345,282.7,285.1,237.8,233.3)
cpue <- c(1.2006,1.3547,1.0585,1.0846,0.9738,1.0437,0.7759,1.0532,1.284,1.3327,
          1.4014,NA,NA,1.142,0.9957,0.8818,0.7635,0.7668,0.7198,0.5997,0.6336,
          0.6936,0.8894,0.8644,0.8442,0.8427,0.8849,0.9964,0.9804,0.957,1.0629)
dat <- makedataset(year,catch,cpue,"testdata","verylow")
dat

In summary, the minimum specification for the fish object is a column of year and a column of catch. cpue is optional as are any other columns you wish to add. While datalowSA does not yet use any extra columns they may prove useful to you when plotting out results. Missing data in the cpue time-series should be filled with NA values.

The glb and props objects

The default glb object only contains the spsname and the resilience, the first of which is the first line of the .csv file and the second is identified under the RESILIENCE marker in the .csv file. The BIOLOGY marker is always required in the .csv file which leads to a matrix being defined to contain columns of the ages, the length-at-age, the weight-at-age, the maturity-at-age, the selectivity-at-age, and the fecundity-at-age. Prior to reading in the properties a number of biological parameters are entered, the number depending on whether or not the property parameter is TRUE or not. If TRUE then one first reads in four glb properties, the maxage, the M, the steepness, and the R0 values. The properties will then be read in directly.

An example file fisheryprops.csv is provided that illustrates the format for a file where the age related properties are read in directly (as may be required is a different growth pattern to the von Bertalanffy curve or the selectivity of maturity are described using other than a logistic curve).

How the props matrix is filled is determined by the property parameter in the readdata function. If the property parameter is left as the default of FALSE, then readdata expects to find a series of constants as described in the "example.csv" file generated by the dataTemplate function. These constant are:

BIOLOGY

To apply the age-structured production model you really need to have estimates of these parameters, either from the fishery under question of from a meta analysis of very similar species.

As an alternative if you have empirically derived estimates of the length-at-age, the weight-at-age, the maturity-at-age, the selectivity-at-age, and the fecundity-at-age, these can be entered directly. This is especially useful if growth is not best described by the power equation, of selectivity is not logistic. Fecundity-at-age does not need to be filled in with anything other than NAs.

# obviously you need to change this path to whereever you kept your copy of this file.
data("invert")
str(invert)

Note the greatly reduced list of biological properties inside glb.

agedata

This requires the AGE keyword followed immediately by the number of years of age composition data, the number of sexes for which data will be presented, and a vector of the age classes represented in the data. An example might be:

While the fishery might act on two sexes either the sexes have not been distinguished or data is only available for females. In such a case the number of sexes is 1. If you do have age-composition data for both females and males then the number of sexes would be 2. The number of years of data and number of sexes are multiplied together to identify how many lines of age-composition data are expected. If you had five years of female age composition and only 4 years of male age composition data then you would need to add a row of year, 2, NAs .... to balance the observations within each year.

Currently, neither age-composition or length-composition data have methods implemented within datalowSA. In theory this R package is for data poor species although visits to different jurisdictions is making it clear that the array of data available is more complex than previously envisaged.

lendata

The format used for age composition data is also used for length composition data.



haddonm/datalowSA documentation built on Nov. 5, 2023, 6:40 p.m.