cleanPacFIN | R Documentation |
Clean raw PacFIN data to remove unsuitable samples if CLEAN = TRUE
and
convert units of measured quantities to work with downstream functions.
Raw data are meant to be inclusive of everything from PacFIN so users can
explore all that is available, but this means that raw data will ALWAYS
include information that is not appropriate for use in
US West Coast stock assessments.
cleanPacFIN(
Pdata,
keep_INPFC = lifecycle::deprecated(),
keep_gears,
keep_sample_type = c("M"),
keep_sample_method = "R",
keep_length_type,
keep_age_method = NULL,
keep_missing_lengths = lifecycle::deprecated(),
keep_states = c("WA", "OR", "CA"),
CLEAN = TRUE,
spp = NULL,
verbose = TRUE,
savedir
)
Pdata |
A data frame of biological samples
originating from the
Pacific Fishieries Information Network (PacFIN) data warehouse,
which originated in 2014. Data are pulled using sql calls, see
|
keep_INPFC |
Deprecated. Areas are now defined using different methods. |
keep_gears |
A character vector including only the gear types you want
to label as unique fleets. Order matters and will define fleet numbering.
If the argument is missing, which is the default, then all found gear groups
are maintained and ordered alphabetically. For more details see
getGearGroup that lists a web link for where you can find the
available gear groupings and how they link to |
keep_sample_type |
A vector of character values specifying the types of
samples you want to keep. The default is to keep |
keep_sample_method |
A vector of character values specifying the types of
sampling methods you want to keep. The default is to keep |
keep_length_type |
A vector of character values specifying the types of
length samples to keep. There is no default value, though users will typically
want to keep |
keep_age_method |
A vector of ageing methods to retain in the data. All fish
aged with methods other than those listed will no longer be considered aged.
A value of |
keep_missing_lengths |
Deprecated. Just subset them using
|
keep_states |
A vector of states that you want to keep, where each state
is defined using a two-letter abbreviation, e.g., |
CLEAN |
A logical value used when you want to remove data from the input
data set. The default is |
spp |
A character string giving the species name to
ensure that the methods are species specific. Leave |
verbose |
A logical specifying if output should be written to the
screen or not. Good for testing and exploring your data but can be turned
off when output indicates information that you already know. The printing
of output to the screen does not affect any of the returned objects. The
default is to always print to the screen, i.e., |
savedir |
A file path to the directory where the results will be saved. The default is the current working directory. The path can be relative or absolute. |
The original fields in the returned data are left untouched, with the exception of
SEX
: modified using nwfscSurvey::codify_sex()
and upon return will
only include character values such that fish with an unidentified sex are
now "U"
.
Age: the best ages to use going forward rather than just the first age read.
The data are put through various tests before they are returned
and the results of these tests are stored in the CLEAN
column.
Thus, sometimes it is informative to run cleanPacFIN(CLEAN = FALSE)
and use frequency tables to inspect which groups of data will be removed
from the data set when you change the code to be CLEAN = FALSE
.
For example, many early length compositions do not have information on
the weight of fish that were sampled, and thus, there is no way to infer
how much the entire sample weighed or how much the tow/trip weighed.
Therefore, these data cannot be expanded and are removed using
CLEAN = TRUE
. Some stock assessment authors or even previous
versions of this very code attempted to use adjacent years to inform
weights. The number of assumptions for this was great and state
representatives discouraged inferring data that did not exist.
The values created as new columns are for use by other functions in this package.
In particular, fishyr
and season
are useful if there are multiple
seasons (e.g., winter and summer, as in the petrale sole assessment), and the
year is adjusted so that "winter" occurs in one year, rather than across two.
The fleet
, fishery
, and state
columns are meant for use in
stratifying the data according to the particulars of an assessment.
The input data filtered for desired areas and record types specified, with added columns
year: initialized from SAMPLE_YEAR
fleet: initialized to 1
fishery: initialized to 1
season: initialized to 1. Change using getSeason
state: initialized from SOURCE_AGID. Change using getState
length: length in mm, where NA
indicates length is not available
lengthcm: floored cm from FORK_LENGTH when available, otherwise FISH_LENGTH
geargroup: the gear group associated with each GRID
weightkg: fish weight in kg from FISH_WEIGHT and FISH_WEIGHT_UNITS
Andi Stephens
getState, getSeason
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.