View source: R/PullBDS.PacFIN.R
PullBDS.PacFIN | R Documentation |
Pull biological data from PacFIN (PACFIN_MARTS.comprehensive_bds_comm
).
PullBDS.PacFIN(
pacfin_species_code,
username = getUserName("PacFIN"),
password = ask_password(),
savedir = getwd(),
verbose = TRUE
)
pacfin_species_code |
A vector of strings specifying the PacFIN species
code(s) you are interested in. This has sometimes been referred to as
|
username |
Most often, this is a string containing your username for the
database of interest. You can use |
password |
Most often, this is a string containing your password for
the database of interest. You can use the function |
savedir |
A file path to the directory where the results will be saved. The default is the current working directory. The path can be relative or absolute. |
verbose |
A logical specifying if output should be written to the
screen or not. Good for testing and exploring your data but can be turned
off when output indicates information that you already know. The printing
of output to the screen does not affect any of the returned objects. The
default is to always print to the screen, i.e., |
Upon downloading, the data are changed from a long table to a wide table
using the combination of unique FISH_ID
and AGE_SEQUENCE_NUMBER
. This
change from long to wide allows for rows equating to a single fish with
columns containing information about all measurements for that fish. Multiple
age reads and information about those reads such as age reader will be in the
columns. The age read number, e.g., 1, 2, 3, 4, ..., is pasted onto the
column name separated by an underscore. So, the maximum number you see is the
maximum number of times an otolith was read in your data set. Not all double
reads are currently available within PacFIN and users should contact the
ageing labs if they wish to inform ageing-error matrices.
AGE_COUNT
is a somewhat cryptic column name and does not always make sense
when compared to AGE_SEQUENCE_NUMBER
. It was determined that the former is
useful to identify how many potential agers were exposed to this fish.
For example, if AGE_SEQUENCE_NUMBER
has a maximum value of three for a
given FISH_ID
, then you can expect AGE_COUNT
to be three for all three
rows in the PacFIN database for that fish. This is not always true though.
Sometimes, not all AGE_SEQUENCE_NUMBER
s are present and they can skip
numbers for a given FISH_ID
, and in this case, AGE_COUNT
will be the
maximum AGE_SEQUENCE_NUMBER
for a given FISH_ID
.
FINAL_FISH_AGE_IN_YEARS
is known as the best age for a given fish.
This will not always match an age reader or be a number determinable
from the individual age reads in AGE_IN_YEARS
. Patrick explained to me
that when age reads do not agree, particularly for younger fish, then
the senior reader will work together with the junior reader to determine
an agreed-upon age. Other times, the senior reader's value will always
be used, or it could be that together they determine that they were both
wrong and a new age is proposed as the resolved age
. Nevertheless,
it can be quite messy and there is no way to predict the best age.
FISH_WEIGHT_GUTTED
is typically only available for a small subset of
samples that were sampled "purposively" by Washington state. E.g., if a
fish is weighed whole and then headed and gutted and weighed again, then
there would be two rows with the same FISH_ID
but different FISH_WEIGHT
entries in the PacFIN BDS table. The downloaded data are reshaped such that
this second gutted weight is placed in FISH_WEIGHT_GUTTED
and the fish is
represented in a single row. Granted, these purposive samples should not be
used in an assessment of the population status but they are included in the
download for completeness.
Values passed to PACFIN_SPECIES_CODE
are searched for using regular
expression matching, which is different than the exact matching that is done
is PullCatch.PacFIN()
. The use of pattern matching allows for species codes
with mistakes like leading and trailing spaces to be found. This is doable in
the biological data because data for nominal species codes are few. In my
experiences these mistakes in the species codes are more common for PacFIN
species codes that are three letters rather than the standard four letters.
An RData
file is saved to the disk and the pulled data are returned as an
invisible()
data frame. The saved data can be read back in using load()
,
but note that upon loading, the object will be named bds.pacfin
, which is
its name inside of the .RData
file, and thus, the object will retain this
name within your work space unless you rename it. The data are in their raw,
form i.e., just as they were extracted from PacFIN, form and will need to be
cleaned prior to their use in downstream functions using cleanPacFIN()
.
John R. Wallace and Kelli F. Johnson
cleanColumns()
to change to legacy column names
cleanPacFIN()
to subset the data frame to those records that should be
used within West Coast assessments of marine populations
## Not run:
# You will be asked for your password
pd <- PullBDS.PacFIN(pacfin_species_code = "POP")
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.