sample_index | R Documentation |
Create new catch-per-unit-effort (CPUE)/indices of abundance that are based on the numbers in a data file. Typically the data file will be filled with expected values rather than observed data but it does not have to be. Sampling can only occur on fleets, years, and months that have current observations. If rows of information are not sampled from, then they are removed. So, you can take away rows of data but you cannot add them with this function.
sample_index(
dat_list,
outfile = lifecycle::deprecated(),
fleets,
years,
sds_obs = list(0.01),
sds_out,
seas = lifecycle::deprecated(),
month = list(1)
)
dat_list |
A Stock Synthesis data list returned from
|
outfile |
A deprecated argument. |
fleets |
An integer vector specifying which fleets to sample from. The
order of the fleets matters here because you must retain the ordering for
all of the remaining input arguments. For example, both |
years |
A list the same length as |
sds_obs , sds_out , month |
A list the same length as |
seas |
A deprecated argument. |
Limitations to the functionality of this function are as follows:
you can only generate observations from rows of data that are present, e.g., you cannot make a new observation for a year that is not present in the passed data file;
no warning will be given if some of the desired year, month, fleet combinations are available but not all, instead just the combinations that are available will be returned in the data list object; and
sampling uses a log-normal distribution when the log-normal distribution
is specified in CPUEinfo[["errtype"]]
and a normal distribution for all
other error types, see below for details on the log-normal sampling.
Samples are generated using the following equation when the log-normal distribution is specified:
B_y*exp(stats::rnorm(1, 0, sds_obs)-sds_obs^2/2)
,
where B_y
is the expected biomass in year y and sds_obs
is the
standard deviation of the normally distributed biomass or the standard error
of the log_e(B_y)
. For the error term, this is the same
parameterization that is used in Stock Synthesis. More details can be found
in the section on indices in the Stock Synthesis manual
The second term in the equation adjusts the random samples so their expected
value is B_y
, i.e., the log-normal bias correction.
If you only know the coefficient of variation (CV
), then the input
error can be approximated using \sqrt{log_e(1+CV^{2})}
. Where,
CV
is assumed to be constant with mean changes in biomass. The
log-normal distribution can be approximated by a proportional distribution
or normal distribution only when the variance is low, i.e., CV < 0.50
or log standard deviation of 0.22.
A Stock Synthesis data file list object is returned. The object will be a
modified version of dat_list
.
Cole Monnahan, Kotaro Ono
Other sampling functions:
clean_data()
,
sample_agecomp()
,
sample_calcomp()
,
sample_catch()
,
sample_discard()
,
sample_lcomp()
,
sample_mlacomp()
,
sample_wtatage()
# Add a list from [r4ss::SS_readdat()] to your workspace, this is example
# data that is saved in the ss3sim package.
# Index data are saved in `dat_list[["CPUE"]]`
dat_list <- r4ss::SS_readdat(
file = file.path(
system.file("extdata", "example-om", package = "ss3sim"),
"ss3_expected_values.dat"
),
verbose = FALSE
)
# Sample from each available year from fleet 2 with an increasing trend in
# the observation error, i.e., the most recent year has the highest
# likelihood to be the furthest from the input data
ex1 <- sample_index(
dat_list,
fleets = 2,
month = list(
dat_list[["CPUE"]][dat_list[["CPUE"]][, "index"] == 2, "month"]
),
years = list(dat_list[["CPUE"]][["year"]]),
sds_obs = list(
seq(0.001, 0.1, length.out = length(dat_list[["CPUE"]][["year"]]))
)
)
## Not run:
# Sample from less years, note that sampling from more years than what is
# present in the data will not work
ex2 <- sample_index(dat_list,
fleets = 2,
month = list(unique(
dat_list[["CPUE"]][dat_list[["CPUE"]][, "index"] == 2, "month"]
)),
years = list(dat_list[["CPUE"]][["year"]][-c(1:2)]),
sds_obs = list(0.001)
)
# sd in the returned file can be different than what is used to sample, this
# is helpful when you want to test what would happen if the estimation method
# was improperly specified
ex3 <- sample_index(
dat_list = dat_list,
fleets = 2,
month = list(unique(
dat_list[["CPUE"]][dat_list[["CPUE"]][, "index"] == 2, "month"]
)),
years = list(dat_list[["CPUE"]][["year"]]),
sds_obs = list(0.01),
sds_out = list(0.20)
)
ex3[["CPUE"]][["se_log"]]
## End(Not run)
# Sample from two fleets after adding fake CPUE data for fleet 1
dat_list2 <- dat_list
dat_list2[["CPUE"]] <- rbind(
dat_list[["CPUE"]],
dat_list[["CPUE"]] |>
dplyr::mutate(index = 1, month = 1)
)
dat_list2[["N_cpue"]] <- NROW(dat_list2[["CPUE"]])
ex4 <- sample_index(
dat_list = dat_list2,
fleets = 1:2,
month = list(1, 7),
# Subset two years from each fleet
years = list(c(76, 78), c(80, 82)),
# Use the same sd values for both fleets
sds_obs = list(0.01),
sds_out = list(0.20)
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.