likelihood_sparsesample: Gadget3 likelihood actions for sparse data
In lentinj/gadget3: Globally-Applicable Area Disaggregated General Ecosystem Toolbox V3

likelihood_sparsesample

R Documentation

Gadget3 likelihood actions for sparse data

Description

Compare model predictions against a set of sparse data points

Usage

g3l_sparsesample_linreg(
        fit = c('log', 'linear'),
        slope = 1,
        intercept = NULL )

g3l_sparsesample_sumsquares(
        weighting = "model_stddev" )

g3l_sparsesample(
        nll_name,
        obs_df,
        stocks,
        measurement_f = quote( wgt ),
        function_f = g3l_sparsesample_linreg(),
        predstocks = list(),
        area_group = NULL,
        weight = g3_parameterized(paste(
            if (length(predstocks) > 0) "csparse" else "asparse",
            function_f_name,
            nll_name,
            "weight",
            sep = "_"), optimise = FALSE, value = 1),
        run_at = g3_action_order$likelihood )

Arguments

`slope`, `intercept`	formula substituted into surveyindices calcuations to fix slope/intercept of linear regression, or NULL if not fixed. See below.
`fit`	Is the fit 'log' or 'linear'? See below.
`weighting`	Weighting applied to sum-of-squares. One of "model_stddev", "obs_stddev" or a formula.
`nll_name`	Character string, used to define the variable name for `obsstock` and `modelstock`. By default set to `(asparse\|csparse)_(name_of_function_f)_(nll_name)_weight`.
`obs_df`	Data.frame of observation data. See details.
`stocks`	A list of `g3_stock` objects to collect sparsesample data for, depending if `stocks` were provided.
`measurement_f`	formula to derive the model's equivalent predicted value for a data point. You can use `wgt` to refer to weight of matching individuals, `length` to refer to length of matching individuals.
`function_f`	A formula to compare `obs_df` to predicted values generated via `transform_f` and generate nll, defined by one of the `g3l_sparsesample_`* functions.
`predstocks`	A list of `g3_stock` predator or fleet objects. If present, we will compare against the model predicted catch. Without (the default), we compare against overall abundance.
`area_group`	List mapping area names used in `obs_df` to integer model areas, most likely generated by `g3_areas`.
`weight`	Weighting applied to this likelihood component. Default is a `g3_param` that defaults to 1, allowing weights to be altered without recompiling.
`run_at`	Integer order that actions will be run within model, see `g3_action_order`.

Details

The actions will define the following variables in your model, which could be reported with g3a_report_history:

nll_sp(abund|catch)_name__obs_mean: Observation mean, the mean column from obs_df
nll_sp(abund|catch)_name__obs_stddev: Observation standard deviation, the stddev column from obs_df
nll_sp(abund|catch)_name__obs_n: Observation number, the number column from obs_df
nll_sp(abund|catch)_name__model_sum: The corresponding model prediction vector, total datapoints. __model_sum / __model_n for the mean
nll_sp(abund|catch)_name__model_sqsum: The corresponding model prediction vector, sqared-sum datapoints.
nll_sp(abund|catch)_name__model_n: The number of data points at each point in the model prediction vector, if predstocks set this is the number of individuals caught matching the datapoint (length/age/...), otherwise abundance of individuals matching the datapoint.

obs_df format

data.frame of observation data. Unlike g3l_abundancedistribution, gaps and sparse data is accepted, and gaps will not be filled with zero.

For each row in the table, all matching predictions are aggregated. Aggregation columns include:

year: Required. The year the sample is from
step: Optional. The timestep/season the sample is from
area: Optional. Only aggregate predicted values from given area
age: Optional. Only aggregate predicted values with given age
length: Optional. Only aggregate predicted values with given length (matches nearest lengthgroup)

So, a row with "year=1998,age=4" will be compared against age 4 individuals of all lengths in 1998, step 1 & 2. A row with "year=2004,step=1,age=2,length=19" will be compared against individuals of age 4, length 10..20, in winter 2004.

The observation data is provided in the following columns:

mean: Required. Mean value at this data point
number: Optional. Number of data points, defaults to 1
stddev: Optional. Observed standard deviation (only required if weighting = "obs_stddev")

Value

g3l_sparsesample_linreg

Returns a formula for use as function_f:

If fit = "log":

\sum_{\it i}^{rows} (\alpha + \beta \log N_{i} - \log \frac{\nu_{i}}{P_{i}})^2

If fit = "linear":

\sum_{\it i}^{rows} (\alpha + \beta N_{i} - \frac{\nu_{i}}{P_{i}})^2

N_{i}: "mean" column from obs_df
\nu_{i}: Total predicted values for all data points, i.e. nll_spabund_name__model_sum
P_{i}: Number of data points, i.e. nll_spabund_name__model_n
\alpha: intercept parameter, defaults to 1, i.e. fixed slope
\beta: slope parameter, defaults to NULL, i.e. linear regression performed to find optimal value

If either alpha or beta is not provided, then linear regression is performed on N vs \nu for each value in table, and the optimal value used for each.

g3l_sparsesample_sumsquares

Returns a formula for use as function_f:

\sum_{\it i}^{rows} w (\frac{\nu_{i}}{P_{i}} - N_{i})^2

N_{i}

"mean" column from obs_df

\nu_{i}

Total predicted values, i.e. nll_spabund_name__model_sum

P_{i}

Number of data points, i.e. nll_spabund_name__model_n

w

weighting parameter, either:

1 / \sigma^2, using stddev of model predicted values if weighting = "model_stddev"
1 / \sigma^2, using stddev column from obs_df if weighting = "obs_stddev"
A custom forumla provided for weighting

Examples


st <- g3_stock("fish", c(10, 20, 30)) %>% g3s_age(3,5)

# Generate some random sparsesample samples
obs_df <- data.frame(
    # NB: No 1993, we don't have any samples for that year
    year = rep(c(1990, 1991, 1992, 1994), each = 2),
    step = 1:2 )
obs_df$age = floor(runif(nrow(obs_df), min = 3, max = 5.1))
obs_df$length = floor(runif(nrow(obs_df), min = 10, max = 50))
obs_df$mean = runif(nrow(obs_df), min = 10, max = 1000)

actions <- list(
    g3a_time(1990, 1994, c(6,6)),
    # Use otherfood to populate abundance / mean weight
    g3a_otherfood(st,
        quote( age * 100 + stock__minlen ),
        quote( cur_year * 1e5 + cur_step * 1e4 + 0 * stock__minlen ) ),
    g3l_sparsesample(
        "bt",
        obs_df,
        list(st),
        measurement_f = g3_formula(
          # Derive blubber thickness from length/weight
          ((wgt/(wmax.a * length^wmax.b) - 0.5) * 100 - 4.44) / (5693 * (length/wgt)^0.5),
          wmax.a = g3_parameterized("wmax.a", by_stock = TRUE),
          wmax.b = g3_parameterized("wmax.b", by_stock = TRUE),
          end = NULL ),
        function_f = g3l_sparsesample_linreg(fit = "linear") ),
    NULL )

model_fn <- g3_to_r(c(actions, list(
    g3a_report_detail(actions),  # TODO: Not reporting anything useful
    NULL )))
r <- attributes(model_fn())
colSums(r$dstart_fish__num)  # TODO: Report something related

lentinj/gadget3 documentation built on June 12, 2025, 5:46 a.m.