estN: Estimate Effective Sample Size

Description Usage Arguments Details Value Note References See Also Examples

View source: R/estN.R

Description

Estimate the effective sample size for catch-at-age or catch-at-length data, based on the multinomial distribution.

Usage

1
2
3
4
estN(model, what="CAc", series=NULL, init=NULL, FUN=mean, ceiling=Inf,
     digits=0)

estN.int(P, Phat)  # internal function

Arguments

model

fitted scape model containing catch-at-age and/or catch-at-length data.

what

name of model element: "CAc", "CAs", "CLc", or "CLs".

series

vector of strings indicating which gears or surveys to analyze (all by default).

init

initial sample size, determining the relative pattern of the effective sample size between years.

FUN

function to standardize the effective sample size.

ceiling

largest possible sample size in one year.

digits

number of decimal places to use when rounding, or NULL to suppress rounding.

P

observed catch-at-age or catch-at-length matrix.

Phat

fitted catch-at-age or catch-at-length matrix.

Details

The init sample sizes set a fixed pattern for the relative sample sizes between years. For example, if there are two years of catch-at-age data and the initial sample sizes are 100 in year 1 and 200 in year 2, the effective sample size will be two times greater in year 2 than in year 1, although both will be scaled up or down depending on how closely the model fits the catch-at-age data. The value of init can be one of the following:

NULL

means read the initial sample sizes from the existing SS column (default).

model

means read the initial sample sizes from the SS column in that model (object of class scape).

numeric vector

means those are the initial sample sizes (same length as the number of years).

FALSE

means ignore the initial sample sizes and use the empirical multinomial sample size (nhat) in each year.

1

means calculate one effective sample size to use across all years, e.g. the mean or median of nhat.

The idea behind FUN=mean is to guarantee that regardless of the value of init, the mean effective sample size will always be the same. Other functions can be used to a similar effect, such as FUN=median.

The estN function is implemented for basic single-sex datasets. If the data are sex-specific, estN pools (averages) the sexes before estimating effective sample sizes. The general function estN.int, on the other hand, is suitable for analyzing any datasets in matrix format. The int in estN.int stands for internal (not integer), analogous to rep.int, seq.int, sort.int, and similar functions.

Value

Numeric vector of effective sample sizes (one value if init=1), or a list of such vectors when analyzing multiple series.

Note

This function uses the empirical multinomial sample size to estimate an effective sample size, which may be appropriate as likelihood weights for catch-at-age and catch-at-length data. The better the model fits the data, the larger the effective sample size. See McAllister and Ianelli (1997), Gavaris and Ianelli (2002), and Magnusson et al. (2013).

estN can be used iteratively, along with estSigmaI and estSigmaR to assign likelihood weights that are indicated by the model fit to the data. Sigmas and sample sizes are then adjusted between model runs, until they converge. The iterate function facilitates this procedure.

If P[t,a] is the observed proportion of fish at age (or length bin) a in year t, and Phat[t,a] is the fitted proportion, then the estimated sample size in that year is:

nhat[t] = sum_a(Phat[t,a]*(1-Phat[t,a])) / sum_a((P[t,a]-Phat[t,a])^2)

Due to the non-random and non-independent nature of sampling fish, the effective sample size, for statistical purposes, is much less than the number of fish sampled. Common starting points include using the number of tows as the sample size in each year, or using the empirical multinomial sample sizes. Those “initial” sample sizes can then be scaled up or down. Sample sizes between 20 and 200 are common in the stock assessment literature.

References

Gavaris, S. and Ianelli, J. N. (2002) Statistical issues in fisheries' stock assessments. Scandinavian Journal of Statistics, 29, 245–271.

Magnusson, A., Punt, A. E. and Hilborn, R. (2013) Measuring uncertainty in fisheries stock assessment: the delta method, bootstrap, and MCMC. Fish and Fisheries, 14, 325–342.

McAllister, M. K. and Ianelli, J. N. (1997) Bayesian stock assessment using catch-age data and the sampling-importance resampling algorithm. Canadian Journal of Fisheries and Aquaticic Sciences, 54, 284–300.

See Also

getN, getSigmaI, getSigmaR, estN, estSigmaI, and estSigmaR extract and estimate sample sizes and sigmas.

iterate combines all the get* and est* functions in one call.

plotCA and plotCL show what is behind the sample-size estimation.

scape-package gives an overview of the package.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
## Exploring candidate sample sizes:

getN(x.sbw)     # sample sizes used in assessment: number of tows
estN(x.sbw)     # effective sample size, given data (tows) and model fit
estN(x.sbw, ceiling=200)  # could use this
estN(x.sbw, init=FALSE)   # from model fit, disregarding tows
plotCA(x.sbw)             # years with good fit => large sample size
estN(x.sbw, init=1)       # one sample size across all years
estN(x.sbw, init=c(rep(1,14),rep(2,9)))  # two sampling periods

## Same mean, regardless of init:

mean(estN(x.sbw, digits=NULL))
mean(estN(x.sbw, digits=NULL, init=FALSE))
mean(estN(x.sbw, digits=NULL, init=1))
mean(estN(x.sbw, digits=NULL, init=c(rep(1,14),rep(2,9))))

## Same median, regardless of init:

median(estN(x.sbw, FUN=median, digits=NULL))
median(estN(x.sbw, FUN=median, digits=NULL, init=FALSE))
median(estN(x.sbw, FUN=median, digits=NULL, init=1))
median(estN(x.sbw, FUN=median, digits=NULL, init=c(rep(1,14),rep(2,9))))

## Multiple series:

getN(x.ling, "CLc")              # sample size used in assessment
getN(x.ling, "CLc", digits=0)    # rounded
estN(x.ling, "CLc")              # model fit implies larger sample sizes

getN(x.ling, "CLc", series="1", digits=0)  # get one series
estN(x.ling, "CLc", series="1")            # estimate one series

Example output

1979 1980 1981 1982 1983 1984 1985 1986 1988 1989 1990 1991 1992 1993 1994 1995 
  20   10   33   16   17   13   17   28  206  133   94   52  121   55   80   76 
1996 1997 1998 1999 2000 2001 2002 
  96  185  255  175  168  321  185 
1979 1980 1981 1982 1983 1984 1985 1986 1988 1989 1990 1991 1992 1993 1994 1995 
  38   19   63   31   33   25   33   54  396  256  181  100  233  106  154  146 
1996 1997 1998 1999 2000 2001 2002 
 185  356  490  336  323  617  356 
1979 1980 1981 1982 1983 1984 1985 1986 1988 1989 1990 1991 1992 1993 1994 1995 
  38   19   63   31   33   25   33   54  200  200  181  100  200  106  154  146 
1996 1997 1998 1999 2000 2001 2002 
 185  200  200  200  200  200  200 
1979 1980 1981 1982 1983 1984 1985 1986 1988 1989 1990 1991 1992 1993 1994 1995 
  16   49   45   23   50   29   93   61  203   65   96   20   48    5  683  343 
1996 1997 1998 1999 2000 2001 2002 
 945  584   52  543  179  263  134 
[1] 197
1979 1980 1981 1982 1983 1984 1985 1986 1988 1989 1990 1991 1992 1993 1994 1995 
 142  142  142  142  142  142  142  142  142  142  142  142  142  142  283  283 
1996 1997 1998 1999 2000 2001 2002 
 283  283  283  283  283  283  283 
[1] 196.928
[1] 196.928
[1] 196.928
[1] 196.928
[1] 65.34033
[1] 65.34033
[1] 65.34033
[1] 65.34033
$`1`
     1989      1990      1991      1992      1993      1994      1995      1996 
  2.46807  21.12030  40.96320  58.98490  56.26150  56.05500  23.26390  45.71440 
     1997      1998      1999      2000 
116.16300 140.97300  96.85780  45.78170 

$`2`
   1995    1996    1997    1998    1999    2000 
140.735 164.541 180.757 119.075 120.941 194.915 

$`1`
1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 
   2   21   41   59   56   56   23   46  116  141   97   46 

$`2`
1995 1996 1997 1998 1999 2000 
 141  165  181  119  121  195 

$`1`
1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 
  22  185  359  517  493  491  204  401 1018 1235  849  401 

$`2`
1995 1996 1997 1998 1999 2000 
 808  945 1038  684  695 1120 

1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 
   2   21   41   59   56   56   23   46  116  141   97   46 
1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 
  22  185  359  517  493  491  204  401 1018 1235  849  401 

scape documentation built on Nov. 23, 2020, 5:08 p.m.