preseqR.rSAC.bootstrap: Best practice for r-SAC

View source: R/rSAC.R

preseqR.rSAC.bootstrapR Documentation

Best practice for r-SAC

Description

preseqR.rSAC.bootstrap predicts the expected number of species represented at least r times in a random sample based on the initial sample.

Usage

preseqR.rSAC.bootstrap(n, r=1, mt=20, size=SIZE.INIT, mu=MU.INIT, times=30,
                       conf=0.95)

Arguments

n

A two-column matrix. The first column is the frequency j = 1,2,…; and the second column is N_j, the number of species with each species represented exactly j times in the initial sample. The first column must be sorted in an ascending order.

r

A positive integer. Default is 1.

mt

An positive integer constraining possible rational function approximations. Default is 20.

times

The number of bootstrap samples.

size

A positive double, the initial value of the parameter size in the negative binomial distribution for the EM algorithm. Default value is 1.

mu

A positive double, the initial value of the parameter mu in the negative binomial distribution for the EM algorithm. Default value is 0.5.

conf

The confidence level. Default is 0.95

Details

This is the bootstrap version of preseqR.rSAC. The bootstrap sample is generated by randomly sampling the initial sample with replacement. For each bootstrap sample, we construct an estimator. The median of estimates is used as the prediction for the number of species represented at least r times in a random sample.

The confidence interval is constructed based on a lognormal distribution.

Value

f

The estimator for the r-SAC. The input of the estimator is a vector of sampling efforts t, i.e., the relative sample sizes comparing with the initial sample. For example, t = 2 means a random sample that is twice the size of the initial sample.

se

The standard error for the estimator. The input is a vector of sampling efforts t.

lb

The lower bound of the confidence interval.The input is a vector of sampling efforts t.

ub

The upper bound of the confidence interval.The input is a vector of sampling efforts t.

Author(s)

Chao Deng

References

Efron, B., & Tibshirani, R. J. (1994). An introduction to the bootstrap. CRC press.

Deng, C., Daley, T., Calabrese, P., Ren, J., & Smith, A.D. (2016). Estimating the number of species to attain sufficient representation in a random sample. arXiv preprint arXiv:1607.02804v3.

Examples

## load library
library(preseqR)

## import data
data(FisherButterfly)

## construct estimator for SAC
estimator1 <- preseqR.rSAC.bootstrap(FisherButterfly, r=1)
## The number of species represented at least once in a sample,
## when the sample size is 10 or 20 times of the initial sample
estimator1$f(c(10, 20))
## The standard error of the estiamtes
estimator1$se(c(10, 20))
## The confidence interval of the estimates
lb <- estimator1$lb(c(10, 20))
ub <- estimator1$ub(c(10, 20))
matrix(c(lb, ub), byrow=FALSE, ncol=2)

## construct estimator for r-SAC
estimator2 <- preseqR.rSAC.bootstrap(FisherButterfly, r=2)
## The number of species represented at least twice in a sample,
## when the sample size is 50 or 100 times of the initial sample
estimator2$f(c(50, 100))
## The standard error of the estiamtes
estimator2$se(c(50, 100))
## The confidence interval of the estimates
lb <- estimator2$lb(c(50, 100))
ub <- estimator2$ub(c(50, 100))
matrix(c(lb, ub), byrow=FALSE, ncol=2)

smithlabcode/preseqR documentation built on Sept. 13, 2022, 6:29 p.m.