Description Usage Arguments Details Value Author(s) References Examples

View source: R/sample_coverage.R

`preseqR.sample.cov`

predicts the probability of observing a species
represented at least *r* times in a random sample.

1 | ```
preseqR.sample.cov(n, r=1, mt=20)
``` |

`n` |
A two-column matrix.
The first column is the frequency |

`r` |
A positive integer. Default is 1. |

`mt` |
A positive integer constraining possible rational function approximations. Default is 20. |

Suppose a sample is given and one more individual is randomly drawn from the
population. `preseqR.sample.cov`

estimates the probability of the
species, which represents the individual, has been observed at least
*r* times in the
sample. When *r = 1*, the probability is called the sample coverage.

Let *N_j* be the number of species represented exactly *j* times in
a sample. The probability of observing a species represented at
least *r* times in the sample is estimated as
*∑_{j=r+1}^∞ jN_j / ∑_{j=1}^∞ jN_j*. The theory is
described by Mao and Lindsay (2002). For a random sample
where *N_j* is unknown, a modified rational function approximation is
first used to predict the value of *N_j*. Then the estimates are
substituted to obtain an estimator for the probability of observing a species
represented at least *r* times in the sample.

This function is the fast version of `preseqR.sample.cov.bootstrap`

.
The function does not provide the confidence interval. To obtain the
confidence interval along with the estimates, one should use the function
`preseqR.sample.cov.bootstrap`

.

The estimator for the probability of observing a species represented at least
*r* times in a random sample.
The input of the estimator is a vector of sampling efforts *t*, i.e.,
the relative sample sizes comparing with the initial sample.
For example, *t = 2* means a random sample that is twice the size of
the initial sample.

Chao Deng

Good, I. J. (1953). The population frequencies of species and the estimation of population parameters. Biometrika, 40(3-4), 237-264.

Mao, C. X. and Lindsay, B. G. (2002). A Poisson model for the coverage problem with a genomic application. Biometrika, 89(3), 669-682.

Deng, C., Daley, T., Calabrese, P., Ren, J., & Smith, A.D. (2016). Estimating the number of species to attain sufficient representation in a random sample. arXiv preprint arXiv:1607.02804v3.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | ```
## load library
library(preseqR)
## import data
data(FisherButterfly)
## construct the estimator for the sample coverage
estimator1 <- preseqR.sample.cov(FisherButterfly, r=1)
## Given a sample that is 10 times or 20 times the size of an initial samples,
## suppose one randomly draws one more individual from the population. The
## value of the function is the probability that the representing species
## has been observed in the sample
estimator1(c(10, 20))
## construct the estimator
estimator2 <- preseqR.sample.cov(FisherButterfly, r=2)
## the probability a species represented at least twice when the sample size
## is 50 times or 100 times of the initial sample
estimator2(c(50, 100))
``` |

```
Warning messages:
1: In polynomial(p) : imaginary parts discarded in coercion
2: In polynomial(p) : imaginary parts discarded in coercion
3: In polynomial(p) : imaginary parts discarded in coercion
4: In polynomial(p) : imaginary parts discarded in coercion
[1] 0.9996047 0.9998965
Warning messages:
1: In polynomial(p) : imaginary parts discarded in coercion
2: In polynomial(p) : imaginary parts discarded in coercion
3: In polynomial(p) : imaginary parts discarded in coercion
4: In polynomial(p) : imaginary parts discarded in coercion
[1] 0.9999492 0.9999871
```

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.