breakaway-package: Species richness estimation and modelling in the...

Description Details Author(s) References Examples

Description

Species richness estimation is an important problem in biodiversity analysis. This package provides methods for total species richness estimation (observed plus unobserved) and a method for modelling total diversity with covariates. breakaway estimates total (observed plus unobserved) species richness. Microbial diversity datasets are characterized by a large number of rare species and a small number of highly abundant species. The class of models implemented by breakaway is flexible enough to model both these features. breakaway_nof1 implements a similar procedure however does not require a singleton count. betta provides a method for modelling total diversity with covariates in a way that accounts for its estimated nature and thus accounts for unobserved taxa, and betta_random permits random effects modelling.

Details

Package: breakaway
Type: Package
Version: 3.0
Date: 2016-03-29
License: GPL-2

The function breakaway estimates the total (observed plus unobserved) number of classes (usually, distinct species) based on a sample of the frequency counts. Standard errors and model fits are also given. The algorithm is based on theory of characterization of distributions by ratios of their probabilities. Parameter estimation is done via nonlinear regression. The class of models available is usually broad enough to account for the high-diversity case, which is often observed in microbial diversity datasets. Since many classical estimation procedures either fail to provide an estimate or provide poor fits in the microbial setting, breakaway addresses this data structure. Additionally, since sequencing errors may result in an inflated singleton count, breakaway_nof1 performs a similar procedure but does not require a singleton count. It can be used as an exploratory tool for investigating the plausibility of the given singleton count. betta runs a regression-type analysis of estimated total diversity, thus permitting accounting for unobserved taxa. It does not enforce use of breakaway for diversity estimation. A mixed-model approach accounts for the differing levels of confidence in the diversity estimates, and covariates constitute the fixed effects. Support of this work from Cornell University's Department of Statistical Sciences is gratefully acknowledged.

Author(s)

Amy Willis & John Bunge

Maintainer: Amy Willis <[email protected]>

References

Willis, A. and Bunge, J. (2015). Estimating diversity via frequency ratios. Biometrics.

Willis, A. (2015). Species richness estimation with high diversity but spurious singletons. Under review.

Willis, A., Bunge, J., and Whitman, T. (2015). Inference for changes in biodiversity. arXiv preprint.

Rocchetti, I., Bunge, J. and Bohning, D. (2011). Population size estimation based upon ratios of recapture probabilities. Annals of Applied Statistics, 5.

Chao, A. and Bunge, J. (2002). Estimating the number of species in a stochastic abundance model. Biometrics, 58.

Chao, A. (1984). Nonparametric estimation of the number of classes in a population. Scandinavian Journal of Statistics, 4.

Examples

1
2
3
4
breakaway(apples)
breakaway(apples,plot=FALSE,print=FALSE,answers=TRUE)
breakaway_nof1(apples[-1,])
betta(c(1552,1500,884),c(305,675,205))

Example output

Iterative reweighting didn't produce any outcomes after the first iteration, so we use 1/x
################## breakaway ##################
	The best estimate of total diversity is 1552 
 	 with std error 305 
	The model employed was model_1_1 
	The function selected was
	  f_{x+1}/f_{x} ~ (beta0+beta1*(x-xbar))/(1+alpha1*(x-xbar)) 
       Coef estimates Coef std errors
beta0      1.20345571      0.16807523
beta1      0.05765149      0.02962841
alpha1     0.03012304      0.03782164
xbar			 16.5$code
[1] 3

$name
[1] "model_1_1"

$para
       Coef estimates Coef std errors
beta0      1.20345571      0.16807523
beta1      0.05765149      0.02962841
alpha1     0.03012304      0.03782164

$est
[1] 1552.416

$seest
[1] 304.7069

$full
Nonlinear regression model
  model: lhs$y ~ structure_1_1(x, beta0, beta1, alpha1)
   data: lhs
  beta0   beta1  alpha1 
1.20346 0.05765 0.03012 
 weighted residual sum-of-squares: 1.274

Number of iterations to convergence: 8 
Achieved convergence tolerance: 8.57e-06

$ci
[1]  1006.52 47805.09

Iterative reweighting didn't produce any outcomes after the first iteration, so we use 1/x
################## breakaway ##################
	The best estimate of total diversity is 1500 
 	 with std error 1341 
	The model employed was model_1_1 
	The function selected was
	  f_{x+1}/f_{x} ~ (beta0+beta1*(x-xbar))/(1+alpha1*(x-xbar)) 
       Coef estimates Coef std errors
beta0      1.20078846      0.18102488
beta1      0.05614294      0.04800125
alpha1     0.02874889      0.05381204
xbar			 16.5$table
     Estimates Standard Errors p-values
[1,]  1212.358        271.4992        0

$cov
         [,1]
[1,] 73711.81

$ssq_u
[1] 105923.2

$homogeneity
[1] 3.9872480 0.1362009

$global
[1] 19.93999  0.00000

$blups
[1] 1393.1890 1266.6155  977.2709

$blupses
[1] 256.2108 366.7010 189.8294

breakaway documentation built on May 30, 2017, 3:14 a.m.