ChaoSpecies: Estimation of species richness in a community
In SpadeR: Species-Richness Prediction and Diversity Estimation with R

Description Usage Arguments Value References Examples

ChaoSpecies: Estimation of species richness in a single community based on five types of data: Type (1) abundance data (datatype="abundance"), Type (1A) abundance-frequency counts
(datatype="abundance_freq_count"), Type (2) incidence-frequency data (datatype = "incidence_freq"), Type (2A) incidence-frequency counts (datatype="incidence_freq_count"), and Type (2B) incidence-raw data (datatype="incidence_raw"); see SpadeR-package details for data input formats.

1
2
3

ChaoSpecies(data, datatype = c("abundance", "abundance_freq_count",
  "incidence_freq", "incidence_freq_count", "incidence_raw"), k = 10,
  conf = 0.95)

`data`	a matrix/data.frame of species abundances/incidences.
`datatype`	type of input data, "abundance", "abundance_freq_count", "incidence_freq", "incidence_freq_count" or "incidence_raw".
`k`	the cut-off point (default = 10), which separates species into "abundant" and "rare" groups for abundance data for the estimator ACE; it separates species into "frequent" and "infrequent" groups for incidence data for the estimator ICE.
`conf`	a positive number ≤ 1 specifying the level of confidence interval.

a list of three objects:

$Basic_data_information and $Rare_species_group/$Infreq_species_group for summarizing data information.

$Species_table for showing a table of various species richness estimates, standard errors, and the associated confidence intervals.

Chao, A., and Chiu, C. H. (2012). Estimation of species richness and shared species richness. In N. Balakrishnan (ed). Methods and Applications of Statistics in the Atmospheric and Earth Sciences. p.76-111, Wiley, New York.

Chao, A., and Chiu, C. H. (2016). Nonparametric estimation and comparison of species richness. Wiley Online Reference in the Life Science. In: eLS. John Wiley and Sons, Ltd: Chichester. DOI: 10.1002/9780470015902.a0026329.

Chao, A., and Chiu, C. H. (2016). Species richness: estimation and comparison. Wiley StatsRef: Statistics Reference Online. 1-26.

Chiu, C. H., Wang Y. T., Walther B. A. and Chao A. (2014). An improved non-parametric lower bound of species richness via the Good-Turing frequency formulas. Biometrics, 70, 671-682.

Gotelli, N. G. and Chao, A. (2013). Measuring and estimating species richness, species diver- sity, and biotic similarity from sampling data. Encyclopedia of Biodiversity, 2nd Edition, Vol. 5, 195-211, Waltham, MA.

data(ChaoSpeciesData)
# Type (1) abundance data
ChaoSpecies(ChaoSpeciesData$Abu,"abundance",k=10,conf=0.95)
# Type (1A) abundance-frequency counts data
ChaoSpecies(ChaoSpeciesData$Abu_count,"abundance_freq_count",k=10,conf=0.95)
# Type (2) incidence-frequency data
ChaoSpecies(ChaoSpeciesData$Inci,"incidence_freq",k=10,conf=0.95)
# Type (2A) incidence-frequency counts data
ChaoSpecies(ChaoSpeciesData$Inci_count,"incidence_freq_count",k=10,conf=0.95)
# Type (2B) incidence-raw data 
ChaoSpecies(ChaoSpeciesData$Inci_raw,"incidence_raw",k=10,conf=0.95)

(1) BASIC DATA INFORMATION:

                                         Variable Value
    Sample size                                 n  1996
    Number of observed species                  D    25
    Coverage estimate for entire dataset        C 0.998
    CV for entire dataset                      CV 1.916
    Cut-off point                               k    10

                                                      Variable Value
    Number of observed individuals for rare group       n_rare    53
    Number of observed species for rare group           D_rare    11
    Estimate of the sample coverage for rare group      C_rare 0.943
    Estimate of CV for rare group in ACE               CV_rare 0.629
    Estimate of CV1 for rare group in ACE-1           CV1_rare  0.74
    Number of observed individuals for abundant group   n_abun  1943
    Number of observed species for abundant group       D_abun    14

NULL


(2) SPECIES RICHNESS ESTIMATORS TABLE:

                              Estimate  s.e. 95%Lower 95%Upper
    Homogeneous Model           25.660 0.954   25.082   30.295
    Homogeneous (MLE)           25.000 0.975   25.000   28.500
    Chao1 (Chao, 1984)          27.249 3.394   25.266   44.030
    Chao1-bc                    25.999 1.817   25.094   35.673
    iChao1 (Chiu et al. 2014)   27.249 3.394   25.266   44.030
    ACE (Chao & Lee, 1992)      26.920 2.367   25.292   37.639
    ACE-1 (Chao & Lee, 1992)    27.399 3.163   25.336   42.153
    1st order jackknife         27.998 2.449   25.739   37.171
    2nd order jackknife         28.998 4.240   25.730   46.915


(3) DESCRIPTION OF ESTIMATORS/MODELS:

Homogeneous Model: This model assumes that all species have the same incidence or detection probabilities. See Eq. (3.2) of Lee and Chao (1994) or Eq. (12a) in Chao and Chiu (2016b).

Chao2 (Chao, 1987): This approach uses the frequencies of uniques and duplicates to estimate the number of undetected species; see Chao (1987) or Eq. (11a) in Chao and Chiu (2016b).
     
Chao2-bc: A bias-corrected form for the Chao2 estimator; see Chao (2005).
  
iChao2: An improved Chao2 estimator; see Chiu et al. (2014).

ICE (Incidence-based Coverage Estimator): A non-parametric estimator originally proposed by Lee and Chao (1994) in the context of capture-recapture data analysis. The observed species are separated as frequent and infrequent species groups; only data in the infrequent group are used to estimate the number of undetected species. The estimated CV for species in the infrequent group characterizes the degree of heterogeneity among species incidence probabilities. See Eq. (12b) of Chao and Chiu (2016b), which is an improved version of Eq. (3.18) in Lee and Chao (1994). This model is also called Model(h) in capture-recapture literature where h denotes "heterogeneity".

ICE-1: A modified ICE for highly-heterogeneous cases.

1st order jackknife: It uses the frequency of uniques to estimate the number of undetected species; see Burnham and Overton (1978).

2nd order jackknife: It uses the frequencies of uniques and duplicates to estimate the number of undetected species; see Burnham and Overton (1978).

95% Confidence interval: A log-transformation is used for all estimators so that the lower bound of the resulting interval is at least the number of observed species. See Chao (1987).

(1) BASIC DATA INFORMATION:

                                         Variable Value
    Sample size                                 n  1008
    Number of observed species                  D   188
    Coverage estimate for entire dataset        C  0.94
    CV for entire dataset                      CV 1.567
    Cut-off point                               k    10

                                                      Variable Value
    Number of observed individuals for rare group       n_rare   515
    Number of observed species for rare group           D_rare   167
    Estimate of the sample coverage for rare group      C_rare 0.882
    Estimate of CV for rare group in ACE               CV_rare 0.715
    Estimate of CV1 for rare group in ACE-1           CV1_rare 0.891
    Number of observed individuals for abundant group   n_abun   493
    Number of observed species for abundant group       D_abun    21

NULL


(2) SPECIES RICHNESS ESTIMATORS TABLE:

                              Estimate   s.e. 95%Lower 95%Upper
    Homogeneous Model          210.438  6.323  201.053  226.572
    Homogeneous (MLE)          188.910  0.969  188.165  193.011
    Chao1 (Chao, 1984)         241.104 17.849  215.967  288.836
    Chao1-bc                   238.783 17.099  214.716  284.530
    iChao1 (Chiu et al. 2014)  254.136 10.632  236.358  278.448
    ACE (Chao & Lee, 1992)     245.828 15.195  222.850  283.957
    ACE-1 (Chao & Lee, 1992)   265.366 22.736  232.012  323.998
    1st order jackknife        248.939 11.037  230.852  274.661
    2nd order jackknife        274.923 19.105  244.787  321.050


(3) DESCRIPTION OF ESTIMATORS/MODELS:

Homogeneous Model: This model assumes that all species have the same incidence or detection probabilities. See Eq. (3.2) of Lee and Chao (1994) or Eq. (12a) in Chao and Chiu (2016b).

Chao2 (Chao, 1987): This approach uses the frequencies of uniques and duplicates to estimate the number of undetected species; see Chao (1987) or Eq. (11a) in Chao and Chiu (2016b).
     
Chao2-bc: A bias-corrected form for the Chao2 estimator; see Chao (2005).
  
iChao2: An improved Chao2 estimator; see Chiu et al. (2014).

ICE (Incidence-based Coverage Estimator): A non-parametric estimator originally proposed by Lee and Chao (1994) in the context of capture-recapture data analysis. The observed species are separated as frequent and infrequent species groups; only data in the infrequent group are used to estimate the number of undetected species. The estimated CV for species in the infrequent group characterizes the degree of heterogeneity among species incidence probabilities. See Eq. (12b) of Chao and Chiu (2016b), which is an improved version of Eq. (3.18) in Lee and Chao (1994). This model is also called Model(h) in capture-recapture literature where h denotes "heterogeneity".

ICE-1: A modified ICE for highly-heterogeneous cases.

1st order jackknife: It uses the frequency of uniques to estimate the number of undetected species; see Burnham and Overton (1978).

2nd order jackknife: It uses the frequencies of uniques and duplicates to estimate the number of undetected species; see Burnham and Overton (1978).

95% Confidence interval: A log-transformation is used for all estimators so that the lower bound of the resulting interval is at least the number of observed species. See Chao (1987).

(1) BASIC DATA INFORMATION:

                                         Variable Value
    Number of observed species                  D    34
    Number of sampling units                    T   121
    Total number of incidences                  U   461
    Coverage estimate for entire dataset        C 0.994
    CV for entire dataset                      CV 1.162

                                                      Variable Value
    Cut-off point                                            k    10
    Total number of incidences in infrequent group    U_infreq   115
    Number of observed species for infrequent group   D_infreq    23
    Estimated sample coverage for infrequent group    C_infreq 0.974
    Estimated CV for infrequent group in ICE         CV_infreq 0.384
    Estimated CV1 for infrequent group in ICE-1     CV1_infreq 0.412
    Number of observed species for frequent group       D_freq    11

                           Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10
    Incidence freq. counts  3  2  3  3  1  5  1  1  3   1


(2) SPECIES RICHNESS ESTIMATORS TABLE:

                              Estimate  s.e. 95%Lower 95%Upper
    Homogeneous Model           34.609 0.880   34.076   38.878
    Chao2 (Chao, 1987)          36.231 3.370   34.263   52.900
    Chao2-bc                    34.992 1.805   34.093   44.606
    iChao2 (Chiu et al. 2014)   36.723 2.403   34.615   46.053
    ICE (Lee & Chao, 1994)      35.064 1.371   34.153   41.398
    ICE-1 (Lee & Chao, 1994)    35.132 1.473   34.161   41.966
    1st order jackknife         36.975 2.434   34.731   46.103
    2nd order jackknife         37.975 4.193   34.730   55.652


(3) DESCRIPTION OF ESTIMATORS/MODELS:

Homogeneous Model: This model assumes that all species have the same incidence or detection probabilities. See Eq. (3.2) of Lee and Chao (1994) or Eq. (12a) in Chao and Chiu (2016b).

Chao2 (Chao, 1987): This approach uses the frequencies of uniques and duplicates to estimate the number of undetected species; see Chao (1987) or Eq. (11a) in Chao and Chiu (2016b).
     
Chao2-bc: A bias-corrected form for the Chao2 estimator; see Chao (2005).
  
iChao2: An improved Chao2 estimator; see Chiu et al. (2014).

ICE (Incidence-based Coverage Estimator): A non-parametric estimator originally proposed by Lee and Chao (1994) in the context of capture-recapture data analysis. The observed species are separated as frequent and infrequent species groups; only data in the infrequent group are used to estimate the number of undetected species. The estimated CV for species in the infrequent group characterizes the degree of heterogeneity among species incidence probabilities. See Eq. (12b) of Chao and Chiu (2016b), which is an improved version of Eq. (3.18) in Lee and Chao (1994). This model is also called Model(h) in capture-recapture literature where h denotes "heterogeneity".

ICE-1: A modified ICE for highly-heterogeneous cases.

1st order jackknife: It uses the frequency of uniques to estimate the number of undetected species; see Burnham and Overton (1978).

2nd order jackknife: It uses the frequencies of uniques and duplicates to estimate the number of undetected species; see Burnham and Overton (1978).

95% Confidence interval: A log-transformation is used for all estimators so that the lower bound of the resulting interval is at least the number of observed species. See Chao (1987).

(1) BASIC DATA INFORMATION:

                                         Variable Value
    Number of observed species                  D    76
    Number of sampling units                    T    18
    Total number of incidences                  U   142
    Coverage estimate for entire dataset        C  0.71
    CV for entire dataset                      CV 0.654

                                                      Variable Value
    Cut-off point                                            k    10
    Total number of incidences in infrequent group    U_infreq   142
    Number of observed species for infrequent group   D_infreq    76
    Estimated sample coverage for infrequent group    C_infreq  0.71
    Estimated CV for infrequent group in ICE         CV_infreq 0.662
    Estimated CV1 for infrequent group in ICE-1     CV1_infreq 0.891
    Number of observed species for frequent group       D_freq     0

                           Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10
    Incidence freq. counts 43 16  8  6  0  2  1  0  0   0


(2) SPECIES RICHNESS ESTIMATORS TABLE:

                              Estimate   s.e. 95%Lower 95%Upper
    Homogeneous Model          107.060 10.201   92.587  134.161
    Chao2 (Chao, 1987)         130.571 22.753  100.899  195.605
    Chao2-bc                   126.167 20.718   99.048  185.193
    iChao2 (Chiu et al. 2014)  139.901 15.602  115.874  178.408
    ICE (Lee & Chao, 1994)     133.595 20.793  105.003  190.374
    ICE-1 (Lee & Chao, 1994)   155.184 34.275  111.147  254.396
    1st order jackknife        116.611  8.886  102.581  138.048
    2nd order jackknife        141.448 14.872  118.160  177.598


(3) DESCRIPTION OF ESTIMATORS/MODELS:

Homogeneous Model: This model assumes that all species have the same incidence or detection probabilities. See Eq. (3.2) of Lee and Chao (1994) or Eq. (12a) in Chao and Chiu (2016b).

Chao2 (Chao, 1987): This approach uses the frequencies of uniques and duplicates to estimate the number of undetected species; see Chao (1987) or Eq. (11a) in Chao and Chiu (2016b).
     
Chao2-bc: A bias-corrected form for the Chao2 estimator; see Chao (2005).
  
iChao2: An improved Chao2 estimator; see Chiu et al. (2014).

ICE (Incidence-based Coverage Estimator): A non-parametric estimator originally proposed by Lee and Chao (1994) in the context of capture-recapture data analysis. The observed species are separated as frequent and infrequent species groups; only data in the infrequent group are used to estimate the number of undetected species. The estimated CV for species in the infrequent group characterizes the degree of heterogeneity among species incidence probabilities. See Eq. (12b) of Chao and Chiu (2016b), which is an improved version of Eq. (3.18) in Lee and Chao (1994). This model is also called Model(h) in capture-recapture literature where h denotes "heterogeneity".

ICE-1: A modified ICE for highly-heterogeneous cases.

1st order jackknife: It uses the frequency of uniques to estimate the number of undetected species; see Burnham and Overton (1978).

2nd order jackknife: It uses the frequencies of uniques and duplicates to estimate the number of undetected species; see Burnham and Overton (1978).

95% Confidence interval: A log-transformation is used for all estimators so that the lower bound of the resulting interval is at least the number of observed species. See Chao (1987).

(1) BASIC DATA INFORMATION:

                                         Variable Value
    Number of observed species                  D    76
    Number of sampling units                    T    18
    Total number of incidences                  U   142
    Coverage estimate for entire dataset        C  0.71
    CV for entire dataset                      CV 0.654

                                                      Variable Value
    Cut-off point                                            k    10
    Total number of incidences in infrequent group    U_infreq   142
    Number of observed species for infrequent group   D_infreq    76
    Estimated sample coverage for infrequent group    C_infreq  0.71
    Estimated CV for infrequent group in ICE         CV_infreq 0.662
    Estimated CV1 for infrequent group in ICE-1     CV1_infreq 0.891
    Number of observed species for frequent group       D_freq     0

                           Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10
    Incidence freq. counts 43 16  8  6  0  2  1  0  0   0


(2) SPECIES RICHNESS ESTIMATORS TABLE:

                              Estimate    s.e. 95%Lower 95%Upper
    Homogeneous Model          106.946  10.179   92.511  133.999
    Chao2 (Chao, 1987)         130.571  22.753  100.899  195.605
    Chao2-bc                   126.167  20.718   99.048  185.193
    iChao2 (Chiu et al. 2014)  139.901  15.602  115.874  178.408
    ICE (Lee & Chao, 1994)     133.661  20.827  105.028  190.539
    ICE-1 (Lee & Chao, 1994)   155.450  34.410  111.250  255.071
    1st order jackknife        116.611   8.886  102.581  138.048
    2nd order jackknife        141.448  14.872  118.160  177.598


(3) DESCRIPTION OF ESTIMATORS/MODELS:

Homogeneous Model: This model assumes that all species have the same incidence or detection probabilities. See Eq. (3.2) of Lee and Chao (1994) or Eq. (12a) in Chao and Chiu (2016b).

Chao2 (Chao, 1987): This approach uses the frequencies of uniques and duplicates to estimate the number of undetected species; see Chao (1987) or Eq. (11a) in Chao and Chiu (2016b).
     
Chao2-bc: A bias-corrected form for the Chao2 estimator; see Chao (2005).
  
iChao2: An improved Chao2 estimator; see Chiu et al. (2014).

ICE (Incidence-based Coverage Estimator): A non-parametric estimator originally proposed by Lee and Chao (1994) in the context of capture-recapture data analysis. The observed species are separated as frequent and infrequent species groups; only data in the infrequent group are used to estimate the number of undetected species. The estimated CV for species in the infrequent group characterizes the degree of heterogeneity among species incidence probabilities. See Eq. (12b) of Chao and Chiu (2016b), which is an improved version of Eq. (3.18) in Lee and Chao (1994). This model is also called Model(h) in capture-recapture literature where h denotes "heterogeneity".

ICE-1: A modified ICE for highly-heterogeneous cases.

1st order jackknife: It uses the frequency of uniques to estimate the number of undetected species; see Burnham and Overton (1978).

2nd order jackknife: It uses the frequencies of uniques and duplicates to estimate the number of undetected species; see Burnham and Overton (1978).

95% Confidence interval: A log-transformation is used for all estimators so that the lower bound of the resulting interval is at least the number of observed species. See Chao (1987).