DiveMaster: DiveMaster

View source: R/DivE.R

DiveMasterR Documentation

DiveMaster

Description

Implements the DivE diversity estimator.

Usage

DiveMaster(models, init.params, param.ranges, main.samp,
           tot.pop=(100*(DivSampleNum(main.samp,2)[1])), numit=10^5,
           varleft=1e-8, subsizes=6, dssamps=list(), nrf=1, minrarefac=1,
           NResamples=1000, minplaus=10,
           precision.lv=c(0.0001, 0.005, 0.005), plaus.pen=500,
           crit.wts=c(1.0, 1.0, 1.0, 1.0), fitloops=2, numpred=5)

Arguments

models

list of models; each model is written as a function: function(x, params) { with(as.list(params), <function of params>)}. Examples are given in the ModelSet data file as part of the DivE package.

init.params

list of matrices of initial seed model parameters. For each matrix, each row represents a given parameter set; each column represents a parameter value. Column names must match parameter names (params) in the corresponding model in the list models. Examples are given in the ParamSeeds data file as part of the DivE package.

param.ranges

list of matrices of lower and upper model parameters bounds. Used for the modFit function. The first and second row corresponds to the lower and upper bounds respectively; each column represents a parameter value. Column names must match parameter names (params) in the corresponding model in the list models. Examples are given in the param.ranges data file as part of the DivE package.

main.samp

the main sample, either as a 2-column data.frame (species ID, count of species), or a vector of species IDs.

tot.pop

total population (integer); default set to 100x the main.samp size.

numit

control argument passed to optimisation routine; the maximum number of iterations that modFit will perform. See modFit for details.

varleft

control argument passed to optimisation routine; see modFit for details.

subsizes

either number of nested subsamples (integer, must be 2 or greater), or a vector of nested sample lengths. If the former, then the vector of sample lengths will be created using the DivSampleNum function.

dssamps

list of user specified rarefaction data DivSubsamples objects. The length of each component vector of each object in the list must correspond to the vector of nested sample lengths (as defined by the user in subsizes).

nrf

difference between lengths of successive rarefaction datapoints.

minrarefac

minimum rarefaction x-axis value. This argument is not used if list of DivSubsamples object is specified in dssamps.

NResamples

number of resamples used to calculate the rarefaction data. This parameter is not used if list of DivSubsamples object is specified in dssamps.

minplaus

lower x-axis bound for plausibility check.

precision.lv

vector of precision level values for each criterion: 1. discrepancy – mean percentage error between rarefaction data points and model predicion, 2. Sample accuracy – percentage error between observed diversity of full rarefaction data and estimated diversity of full data from subsample, 3. local similarity. The scores for each criteria are defined as 1 + (multiples of bin sizes)

plaus.pen

penalty score for breaking the plausibility criterion: a model fit should be monotonically increasing and should have a slowing rate of species accumulation.

crit.wts

vector of weights of each of the four scoring criteria – fit, accuracy, similarity, plausibility. Default is c(1,1,1,1).

fitloops

number of fitting rounds performed for each model. In each round of fitting, the initial seed parameter values for each model will be the fitted parameters of the previous fitting run. This parameter has a significant impact on the computational time. The ‘sweet spot’ is 2.

numpred

number of topscoring models used for diversity prediction. Default is 5.

Details

This is the master function of the DivE estimator. The default operation is a combination of four steps. 1. Generate a list of nested samples lengths from the main sample. 2. For each nested subsample, generate a vector of rarefaction data and their associated mean species diversity. 3. Fit to the generated data a set of models. 4. Evaluate the fits according to the DivE diversity estimation methodology and compare the scores across models and fitting criteria.

A list of DiveMaster objects, each representing the fits to different sets of models, can combined into a single DiveMaster object using the CombDM function. This is useful when running the DivE estimator with the full set of 58 models in a single run is not possible.

One can estimate the diversity for a given population using the PopDiversity function where the arguments are the Divemaster object and the population size respectively.

Value

A list of objects:

model.score

a matrix of aggregated model scores

fmm

a list of fitsingMod objects corresponding to the list of fitted models

ssm

a matrix of scores of the fit of the models corresponding to the list of fitted models

estimate

the estimate of species richness (number of species/classes or diversity) at population size tot.pop. This is the geometric average of the models corresponding to the top-five model scores

lower_estimate

as per estimate value, but the lowest prediction amongst the models having the top-five scores

upper_estimate

as per estimate value, but the highest prediction amongst the models having the top-five scores

models

list of original input models

m

number of topscoring models used for diversity estimate

Author(s)

Daniel J. Laydon, Aaron Sim, Charles R.M. Bangham, Becca Asquith

References

Laydon, D. J., Melamed, A., Sim, A., Gillet, N. A., Sim, K., Darko, S., Kroll, S., Douek, D. C., Price, D., Bangham, C. R. M., Asquith, B., Quantification of HTLV-1 clonality and TCR diversity, PLOS Comput. Biol. 2014

See Also

FitSingleMod, ScoreSingleMod

Examples

require(DivE)
data(Bact1)
data(ModelSet)
data(ParamSeeds)
data(ParamRanges)

testmodels <- list()
testmeta <- list()
paramranges <- list()

# Choose a single model 
testmodels <- c(testmodels, ModelSet[1])
#testmeta[[1]] <- (ParamSeeds[[1]]) # Commented out for sake of brevity)
testmeta[[1]] <- matrix(c(0.9451638, 0.007428265, 0.9938149, 1.0147441, 0.009543598, 0.9870419),
                        nrow=2, byrow=TRUE, dimnames=list(c(), c("a1", "a2", "a3"))) # Example seeds
paramranges[[1]] <- ParamRanges[[1]]


# Create DivSubsamples object (NB: For quick illustration only -- not default parameters)
dss_1 <- DivSubsamples(Bact1, nrf=2, minrarefac=1, maxrarefac=40, NResamples=5)
dss_2 <- DivSubsamples(Bact1, nrf=2, minrarefac=1, maxrarefac=65, NResamples=5)
dss <- list(dss_2, dss_1)

# Implement the function (NB: For quick illustration only -- not default parameters)
out <- DiveMaster(models=testmodels, init.params=testmeta, param.ranges=paramranges,
                  main.samp=Bact1, subsizes=c(65, 40), NResamples=5, fitloops=1,
                  dssamp=dss, numit=2, varleft=10)

# DiveMaster Outputs
out
out$estimate

out$fmm$logistic

out$fmm$logistic$global

out$ssm

summary(out)

## Combining two DiveMaster objects (assuming a second object 'out2'):
# out3 <- CombDM(list(out, out2))

## To calculate the diversity for a different population size
# PopDiversity(dm=out, popsize=10^5, TopX=1)


DivE documentation built on Oct. 14, 2023, 5:08 p.m.