Calculate probability of ancestral states below a speciation event, given probabilities of the states on each descendant branch

Description

This function, given parameters on the Relative weight of different geographic range inheritance scenarios at cladogenesis (speciation) events, calculates the probability of each possible ancestral state given the probabilities of each possible combination of tip states.

Usage

1
2
3
4
5
6
7
  rcpp_calc_anclikes_sp(Rcpp_leftprobs, Rcpp_rightprobs, l,
    s = 1, v = 1, j = 0, y = 1, dmat = NULL,
    maxent01s = NULL, maxent01v = NULL, maxent01j = NULL,
    maxent01y = NULL,
    max_minsize_as_function_of_ancsize = NULL,
    Rsp_rowsums = rep(1, length(Rcpp_leftprobs)),
    printmat = FALSE)

Arguments

Rcpp_leftprobs

Probabilities of the states at the base of the left descendant branch

Rcpp_rightprobs

Probabilities of the states at the base of the right descendant branch

l

List of state indices (0-based)

s

Relative weight of sympatric "subset" speciation. Default s=1 mimics LAGRANGE model.

v

Relative weight of vicariant speciation. Default v=1 mimics LAGRANGE model.

j

Relative weight of "founder event speciation"/jump speciation. Default j=0 mimics LAGRANGE model.

y

Relative weight of fully sympatric speciation (range-copying). Default y=1 mimics LAGRANGE model.

dmat

If given, a matrix of rank numareas giving multipliers for the probability of each dispersal event between areas. Default NULL, which sets every cell of the dmat matrix to value 1. Users may construct their own parameterized dmat (for example, making dmat a function of distance) for inclusion in ML or Bayesian analyses.

maxent01s

Matrix giving the relative weight of each possible descendant rangesize for the smaller range, for a given ancestral rangesize, for a subset-sympatric speciation event. Default is NULL, which means the script will set up the LAGRANGE model (one descendent always has range size 1).

maxent01v

Matrix giving the relative weight of each possible descendant rangesize for the smaller range, for a given ancestral rangesize, for a vicariance speciation event. Default is NULL, which means the script will set up the LAGRANGE model (one descendent always has range size 1).

maxent01j

Matrix giving the relative weight of each possible descendant rangesize for the smaller range, for a given ancestral rangesize, for a founder-event speciation event. Default is NULL, which means the script will set up the LAGRANGE model (one descendent always has range size 1).

maxent01y

Matrix giving the relative weight of each possible descendant rangesize for the smaller range, for a given ancestral rangesize, for a full-sympatric (range-copying) speciation event. Default is NULL, which means the script will set up the LAGRANGE model (one descendent always has range size 1).

max_minsize_as_function_of_ancsize

If given, any state with a range larger that this value will be given a probability of zero (for the branch with the smaller rangesize). This means that not every possible combination of ranges has to be checked, which can get very slow for large state spaces.

Rsp_rowsums

A vector of size (numstates) giving the sum of the relative probabilites of each combination of descendant states, assuming the probabilities of the left- and right-states are all equal (set to 1). This is thus the sum of the weights, and dividing by this normalization vector means that the each row of the speciation probability matrix will sum to 1. Default assumes the weights sum to 1 but this is not usually the case. Rsp_rowsums need only be calculated once per tree+model combination, stored, and then re-used for each node in the tree, yielding significant time savings.

printmat

Should the probability matrix output be printed to screen? (useful for debugging, but can be dramatically slow in R.app for some reason for even moderate numbers of states; perhaps overrunning the line length...)

Details

The Python/C++ program LAGRANGE (Ree & Smith 2008) gives a fixed equal probability to each range-inheritance scenario it allows:

(1) sympatric speciation with 1 area (e.g. A –> A,A);
(2) sympatric speciation where one species inherits the ancestral range, and the other inherits a 1-area subset of the ancestral range (e.g. ABC –> ABC,B);
(3) vicariant speciation with one daughter occupying an area of size 1 (e.g. ABCD –> ACD,B)

For example, if the ancestral range is ABC, the possible daughters are:

(Left, Right)

Vicariance: A,BC AB,C AC,B BC,A C,AB B,AC

Sympatric subset: A,ABC B,ABC C,ABC ABC,A ABC,B ABC,C

There are 12 possibilities, so LAGRANGE would give each a probability of 1/12, conditional on the ancestor having range ABC. All other imaginable scenarios are given probability 0 – e.g., sympatric speciation of a widespread range (ABC –> ABC,ABC), or jump dispersal leading to founder-event speciation (ABC –> ABC,D).

In BioGeoBEARS, the relative probability (or weight) of these categories is set by the s (sympatric-subset), v (vicariance), j (jump/founder-event), and y (sympatric-range-copying) parameters. These parameters do not have to sum to 1, they just give the relative weight of an event of each type. E.g., if s=1, v=1, j=0, y=1, then each allowed sympatric-range-copying, sympatric-subset, and vicariance event is given equal probability (this is the LAGRANGE cladogenesis model) .

The rcpp_calc_anclikes_sp function gets slow for large state spaces, as every possible combination of states at Left and Right branches is checked. Even in C++ this will get slow, as the (number of states) = 2^(number of areas), and as the number of possible combinations of (ancestor, left,right) states is (number of states)*(number of states)*(number of states).

Note: the maxent parameters allow the user to specify the probability distribution for different range sizes of the smaller-ranged descendant lineage. The defaults set these parameters so that the LAGRANGE model is implemented (the smaller descendant always has range size 1).

See rcpp_calc_anclikes_sp_COOprobs and rcpp_calc_anclikes_sp_COOweights_faster for successively faster solutions to this problem.

This is the byte-compiled version of rcpp_calc_anclikes_sp_prebyte. rcpp_calc_anclikes_sp is byte-compiled, which (might) make it faster.

For information on byte-compiling, see http://www.r-statistics.com/2012/04/speed-up-your-r-code-using-a-just-in-time-jit-compiler/ and cmpfun in the compiler package.

Value

prob_ancestral_states The probabilities of the ancestral states.

Author(s)

Nicholas Matzke matzke@berkeley.edu

References

Matzke N (2012). "Founder-event speciation in BioGeoBEARS package dramatically improves likelihoods and alters parameter inference in Dispersal-Extinction-Cladogenesis (DEC) analyses." _Frontiers of Biogeography_, *4*(suppl. 1), pp. 210. ISSN 1948-6596, Poster abstract published in the Conference Program and Abstracts of the International Biogeography Society 6th Biannual Meeting, Miami, Florida. Poster Session P10: Historical and Paleo-Biogeography. Poster 129B. January 11, 2013, <URL: http://phylo.wikidot.com/matzke-2013-international-biogeography-society-poster>.

Ree RH and Smith SA (2008). "Maximum likelihood inference of geographic range evolution by dispersal, local extinction, and cladogenesis." _Systematic Biology_, *57*(1), pp. 4-14. <URL: http://dx.doi.org/10.1080/10635150701883881>, <URL: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=18253896>.

See Also

rcpp_calc_anclikes_sp, rcpp_calc_anclikes_sp_COOprobs, rcpp_calc_anclikes_sp_COOweights_faster

Examples

1
2
3
4
5
# For the basic logic of a probablistic cladogenesis model, see
?rcpp_calc_anclikes_sp

# For examples of running the functions, see the comparison of all functions at:
# ?cladoRcpp

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.