R/transformPhylo.ll.R

Defines functions transformPhylo.ll

Documented in transformPhylo.ll

#' @title Log-likelhood for models of trait evolution. 
#' @description Fits likelihood models for various models of continuous character evolution.
#' @param y A matrix of trait values.
#' @param phy An object of class \code{phylo} (see \pkg{ape}).
#' @param model The model of trait evolution (see details).
#' @param meserr A vector (or matrix) of measurement error for each tip. This is only applicable to univariate analyses.
#' @param kappa Value of kappa transform.
#' @param lambda Value of lambda transform.
#' @param delta Value of delta transform.
#' @param alpha Value of alpha (OU) transform.
#' @param psi Value of psi transform.
#' @param lambda.sp Speciation rate estimate for the tree.
#' @param branchLabels Branches on which different psi parameters are estimated in the "multipsi" model.
#' @param nodeIDs Integer - ancestral nodes of clades.
#' @param rateType If model="clade", a vector specifying if rate shift occurs in a clade ("clade") or on the single branch leading to a clade ("branch").
#' @param branchRates Numeric vector specifying relative rates for individual branches
#' @param timeRates The rates (from ancient to recent) for the timeSlice model
#' @param splitTime A split time (measured from the present, or most recent species) at which a shift in the rate occurs for the "timeSlice" model
#' @param acdcRate Value of ACDC transform
#' @param cladeRates Numeric vector specifying relative rates for clades or logical to indicate scalar is included in the 'modeslice' model (the scalar value is included in the mode.param argument with the 'modeslice' model).
#' @param covPIC Logical. For multivariate analyses, allow for co-variance between traits rates (TRUE) or no covariance in trait rates (FALSE). If FALSE, only the trait variances not co-variances are used.
#' @param cophenetic.dist a cophenetic distance matrix showing the absolute distance between taxa - only applicable for OU models run on non-ultrmetric trees. If null will be calculated internally, but supplying the data can speed up run time
#' @param vcv.matrix a variance-covariance matrix - only applicable for OU models run on non-ultrmetric trees. If null will be calculated internally, but supplying the data can speed up run time
#' @param mode.order The order of modes for the 'modeslice' model. Any combination of 'BM', 'OU', 'acdc', and 'kappa'
#' @param rate.var Allows rate variation in BM modes in the 'modeslice' model
#' @param mode.param Parameters for the modes of evoluton in the 'modeslice' model
#' @param mu Phylogenetic mean estimate, Mainly for internal use when using the variance-covariance method to calculate likelihood for non-ultrametric trees with the OU model
#' @param sigma.sq Brownian variance estimate, mainly for internal use when using the variance-covariance method to calculate likelihood for non-ultrametric trees with the OU model
#' @details This function fits likelihood models (see below) for continuous character evolution where the parameter values are set a priori. The function returns the log-likihood and the Brownian variance (or variance covariance matrix).
#' \itemize{
#' \item {model="bm"} {Brownian motion (constant rates random walk).}
#' \item {model="kappa"} {fits Pagel's kappa by raising all branch lengths to the power kappa. As kappa approaches zero, trait change becomes focused at branching events. For complete phylogenies, if kappa approaches zero this infers speciational trait change. Default bounds from ~0 - 1.}
#' \item {model="lambda"} {fits Pagel's lambda to estimate phylogenetic signal by multiplying all internal branches of the tree by lambda, leaving tip branches as their original length (root to tip distances are unchanged). Default bounds from ~0 - 1.}
#' \item {model="delta"} {fits Pagel's delta by raising all node depths to the power delta. If delta <1, trait evolution is concentrated early in the tree whereas if delta >1 trait evolution is concentrated towards the tips. Values of delta above one can be difficult to fit reliably. If a nodeIDs is supplied, the model will fit a delta model nested within a clade, with a BM fit to the rest of the tree. Default bounds from ~0 - 5.}
#' \item {model="OU"} {fits an Ornstein-Uhlenbeck model - a random walk with a central tendency proportional to alpha. High values of alpha can be interpreted as evidence of evolutionary constraints, stabilising selection or weak phylogenetic signal. It is often difficult to distinguish among these possibilities. If a nodeIDs is supplied, the model will fit a OU model nested within a clade, with a BM fit to the rest of the tree. For OU models, alternative optimisation are performed with different starting values (1e-8, 0.01, 0.1, 1, 5). Default bounds from ~0 - 10.}
#' \item {model="ACDC"} {fits a model to in which rates can exponentially increased or decrease through time (Blomberg et al. 2003). If the upper bound is < 0, the model is equivalent to the 'Early Burst' model of Harmon et al. 2010. If a nodeIDs is supplied, the model will fit a ACDC model nested within a clade, with a BM fit to the rest of the tree. Default rate parameter bounds from ln(1e-10) ~ ln(20) divided by the root age. Note this process starts on the stem branch leading to the MRCA of the common node, unlike the other methods that start at the common node.}
#' \item {model="trend"} {fits a model in which the expectated mean change through time is non-zero, signifying a directional evolution to a larger or smaller trait value. This model is only appliacble to non-ultrametric trees.}
#' \item {model="psi"} {fits a model to assess to the relative contributions of speciation and gradual evolution to a trait's evolutionary rate (Ingram 2010). Note that the algorithm will automatically estimate speciation and extinction estimates, and will incorporate estimates of 'hidden' speciation if death estimates are greater than 0. }
#' \item {model="multipsi"} {fits a model to assess to the relative contributions of speciation and gradual evolution to a trait's evolutionary rate but allows seperate values of psi fitted to seperate branches (Ingram 2010; Ingram et al. 2016). Note that the algorithm will automatically estimate speciation and extinction estimates, and will incorporate estimates of 'hidden' speciation if death estimates are greater than 0.}
#' \item {model="free"} {fits Mooers et al's free model where each branch has its own rate of trait evolution. This can be a useful exploratory analysis but it is slow due to the number of parameters, particularly for large trees. Default rate parameter bounds from ~0 - 200.}
#' \item {model="clade"} {fits a model where particular clades are a priori hypothesised to have different rates of trait evolution (see O'Meara et al. 2006; Thomas et al. 2006, 2009). Clades are specified using nodeIDs and are defined as the mrca node. Default rate parameter bounds from ~0 - 200.}
#' \item {model="tm1"} {fits "clade" models without any a priori assertion of the location of phenotypic diversification rate shifts. It uses the same AIC approach as the runMedusa function in the geiger package (runMedusa tests for shifts in the rate of lineage diversification). The algorithm first fits a constant-rate Brownian model to the data, it then works iteratively through the phylogeny fitting a two-rate model at each node in turn. Each two-rate model is compared to the constant rate model and the best two-rate model is retained. Keeping the location of this rate shift intact, it then repeats the procedure for a three-rate model and so on. The maximum number of rate shifts can be specified a priori using nSplits. Limits can be applied to the size (species richness) of clades on which to infer new rate shifts using minCladeSize. This can be useful to enable large trees to be handled but should be used cautiously since specifiying a large minimum clade size may result in biologically interesting nested rate shifts being missed. Equally, very small clade sizes may provide poor estimates of rate that may not be informative. Limits on the search can also be placed using restrictNode. This requires a list where each element of the list is a vector of tip names that define monophyletic groups. Rate shifts will not be searched for within any of the defined groups. Default rate parameter bounds from ~0 - 1000.}
#' \item {model="tm2"} {this model is similar to "tm1", however, at each node it assesses the fit of two models. The first model is exactly as per "tm1". The second model infers a rate shift on the single branch descending directly from a node but not on any of the descending branches thereafter. Only the best fitting single-branch or whole clade model is retained for the next iteration. If a single-branch shift is favoured, this infers either that there was a rapid shift in trait value along the stem leading to the crown group, or that the members of the clade have undergone parallel shifts. In either case, this can be considered as a change in mean, though separating a single early shift from a clade-parallel shift is not possible with this method. }
#' \item {model="timeSlice"} {A model in which all branch rates change at a time or times set a priori by the user. If  Default rate parameter bounds from ~0 - 1000. If splitTime=NULL, all 1 Ma (as defined by test Age) intervals from the root of the tree - 10 and the youngest tip + 10 will be included in the search. The +/- 10 Ma age can be modified using the argument boundaryAge. At each stage the best fitting model will be stored, and the search will continue until n shifts, with n shifts defined by nSplits. If a single value or vector is used for splitTime, only these ages are included in the search.}
#' \item {model="modeslice"} {A model in which all branch modes change at a time or times set a priori by the user.}
#' } 
#' @return brownianVariance Brownian variance (or covariance for multiple traits) given the data and phylogeny
#' @return logLikelihood The log-likelihood of the model and data
#' @seealso \code{\link{transformPhylo.ML}},\code{\link{transformPhylo.MCMC}}, \code{\link{transformPhylo.sim}}
#' @references Felsenstein J. 1973. Maximum-likelihood estimation of evolutionary trees from continuous characters. Am. J. Hum. Genet. 25, 471-492.
#' Felsenstein J. 1985. Phylogenies and the comparative method. American Naturalist 125, 1-15.
#' Freckleton RP & Jetz W. 2009. Space versus phylogeny: disentangling phylogenetic and spatial signals in comparative data. Proc. Roy. Soc. B 276, 21-30. 
#' @references Ingram T. 2011. Speciation along a depth gradient in a marine adaptive radiation. Proc. Roy. Soc. B. 278, 613-618.
#' @references Ingram T,  Harrison AD, Mahler L, Castaneda MdR, Glor RE, Herrel A, Stuart YE, and Losos JB. 2016. Comparative tests of the role of dewlap size in Anolis lizard speciation. Proc. Roy. Soc. B. 283, 20162199. #' Mooers AO, Vamosi S, & Schluter D. 1999. Using phylogenies to test macroevolutionary models of trait evolution: sexual selection and speciation in Cranes (Gruinae). American Naturalist 154, 249-259.
#' O'Meara BC, Ane C, Sanderson MJ & Wainwright PC. 2006. Testing for different rates of continuous trait evolution using likelihood. Evolution 60, 922-933
#' Pagel M. 1997. Inferring evolutionary processes from phylogenies. Zoologica Scripta 26, 331-348.
#' Pagel M. 1999 Inferring the historical patterns of biological evolution. Nature 401, 877-884.
#' Thomas GH, Meiri S, & Phillimore AB. 2009. Body size diversification in Anolis: novel environments and island effects. Evolution 63, 2017-2030.
#' @author Gavin Thomas, Mark Puttick
#' @examples
#' # Data and phylogeny
#' data(anolis.tree)
#' data(anolis.data)
#' 
#' # anolis.data is not matrix and contains missing data so put together matrix of
#' # relevant traits (here female and male snout vent lengths) and remove species 
#' # with missing data from the matrix and phylogeny
#' sorted.traits <- sortTraitData(anolis.tree, anolis.data,
#' c("Female_SVL", "Male_SVL"), log.trait=TRUE, pass.ultrametric=TRUE)
#'
#' tree <- sorted.traits$phy
#' traits <- sorted.traits$trait
#' 
#' # log likelihood of kappa = 0.1 or 1
#' transformPhylo.ll(traits, phy=tree, model="kappa", kappa=0.1)
#' transformPhylo.ll(traits, phy=tree, model="kappa", kappa=1)
#' 
#' # log likelihood of lambda = 0.01 or 1
#' transformPhylo.ll(traits, phy=tree, model="lambda", lambda=0.01)
#' transformPhylo.ll(traits, phy=tree, model="lambda", lambda=1)
#' 
#' # log likelihood of delta = 1.5 or 1
#' transformPhylo.ll(traits, phy=tree, model="delta", delta=1.5)
#' transformPhylo.ll(traits, phy=tree, model="delta", delta=1)
#' 
#' # log likelihood of alpha = 0.001 or 2
#' transformPhylo.ll(traits, phy=tree, model="OU", alpha=0.001)
#' transformPhylo.ll(traits, phy=tree, model="OU", alpha=2)
#'
#' @export
#' @import mvtnorm

transformPhylo.ll <- function(y=NULL, phy, model=NULL, meserr=NULL, kappa=NULL, lambda=NULL, delta=NULL, alpha=NULL, psi=NULL, lambda.sp = NULL, nodeIDs=NULL, rateType=NULL, branchRates=NULL, cladeRates=NULL, timeRates=NULL, splitTime=NULL, branchLabels = NULL, acdcRate=NULL,  covPIC = TRUE, cophenetic.dist=NULL, vcv.matrix=NULL, mode.order=NULL, mode.param=NULL, rate.var=NULL, mu=NULL, sigma.sq=NULL) {
	
		model <- tolower(model)
  all.models <- c("bm", "kappa", "lambda", "delta", "ou", "acdc", "psi", "multipsi", "free", "clade", "timeslice", "modeslice")
  if (any(is.na((match(model, all.models))))) stop(paste(model, "not recognised - please provide one of", paste0(all.models, collapse = ", ")))
		
		contemp.ou <- TRUE
		if(model=="ou") contemp.ou <- is.ultrametric(phy)
		
	switch(model,		  
		   
		   "bm" = {
		   transformPhy <- transformPhylo(phy=phy, model="bm", meserr = meserr, y=y)
		   },
		   
		   "kappa" = {
		   transformPhy <- transformPhylo(phy=phy, model="kappa", kappa=kappa, nodeIDs=nodeIDs, meserr = meserr, y=y)
		   },
		   
		   "lambda" = {
		   transformPhy <- transformPhylo(phy=phy, model="lambda", lambda=lambda, meserr = meserr, y=y)
		   },
		   
		   "delta" = {
		   transformPhy <- transformPhylo(phy=phy, model="delta", delta=delta, nodeIDs=nodeIDs, meserr = meserr, y=y)
		   },
		   
		   "free" = {
		   transformPhy <- transformPhylo(phy=phy, model="free", branchRates=branchRates, meserr = meserr, y=y)
		   },
		   
		   "clade" = {
		   transformPhy <- transformPhylo(phy=phy, model="clade", nodeIDs=nodeIDs, cladeRates=cladeRates, rateType=rateType, meserr = meserr, y=y)
		   },
		   
		   "ou" = {
		   	if(contemp.ou) {
		   		transformPhy <- transformPhylo(phy=phy, model="OU", alpha=alpha, nodeIDs=nodeIDs, meserr = meserr, y=y)
		   	} else {
		   		transformPhy <- transformPhylo(phy=phy, model="OU", alpha=alpha, nodeIDs=nodeIDs, meserr = meserr, y=y, cophenetic.dist=cophenetic.dist, vcv.matrix=vcv.matrix)
		   		}
		   },
		   
		   "psi" = {
		   transformPhy <- transformPhylo(phy=phy, model="psi", psi=psi, meserr = meserr, y=y, lambda.sp=lambda.sp)
		   },
		   
		   "multipsi" = {
        	transformPhy <- transformPhylo(phy = phy, branchLabels = branchLabels, model = "multipsi", psi = psi, meserr = meserr, y = y, lambda.sp = lambda.sp)
		   },
		   
		   "timeslice" = {
		   transformPhy <- transformPhylo(phy=phy, model="timeSlice", timeRates=timeRates,  splitTime=splitTime, meserr = meserr, y=y)
		   },
		   
		   	"acdc" = {
		   transformPhy <- transformPhylo(phy=phy, model="acdc", acdcRate=acdcRate, nodeIDs=nodeIDs, cladeRates=cladeRates, y=y, meserr = meserr)
		   },
		   
		   	"modeslice"= {
	   		transformPhy <- transformPhylo(phy=phy, model="modeslice", mode.order=mode.order, mode.param=mode.param, meserr = meserr, splitTime=splitTime, rate.var=rate.var, cladeRates=cladeRates, y=y)
		  }
		 )
	
		if(methods::is(transformPhy)[1] == "phylo") {
			return(likTraitPhylo(y=y, phy=transformPhy, covPIC = covPIC))
		} else {
			transformPhy <- transformPhy * sigma.sq
			if(length(mu) == 1) {
				mean.est <- rep(mu, ncol(transformPhy))
			} else {
				mean.est <- mu
			}
	   		return(mvtnorm::dmvnorm(y[,1], mean=mean.est, sigma=transformPhy, log=TRUE))	
			}
		}

Try the motmot package in your browser

Any scripts or data that you put into this service are public.

motmot documentation built on Jan. 11, 2020, 9:15 a.m.