R/OUT_savage.dickey.R

Defines functions get.sd

Documented in get.sd

#' Calculate Savage-Dickey ratio to test for significant rate heterogeneity in a fitted evorates model
#' 
#' 
#' This function calculates the Savage-Dickey Ratio (i.e., the ratio of posterior to prior probability
#' density) at \code{R_sig2} = 0 given an \code{evorates_fit} object. This is a relatively simple test
#' statistic for determining whether the data exhibits evidence for trait evolution rate heterogeneity.
#' The lower the ratio is, the more evidence for heterogeneity, with 1/3 typically being an appropriate
#' threshold for "substantial evidence".
#' 
#' 
#' @param fit An object of class "\code{evorates_fit}". Notably, this test is only applicable to fits
#' with an unconstrained \code{R_sig2} parameter!
#' 
#' 
#' @return The Savage-Dickey ratio
#' 
#' 
#' @details You can ensure your fitted evorates model has an unconstrained \code{R_sig2} parameter by
#' checking if \code{fit$call$constrain.Rsig2} is \code{TRUE}. Attempting to use this function with a
#' fit with a constrained \code{R_sig2} parameter results in an error.
#' 
#' Unfortunately, we can only approximate the posterior density of \code{R_sig2} about 0. Base R's
#' \code{density()} function doesn't seem to do a good job here because \code{R_sig2} posteriors
#' are bounded to the left at 0 and often have fat tails that interact poorly with kernel-based
#' approach of \code{density()}. We thus instead use penalized likelihood fitting of splines via the
#' \code{logspline} package. This approach seems to better approximate densities of
#' distributions with fat tails where sampling gets quite diffuse, and can handle bounded distributions
#' to boot. The function divides the resulting approximate posterior density
#' by the analytically known prior density given by \code{dcauchy(0, 0, fit$call$Rsig2_prior_sig2)}.
#' 
#' 
#' @seealso \link[logspline]{logspline} and \link[logspline]{dlogspline} from the \code{logspline} package.
#' 
#' 
#' @examples
#' #get whale/dolphin evorates fit
#' data("cet_fit")
#' 
#' #get Savage-Dickey ratio
#' sdr <- get.sd(cet_fit) #substantial evidence since ratio is below 1/3!
#' 
#' #simulate some data on same tree evolving under Brownian Motion
#' sim <- sim.evorates(cet_fit$call$tree,
#'                     Rsig2 = 0, #no rate heterogeneity
#'                     R0 = mean(cet_fit %means% "bg_rate"), #set rate to estimated empirical background rate
#'                     Ysig2 = mean(cet_fit %means% "Y_sig2")) #set tip error to estimated empirical tip error
#' #fit simulated data with unconstrained R_sig2 (takes a few minutes)
#' fit <- fit.evorates(cet_fit$call$tree, sim$trait.data, chains = 1)
#' sdr <- get.sd(fit) #should almost always be over 1/3, though you might get the rare false positive
#' #ratios for BM simulations often exceed 3/1, indicating substantial evidence AGAINST rate heterogeneity!
#' 
#' #fit simulated data with contrained R_sig2 (takes a little bit)
#' fit <- fit.evorates(cet_fit$call$tree, sim$trait.data, constrain.Rsig2 = TRUE, chains = 1)
#' sdr <- get.sd(fit) #returns error
#' 
#' 
#' @export
get.sd<-function(fit){
  #interesting--logspline internally changes names of evorates element
  #this is potentially indicative of a more general undesirable side effect
  if(fit$call$constrain.Rsig2){
    stop("fit needs unconstrained R_sig2 parameter to calculate Savage-Dickey ratio!")
  }
  0.5*dlogspline(0,logspline(.strip.par.class(.call.select(fit$chains,'R_sig2')),lbound=0))/
    dcauchy(0,0,fit$call$Rsig2_prior_sig)
}
bstaggmartin/backwards-BM-simulator documentation built on June 3, 2024, 5:51 p.m.