OU_RSS: Calculates the RSS under a (multivariate) phylogenetic OU...

View source: R/OUphylregression.R

OU_RSSR Documentation

Calculates the RSS under a (multivariate) phylogenetic OU model of evolution

Description

The OU_RSS function calculates the residual sum of squares (RSS) for given data under a multivariate OU model evolving on the phylogeny. The user is recommended to install the suggested package PCMBaseCpp which significantly speeds up the calculations (see Details).

Usage

OU_RSS(mY, phyltree, modelParams, M.error = NULL, do_centre = NA,
regimes = NULL, regimes.times = NULL, root.regime = NULL)

Arguments

mY

A matrix with the rows corresponding to the tip species while the columns correspond to the traits. The rows should be named by species
(field phyltree$tip.label), if not, then a warning is thrown and the order of the species is assumed to be the same as the order in which the species are in the phylogeny (i.e. correspond to the node indices 1:n, where n is the number of tips). The columns should be named by traits, otherwise a warning is thrown and generic names are generated.

phyltree

The phylogeny in phylo format. The tree can be obtained from e.g. a nexus file by the read.nexus() function from the ape package. The "standard" ape node indexing is assumed: for a tree with n tips, the tips should have indices 1:n and the root index n+1. The root.edge field is ignored.

modelParams

List of model parameters of the BM/OUOU/OUBM model as ParamsInModel part of output of BrownianMotionModel/ouchModel/mvslouchModel.

M.error

An optional measurement error covariance structure. The measurement errors between species are assumed independent. The program tries to recognize the structure of the passed matrix and accepts the following possibilities :

  • a single number that is a common measurement error for all tips and species,

  • a m element vector with each value corresponding to a variable, measurement errors are independent between variables and each species is assumed to have the same measurement errors,

  • a m x m ((number of variables) x (number of variables)) matrix, all species will have the same measurement error,

  • a list of length n (number of species), each list element is the covariance structure for the appropriate (numbering according to tree) species, either a single number (each variable has same variance), vector (of length m for each variable), or m x m matrix, the order of the list has to correspond to the order of the nodes in the phyltree object,

  • NULL no measurement error.

From version 2.0.0 of mvSLOUCH it is impossible to pass a single joint measurement error matrix for all the species and traits.

do_centre

Should the data, mY, be centred (admissable values are "average", "phylaverage" or "evolutionary_model") or not (NA). If "average", then each column (trait) is centred by its arithmetic average, if "evolutionary_model", then mY is centred by the expectation under the evolutoniary model (inferred from modelParams) and if "phylaverage", then mY is centred by a phylogenetically weighted arithmetic average (see Details).

regimes

A vector or list of regimes. If vector then each entry corresponds to each of the branches of phyltree, i.e. to each row of phyltree$edge. If list then each list entry corresponds to a tip node and is a vector for regimes on that lineage. If NULL, then a constant regime is assumed on the whole tree.

regimes.times

A list of vectors for each tree node, it starts with 0 and ends with the current time of the species. In between are the times where the regimes (niches) changed. If NULL then each branch is considered to be a regime.

root.regime

The regime at the root of the tree. If not given, then it is taken as the regime that is present on the daughter lineages stemming from the root and is the most frequent one in the regimes vector. If more than one regime has the same maximum frequency, then alphabetically first one of the maximum ones is taken.

Details

The matrix algebra calculations are done using the likelihood function offerred by the PCMBase package. However, there is a C++ backend, PCMBaseCpp. If it is not available, then the likelihood is calculated slower using pure R. However, with the calculations in C++ up to a 100-fold increase in speed is possible (more realistically 10-20 times). The PCMBaseCpp package is available from https://github.com/venelin/PCMBaseCpp.

For a given input data matrix, mY, the function considers the stacking of it by rows (i.e. stacking species by species). Let Y = vec(mY), i.e. Y<-c(t(mY)), V be the between-species-between-traits variance-covariance matrix (under the parameters passed in modelParams) and v a centring vector (if do_centre is NA, then v=0). The function calculates the value of the quadratic form

(Y-v)^{T}V^{-1}(Y-v).

A special centring is when do_centre equals "phylaverage". In this situation the centring vector is a phylogenetically weighted average, i.e.

v=(D^{T}V^{-1}D)^{-1}D^{T}V^{-1}Y ,

where denoting 1_{n} as a column vector of n ones and Id_{k} as the identity matrix with rows and columns equalling the number or columns of mY,

D=1_{n}\otimes Id_{k}.

Value

The value of the residual sum of squares, quadratic form with respect to the between-species-between-traits precision matrix. Also the used phylogeny is returned.

Author(s)

Krzysztof Bartoszek

Examples

RNGversion(min(as.character(getRversion()),"3.6.1"))
set.seed(12345, kind = "Mersenne-Twister", normal.kind = "Inversion")
### We will first simulate a small phylogenetic tree using functions from ape.
### For simulating the tree one could also use alternative functions, e.g. sim.bd.taxa 
### from the TreeSim package
phyltree<-ape::rtree(5)

## The line below is not necessary but advisable for speed
phyltree<-phyltree_paths(phyltree)

### Define a vector of regimes.
regimes<-c("small","small","large","small","small","large","large","large")

### Define SDE parameters to be able to simulate data under the OUOU model.
## 3D model
## OUOUparameters<-list(vY0=matrix(c(1,-1,0.5),nrow=3,ncol=1),
## A=rbind(c(9,0,0),c(0,5,0),c(0,0,1)),mPsi=cbind("small"=c(1,-1,0.5),"large"=c(-1,1,0.5)),
## Syy=rbind(c(1,0.25,0.3),c(0,1,0.2),c(0,0,1)))
## 2D model used to reduce running time on CRAN
OUOUparameters<-list(vY0=matrix(c(1,-1),nrow=2,ncol=1),
A=rbind(c(9,0),c(0,5)),mPsi=cbind("small"=c(1,-1),"large"=c(-1,1)),
Syy=rbind(c(1,0.25),c(0,1)))

### Now simulate the data.
OUOUdata<-simulOUCHProcPhylTree(phyltree,OUOUparameters,regimes,NULL)
OUOUdata<-OUOUdata[phyltree$tip.label,,drop=FALSE]

## below will return the RSS under the assumed OUOU model of evolution
OU_RSS(OUOUdata, phyltree, OUOUparameters, M.error=NULL, 
do_centre="evolutionary_model", regimes = regimes)

mvSLOUCH documentation built on Nov. 21, 2023, 1:08 a.m.