View source: R/OUphylregression.R
OU_phylreg | R Documentation |
The OU_phylreg
function does a phylogenetic regression
for given response and design matrices under a multivariate OU model evolving on the phylogeny.
The user is recommended to install the suggested package
PCMBaseCpp which significantly speeds up the calculations (see Details).
OU_phylreg(mY, mD, phyltree, modelParams, regimes = NULL, kY = NULL, M.error = NULL,
signif_level = 0.05, regimes.times = NULL, root.regime = NULL, b_GLSB = FALSE,
b_GLSX0 = FALSE, signsB = NULL, signsvX0 = NULL, estimate.root.state = FALSE)
mY |
A matrix with the rows corresponding to the tip species while the columns correspond to the traits.
The rows should be named by species |
mD |
A design matrix with the rows corresponding to the traits in the tips species while the columns correspond to
the unknown regression variables. The number or rows have to correspond to the number of elements in
|
phyltree |
The phylogeny in |
modelParams |
List of model parameters of the BM/OUOU/OUBM model as |
regimes |
A vector or list of regimes. If vector then each entry corresponds to each of the branches of |
kY |
Number of "Y" (response) variables if the considered model is an OUBM one. The first |
M.error |
An optional measurement error covariance structure. The measurement errors between species are assumed independent. The program tries to recognize the structure of the passed matrix and accepts the following possibilities :
From version |
signif_level |
The significance level to be taken when calculating regression confidence intervals, i.e.
|
regimes.times |
A list of vectors for each tree node, it starts with |
root.regime |
The regime at the root of the tree. If not given, then it is taken as the regime that is present
on the daughter lineages stemming from the root and is the most frequent one in the |
b_GLSB |
If the evolutionary model is an OUBM one (and |
b_GLSX0 |
If the evolutionary model is an OUBM or BM one (and |
signsB |
A matrix of constraints on the estimation of |
signsvX0 |
A matrix of constraints on the estimation of |
estimate.root.state |
Should the root state be estimate |
The matrix algebra calculations are done using the likelihood function offerred by the PCMBase package. However, there is a C++ backend, PCMBaseCpp. If it is not available, then the likelihood is calculated slower using pure R. However, with the calculations in C++ up to a 100-fold increase in speed is possible (more realistically 10-20 times). The PCMBaseCpp package is available from https://github.com/venelin/PCMBaseCpp.
For a given input data matrix, mY
, the function considers the stacking of it by rows
(i.e. stacking species by species). Let Y = vec(mY
), i.e. Y<-c(t(mY))
,
V be the between-species-between-traits variance-covariance matrix (under the parameters passed
in modelParams
). The function calculates the value of the generalized least squares
estimator (not directly, but as a transformation of the likelihood provided by PCMBase)
v=(D^{T}V^{-1}D)^{-1}D^{T}V^{-1}Y.
The user can provide the design matrix directly or if mD
is NA
, then the design matrix
induced by the evolutionary model in modelParams
is assumed. The following parameters
can be estimated: vX0
(if b_GLSX0
is TRUE
, BM model); mPsi
,
vY0
(if estimate.root.state
is TRUE
, otherwise set at optimum) for OUOU model;
vX0
(if b_GLSX0
is TRUE
) mPsi
, vY0
(if estimate.root.state
is TRUE
, otherwise set at optimum), B
(if b_GLSB
is TRUE
). One can
constrain (some of) the elements of the matrices to be estimated to be postive, negative or
equal to some value. For B
and vX0
this was described in the description of the
arguments of signsB
and signsvX0
. For mPsi
and vY0
one does this
in the respective entries of modelParams
. There matrix entries can be set to "+"
,
"-"
, NA
or some specific value. In the OUBM case the model specfic design matrix
is not derived from the conditional expectation of all of the responses on all of the predictors,
but from the conditional expectations of each tip species independently (as if V were block
diagonal). This is as the joint condtional expecation design matrix cannot be calculated
at the moment in an efficient manner and would cause a serious computational bottleneck.
However this only makes a difference if B
is to be estimated inside the GLS.
Special support is given if one wants to compute a phylogenetically weighted mean.
If mD
is set to "phylaverage"
, then it is calculated as
D_p=1_{n}\otimes Id_{k},
where 1_{n}
is a column vector of n ones and Id_{k}
is
the identity matrix with rows and columns equalling the number or columns of mY
.
A list with the following entries
The regression estimates
The covariance matrix between regression estimates.
The confidence intervals for each estimated parameter.
The model parameters updated if anything was estimated from them in the procedure.
The used or calculated design matrix.
The residual sum of squares.
R2, where the alternative model is the sample average.
R2, where the alternative model is the phylogenetically weighted sample average, i.e. the design matrix
is D_p
.
The RSS with respect to the sample average.
The RSS with respect to the phylogenetically weighted sample average.
The phylogeny used, returned as in the estimation procedure some additional fields are calculated. This could
help in a speed up if the OU_phylreg
is used in some iterative procedure.
Krzysztof Bartoszek
Bartoszek, K. and Pienaar, J. and Mostad. P. and Andersson, S. and Hansen, T. F. (2012) A phylogenetic comparative method for studying multivariate adaptation. Journal of Theoretical Biology 314:204-215.
Hansen, T.F. and Bartoszek, K. (2012) Interpreting the evolutionary regression: the interplay between observational and biological errors in phylogenetic comparative studies. Systematic Biology 61(3):413-425.
RNGversion(min(as.character(getRversion()),"3.6.1"))
set.seed(12345, kind = "Mersenne-Twister", normal.kind = "Inversion")
### We will first simulate a small phylogenetic tree using functions from ape.
### For simulating the tree one could also use alternative functions, e.g. sim.bd.taxa
### from the TreeSim package
phyltree<-ape::rtree(5)
## The line below is not necessary but advisable for speed
phyltree<-phyltree_paths(phyltree)
### Define a vector of regimes.
regimes<-c("small","small","large","small","small","large","large","large")
### Define SDE parameters to be able to simulate data under the OUOU model.
## 3D model
## OUOUparameters<-list(vY0=matrix(c(1,-1,0.5),nrow=3,ncol=1),
## A=rbind(c(9,0,0),c(0,5,0),c(0,0,1)),mPsi=cbind("small"=c(1,-1,0.5),"large"=c(-1,1,0.5)),
## Syy=rbind(c(1,0.25,0.3),c(0,1,0.2),c(0,0,1)))
## 2D model used to reduce running time on CRAN
OUOUparameters<-list(vY0=matrix(c(1,-1),nrow=2,ncol=1),
A=rbind(c(9,0),c(0,5)),mPsi=cbind("small"=c(1,-1),"large"=c(-1,1)),
Syy=rbind(c(1,0.25),c(0,1)))
### Now simulate the data.
OUOUdata<-simulOUCHProcPhylTree(phyltree,OUOUparameters,regimes,NULL)
OUOUdata<-OUOUdata[phyltree$tip.label,,drop=FALSE]
OUOUparameters_reg<-OUOUparameters
OUOUparameters_reg$mPsi<-apply(OUOUparameters_reg$mPsi,c(1,2),function(x){NA})
OUOUparameters_reg$vY0<-apply(OUOUparameters_reg$vY0,c(1,2),function(x){NA})
## estimate parameters under OUOU model
OU_phylreg(OUOUdata, NA, phyltree, OUOUparameters_reg, regimes=regimes,
kY=NULL, M.error=NULL)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.