RRphylo: Evolutionary rates computation along phylogenies

RRphyloR Documentation

Evolutionary rates computation along phylogenies

Description

The function RRphylo (Castiglione et al. 2018) performs the phylogenetic ridge regression. It takes a tree and a vector of tip data (phenotypes) as entries, calculates the regularization factor, produces the matrices of tip to root (makeL), and node to root distances (makeL1), the vector of ancestral state estimates, the vector of predicted phenotypes, and the rates vector for all the branches of the tree. For multivariate data, rates are given as both one vector per variable, and as a multivariate vector obtained by computing the Euclidean Norm of individual rate vectors.

Usage

RRphylo(tree,y,cov=NULL,rootV=NULL,aces=NULL,x1=NULL,
  aces.x1=NULL,clus=0.5,verbose=FALSE)

Arguments

tree

a phylogenetic tree. The tree needs not to be ultrametric or fully dichotomous.

y

either a single vector variable or a multivariate dataset. In any case, y must be named. In case of categorical variable, this should be supplied to the function as a numeric vector.

cov

the covariate to be indicated if its effect on the rates must be accounted for. In this case residuals of the covariate versus the rates are used as rates. 'cov' must be as long as the number of nodes plus the number of tips of the tree, which can be obtained by running RRphylo on the covariate as well, and taking the vector of ancestral states and tip values to form the covariate, as in the example below. See RRphylo vignette - covariate for details.

rootV

phenotypic value (values if multivariate) at the tree root. If rootV=NULL the function 'learns' about the root value from the 10% tips being closest in time to the tree root, weighted by their temporal distance from the root itself (close tips phenotypes weigh more than more distant tips).

aces

a named vector (or matrix if y is multivariate) of ancestral character values at nodes. Names correspond to the nodes in the tree. See RRphylo vignette - aces for details.

x1

the additional predictor(s) to be indicated to perform the multiple version of RRphylo. 'x1' vector/matrix must be as long as the number of nodes plus the number of tips of the tree, which can be obtained by running RRphylo on the predictors (separately for each predictor) as well, and taking the vector of ancestral states and tip values to form the x1. See RRphylo vignette - predictor for details.

aces.x1

a named vector/matrix of ancestral character values at nodes for x1. It must be indicated if both aces and x1 are specified. Names/rownames correspond to the nodes in the tree.

clus

the proportion of clusters to be used in parallel computing. Default is 0.5. To run the single-threaded version of RRphylo set clus = 0.

verbose

logical indicating whether a "RRlog.txt" printing progresses should be stored into the working directory.

Value

tree the tree used by RRphylo. The fully dichotomous version of the tree argument. For trees with polytomies, the tree is resolved by using multi2di function in the package ape. Note, tip labels are ordered according to their position in the tree.

tip.path a n * m matrix, where n=number of tips and m=number of branches (i.e. 2*n-1). Each row represents the branch lengths along a root-to-tip path.

node.path a n * n matrix, where n=number of internal branches. Each row represents the branch lengths along a root-to-node path.

rates single rate values computed for each branch of the tree. If y is a single vector variable, rates are equal to multiple.rates. If y is a multivariate dataset, rates are computed as the square root of the sum of squares of each row of $multiple.rates.

aces the phenotypes reconstructed at nodes.

predicted.phenotypes the vector of estimated tip values. It is a matrix in the case of multivariate data.

multiple.rates a n * m matrix, where n= number of branches (i.e. n*2-1) and m = number of variables. For each branch, the column entries represent the evolutionary rate.

lambda the regularization factor fitted within RRphylo by the inner function optL. With multivariate data, several optL runs are performed. Hence, the function provides a single lambda for each individual variable.

ace.values if aces are specified, the function returns a dataframe containing the corresponding node number on the RRphylo tree for each node , along with estimated values.

x1.rate if x1 is specified, the function returns the partial regression coefficient for x1.

Author(s)

Pasquale Raia, Silvia Castiglione, Carmela Serio, Alessandro Mondanaro, Marina Melchionna, Mirko Di Febbraro, Antonio Profico, Francesco Carotenuto

References

Castiglione, S., Tesone, G., Piccolo, M., Melchionna, M., Mondanaro, A., Serio, C., Di Febbraro, M., & Raia, P.(2018). A new method for testing evolutionary rate variation and shifts in phenotypic evolution. Methods in Ecology and Evolution, 9: 974-983.doi:10.1111/2041-210X.12954

Serio, C., Castiglione, S., Tesone, G., Piccolo, M., Melchionna, M., Mondanaro, A., Di Febbraro, M., & Raia, P.(2019). Macroevolution of toothed whales exceptional relative brain size. Evolutionary Biology, 46: 332-342. doi:10.1007/s11692-019-09485-7

Melchionna, M., Mondanaro, A., Serio, C., Castiglione, S., Di Febbraro, M., Rook, L.,Diniz-Filho,J.A.F., Manzi, G., Profico, A., Sansalone, G., & Raia, P.(2020).Macroevolutionary trends of brain mass in Primates. Biological Journal of the Linnean Society, 129: 14-25. doi:10.1093/biolinnean/blz161

Castiglione, S., Serio, C., Mondanaro, A., Melchionna, M., Carotenuto, F., Di Febbraro, M., Profico, A., Tamagnini, D., & Raia, P. (2020). Ancestral State Estimation with Phylogenetic Ridge Regression. Evolutionary Biology, 47: 220-232. doi:10.1007/s11692-020-09505-x

Castiglione, S., Serio, C., Piccolo, M., Mondanaro, A., Melchionna, M., Di Febbraro, M., Sansalone, G., Wroe, S., & Raia, P. (2020). The influence of domestication, insularity and sociality on the tempo and mode of brain size evolution in mammals. Biological Journal of the Linnean Society,132: 221-231. doi:10.1093/biolinnean/blaa186

Examples

 ## Not run: 
data("DataOrnithodirans")
DataOrnithodirans$treedino->treedino
DataOrnithodirans$massdino->massdino
cc<- 2/parallel::detectCores()

# Case 1. "RRphylo" without accounting for the effect of a covariate
RRphylo(tree=treedino,y=massdino,clus=cc)->RRcova

# Case 2. "RRphylo" accounting for the effect of a covariate
# "RRphylo" on the covariate in order to retrieve ancestral state values
c(RRcova$aces,massdino)->cov.values
c(rownames(RRcova$aces),names(massdino))->names(cov.values)

RRphylo(tree=treedino,y=massdino,cov=cov.values,clus=cc)->RR

# Case 3. "RRphylo" specifying the ancestral states
data("DataCetaceans")
DataCetaceans$treecet->treecet
DataCetaceans$masscet->masscet
DataCetaceans$brainmasscet->brainmasscet
DataCetaceans$aceMyst->aceMyst

RRphylo(tree=treecet,y=masscet,aces=aceMyst,clus=cc)->RR

# Case 4. Multiple "RRphylo"
library(ape)
drop.tip(treecet,treecet$tip.label[-match(names(brainmasscet),treecet$tip.label)])->treecet.multi
masscet[match(treecet.multi$tip.label,names(masscet))]->masscet.multi

RRphylo(tree=treecet.multi, y=masscet.multi,clus=cc)->RRmass.multi
RRmass.multi$aces[,1]->acemass.multi
c(acemass.multi,masscet.multi)->x1.mass

RRphylo(tree=treecet.multi,y=brainmasscet,x1=x1.mass,clus=cc)->RR

# Case 5. Categorical and multiple "RRphylo" with 2 additional predictors
library(phytools)

set.seed(1458)
rtree(50)->tree
fastBM(tree)->y
jitter(y)*10->y1
rep(1,length(y))->y2
y2[sample(1:50,20)]<-2
names(y2)<-names(y)

c(RRphylo(tree,y1,clus=cc)$aces[,1],y1)->x1

RRphylo(tree,y2,clus=cc)->RRcat ### this is the RRphylo on the categorical variable
c(RRcat$aces[,1],y2)->x2

cbind(c(jitter(mean(y1[tips(tree,83)])),1),
      c(jitter(mean(y1[tips(tree,53)])),2))->acex
c(jitter(mean(y[tips(tree,83)])),jitter(mean(y[tips(tree,53)])))->acesy
names(acesy)<-rownames(acex)<-c(83,53)

RRphylo(tree,y,aces=acesy,x1=cbind(x1,x2),aces.x1 = acex,clus=cc)

    
## End(Not run)

RRphylo documentation built on June 7, 2023, 5:49 p.m.