fit_t_pl: High-dimensional phylogenetic models of trait evolution

View source: R/fit_t_pl.R

fit_t_plR Documentation

High-dimensional phylogenetic models of trait evolution

Description

Fits high-dimensional model of trait evolution on trees through penalized likelihood. A phylogenetic Leave-One-Out Cross-Validated log-likelihood (LOOCV) is used to estimate model parameters.

Usage


fit_t_pl(Y, tree, model=c("BM", "OU", "EB", "lambda"),
		 method=c("RidgeAlt", "RidgeArch", "RidgeAltapprox", 
		 "LASSO", "LASSOapprox"), targM=c("null", "Variance", 
		 "unitVariance"), REML=TRUE, up=NULL, low=NULL, 
		 tol=NULL, starting=NULL, SE=NULL,
		 scale.height=TRUE, ...)
  

Arguments

Y

A matrix of phenotypic traits values (the variables are represented as columns)

tree

An object of class 'phylo' (see ape documentation)

model

The evolutionary model, "BM" is Brownian Motion, "OU" is Ornstein-Uhlenbeck, "EB" is Early Burst, and "lambda" is Pagel's lambda transformation.

method

The penalty method. "RidgeArch": Archetype (linear) Ridge penalty, "RidgeAlt": Quadratic Ridge penalty, "LASSO": Least Absolute Selection and Shrinkage Operator. "RidgeAltapprox" and "LASSOapprox" are fast approximations of the LOOCV for the Ridge quadratic and LASSO penalties

targM

The target matrix used for the Ridge regularizations. "null" is a null target, "Variance" for a diagonal unequal variance target, "unitVariance" for an equal diagonal target. Only works with "RidgeArch","RidgeAlt", and "RidgeAltapprox" methods.

REML

Use REML (default) or ML for estimating the parameters.

up

Upper bound for the parameter search of the evolutionary model (optional).

low

Lower bound for the parameter search of the evolutionary model (optional).

tol

minimum value for the regularization parameter. Singularities can occur with a zero value in high-dimensional cases. (default is NULL)

starting

Starting values for the parameter search (optional).

SE

Standard errors associated with values in Y. If TRUE, SE will be estimated.

scale.height

Whether the tree should be scaled to unit length or not. (default is TRUE)

...

Options to be passed through. (e.g., echo=FALSE to stop printing messages)

Details

fit_t_pl allows fitting various multivariate evolutionary models to high-dimensional datasets (where the number of variables p is larger than n). Models estimates are more accurate than maximum likelihood methods. Models fit can be compared using the GIC criterion (see ?GIC). Details about the methods are described in Clavel et al. (2019).

Value

a list with the following components

loocv

the (negative) cross-validated penalized likelihood

model.par

the evolutionary model parameter estimates

gamma

the regularization/tuning parameter of the penalized likelihood

corrstruct

a list with the tansformed variables and the phylogenetic tree with branch length stretched to the model estimated parameters

model

the evolutionary model

method

the penalization method

p

the number of traits

n

the number of species

targM

the target used for Ridge Penalization

R

a list with the estimated evolutionary covariance matrix and it's inverse

REML

logical indicating if the REML (TRUE) or ML (FALSE) method has been used

variables

Y is the input dataset and tree is the input phylogenetic tree

SE

the estimated standard error

Note

The LASSO is computationally intensive. Please wait! For highly-dimensional datasets you should favor the "RidgeArch" method to speed up the computations. The Ridge penalties with "null" or "unitVariance" targets are rotation invariants.

Author(s)

J. Clavel

References

Clavel, J., Aristide, L., Morlon, H., 2019. A Penalized Likelihood framework for high-dimensional phylogenetic comparative methods and an application to new-world monkeys brain evolution. Syst. Biol. 68: 93-116.

See Also

ancestral, phyl.pca_pl, GIC.fit_pl.rpanda, gic_criterion mvgls

Examples



if(test){
require(mvMORPH)
set.seed(1)
n <- 32 # number of species
p <- 31 # number of traits

tree <- pbtree(n=n) # phylogenetic tree
R <- Posdef(p)      # a random symmetric matrix (covariance)

# simulate a dataset
Y <- mvSIM(tree, model="BM1", nsim=1, param=list(sigma=R))

# fit the model
fit_t_pl(Y, tree, model="BM", method="RidgeAlt")

# try on rotated axis (using PCA)
trans <- prcomp(Y, center=FALSE)
fit_t_pl(trans$x, tree, model="BM", method="RidgeAlt")

# Estimate the SE (similar to Pagel's lambda for BM). 
# Advised with empirical datasets
fit_t_pl(Y, tree, model="BM", method="RidgeAlt", SE=TRUE)
}


RPANDA documentation built on Oct. 24, 2022, 5:06 p.m.