estimate_shift_configuration: Detects evolutionary shifts under an OU model

Description Usage Arguments Details Value References Examples

View source: R/shift_configuration.R

Description

This function takes in one or multiple traits, and automatically detects the phylogenetic placement and the magnitude of shifts in the evolution of these traits. The model assumes an Ornstein-Uhlenbeck process whose parameters are estimated (adaptation ‘strength’ alpha and drift variance sigma^2). Instantaneous shifts in the optimal trait value affect the traits over time.

Usage

1
2
3
4
5
6
7
8
estimate_shift_configuration(tree, Y,
  max.nShifts = floor(length(tree$tip.label)/2), criterion = c("pBIC",
  "pBICess", "mBIC", "BIC", "AICc"), root.model = c("OUfixedRoot",
  "OUrandomRoot"), candid.edges = NA, quietly = TRUE,
  alpha.starting.value = NA, alpha.upper = alpha_upper_bound(tree),
  alpha.lower = NA, lars.alg = c("lasso", "stepwise"), nCores = 1,
  rescale = TRUE, edge.length.threshold = .Machine$double.eps,
  grp.delta = 1/16, grp.seq.ub = 5, l1ou.options = NA)

Arguments

tree

ultrametric tree of class phylo with branch lengths, and edges in postorder.

Y

trait vector/matrix without missing entries. The row names of the data must be in the same order as the tip labels.

max.nShifts

upper bound for the number of shifts. The default value is half the number of tips.

criterion

information criterion for model selection (see Details in configuration_ic).

root.model

ancestral state model at the root.

candid.edges

a vector of indices of candidate edges where the shifts may occur. If provided, shifts will only be allowed on these edges; otherwise all edges will be considered.

quietly

logical. If FALSE, a basic summary of the progress and results is printed.

alpha.starting.value

optional starting value for the optimization of the phylogenetic adaptation rate.

alpha.upper

optional upper bound for the phylogenetic adaptation rate. The default value is log(2) over the minimum branch length connected to tips.

alpha.lower

optional lower bound for the phylogenetic adaptation rate.

lars.alg

model selection algorithm for LARS in univariate case.

nCores

number of processes to be created for parallel computing. If nCores=1 then it will run sequentially. Otherwise, it creates nCores processes by using mclapply function. For parallel computing it, requires parallel package.

rescale

logical. If TRUE, the columns of the trait matrix are first rescaled so that all have the same l2-norm. If TRUE, the scores will be based on the rescale one.

edge.length.threshold

minimum edge length that is considered non-zero. Branches with shorter length are considered as soft polytomies, disallowing shifts on such branches.

grp.delta

internal (used when the data contain multiple traits). The input lambda sequence for the group lasso, in ‘grplasso’, will be lambda.max*(0.5^seq(0, grp.seq.ub, grp.delta) ).

grp.seq.ub

(used for multiple traits). The input lambda sequence for grplasso will be lambda.max*(0.5^seq(0, grp.seq.ub, grp.delta) ).

l1ou.options

if provided, all the default values will be ignored.

Details

For information criteria: see configuration_ic.

Value

Y

input trait vector/matrix.

tree

input tree.

shift.configuration

estimated shift positions, i.e. vector of indices of edges where the estimated shifts occur.

shift.values

estimates of the shift values.

shift.means

estimates change of the expectation of the shift values

nShifts

estimated number of shifts.

optima

optimum values of the trait at tips. If the data are multivariate, this is a matrix where each row corresponds to a tip.

edge.optima

optimum values of the trait on the edges. If the data are multivariate, this is a matrix where each row corresponds to an edge.

alpha

maximum likelihood estimate(s) of the adaptation rate alpha, one per trait.

sigma2

maximum likelihood estimate(s) of the variance rate sigma^2, one per trait.

mu

fitted values, i.e. estimated trait means.

residuals

residuals. These residuals are phylogenetically correlated.

score

information criterion value of the estimated shift configuration.

profile

list of shift configurations sorted by their ic scores.

l1ou.options

list of options that were used.

References

Mohammad Khabbazian, Ricardo Kriebel, Karl Rohe, and Cécile Ané (2016). "Fast and accurate detection of evolutionary shifts in Ornstein-Uhlenbeck models". Methods in Ecology and Evolution. doi:10.1111/2041-210X.12534

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
data(lizard.tree, lizard.traits)
# here lizard.traits already has row names:
rownames(lizard.traits)
# also, it is a matrix (not data frame) so columns retain row names:
names(lizard.traits[,1])
# If your trait data "dat" does not have row names but instead has
# species names in a column called "species", then you can
# create row names containing the species names like this:
# rownames(dat) <- dat$species
lizard <- adjust_data(lizard.tree, lizard.traits[,1])
eModel <- estimate_shift_configuration(lizard$tree, lizard$Y)
eModel
 
## use parallel computing to accelerate the computation
eModel.par <- estimate_shift_configuration(lizard$tree, lizard$Y, nCores=8)

stopifnot( identical( sort(eModel.par$shift.configuration), sort(eModel$shift.configuration) ) ) ## TRUE

nEdges <- Nedge(lizard.tree) # total number of edges
ew <- rep(1,nEdges)  # to set default edge width of 1
ew[eModel$shift.configuration] <- 3   # to widen edges with a shift 
plot(eModel, cex=0.5, label.offset=0.02, edge.width=ew)

# example to constrain the set of candidate branches with a shift
eModel <- estimate_shift_configuration(lizard$tree, lizard$Y, criterion="AICc")
ce <- eModel$shift.configuration # set of candidate edges
eModel <- estimate_shift_configuration(lizard$tree, lizard$Y, candid.edges = ce)
plot(eModel, edge.ann.cex=0.7, cex=0.5, label.offset=0.02)

khabbazian/l1ou documentation built on July 30, 2018, 12:05 p.m.