PEM-functions: Phylogenetic Eigenvector Maps
In MPSEM: Modelling Phylogenetic Signals using Eigenvector Maps

PEM-functions

R Documentation

Phylogenetic Eigenvector Maps

Description

Functions to calculate and manipulate Phylogenetic Eigenvector Maps (PEM), which are sets of eigenfunctions describing the structure of a phylogenetic graph. Each computation function is briefly described in section Functions below.

Usage

InflMat(x)

PEMweights(d, a = 0, psi = 1)

PEM.build(
  x,
  d = "distance",
  sp = "species",
  a = 0,
  psi = 1,
  tol = .Machine$double.eps^0.5
)

PEM.updater(object, a, psi = 1, tol = .Machine$double.eps^0.5)

PEM.fitSimple(
  y,
  x,
  w,
  d = "distance",
  sp = "species",
  lower = 0,
  upper = 1,
  tol = .Machine$double.eps^0.5
)

PEM.forcedSimple(
  y,
  x,
  w,
  d = "distance",
  sp = "species",
  a = 0,
  psi = 1,
  tol = .Machine$double.eps^0.5
)

getGraphLocations(tpall, targets)

getAncGraphLocations(x, tpall)

Locations2PEMscores(object, gsc)

Arguments

`x`	A `graph-class` object containing a phylogenetic graph.
`d`	The name of the member of `x$edge` where the phylogenetic distances (edge lengths) can be found.
`a`	The steepness parameter describing whether changes occur, on average: progressively long edges (a close to 0) or abruptly at vertices (a close to 1).
`psi`	Relative evolution rate along the edges (default: 1). This parameter is only relevant when multiple values are assigned to different portions of the phylogeny.
`sp`	Name of the member of `x$vertex` where a `logical` vertex property can be found, specifying which vertices are species (see `graph-class`).
`tol`	Eigenvalue threshold indicating that eigenvectors as usable.
`object`	A `PEM-class` object containing a Phylogenetic Eigenvector Map.
`y`	One or many response variable(s) in the form of a single numeric vector or a `matrix`, respectively.
`w`	A `graph-class` object containing a phylogenetic graph.
`lower`	Lower limit for the L-BFGS-B optimization algorithm implemented in `optim`.
`upper`	Upper limit for the L-BFGS-B optimization algorithm implemented in `optim`.
`tpall`	First parameter of function `getGraphLocations`: Phylogenetic tree object with class ‘phylo’ (package ape) containing all species (model and target) used in the study.
`targets`	Name of the target species to extract using the tree `tpall`.
`gsc`	The output of `getGraphLocations`.

Details

Functions InflMat and PEMweights are used internally by PEM.build to create a binary matrix referred to as an ‘influence matrix’ and weight its columns. That matrix has a row for each vertex (or node) of graph ‘x’ and a column for each of its edges. The elements of the influence matrix are 1 whenever the vertex associated with a row is located in the tree, either directly or indirectly downward the edge associated with a column. That function is implemented in C language using recursive function calls. Although InflMat allows one to use multiple roots. User must therefore make sure that the graph provided to PEMap is single-rooted.

Function PEM.build is used to produce a phylogenetic eigenvector map, while function PEM.updater allows one to re-calculate a PEM-class object with new weighting function parameters. Function PEM.fitSimple performs a maximum likelihood estimation of parameters a and psi assuming single values for the whole tree, whereas function PEM.forcedSimple allows one to impose values to arguments a and psi of a PEM-class object, while making the function produce the same details as PEM.fitSimple would have produced; these details are necessary to make predictions.

Functions getGraphLocations returns the coordinates of a species in terms of its position with respect to the influence matrix while function Locations2PEMscores transforms these coordinates into sets of scores that can be used to make predictions. Function getAncGraphLocations produces the same output as getGraphLocations, but for the ancestral species (i.e. the nodes of the phylogeny) in order to estimate ancestral trait values.

Value

Function InflMat returns the influence matrix of graph x and function PEMweights returns weights corresponding to the distances. Functions PEM.build, PEM.fitSimple and PEM.forcedSimple return a PEM-class object. Function getGraphLocations returns a list whose first member is an influence coordinate matrix whose rows refer to the target species and columns refer to the edges. The second member contains the lengths of the terminal edges connecting each target species to the rest of the phylogeny.

Function Locations2PEMscores returns a list whose first member is a PEM score matrix whose rows refer to the target species and columns refer to the eigenvectors. The second member contains the variance associated with the terminal edges connecting the target species to the phylogeny.

Functions

InflMat(): Influence Matrix

Calculates the influence matrix of a phylogenetic graph. The influence matrix is a binary matrix whose rows and columns correspond to the vertices and edges of the phylogenetic graph, respectively, and whose elements describe whether a given edge had been taken by any ancestors of a vertex (representing extinct of extant species) during evolution (value = 1) or not (value = 0).
PEMweights(): PEM Weighting

A power function to obtain the edge weights used during PEM calculation.
PEM.build(): PEM Building

Calculates a PEM with parameters given by arguments a and psi.
PEM.updater(): PEM Update

Update a PEM with new parameters given by arguments a and psi.
PEM.fitSimple(): Fitting a PEM to Data while Estimating Global Steepness

Fits a PEM to a data set estimating the selection (steepness) parameter using gradient descent. The selection and evolution rate (psi = 1) are assumed to be homogeneous for the whole phylogenetic network.
PEM.forcedSimple(): Fitting a PEM to Data while Forcing Global Steepness

Fits a PEM to a data set forcing a user-provided selection (steepness) parameter. The selection and evolution rate (psi = 1) are assumed to be homogeneous for the whole phylogenetic network.
getGraphLocations(): Get Phylogenetic Graph Locations

Takes a phylogenetic tree and a list of species to be removed, and produce a phylogenic graph without these species together with the locations of the removed species on that graph (i.e., the location where the removed species would be found should they be inserted again in the phylogenetic graph).
getAncGraphLocations(): Get Ancestral Species Location

Get the location on the phylogenetic graph of the immediate ancestors for a list of species. The species of the list remain in the resulting phylogenetic graph. This function is useful for estimating the ancestral state of a trait.
Locations2PEMscores(): PEM Score Calculation

Calculates the scores of an extant or ancestral species on a phylogenetic eigenvector map (i.e., its value on the eigenvectors of the map) from its location on the phylogenetic graph used to build that map.

Author(s)

Guillaume Guénard [aut, cre] (ORCID: <https://orcid.org/0000-0003-0761-3072>), Pierre Legendre [ctb] (ORCID: <https://orcid.org/0000-0002-3838-3305>) – Maintainer: Guillaume Guénard <guillaume.guenard@gmail.com>

References

Guénard, G., Legendre, P., and Peres-Neto, P. 2013. Phylogenetic eigenvector maps: a framework to model and predict species traits. Methods in Ecology and Evolution. 4: 1120–1131

Makarenkov, V., Legendre, L. & Desdevise, Y. 2004. Modelling phylogenetic relationships using reticulated networks. Zoologica Scripta 33: 89–96

Blanchet, F. G., Legendre, P. & Borcard, D. 2008. Modelling directional spatial processes in ecological data. Ecological Modelling 215: 325–336

Examples

### Synthetic example

## This example describes the phyogeny of 7 species (A to G) in a tree with 6
## nodes, presented in Newick format, read by function
## read.tree of package ape.

t1 <- read.tree(text=paste(
            "(((A:0.15,B:0.2)N4:0.15,C:0.35)N2:0.25,((D:0.25,E:0.1)N5:0.3,",
            "(F:0.15,G:0.2)N6:0.3)N3:0.1)N1;",sep=""))
t1                 # Summary of the structure of the tree
summary(t1)

x <- Phylo2DirectedGraph(t1)

## Calculate the (binary) influence matrix; E1 to E12 are the tree edges
## Edge E12 comes from the tree origin
InflMat(x)
InflMat(x)[x$vertex$species,]

## Building phylogenetic eigenvector maps
PEM1 <- PEM.build(x)
PEM2 <- PEM.build(x, a = 0.2)
PEM3 <- PEM.build(x, a = 1)
PEM4 <- PEM.updater(PEM3,a=0.5)

## Print summary statistics about PEM1
print(PEM1)

## Extract the eigenvectors (species A--G, 6 eigenvectors)
as.data.frame(PEM4)

## Example of a made up set of trait values for the 7 species
y <- c(A=-1.1436265,B=-0.3186166,C=1.9364105,D=1.7164079,E=1.0013993,
       F=-1.8586351,G=-2.0236371)

## Estimate a single steepness parameter for the whole tree
PEMfs1 <- PEM.fitSimple(y=y, x=NULL, w=x, d="distance", sp="species",
                        lower=0, upper=1)
PEMfs1$optim       # Optimisation results

## Force neutral evolution over the whole tree
PEMfrc1 <- PEM.forcedSimple(y=y,x=NULL,w=x,d="distance",sp="species",a=0)
PEMfrc1$x$edge$a   # Steepness parameter forced on each individual edge

## Graph locations for target species X, Y, and Z not found in the original
## data set
tpAll <- read.tree(text=paste("((X:0.45,((A:0.15,B:0.2)N4:0.15,",
                              "(C:0.25,Z:0.2)NZ:0.1)N2:0.05)NX:0.2,",
                              "(((D:0.25,E:0.1)N5:0.05,Y:0.25)NY:0.25,",
                              "(F:0.15,G:0.2)N6:0.3)N3:0.1)N1;",sep=""))
tpAll
summary(tpAll)     # Summary of the structure of the tree

grloc <- getGraphLocations(tpAll, c("X","Y","Z"))
grloc

PEMfs2 <- PEM.fitSimple(y=y, x=NULL, w=grloc$x, d="distance", sp="species",
                        lower=0, upper=1)
PEMfs2

## Same as for PEMfs1$optim
PEMfs2$optim

## Get the PEM scores from the species graph locations:
PEMsc1 <- Locations2PEMscores(PEMfs2, grloc)
lm1 <- lm(y ~ V_2 + V_3 + V_5, data=PEMfs2)

## Making prdictions for the species in locations `grloc`
## using linear model `lm1`:
ypred <- predict(object=PEMfs2, targets=grloc, lmobject=lm1, interval="none")

## Removing species X, Y, and Z from the tree in `tpAll`:
tpModel <- drop.tip(tpAll, c("X","Y","Z"))

## Plot the results
layout(t(c(1,1,2)))
par(mar=c(6,2,2,0.5)+0.1)
plot(tpModel, show.tip.label=TRUE, show.node.label=TRUE, root.edge = TRUE,
     srt = 0, adj=0.5, label.offset=0.08, font=1, cex=1.5, xpd=TRUE)
edgelabels(paste("E", 1:nrow(tpModel$edge), sep=""),
           edge=1:nrow(tpModel$edge), bg="white", font=1, cex=1)
points(x=0.20,y=2.25,pch=21,bg="black")
lines(x=c(0.20,0.20,0.65), y=c(2.25,0.55,0.55), xpd=TRUE, lty=2)
text("X",x=0.69, y=0.55, xpd=TRUE, font=1, cex=1.5)
points(x=0.35, y=4.5,pch=21,bg="black")
lines(x=c(0.35,0.35,0.6), y=c(4.5,5.47,5.47), xpd=TRUE, lty=2)
text("Y", x=0.64, y=5.47, xpd=TRUE, font=1, cex=1.5)
points(x=0.35, y=3, pch=21, bg="black")
lines(x=c(0.35,0.35,0.55), y=c(3,3.5,3.5), xpd=TRUE, lty=2)
text("Z", x=0.59, y=3.5, xpd=TRUE, font=1, cex=1.5)
text(c("NX","NY","NZ"), x=c(0.20,0.35,0.35), y=c(2.25,4.5,3)+0.3*c(1,-1,-1),
     font=1, cex=1)
add.scale.bar(length=0.1, cex=1.25)
par(mar=c(3.75,0,2,2)+0.1)
plot(x=y, y=1:7, ylim=c(0.45,7), xlim=c(-4,4), axes=FALSE, type="n", xlab="")
axis(1, label=c("-4","-2","0","2","4"), at=c(-4,-2,0,2,4))
abline(v=0)

## Plot the observed values
points(x=y, y=1:7, xlim=c(-2,2), pch=21, bg="black")
text("B)", x=-3.5, y=7, cex=1.5, xpd=TRUE)
text("Trait value", x=0, y=-0.5, cex=1.25, xpd=TRUE)

## Plot the predicted values
points(x=ypred, y=c(0.5,5.5,3.5), pch=23, bg="white", cex=1.25)

## Estimate the ancestral trait values
ANCloc <- getAncGraphLocations(x)
PEMfsAnc <- PEM.fitSimple(y=y, x=NULL, w=ANCloc$x, d="distance",
                          sp="species", lower=0, upper=1)
PEMfsAnc$optim

## Get the PEM scores from the species graph locations:
PEManc1 <- Locations2PEMscores(PEMfsAnc, ANCloc)

## Making predictions for the ancestral species whose locations are found in
## `ANCloc` using the linear model `lm1`:
y_anc <- predict(object=PEMfsAnc, targets=ANCloc, lmobject=lm1,
                 interval="confidence")

MPSEM documentation built on June 8, 2025, 1:56 p.m.