ppca: Phylogenetic principal component analysis

View source: R/ppca.R

ppcaR Documentation

Phylogenetic principal component analysis

Description

These functions are designed to perform a phylogenetic principal component analysis (pPCA, Jombart et al. 2010) and to display the results.

Usage

ppca(
  x,
  prox = NULL,
  method = c("patristic", "nNodes", "oriAbouheif", "Abouheif", "sumDD"),
  f = function(x) {
     1/x
 },
  center = TRUE,
  scale = TRUE,
  scannf = TRUE,
  nfposi = 1,
  nfnega = 0
)

## S3 method for class 'ppca'
scatter(x, axes = 1:ncol(x$li), useLag = FALSE, ...)

## S3 method for class 'ppca'
print(x, ...)

## S3 method for class 'ppca'
summary(object, ..., printres = TRUE)

## S3 method for class 'ppca'
screeplot(x, ..., main = NULL)

## S3 method for class 'ppca'
plot(x, axes = 1:ncol(x$li), useLag = FALSE, ...)

Arguments

x

a phylo4d object (for ppca) or a ppca object (for other methods).

prox

a marix of phylogenetic proximities as returned by proxTips. If not provided, this matrix will be constructed using the arguments method and a.

method

a character string (full or abbreviated without ambiguity) specifying the method used to compute proximities; possible values are:
- patristic: (inversed sum of) branch lengths
- nNodes: (inversed) number of nodes on the path between the nodes
- oriAbouheif: original Abouheif's proximity, with diagonal (see details in proxTips)
- Abouheif: Abouheif's proximity (see details in proxTips)
- sumDD: (inversed) sum of direct descendants of all nodes on the path (see details in proxTips).

f

a function to change a distance into a proximity.

center

a logical indicating whether traits should be centred to mean zero (TRUE, default) or not (FALSE).

scale

a logical indicating whether traits should be scaled to unit variance (TRUE, default) or not (FALSE).

scannf

a logical stating whether eigenvalues should be chosen interactively (TRUE, default) or not (FALSE).

nfposi

an integer giving the number of positive eigenvalues retained ('global structures').

nfnega

an integer giving the number of negative eigenvalues retained ('local structures').

axes

the index of the principal components to be represented.

useLag

a logical stating whether the lagged components (x\$ls) should be used instead of the components (x\$li).

...

further arguments passed to other methods. Can be used to provide arguments to table.phylo4d in plot method.

object

a ppca object.

printres

a logical stating whether results should be printed on the screen (TRUE, default) or not (FALSE).

main

a title for the screeplot; if NULL, a default one is used.

Details

ppca performs the phylogenetic component analysis. Other functions are:

- print.ppca: prints the ppca content

- summary.ppca: provides useful information about a ppca object, including the decomposition of eigenvalues of all axes

- scatter.ppca: plot principal components using table.phylo4d

- screeplot.ppca: graphical display of the decomposition of pPCA eigenvalues

- plot.ppca: several graphics describing a ppca object

The phylogenetic Principal Component Analysis (pPCA, Jombart et al., 2010) is derived from the spatial Principal Component Analysis (spca, Jombart et al. 2008), implemented in the adegenet package (see spca).

pPCA is designed to investigate phylogenetic patterns a set of quantitative traits. The analysis returns principal components maximizing the product of variance of the scores and their phylogenetic autocorrelation (Moran's I), therefore reflecting life histories that are phylogenetically structured. Large positive and large negative eigenvalues correspond to global and local structures.

Value

The class ppca are given to lists with the following components:

eig

a numeric vector of eigenvalues.

nfposi

an integer giving the number of global structures retained.

nfnega

an integer giving the number of local structures retained.

c1

a data.frame of loadings of traits for each axis.

li

a data.frame of coordinates of taxa onto the ppca axes (i.e., principal components).

ls

a data.frame of lagged prinpal components; useful to represent of global scores.

as

a data.frame giving the coordinates of the axes of an 'ordinary' PCA onto the ppca axes.

call

the matched call.

tre

a phylogenetic tre with class phylo4.

prox

a matrix of phylogenetic proximities.

Other functions have different outputs:

- scatter.ppca returns the matched call.

Author(s)

Thibaut Jombart tjombart@imperial.ac.uk

References

Jombart, T.; Pavoine, S.; Dufour, A. & Pontier, D. (2010, in press) Exploring phylogeny as a source of ecological variation: a methodological approach. doi:10.1016/j.jtbi.2010.03.038

Jombart, T., Devillard, S., Dufour, A.-B. and Pontier, D. (2008) Revealing cryptic phylogenetic patterns in genetic variability by a new multivariate method. Heredity, 101, 92–103.

See Also

The implementation of spca in the adegenet package (adegenet)

Examples


data(lizards)

if(require(ape) && require(phylobase)){

#### ORIGINAL EXAMPLE FROM JOMBART ET AL 2010 ####


## BUILD A TREE AND A PHYLO4D OBJECT
liz.tre <- read.tree(tex=lizards$hprA)
liz.4d <- phylo4d(liz.tre, lizards$traits)
oldpar <- par(mar=rep(.1,4))
table.phylo4d(liz.4d,var.lab=c(names(lizards$traits),
   "ACP 1\n(\"size effect\")"),show.node=FALSE, cex.lab=1.2)


## REMOVE DUPLICATED POPULATIONS
liz.4d <- prune(liz.4d, c(7,14))
table.phylo4d(liz.4d)


## CORRECT LABELS
lab <- c("Pa", "Ph", "Ll", "Lmca", "Lmcy", "Phha", "Pha",
   "Pb", "Pm", "Ae", "Tt", "Ts", "Lviv", "La", "Ls", "Lvir")
tipLabels(liz.4d) <- lab


## REMOVE SIZE EFFECT
dat <- tdata(liz.4d, type="tip")
dat <- log(dat)
newdat <- data.frame(lapply(dat, function(v) residuals(lm(v~dat$mean.L))))
rownames(newdat) <- rownames(dat)
tdata(liz.4d, type="tip") <- newdat[,-1] # replace data in the phylo4d object


## pPCA
liz.ppca <- ppca(liz.4d,scale=FALSE,scannf=FALSE,nfposi=1,nfnega=1, method="Abouheif")
liz.ppca
tempcol <- rep("grey",7)
tempcol[c(1,7)] <- "black"
barplot(liz.ppca$eig,main='pPCA eigenvalues',cex.main=1.8,col=tempcol)

par(mar=rep(.1,4))
plot(liz.ppca,ratio.tree=.7)


## CONTRIBUTIONS TO PC (LOADINGS) (viewed as dotcharts)
dotchart(liz.ppca$c1[,1],lab=rownames(liz.ppca$c1),main="Global principal
component 1")
abline(v=0,lty=2)

dotchart(liz.ppca$c1[,2],lab=rownames(liz.ppca$c1),main="Local principal
component 1")
abline(v=0,lty=2)


## REPRODUCE FIGURES FROM THE PAPER
obj.ppca <- liz.4d
tdata(obj.ppca, type="tip") <- liz.ppca$li
myLab <- paste(" ",rownames(liz.ppca$li), sep="")

## FIGURE 1
par(mar=c(.1,2.4,2.1,1))
table.phylo4d(obj.ppca, ratio=.7, var.lab=c("1st global PC", "1st local
   PC"), tip.label=myLab,box=FALSE,cex.lab=1.4, cex.sym=1.2, show.node.label=TRUE)
add.scatter.eig(liz.ppca$eig,1,1,1,csub=1.2, posi="topleft", ratio=.23)


## FIGURE 2
s.arrow(liz.ppca$c1,xlim=c(-1,1),clab=1.3,cgrid=1.3)



#### ANOTHER EXAMPLE - INCLUDING NA REPLACEMENT ####
## LOAD THE DATA
data(maples)
tre <- read.tree(text=maples$tre)
x <- phylo4d(tre, maples$tab)
omar <- par("mar")
par(mar=rep(.1,4))
table.phylo4d(x, cex.lab=.5, cex.sym=.6, ratio=.1) # note NAs in last trait ('x')

## FUNCTION TO REPLACE NAS
f1 <- function(vec){
if(any(is.na(vec))){
m <- mean(vec, na.rm=TRUE)
vec[is.na(vec)] <- m
}
return(vec)
}


## PERFORM THE PPCA
dat <- apply(maples$tab,2,f1) # replace NAs
x.noNA <- phylo4d(tre, as.data.frame(dat))
map.ppca <- ppca(x.noNA, scannf=FALSE, method="Abouheif")
map.ppca


## SOME GRAPHICS
screeplot(map.ppca)
scatter(map.ppca, useLag=TRUE)
plot(map.ppca, useLag=TRUE)


## MOST STRUCTURED TRAITS
a <- map.ppca$c1[,1] # loadings on PC 1
names(a) <- row.names(map.ppca$c1)
highContrib <- a[a< quantile(a,0.1) | a>quantile(a,0.9)]
datSel <- cbind.data.frame(dat[, names(highContrib)], map.ppca$li)
temp <- phylo4d(tre, datSel)
table.phylo4d(temp) # plot of most structured traits


## PHYLOGENETIC AUTOCORRELATION TESTS FOR THESE TRAITS
prox <- proxTips(tre, method="Abouheif")
abouheif.moran(dat[, names(highContrib)], prox)
par(oldpar)

}


adephylo documentation built on Oct. 6, 2023, 5:07 p.m.