spca: Spatial principal component analysis

Description Usage Arguments Details Value Author(s) References See Also Examples

Description

These functions implement the spatial principal component analysis (sPCA). The function spca is a generic with methods for:

The core computation use multispati from the ade4 package.

Besides the set of spca functions, other functions include:

A tutorial on sPCA can be opened using:
adegenetTutorial(which="spca").

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
spca(...)

## Default S3 method:
spca(x, ...)

## S3 method for class 'matrix'
spca(x, xy = NULL, cn = NULL, matWeight = NULL,
            center = TRUE, scale = FALSE, scannf = TRUE,
            nfposi = 1, nfnega = 1,
            type = NULL, ask = TRUE,
            plot.nb = TRUE, edit.nb = FALSE,
            truenames = TRUE,
            d1 = NULL, d2 = NULL, k = NULL,
            a = NULL, dmin = NULL, ...)

## S3 method for class 'data.frame'
spca(x, xy = NULL, cn = NULL, matWeight = NULL,
            center = TRUE, scale = FALSE, scannf = TRUE,
            nfposi = 1, nfnega = 1,
            type = NULL, ask = TRUE,
            plot.nb = TRUE, edit.nb = FALSE,
            truenames = TRUE,
            d1 = NULL, d2 = NULL, k = NULL,
            a = NULL, dmin = NULL, ...)

## S3 method for class 'genind'
spca(obj, xy = NULL, cn = NULL, matWeight = NULL,
            scale = FALSE, scannf = TRUE,
            nfposi = 1, nfnega = 1,
            type = NULL, ask = TRUE,
            plot.nb = TRUE, edit.nb = FALSE,
            truenames = TRUE,
            d1 = NULL, d2 = NULL, k = NULL,
            a = NULL, dmin = NULL, ...)

## S3 method for class 'genpop'
spca(obj, xy = NULL, cn = NULL, matWeight = NULL,
            scale = FALSE, scannf = TRUE,
            nfposi = 1, nfnega = 1,
            type = NULL, ask = TRUE,
            plot.nb = TRUE, edit.nb = FALSE,
            truenames = TRUE,
            d1 = NULL, d2 = NULL, k = NULL,
            a = NULL, dmin = NULL, ...)


## S3 method for class 'spca'
print(x, ...)

## S3 method for class 'spca'
summary(object, ..., printres=TRUE)

## S3 method for class 'spca'
plot(x, axis = 1, useLag=FALSE, ...)

## S3 method for class 'spca'
screeplot(x, ..., main=NULL)

## S3 method for class 'spca'
colorplot(x, axes=1:ncol(x$li), useLag=FALSE, ...)

Arguments

x

a matrix or a data.frame of numeric values, with individuals in rows and variables in columns; categorical variables with a binary coding are acceptable too; for print and plotting functions, a spca object.

obj

a genind or genpop object.

xy

a matrix or data.frame with two columns for x and y coordinates. Seeked from obj\$other\$xy if it exists when xy is not provided. Can be NULL if a nb object is provided in cn.
Longitude/latitude coordinates should be converted first by a given projection (see 'See Also' section).

cn

a connection network of the class 'nb' (package spdep). Can be NULL if xy is provided. Can be easily obtained using the function chooseCN (see details).

matWeight

a square matrix of spatial weights, indicating the spatial proximities between entities. If provided, this argument prevails over cn (see details).

center

a logical indicating whether data should be centred to a mean of zero; used implicitely for genind or genpop objects.

scale

a logical indicating whether data should be scaled to unit variance (TRUE) or not (FALSE, default).

scannf

a logical stating whether eigenvalues should be chosen interactively (TRUE, default) or not (FALSE).

nfposi

an integer giving the number of positive eigenvalues retained ('global structures').

nfnega

an integer giving the number of negative eigenvalues retained ('local structures').

type

an integer giving the type of graph (see details in chooseCN help page). If provided, ask is set to FALSE.

ask

a logical stating whether graph should be chosen interactively (TRUE,default) or not (FALSE).

plot.nb

a logical stating whether the resulting graph should be plotted (TRUE, default) or not (FALSE).

edit.nb

a logical stating whether the resulting graph should be edited manually for corrections (TRUE) or not (FALSE, default).

truenames

a logical stating whether true names should be used for 'obj' (TRUE, default) instead of generic labels (FALSE)

d1

the minimum distance between any two neighbours. Used if type=5.

d2

the maximum distance between any two neighbours. Used if type=5.

k

the number of neighbours per point. Used if type=6.

a

the exponent of the inverse distance matrix. Used if type=7.

dmin

the minimum distance between any two distinct points. Used to avoid infinite spatial proximities (defined as the inversed spatial distances). Used if type=7.

object

a spca object.

printres

a logical stating whether results should be printed on the screen (TRUE, default) or not (FALSE).

axis

an integer between 1 and (nfposi+nfnega) indicating which axis should be plotted.

main

a title for the screeplot; if NULL, a default one is used.

...

further arguments passed to other methods.

axes

the index of the columns of X to be represented. Up to three axes can be chosen.

useLag

a logical stating whether the lagged components (x\$ls) should be used instead of the components (x\$li).

Details

The spatial principal component analysis (sPCA) is designed to investigate spatial patterns in the genetic variability. Given multilocus genotypes (individual level) or allelic frequency (population level) and spatial coordinates, it finds individuals (or population) scores maximizing the product of variance and spatial autocorrelation (Moran's I). Large positive and negative eigenvalues correspond to global and local structures.

Spatial weights can be obtained in several ways, depending how the arguments xy, cn, and matWeight are set.
When several acceptable ways are used at the same time, priority is as follows:
matWeight > cn > xy

Value

The class spca are given to lists with the following components:

eig

a numeric vector of eigenvalues.

nfposi

an integer giving the number of global structures retained.

nfnega

an integer giving the number of local structures retained.

c1

a data.frame of alleles loadings for each axis.

li

a data.frame of row (individuals or populations) coordinates onto the sPCA axes.

ls

a data.frame of lag vectors of the row coordinates; useful to clarify maps of global scores .

as

a data.frame giving the coordinates of the PCA axes onto the sPCA axes.

call

the matched call.

xy

a matrix of spatial coordinates.

lw

a list of spatial weights of class listw.

Other functions have different outputs:
- summary.spca returns a list with 3 components: Istat giving the null, minimum and maximum Moran's I values; pca gives variance and I statistics for the principal component analysis; spca gives variance and I statistics for the sPCA.

- plot.spca returns the matched call.

- screeplot.spca returns the matched call.

Author(s)

Thibaut Jombart t.jombart@imperial.ac.uk

References

Jombart, T., Devillard, S., Dufour, A.-B. and Pontier, D. Revealing cryptic spatial patterns in genetic variability by a new multivariate method. Heredity, 101, 92–103.

Wartenberg, D. E. (1985) Multivariate spatial correlation: a method for exploratory geographical analysis. Geographical Analysis, 17, 263–283.

Moran, P.A.P. (1948) The interpretation of statistical maps. Journal of the Royal Statistical Society, B 10, 243–251.

Moran, P.A.P. (1950) Notes on continuous stochastic phenomena. Biometrika, 37, 17–23.

de Jong, P. and Sprenger, C. and van Veen, F. (1984) On extreme values of Moran's I and Geary's c. Geographical Analysis, 16, 17–24.

See Also

spcaIllus and rupica for datasets illustrating the sPCA
global.rtest and local.rtest
chooseCN, multispati, multispati.randtest
convUL, from the package 'PBSmapping' to convert longitude/latitude to UTM coordinates.

Examples

1
2
3
4
5
6
7
8
## data(spcaIllus) illustrates the sPCA
## see ?spcaIllus
##
## Not run: 
example(spcaIllus)
example(rupica)

## End(Not run)

Example output

Loading required package: ade4

   /// adegenet 2.0.1 is loaded ////////////

   > overview: '?adegenet'
   > tutorials/doc/questions: 'adegenetWeb()' 
   > bug reports/feature requests: adegenetIssues()



spcIll> data(spcaIllus)

spcIll> attach(spcaIllus)

spcIll> opar <- par(no.readonly=TRUE)

spcIll> ## comparison PCA vs sPCA
spcIll> 
spcIll> # PCA
spcIll> pca2A <- dudi.pca(dat2A$tab,center=TRUE,scale=FALSE,scannf=FALSE)

spcIll> pca2B <- dudi.pca(dat2B$tab,center=TRUE,scale=FALSE,scannf=FALSE)

spcIll> pca2C <- dudi.pca(dat2C$tab,center=TRUE,scale=FALSE,scannf=FALSE)

spcIll> pca3 <- dudi.pca(dat3$tab,center=TRUE,scale=FALSE,scannf=FALSE,nf=2)

spcIll> pca4 <- dudi.pca(dat4$tab,center=TRUE,scale=FALSE,scannf=FALSE,nf=2)

spcIll> # sPCA
spcIll> spca2A <-spca(dat2A,xy=dat2A$other$xy,ask=FALSE,type=1,
spcIll+ plot=FALSE,scannf=FALSE,nfposi=1,nfnega=0)

     PLEASE NOTE:  The components "delsgs" and "summary" of the
 object returned by deldir() are now DATA FRAMES rather than
 matrices (as they were prior to release 0.0-18).
 See help("deldir").
 
     PLEASE NOTE: The process that deldir() uses for determining
 duplicated points has changed from that used in version
 0.0-9 of this package (and previously). See help("deldir").



spcIll> spca2B <- spca(dat2B,xy=dat2B$other$xy,ask=FALSE,type=1,
spcIll+ plot=FALSE,scannf=FALSE,nfposi=1,nfnega=0)

spcIll> spca2C <- spca(dat2C,xy=dat2C$other$xy,ask=FALSE,
spcIll+ type=1,plot=FALSE,scannf=FALSE,nfposi=0,nfnega=1)

spcIll> spca3 <- spca(dat3,xy=dat3$other$xy,ask=FALSE,
spcIll+ type=1,plot=FALSE,scannf=FALSE,nfposi=1,nfnega=1)

spcIll> spca4 <- spca(dat4,xy=dat4$other$xy,ask=FALSE,
spcIll+ type=1,plot=FALSE,scannf=FALSE,nfposi=1,nfnega=1)

spcIll> # an auxiliary function for graphics
spcIll> plotaux <- function(x,analysis,axis=1,lab=NULL,...){
spcIll+ neig <- NULL
spcIll+ if(inherits(analysis,"spca")) neig <- nb2neig(analysis$lw$neighbours)
spcIll+ xrange <- range(x$other$xy[,1])
spcIll+ xlim <- xrange + c(-diff(xrange)*.1 , diff(xrange)*.45)
spcIll+ yrange <- range(x$other$xy[,2])
spcIll+ ylim <- yrange + c(-diff(yrange)*.45 , diff(yrange)*.1)
spcIll+ 
spcIll+ s.value(x$other$xy,analysis$li[,axis],include.ori=FALSE,addaxes=FALSE,
spcIll+ cgrid=0,grid=FALSE,neig=neig,cleg=0,xlim=xlim,ylim=ylim,...)
spcIll+ 
spcIll+ par(mar=rep(.1,4))
spcIll+ if(is.null(lab)) lab = gsub("[P]","",x$pop)
spcIll+ text(x$other$xy, lab=lab, col="blue", cex=1.2, font=2)
spcIll+ add.scatter({barplot(analysis$eig,col="grey");box();
spcIll+ title("Eigenvalues",line=-1)},posi="bottomright",ratio=.3)
spcIll+ }

spcIll> # plots
spcIll> plotaux(dat2A,pca2A,sub="dat2A - PCA",pos="bottomleft",csub=2)

spcIll> plotaux(dat2A,spca2A,sub="dat2A - sPCA glob1",pos="bottomleft",csub=2)

spcIll> plotaux(dat2B,pca2B,sub="dat2B - PCA",pos="bottomleft",csub=2)

spcIll> plotaux(dat2B,spca2B,sub="dat2B - sPCA glob1",pos="bottomleft",csub=2)

spcIll> plotaux(dat2C,pca2C,sub="dat2C - PCA",pos="bottomleft",csub=2)

spcIll> plotaux(dat2C,spca2C,sub="dat2C - sPCA loc1",pos="bottomleft",csub=2,axis=2)

spcIll> par(mfrow=c(2,2))

spcIll> plotaux(dat3,pca3,sub="dat3 - PCA axis1",pos="bottomleft",csub=2)

spcIll> plotaux(dat3,spca3,sub="dat3 - sPCA glob1",pos="bottomleft",csub=2)

spcIll> plotaux(dat3,pca3,sub="dat3 - PCA axis2",pos="bottomleft",csub=2,axis=2)

spcIll> plotaux(dat3,spca3,sub="dat3 - sPCA loc1",pos="bottomleft",csub=2,axis=2)

spcIll> plotaux(dat4,pca4,lab=dat4$other$sup.pop,sub="dat4 - PCA axis1",
spcIll+ pos="bottomleft",csub=2)

spcIll> plotaux(dat4,spca4,lab=dat4$other$sup.pop,sub="dat4 - sPCA glob1",
spcIll+ pos="bottomleft",csub=2)

spcIll> plotaux(dat4,pca4,lab=dat4$other$sup.pop,sub="dat4 - PCA axis2",
spcIll+ pos="bottomleft",csub=2,axis=2)

spcIll> plotaux(dat4,spca4,lab=dat4$other$sup.pop,sub="dat4 - sPCA loc1",
spcIll+ pos="bottomleft",csub=2,axis=2)

spcIll> # color plot
spcIll> par(opar)

spcIll> colorplot(spca3, cex=4, main="colorplot sPCA dat3")

spcIll> text(spca3$xy[,1], spca3$xy[,2], dat3$pop)

spcIll> colorplot(spca4, cex=4, main="colorplot sPCA dat4")

spcIll> text(spca4$xy[,1], spca4$xy[,2], dat4$other$sup.pop)

spcIll> # detach data
spcIll> detach(spcaIllus)

rupica> data(rupica)

rupica> rupica
/// GENIND OBJECT /////////

 // 335 individuals; 9 loci; 55 alleles; size: 217.2 Kb

 // Basic content
   @tab:  335 x 55 matrix of allele counts
   @loc.n.all: number of alleles per locus (range: 4-10)
   @loc.fac: locus factor for the 55 columns of @tab
   @all.names: list of allele names for each locus
   @ploidy: ploidy of each individual  (range: 2-2)
   @type:  codom
   @call: NULL

 // Optional content
   @other: a list containing: xy  mnt  showBauges 


rupica> ## Not run: 
rupica> ##D if(require(adehabitat)){
rupica> ##D 
rupica> ##D ## see the sampling area
rupica> ##D showBauges <- rupica$other$showBauges
rupica> ##D showBauges()
rupica> ##D points(rupica$other$xy,col="red")
rupica> ##D 
rupica> ##D ## perform a sPCA
rupica> ##D spca1 <- spca(rupica,type=5,d1=0,d2=2300,plot=FALSE,scannf=FALSE,nfposi=2,nfnega=0)
rupica> ##D barplot(spca1$eig,col=rep(c("black","grey"),c(2,100)),main="sPCA eigenvalues")
rupica> ##D screeplot(spca1,main="sPCA eigenvalues: decomposition")
rupica> ##D 
rupica> ##D ## data visualization
rupica> ##D showBauges(,labcex=1)
rupica> ##D s.value(spca1$xy,spca1$ls[,1],add.p=TRUE,csize=.5)
rupica> ##D add.scatter.eig(spca1$eig,1,1,1,posi="topleft",sub="Eigenvalues")
rupica> ##D 
rupica> ##D showBauges(,labcex=1)
rupica> ##D s.value(spca1$xy,spca1$ls[,2],add.p=TRUE,csize=.5)
rupica> ##D add.scatter.eig(spca1$eig,2,2,2,posi="topleft",sub="Eigenvalues")
rupica> ##D 
rupica> ##D rupica$other$showBauges()
rupica> ##D colorplot(spca1$xy,spca1$li,cex=1.5,add.plot=TRUE)
rupica> ##D 
rupica> ##D ## global and local tests
rupica> ##D Gtest <- global.rtest(rupica@tab,spca1$lw,nperm=999)
rupica> ##D Gtest
rupica> ##D plot(Gtest)
rupica> ##D Ltest <- local.rtest(rupica@tab,spca1$lw,nperm=999)
rupica> ##D Ltest
rupica> ##D plot(Ltest)
rupica> ##D }
rupica> ## End(Not run)
rupica> 
rupica> 
rupica> 
rupica> 

adegenet documentation built on Oct. 10, 2021, 1:09 a.m.