eof: EOF (Empirical Orthogonal Functions analysis)

View source: R/eof.R

eofR Documentation

EOF (Empirical Orthogonal Functions analysis)

Description

This function conducts an Empirical Orthogonal Function (EOF) analysis via a covariance matrix (cov4gappy function) and is especially designed to handle gappy data (i.e. containing missing values - NaN)

Usage

eof(
  F1,
  centered = TRUE,
  scaled = FALSE,
  nu = NULL,
  method = NULL,
  recursive = FALSE
)

Arguments

F1

A data field. The data should be arraunged as samples in the column dimension (typically each column is a time series for a spatial location).

centered

Logical (TRUE/FALSE) to define if F1 should be centered prior to the analysis. Defaults to TRUE

scaled

Logical (TRUE/FALSE) to define if F1 should be scaled prior to the analysis. Defaults to TRUE

nu

Numeric value. Defines the number of EOFs to return. Defaults to return the full set of EOFs.

method

Method for matrix decomposition (svd, eigen, irlba, svds). Defaults to "svd" when method = NULL and "svds" when method = NULL and recursive = TRUE. Use of "svds" or "irlba" calculates a partial SVD, which is recommended when recursive = TRUE due to faster computation speed.

recursive

Logical. When TRUE, the function follows the method of "Recursively Subtracted Empirical Orthogonal Functions" (RSEOF) (Taylor et al. 2013). RSEOF is a modification of a least squares EOF approach for gappy data (LSEOF) (see von Storch and Zwiers 1999)

Details

Taylor et al. (2013) demonstrated that the RSEOF approach (i.e. recursive = TRUE) more accurately estimates EOFs from a gappy field than the traditional LSEOF method. Pre-treatment of gappy fields through in EOF interpolation (dineof) may provide the most accurate estimate of EOFs; however, RSEOF can be much faster in cases where the number of columns in F1 is smaller than the number of rows.

All methods should give identical results when recursive = TRUE, despite differences in calculation time. When recursive = FALSE, "svd" and "eigen" give similar results for non-gappy fields, but will differ with gappy fields due to decomposition of a nonpositive definite covariance matrix, as calculated with cov4gappy. Specifically, "eigen" will produce negative eigenvalues for trailing EOFs, while singular values derived from "svd" will be strictly positive. Faster computation time with "svds" over "irlba" may not result when recursive = TRUE due to the iterative computation of leading vectors only.

Value

Results of eof are returned as a list containing the following components:

u EOFs.
Lambda Singular values.
A EOF coefficients (i.e. 'Principal Components').
tot.var Total variance of field F1 (from cov4gappy).
F1_dim Dimensions of field F1.
F1_center Vector of center values from each column in field F1.
F1_scale Vector of scale values from each column in field F1.

References

Bjoernsson, H. and Venegas, S.A. (1997). "A manual for EOF and SVD analyses of climate data", McGill University, CCGCR Report No. 97-1, Montreal, Quebec, 52pp.

von Storch, H, Zwiers, F.W. (1999). Statistical analysis in climate research. Cambridge University Press.

Taylor, Marc H., Martin Losch, Manfred Wenzel, Jens Schroeter (2013). On the Sensitivity of Field Reconstruction and Prediction Using Empirical Orthogonal Functions Derived from Gappy Data. J. Climate, 26, 9194-9205. pdf

Examples


# EOF of 'iris' dataset
Et <- eof(iris[,1:4])
plot(Et$A, col=iris$Species)

# Compare to results of 'prcomp'
Pt <- prcomp(iris[,1:4])
SIGN <- diag(sign(diag(cor(Et$A, Pt$x)))) # correction for differing sign
matplot(Et$A %*% SIGN, Pt$x)
abline(0,1, col=8)

# Compare to a gappy dataset (sign of loadings may differ between methods)
iris.gappy <- as.matrix(iris[,1:4])
set.seed(1)
iris.gappy[sample(length(iris.gappy), 0.25*length(iris.gappy))] <- NaN
Eg <- eof(iris.gappy, method="svd", recursive=TRUE) # recursive ("RSEOF")
op <- par(mfrow=c(1,2))
plot(Et$A, col=iris$Species)
plot(Eg$A, col=iris$Species)
par(op)

# Compare Non-gappy vs. Gappy EOF loadings
op <- par(no.readonly = TRUE)
layout(matrix(c(1,2,1,3),2,2), widths=c(3,3), heights=c(1,4))
par(mar=c(0,0,0,0))
plot(1, t="n", axes=FALSE, ann=FALSE)
legend("center", ncol=4, legend=colnames(iris.gappy), border=1, bty="n", 
fill=rainbow(4))
par(mar=c(6,3,2,1))
barplot(Et$u, beside=TRUE, col=rainbow(4), ylim=range(Et$u)*c(1.15,1.15))
mtext("Non-gappy", side=3, line=0)
axis(1, labels=paste("EOF", 1:4), at=c(3, 8, 13, 18), las=2, tick=FALSE)
barplot(Eg$u, beside=TRUE, col=rainbow(4), ylim=range(Et$u)*c(1.15,1.15))
mtext("Gappy", side=3, line=0)
axis(1, labels=paste("EOF", 1:4), at=c(3, 8, 13, 18), las=2, tick=FALSE)
par(op)


### EOF of climate field example
library(maps) # required package to add map

# load data
data(sst)
names(sst)

# EOF
E <- eof(sst$field, nu=10)

# Plot of EOF loading and PC
eof.num <- 2 # EOF number to plot
op <- par(no.readonly=TRUE)
layout(matrix(c(1,3,2,3),nrow=2, ncol=2), widths=c(5,1), heights=c(3,3), respect=TRUE)
op <- par(ps=10, cex=1)
par(mar=c(4,4,1,1))
PAL <- colorPalette(c("blue", "cyan", "grey90", "yellow", "red"), c(10,1,1,10))
ZLIM <- c(-1,1)*max(abs(E$u[,eof.num]))
COL <- val2col(E$u[,eof.num], col=PAL(100), zlim=ZLIM)
plot(lat ~ lon, data=sst$grid, pch=22, bg=COL, col=COL, cex=2)
map("world", add=TRUE)
par(mar=c(4,0,1,4))
imageScale(E$u[,eof.num], col=PAL(100), zlim=ZLIM, axis.pos=4)
par(mar=c(4,4,1,4))
plot(sst$date, E$A[,eof.num], t="l", xlab="date", ylab="")
lines(loess.smooth(sst$date, E$A[,eof.num], span=1/3), col=rgb(0.5,0.5,1), lwd=2) # smoothed signal
abline(h=0, v=seq(as.Date("1000-01-01"), as.Date("2100-01-01"), by="10 years"), col=8, lty=3)
par(op)



marchtaylor/sinkr documentation built on July 4, 2022, 5:48 p.m.