spatpca: Regularized PCA for spatial data

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/SpatPCA.R

Description

Produce spatial dominant patterns and spatial predictions at the designated locations according to the specified tuning parameters or the selected tuning parameters by the M-fold cross-validation.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
spatpca(
  x,
  Y,
  M = 5,
  K = NULL,
  is_K_selected = ifelse(is.null(K), TRUE, FALSE),
  tau1 = NULL,
  tau2 = NULL,
  gamma = NULL,
  is_Y_detrended = FALSE,
  maxit = 100,
  thr = 1e-04,
  num_cores = NULL
)

Arguments

x

Location matrix (p \times d). Each row is a location. d is the dimension of locations

Y

Data matrix (n \times p) stores the values at p locations with sample size n.

M

Optional number of folds for cross validation; default is 5.

K

Optional user-supplied number of eigenfunctions; default is NULL. If K is NULL or is_K_selected is TRUE, K is selected automatically.

is_K_selected

If TRUE, K is selected automatically; otherwise, is_K_selected is set to be user-supplied K. Default depends on user-supplied K.

tau1

Optional user-supplied numeric vector of a nonnegative smoothness parameter sequence. If NULL, 10 tau1 values in a range are used.

tau2

Optional user-supplied numeric vector of a nonnegative sparseness parameter sequence. If NULL, none of tau2 is used.

gamma

Optional user-supplied numeric vector of a nonnegative tuning parameter sequence. If NULL, 10 values in a range are used.

is_Y_detrended

If TRUE, center the columns of Y. Default is FALSE.

maxit

Maximum number of iterations. Default value is 100.

thr

Threshold for convergence. Default value is 10^{-4}.

num_cores

Number of cores used to parallel computing. Default value is NULL (See RcppParallel::defaultNumThreads())

Details

An ADMM form of the proposed objective function is written as

\min_{\mathbf{Φ}} \|\mathbf{Y}-\mathbf{Y}\mathbf{Φ}\mathbf{Φ}'\|^2_F +τ_1\mbox{tr}(\mathbf{Φ}^T\mathbf{Ω}\mathbf{Φ})+τ_2∑_{k=1}^K∑_{j=1}^p |φ_{jk}|,

\mbox{subject to $ \mathbf{Φ}^T\mathbf{Φ}=\mathbf{I}_K$,} where \mathbf{Y} is a data matrix, {\mathbf{Ω}} is a smoothness matrix, and \mathbf{Φ}=\{φ_{jk}\}.

Value

A list of objects including

eigenfn

Estimated eigenfunctions at the new locations, x_new.

selected_K

Selected K based on CV. Execute the algorithm when is_K_selected is TRUE.

selected_tau1

Selected tau1.

selected_tau2

Selected tau2.

selected_gamma

Selected gamma.

cv_score_tau1

cv scores for tau1.

cv_score_tau2

cv scores for tau2.

cv_score_gamma

cv scores for gamma.

tau1

Sequence of tau1-values used in the process.

tau2

Sequence of tau2-values used in the process.

gamma

Sequence of gamma-values used in the process.

detrended_Y

If is_Y_detrended is TRUE, detrended_Y means Y is detrended; else, detrended_Y is equal to Y.

scaled_x

Input location matrix. Only scale when it is one-dimensional

Author(s)

Wen-Ting Wang and Hsin-Cheng Huang

References

Wang, W.-T. and Huang, H.-C. (2017). Regularized principal component analysis for spatial data. Journal of Computational and Graphical Statistics 26 14-25.

See Also

predict

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
# The following examples only use two threads for parallel computing.
## 1D: regular locations
x_1D <- as.matrix(seq(-5, 5, length = 50))
Phi_1D <- exp(-x_1D^2) / norm(exp(-x_1D^2), "F")
set.seed(1234)
Y_1D <- rnorm(n = 100, sd = 3) %*% t(Phi_1D) + matrix(rnorm(n = 100 * 50), 100, 50)
cv_1D <- spatpca(x = x_1D, Y = Y_1D, num_cores = 2)
plot(x_1D, cv_1D$eigenfn[, 1], type = "l", main = "1st eigenfunction")
lines(x_1D, svd(Y_1D)$v[, 1], col = "red")
legend("topleft", c("SpatPCA", "PCA"), lty = 1:1, col = 1:2)


## 2D: Daily 8-hour ozone averages for sites in the Midwest (USA)
library(fields)
library(pracma)
library(maps)
data(ozone2)
x <- ozone2$lon.lat
Y <- ozone2$y
date <- as.Date(ozone2$date, format = "%y%m%d")
rmna <- !colSums(is.na(Y))
YY <- matrix(Y[, rmna], nrow = nrow(Y))
YY <- detrend(YY, "linear")
xx <- x[rmna, ]
cv <- spatpca(x = xx, Y = YY)
quilt.plot(xx, cv$eigenfn[, 1])
map("state", xlim = range(xx[, 1]), ylim = range(xx[, 2]), add = TRUE)
map.text("state", xlim = range(xx[, 1]), ylim = range(xx[, 2]), cex = 2, add = TRUE)
plot(date, YY %*% cv$eigenfn[, 1], type = "l", ylab = "1st Principal Component")
### new loactions
new_p <- 200
x_lon <- seq(min(xx[, 1]), max(xx[, 1]), length = new_p)
x_lat <- seq(min(xx[, 2]), max(xx[, 2]), length = new_p)
xx_new <- as.matrix(expand.grid(x = x_lon, y = x_lat))
eof <- spatpca(x = xx,
               Y = YY,
               K = cv$selected_K, 
               tau1 = cv$selected_tau1, 
               tau2 = cv$selected_tau2)
predicted_eof <- predictEigenfunction(eof, xx_new)              
quilt.plot(xx_new,
           predicted_eof[,1],
           nx = new_p, 
           ny = new_p, 
           xlab = "lon.", 
           ylab = "lat.")
map("state", xlim = range(x_lon), ylim = range(x_lat), add = TRUE)
map.text("state", xlim = range(x_lon), ylim = range(x_lat), cex = 2, add = TRUE)
## 3D: regular locations
p <- 10
x <- y <- z <- as.matrix(seq(-5, 5, length = p))
d <- expand.grid(x, y, z)
Phi_3D <- rowSums(exp(-d^2)) / norm(as.matrix(rowSums(exp(-d^2))), "F")
Y_3D <- rnorm(n = 100, sd = 3) %*% t(Phi_3D) + matrix(rnorm(n = 100 * p^3), 100, p^3)
cv_3D <- spatpca(x = d, Y = Y_3D, tau2 = seq(0, 1000, length = 10))
library(plot3D)
library(RColorBrewer)
cols <- colorRampPalette(brewer.pal(9, "Blues"))(p)
isosurf3D(x, y, z,
         colvar = array(cv_3D$eigenfn[, 1], c(p, p, p)),
         level= seq(min(cv_3D$eigenfn[, 1]), max(cv_3D$eigenfn[, 1]), length = p),
         ticktype = "detailed",
         colkey = list(side = 1),
         col = cols)

SpatPCA documentation built on Jan. 31, 2021, 5:05 p.m.