robfpca.partial: Robust functional principal component analysis for partially...

View source: R/robfpca.partial.R

robfpca.partialR Documentation

Robust functional principal component analysis for partially observed functional data

Description

Robust functional principal component analysis (FPCA) for partially observed functional data is performed based on the pairwise robust covariance function estimation and eigenanalysis.

Usage

robfpca.partial(
  X,
  type = c("huber", "bisquare", "tdist"),
  grid = NULL,
  K = NULL,
  PVE = 0.99,
  MM = TRUE,
  smooth = TRUE,
  bw = NULL,
  cv = FALSE,
  df = 3,
  cv_optns = list(bw_cand = NULL, K = 5, ncores = 1)
)

Arguments

X

a n x p matrix. It allows NA.

type

the option for robust dispersion estimator. "huber", "bisquare", and "tdist" are supported.

grid

a vector containing the observed timepoints

K

a number of FPCs. If K is NULL, K is selected by PVE.

PVE

a proportion of variance explained

MM

the option for M-scale estimator in GK identity. If it is FALSE, the same method using type is used, that is the iterative algorithm. The closed form solution using method of moments can be used when MM == TRUE. Defalut is TRUE.

smooth

If it is TRUE, bivariate Nadaraya-Watson smoothing is performed using fields::smooth2d(). Default is TRUE.

bw

a bandwidth when smooth = TRUE.

cv

If it is TRUE, K-fold cross-validation is performed for the bandwidth selection when smooth = TRUE.

df

the degrees of freedm when type = "tdist".

cv_optns

the options of K-fold cross-validation when cv = TRUE. See Details.

Details

The location and scale functions are computed via pointwise M-estimator, and the covariance function is obtained via robust pairwise computation based on Orthogonalized Gnanadesikan-Kettenring (OGK) estimation. Additionally, bivariate Nadaraya-Watson smoothing is applied for smoothed covariance surfaces. To deal with the missing segments, FPCA is performed via PACE (Principal Analysis via Conditional Expectation).

The options of cv_optns:

bw_cand

a vector contains the candidates of bandwidths for bivariate smoothing.

K

the number of folds for K-fold cross validation.

ncores

the number of cores on foreach for parallel computing.

Value

a list contatining as follows:

data

a matrix which is the input X

lambda

the first K eigenvalues

eig.fun

the first K eigenvectors

pc.score

the first K FPC scores

K

a number of FPCs

PVE

a proportion of variance explained

work.grid

a work grid

eig.obj

an object of the eigenanalysis

mu

a mean function

cov

a covariance function

sig2

a noise variance

cov.obj

the object from cov_ogk(). See ?cov_ogk().

References

Park, Y., Kim, H., & Lim, Y. (2022+). Functional principal component analysis for partially observed elliptical process, Under review.

Maronna, R. A., & Zamar, R. H. (2002). Robust estimates of location and dispersion for high-dimensional datasets. Technometrics, 44(4), 307-317.

Yao, F., Müller, H. G., & Wang, J. L. (2005). Functional data analysis for sparse longitudinal data. Journal of the American statistical association, 100(470), 577-590.

Examples

### Generate example data
set.seed(100)
x.list <- sim_delaigle(n = 100,
                       type = "partial",
                       out.prop = 0.2,
                       out.type = 1,
                       dist = "normal")
x <- list2matrix(x.list)
matplot(t(x), type = "l")

### Robust FPCA for partially observed functional data
### Given bandwidth = 0.1
fpca.obj <- robfpca.partial(x,
                            type = "huber",
                            PVE = 0.95,
                            bw = 0.1)
fpc.score <- fpca.obj$pc.score

# ### Give same result in the above
# work.grid <- seq(0, 1, length.out = 51)
# cov.obj <- cov_ogk(x,
#                    type = "huber",
#                    bw = 0.1)
# fpca.obj <- funPCA(Lt = x.list$Lt,
#                    Ly = x.list$Ly,
#                    mu = cov.obj$mean,
#                    cov = cov.obj$cov,
#                    work.grid = work.grid,
#                    PVE = 0.95)
# fpc.score <- fpca.obj$pc.score


### Robust FPCA for partially observed functional data
### Choose bandwidth using 5-fold cross-validation
set.seed(1000)
bw_cand <- seq(0.02, 0.3, length.out = 10)
fpca.obj <- robfpca.partial(x,
                            type = "huber",
                            PVE = 0.95,
                            cv = TRUE,
                            cv_optns = list(bw_cand = bw_cand,
                                            K = 5,
                                            ncores = 1))
fpc.score <- fpca.obj$pc.score

# ### Give same result in the above
# set.seed(1000)
# work.grid <- seq(0, 1, length.out = 51)
# bw_cand <- seq(0.02, 0.3, length.out = 10)
# cov.cv.obj <- cv.cov_ogk(x,
#                          K = 5,
#                          bw_cand = bw_cand,
#                          MM = TRUE,
#                          type = 'huber')
# print(cov.cv.obj$selected_bw)
# cov.obj <- cov_ogk(x,
#                    type = "huber",
#                    bw = cov.cv.obj$selected_bw)
# fpca.obj <- funPCA(Lt = x.list$Lt,
#                    Ly = x.list$Ly,
#                    mu = cov.obj$mean,
#                    cov = cov.obj$cov,
#                    work.grid = work.grid,
#                    PVE = 0.95)
# fpc.score <- fpca.obj$pc.score


new_data <- x[1:5, ]   # example of new data

### Predict FPC score
pred_score <- predict(fpca.obj, type = "score", newdata = new_data)
pred_score

### Reconstruction
pred_reconstr <- predict(fpca.obj, type = "reconstr", newdata = new_data)
pred_reconstr
par(mfrow = c(1, 2))
matplot(t(new_data), type = "l",
        xlab = "t", ylab = "", main = "Observed curves")
matplot(t(pred_reconstr), type = "l",
        xlab = "t", ylab = "", main = "Reconstructed curves")

### Completion
pred_comp <- predict(fpca.obj, type = "comp", newdata = new_data)
pred_comp
matplot(t(new_data), type = "l",
        xlab = "t", ylab = "", main = "Completion")
matlines(t(pred_comp), type = "l", lty = 1, lwd = 2)



statKim/robfpca documentation built on April 15, 2023, 10:12 p.m.