RFPCA: Riemannian Functional Principal Component Analysis

View source: R/RFPCA.R

RFPCAR Documentation

Riemannian Functional Principal Component Analysis

Description

FPCA for Riemannian manifold-valued functional data. The Riemannian (or multi-dimensional) functional data can be dense or sparse.

RFPCA for Functional Data Analysis of Riemannian manifold data

Usage

RFPCA(Ly, Lt, optns = list())

Arguments

Ly

A list of matrices, each being D by n_i containing the observed values for individual i. In each matrix, columes corresponds to different time points, and rows corresponds to different dimensions.

Lt

A list of n vectors containing the observation time points for each individual corresponding to y. Each vector should be sorted in ascending order.

optns

A list of options control parameters specified by list(name=value). See ‘Details’.

Details

Supported classes includes: 'Sphere' (default), 'Euclidean', 'SO', 'HS', 'L2', 'Dens'.

Available control options are

mfd

A structure such as structure(1, class='CLASS'), where CLASS is one of the supported classes above. Takes precedence over mfdName. Default: structure(1, 'Sphere')

mfdName

The name of a manifold. Supported values are 'Sphere', 'Euclidean', 'SO', 'HS', 'L2', 'Dens'. Default: 'Sphere'

dataType

The type of design we have (usually distinguishing between sparse or dense functional data); 'Sparse', 'Dense', 'DenseWithMV', 'p>>n'. Default: determine automatically based on 'IsRegular'

userBwMu

The bandwidth for smoothing the mean function. Can be either a scalar specifying the bandwidth, or 'GCV' for generalized cross-validation. MUST BE SPECIFIED

userBwCov

The bandwidth for smoothing the covariance function. Can be a scalar specifying the bandwidth. If userBwCov = 'GCV', then userBwCov will be set to twice the GCV-selected bandwidth for mu. MUST BE SPECIFIED

ToutRange

Truncate the FPCA to be only within ToutRange. Default: c(-Inf, Inf)

npoly

The degree of local polynomials for smoothing. Default: 1 (local linear)

nRegGrid

The number of support points in each direction of covariance surface. Default: 51

kernel

Smoothing kernel choice, common for mu and covariance; "rect", "gauss", "epan", "gausvar", "quar". Default: "gauss"; dense data are assumed noise-less so no smoothing is performed.

error

Assume measurement error in the dataset. The error is assumed to be isometric on the tangent space. Default: TRUE

maxK

The maximum number of principal components to consider. Default: Inf if the smoothing method is used, and 30 for cross-sectional estimate

userSigma2

The user-defined measurement error variance. A positive scalar. Default: 'NULL'

methodMuCovEst

The method to estimate the mean and covariance in the case of dense functional data; 'cross-sectional', 'smooth'. Default: 'cross-sectional'

methodXi

The method to estimate the PC scores; 'CE' (Condit. Expectation), 'IN' (Numerical Integration). Default: 'CE' for sparse data and dense data with missing values, 'IN' for dense data.

obsGridOnly

If TRUE, then assume the observation grids are regular, and use it as the regGrid/workGrid. This may speed up the grid convertion and eigendecomposition if length(obsGrid) is small. Default: TRUE if the Lt are regular and length(obsGrid) <= nRegGrid, and FALSE otherwise.

References: Dai X, Lin Z, Müller HG. Modeling sparse longitudinal data on Riemannian manifolds. Biometrics. 2021;77(4):1328–41. Lin Z, Yao F. Intrinsic Riemannian functional data analysis. The Annals of Statistics. 2019;47(6):3533–77. Dai X, Müller HG. Principal component analysis for functional data on Riemannian manifolds and spheres. Annals of Statistics. 2018;46(6B):3334–61.

Maintainer: Xiongtao Dai xdai@iastate.edu

Value

A list containing the following fields:

muReg

A D by nRegGrid matrix containing the mean function estimate on regGrid.

muWork

A D by nWorkGrid matrix containing the mean function estimate on workGrid.

muObs

A D by nObsGrid matrix containing the mean function estimate on obsGrid.

cov

An nWorkGrid by nWorkGrid by D by D array of the smoothed covariance surface.

covObs

An nObsGrid by nObsGrid by D by D array of the smoothed covariance surface interpolated onto the obsGrid.

phi

An nWorkGrid by D by K array containing eigenfunctions supported on workGrid, where D is the ambient dimension.

phiObsTrunc

A possibly truncated version of phi, supported on the truncated obsGrid

lambda

A vector of length K containing eigenvalues.

xi

A n by K matrix containing the FPC estimates.

sigma2

Variance for measure error.

obsGrid

The (sorted) grid points where all observation points are pooled.

regGrid

A vector of length nRegGrid. The internal regular grid on which the eigen analysis is carried on.

workGrid

Duplicates regGrid. A vector of length nWorkGrid. The internal regular grid on which the eigen analysis is carried on.

workGridTrunc

A possibly truncated version of regGrid.

K

Number of components returned

userBwMu

The selected (or user specified) bandwidth for smoothing the mean function.

userBwCov

The selected (or user specified) bandwidth for smoothing the covariance function.

mfd

The manifold on which the analysis is performed.

optns

A list of actually used options.

Author(s)

Xiongtao Dai xdai@iastate.edu Zhenhua Lin

Examples

# First simulate some data
set.seed(1)
n <- 50
m <- 20 # Number of different pooled time points
K <- 20
lambda <- 0.07 ^ (seq_len(K) / 2)
basisType <- 'legendre01'
sparsity <- 5:15
sigma2 <- 0.01
muList <- list(
  function(x) x * 2, 
  function(x) sin(x * 1 * pi) * pi / 2 * 0.6,
  function(x) rep(0, length(x))
)
D <- length(muList)
pts <- seq(0, 1, length.out=m)
mfd <- structure(1, class='Euclidean')
mu <- Makemu(mfd, muList, c(rep(0, D - 1), 1), pts)

# Generate noisy samples
samp <- MakeMfdProcess(mfd, n, mu, pts, K = K, lambda=lambda, basisType=basisType, sigma2=sigma2)
spSamp <- SparsifyM(samp$X, samp$T, sparsity)
yList <- spSamp$Ly
tList <- spSamp$Lt

# Fit model
bw <- 0.2
kern <- 'epan'

resEu <- RFPCA(yList, tList, 
  list(userBwMu=bw, 
       userBwCov=bw * 2, 
       kernel=kern, 
       maxK=K, 
       mfdName='euclidean', 
       error=TRUE))

# Solid curve stands for the true mean and dashed for the estimated mean function.
matplot(pts, t(mu), type='l', lty=1)
matplot(pts, t(resEu$muObs), type='l', lty=2, add=TRUE)

# Up to the 3rd principal components were well-estimated
plot(resEu$xi[, 3], samp$xi[, 3]) 


CrossD/RFPCA documentation built on Aug. 24, 2023, 4:42 p.m.