View source: R/MFPCA_calculation.R
MFPCA | R Documentation |
This function calculates a multivariate functional principal component analysis (MFPCA) based on i.i.d. observations x_1, …, x_N of a multivariate functional data-generating process X = X^(1), …, X^(p) with elements X^(j) in L^2(calT_j) defined on a domain calT_j of IR^{d_j}. In particular, the elements can be defined on different (dimensional) domains. The results contain the mean function, the estimated multivariate functional principal components \hat ψ_1, …, \hat ψ_M (having the same structure as x_i), the associated eigenvalues \hat ν_1 ≥q … ≥q \hat ν_M > 0 and the individual scores \hat ρ_{im} = \hat{<x_i, ψ_m>}. Moreover, estimated trajectories for each observation based on the truncated Karhunen-Loeve representation
\hat x_i = ∑_{m = 1}^M \hat ρ_{im} \hat ψ_m
are given
if desired (fit = TRUE
). The implementation of the observations
x_i =
(x_i^(1), … , x_i^(p)), i = 1 , …, N, the mean function and
multivariate functional principal components \hat ψ_1, …, \hat
ψ_M uses the multiFunData
class, which is defined
in the package funData.
MFPCA( mFData, M, uniExpansions, weights = rep(1, length(mFData)), fit = FALSE, approx.eigen = FALSE, bootstrap = FALSE, nBootstrap = NULL, bootstrapAlpha = 0.05, bootstrapStrat = NULL, verbose = options()$verbose )
mFData |
A |
M |
The number of multivariate functional principal components to calculate. |
uniExpansions |
A list characterizing the (univariate) expansion that is calculated for each element. See Details. |
weights |
An optional vector of weights, defaults to |
fit |
Logical. If |
approx.eigen |
Logical. If |
bootstrap |
Logical. If |
nBootstrap |
The number of bootstrap iterations to use. Defaults to
|
bootstrapAlpha |
A vector of numerics (or a single number) giving the
significance level for bootstrap intervals. Defaults to |
bootstrapStrat |
A stratification variable for bootstrap. Must be a
factor of length |
verbose |
Logical. If |
If the elements vary considerably in domain, range or variation, a weight vector w_1 , …, w_p can be supplied and the MFPCA is based on the weighted scalar product
<<f,g>>_w = ∑_{j = 1}^p w_j \int_\calT_j f^(j)(t) g^(j)(t) d t
and the corresponding weighted covariance operator Γ_w.
If bootstrap = TRUE
, pointwise bootstrap
confidence bands are generated for the multivariate eigenvalues \hat
ν_1, …, \hat ν_M as well as for multivariate functional principal
components \hat ψ_1, …, \hat
ψ_M. The parameter nBootstrap
gives the number of bootstrap
iterations. In each iteration, the observations are resampled on the level of
(multivariate) functions and the whole MFPCA is recalculated. In particular,
if the univariate basis depends on the data (FPCA approaches), basis
functions and scores are both re-estimated. If the basis functions are fixed
(e.g. splines), the scores from the original estimate are used to speed up
the calculations. The confidence bands for the eigenfunctions are calculated
separately for each element as pointwise percentile bootstrap confidence
intervals. Analogously, the confidence bands for the eigenvalues are also
percentile bootstrap confidence bands. The significance level(s) can be
defined by the bootstrapAlpha
parameter, which defaults to 5%. As a
result, the MFPCA
function returns a list CI
of the same length
as bootstrapAlpha
, containing the lower and upper bounds of the
confidence bands for the principal components as multiFunData
objects
of the same structure as mFData
. The confidence bands for the
eigenvalues are returned in a list CIvalues
, containing the upper and
lower bounds for each significance level.
The multivariate functional principal
component analysis relies on a univariate basis expansion for each element
X^(j). The univariate basis representation is calculated using
the univDecomp
function, that passes the univariate functional
observations and optional parameters to the specific function. The univariate
decompositions are specified via the uniExpansions
argument in the
MFPCA
function. It is a list of the same length as the mFData
object, i.e. having one entry for each element of the multivariate functional
data. For each element, uniExpansion
must specify at least the type of
basis functions to use. Additionally, one may add further parameters. The
following basis representations are supported:
Given basis
functions. Then uniExpansions[[j]] = list(type = "given", functions,
scores, ortho)
, where functions
is a funData
object on the
same domain as mFData
, containing the given basis functions. The
parameters scores
and ortho
are optional. scores
is an
N x K
matrix containing the scores (or coefficients) of the observed
functions for the given basis functions, where N
is the number of
observed functions and K
is the number of basis functions. Note that
the scores need to be demeaned to give meaningful results. If scores are not
supplied, they are calculated using the given basis functions. The parameter
ortho
specifies whether the given basis functions are orthonormal
orhto = TRUE
or not ortho = FALSE
. If ortho
is not
supplied, the functions are treated as non-orthogonal. scores
and
ortho
are not checked for plausibility, use them at your own risk!
Univariate functional principal component analysis. Then
uniExpansions[[j]] = list(type = "uFPCA", nbasis, pve, npc, makePD)
,
where nbasis,pve,npc,makePD
are parameters passed to the
PACE
function for calculating the univariate functional
principal component analysis.
Basis functions expansions from the
package fda. Then uniExpansions[[j]] = list(type = "fda", ...)
,
where ...
are passed to funData2fd
, which
heavily builds on eval.fd
. If fda is not available,
a warning is thrown.
Spline basis functions (not penalized). Then
uniExpansions[[j]] = list(type = "splines1D", bs, m, k)
, where
bs,m,k
are passed to the functions univDecomp
and
univExpansion
. For two-dimensional tensor product splines, use
type = "splines2D"
.
Spline basis functions (with smoothness
penalty). Then uniExpansions[[j]] = list(type = "splines1Dpen", bs, m,
k)
, where bs,m,k
are passed to the functions univDecomp
and univExpansion
. Analogously to the unpenalized case, use
type = "splines2Dpen"
for 2D penalized tensor product splines.
Cosine basis functions. Use uniExpansions[[j]] = list(type = "DCT2D",
qThresh, parallel)
for functions one two-dimensional domains (images) and
type = "DCT3D"
for 3D images. The calculation is based on the discrete
cosine transform (DCT) implemented in the C-library fftw3
. If this
library is not available, the function will throw a warning. qThresh
gives the quantile for hard thresholding the basis coefficients based on
their absolute value. If parallel = TRUE
, the coefficients for
different images are calculated in parallel.
See univDecomp
and univExpansion
for details.
An object of class MFPCAfit
containing the following
components:
values |
A vector of estimated eigenvalues \hat ν_1 , … , \hat ν_M. |
functions |
A
|
scores |
A matrix of dimension |
vectors |
A matrix representing the eigenvectors associated with the combined univariate score vectors. This might be helpful for calculating predictions. |
normFactors |
The normalizing factors used for calculating the multivariate eigenfunctions and scores. This might be helpful when calculation predictions. |
meanFunction |
A multivariate functional
data object, corresponding to the mean function. The MFPCA is applied to
the de-meaned functions in |
fit |
A
|
CI |
A
list of the same length as |
CIvalues |
A list of the same
length as |
C. Happ, S. Greven (2018): Multivariate Functional Principal Component Analysis for Data Observed on Different (Dimensional) Domains. Journal of the American Statistical Association, 113(522): 649-659. DOI: doi: 10.1080/01621459.2016.1273115
C. Happ-Kurz (2020): Object-Oriented Software for Functional Data. Journal of Statistical Software, 93(5): 1-38. DOI: doi: 10.18637/jss.v093.i05
See Happ-Kurz (2020. doi: 10.18637/jss.v093.i05) for a general
introduction to the funData package and it's interplay with
MFPCA. This file also includes a case study on how to use
MFPCA
. Useful functions: multiFunData
,
PACE
, univDecomp
, univExpansion
,
summary
,
plot
,
scoreplot
oldPar <- par(no.readonly = TRUE) set.seed(1) ### simulate data (one-dimensional domains) sim <- simMultiFunData(type = "split", argvals = list(seq(0,1,0.01), seq(-0.5,0.5,0.02)), M = 5, eFunType = "Poly", eValType = "linear", N = 100) # MFPCA based on univariate FPCA uFPCA <- MFPCA(sim$simData, M = 5, uniExpansions = list(list(type = "uFPCA"), list(type = "uFPCA"))) summary(uFPCA) plot(uFPCA) # plot the eigenfunctions as perturbations of the mean scoreplot(uFPCA) # plot the scores # MFPCA based on univariate spline expansions splines <- MFPCA(sim$simData, M = 5, uniExpansions = list(list(type = "splines1D", k = 10), list(type = "splines1D", k = 10)), fit = TRUE) # calculate reconstruction, too summary(splines) plot(splines) # plot the eigenfunctions as perturbations of the mean scoreplot(splines) # plot the scores ### Compare estimates to true eigenfunctions # flip to make results more clear uFPCA$functions <- flipFuns(sim$trueFuns, uFPCA$functions) splines$functions <- flipFuns(sim$trueFuns, splines$functions) par(mfrow = c(1,2)) plot(sim$trueFuns[[1]], main = "Eigenfunctions\n1st Element", lwd = 2) plot(uFPCA$functions[[1]], lty = 2, add = TRUE) plot(splines$functions[[1]], lty = 3, add = TRUE) plot(sim$trueFuns[[2]], main = "Eigenfunctions\n2nd Element", lwd = 2) plot(uFPCA$functions[[2]], lty = 2, add = TRUE) plot(splines$functions[[2]], lty = 3, add = TRUE) legend("bottomleft", c("True", "uFPCA", "splines"), lty = 1:3, lwd = c(2,1,1)) # Test reconstruction for the first 10 observations plot(sim$simData[[1]], obs = 1:10, main = "Reconstruction\n1st Element", lwd = 2) plot(splines$fit[[1]], obs = 1:10, lty = 2, col = 1, add = TRUE) plot(sim$simData[[2]], obs = 1:10, main = "Reconstruction\n2nd Element", lwd = 2) plot(splines$fit[[2]], obs = 1:10, lty = 2, col = 1, add = TRUE) legend("bottomleft", c("True", "Reconstruction"), lty = c(1,2), lwd = c(2,1)) # MFPCA with Bootstrap-CI for the first 2 eigenfunctions ### ATTENTION: Takes long splinesBoot <- MFPCA(sim$simData, M = 2, uniExpansions = list(list(type = "splines1D", k = 10), list(type = "splines1D", k = 10)), bootstrap = TRUE, nBootstrap = 100, bootstrapAlpha = c(0.05, 0.1), verbose = TRUE) summary(splinesBoot) plot(splinesBoot$functions[[1]], ylim = c(-2,1.5)) plot(splinesBoot$CI$alpha_0.05$lower[[1]], lty = 2, add = TRUE) plot(splinesBoot$CI$alpha_0.05$upper[[1]], lty = 2, add = TRUE) plot(splinesBoot$CI$alpha_0.1$lower[[1]], lty = 3, add = TRUE) plot(splinesBoot$CI$alpha_0.1$upper[[1]], lty = 3, add = TRUE) abline(h = 0, col = "gray") plot(splinesBoot$functions[[2]], ylim = c(-1,2.5)) plot(splinesBoot$CI$alpha_0.05$lower[[2]], lty = 2, add = TRUE) plot(splinesBoot$CI$alpha_0.05$upper[[2]], lty = 2, add = TRUE) plot(splinesBoot$CI$alpha_0.1$lower[[2]], lty = 3, add = TRUE) plot(splinesBoot$CI$alpha_0.1$upper[[2]], lty = 3, add = TRUE) abline(h = 0, col = "gray") legend("topleft", c("Estimate", "95% CI", "90% CI"), lty = 1:3, lwd = c(2,1,1)) # Plot 95% confidence bands for eigenvalues plot(1:2, splinesBoot$values, pch = 20, ylim = c(0, 1.5), main = "Estimated eigenvalues with 95% CI", xlab = "Eigenvalue no.", ylab = "") arrows(1:2, splinesBoot$CIvalues$alpha_0.05$lower, 1:2, splinesBoot$CIvalues$alpha_0.05$upper, length = 0.05, angle = 90, code = 3) points(1:2, sim$trueVals[1:2], pch = 20, col = 4) legend("topright", c("Estimate", "True value"), pch = 20, col = c(1,4)) ### simulate data (two- and one-dimensional domains) ### ATTENTION: Takes long set.seed(2) sim <- simMultiFunData(type = "weighted", argvals = list(list(seq(0,1,0.01), seq(-1,1,0.02)), list(seq(-0.5,0.5,0.01))), M = list(c(4,5), 20), eFunType = list(c("Fourier", "Fourier"), "Poly"), eValType = "exponential", N = 150) # MFPCA based on univariate spline expansions (for images) and univariate FPCA (for functions) pca <- MFPCA(sim$simData, M = 10, uniExpansions = list(list(type = "splines2D", k = c(10,12)), list(type = "uFPCA"))) summary(pca) plot(pca) # plot the eigenfunctions as perturbations of the mean scoreplot(pca) # plot the scores ### Compare to true eigenfunctions # flip to make results more clear pca$functions <- flipFuns(sim$trueFuns[1:10], pca$functions) par(mfrow = c(5,2), mar = rep(2,4)) for(m in 2:6) # for m = 1, image.plot (used in plot(funData)) produces an error... { plot(sim$trueFuns[[1]], main = paste("True, m = ", m), obs = m) plot(pca$functions[[1]], main = paste("Estimate, m = ", m), obs = m) } par(mfrow = c(1,1)) plot(sim$trueFuns[[2]], main = "Eigenfunctions (2nd element)", lwd = 2, obs= 1:5) plot(pca$functions[[2]], lty = 2, add = TRUE, obs= 1:5) legend("bottomleft", c("True", "MFPCA"), lty = 1:2, lwd = c(2,1)) par(oldPar)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.