cv.fof.wv: Cross-validation for wavelet-based linear...

Description Usage Arguments Details Value Author(s) References Examples

Description

This function performs cross-validation and builds the final model for the following linear function-on-function regression model:

Y(t)= μ(t)+\int Z_1(s)β_1(s,t)ds+...+\int Z_p(s)β_p(s,t)ds+ε(t),

where μ(t) is the intercept function. The {Z_i(s),1≤ i≤ p} are p functional predictors and {β_i(s,t),1≤ i≤ p} are their corresponding coefficient functions, where p is a positive integer. The ε(t) is the noise function. This method first applies the fast wavelet transformation (FWT) to the predictor curves, and transforms the original model to a function-on-scalar model with wavelet coefficients as scalar predictors. Then applying a dimension reduction based on signal compression to approximate the regression function, we have a function-on-scalar model with a small number of uncorrelated scalar predictors.

Usage

1
cv.fof.wv(X, Y, t.y, K.cv = 5, upp.comp=10, thresh=0.01)

Arguments

X

the n*p matrx of the wavelet coefficients of the predictor curves, where n is the sample size and q is the total number of wavelet coefficients for original functional predictors.

Y

the n*m data matrix for the functional response Y(t), where n is the sample size and m is the number of the observation time points for Y(t).

t.y

the vector of obesrvation time points of the functional response Y(t).

K.cv

the number of CV folds. Default is 5.

upp.comp

the upper bound for the maximum number of components to be calculated. Default is 10.

thresh

a number between 0 and 1 used to determine the maximum number of components we need to calculate. The maximum number is between one and the "upp.comp" above. The optimal number of components will be chosen between 1 and this maximum number, together with other tuning parameters by cross-validation. A smaller thresh value leads to a larger maximum number of components and a longer running time. A larger thresh value needs less running time, but may miss some important components and lead to a larger prediction error. Default is 0.01.

Details

This method first expresses the functional predictors and β_i(.,t) using wavelet expansion. Let \bold{X} and \bold{β}(t) denote the vectors of all concatenated wavelet coefficients. Then the original model is transformed to

Y(t)=μ(t)+\bold{X}^T \bold{β}(t)+\varepsilon(t).

We use the decomposition \bold{β}(t)=∑_{k=1}^K \bold{α}_k w_k(t) based on the KL expansion of \bold{X}^T \bold{β}(t), where \bold{α}_k's are vectors of the same length as \bold{β}(t). We estimate \bold{α}_k for each k by solving the panelized generalized eigenvalue problem

max_{\bold{α}} \frac{\bold{α}^T\hat{\bold{B}}\bold{α}}{ \bold{α}\hat{\bold{Σ}}\bold{α}+P(\bold{α})}

{\rm{ s.t. }}\quad \bold{α} \hat{\bold{Σ}}\bold{α}=1

{\rm{ and }}\quad \bold{α} \hat{\bold{Σ}}\bold{α}_{k'}=0 \quad{\rm{for}}\quad k'<k

where \hat{\bold{B}}=∑_{\ell=1}^n∑_{\ell'=1}^n \{x_{\ell}-\bar{x}\}\int\{y_{\ell}(t)-\bar{y}(t)\}\{y_{\ell'}(t)-\bar{y}(t)\}dt \{x_{\ell'}-\bar{x}\}^T/n^2, \hat{\bold{Σ}}=∑_{\ell=1}^n \{x_{\ell}-\bar{x}\}\{x_{\ell}-\bar{x}\}^T/n, and penalty

P(\bold{α})=τ\{(1-λ)||\bold{α}||_2^2+ λ ||\bold{α}||_1^2\}.

Then we estimate w_{k}(t), k>0 by regressing Y(t) on \{\hat{z}_{1},... \hat{z}_{K}\} with penalty κ ∑_{k=0}^K \|w''_k\|^2 tuned by the smoothness parameter κ. Here \hat{z}_{k}=(\bold{X}-\bar{\bold{X}})^T\hat{\bold{α}}_{k}.

Value

An object of the “cv.fof.wv” class, which is used in the function pred.fof.wv for prediction.

min.error

minimum CV error.

X

input X.

Y

input Y.

errors

list for CV errors.

opt.K

optimal number of components to be selected.

opt.smooth

optimal smooth tuning parameter κ.

min.tau

optimal tuning parameter τ.

min.lambda

optimal tuning parameter λ.

params.set

set of tuning parameters using in CV.

...

other output for internal use.

Author(s)

Ruiyan Luo and Xin Qi

References

Ruiyan Luo, Xin Qi and Yanhong Wang. (2016) Functional wavelet regression for function-on-function linear models. Electronic Journal of Statistics. 10(2): 3179-3216.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
########################################################################
## Example: wavelet function-on-function regresion
#########################################################################
ptm <- proc.time()
library(FRegSigCom)
library(wavethresh)
library(refund)
data(DTI)

I=which(is.na(apply(DTI$cca,1,mean)))
Y=DTI$cca[-I,] # functional response
X=DTI$rcst[-I,21:52] #functional predictor
n.wv=5

diagmat <- diag(2^n.wv)
W.x <- diagmat
for(i in 1:2^n.wv){
  tmp <- wd(diagmat[i,])
  tmp.cof <- accessC(tmp, level=0)
  for(j in 0:(n.wv-1))
    tmp.cof <- c(tmp.cof, accessD(tmp, level=j))
  W.x[,i] <- tmp.cof
}
X.wv=X

t.y <- seq(0,1,length=dim(Y)[2])
# randomly split all the observations into a training set with 200 observations
# and a test set.
train.id=sample(1:nrow(Y), 50)
X.wv.train <- X.wv[train.id,]
Y.train <- Y[train.id, ]
X.wv.test <- X.wv[-(train.id),]
Y.test <- Y[-(train.id), ]

fit.cv=cv.fof.wv(X.wv.train, Y.train, t.y, upp.comp=5) # use default upp.comp or larger
Y.pred=pred.fof.wv(fit.cv, X.wv.test)
error<- mean((Y.pred-Y.test)^2)
print(c(" prediction error=", error))
print(proc.time()-ptm)

FRegSigCom documentation built on May 1, 2019, 9:45 p.m.