wcr: Principal component regression and partial least squares in...
In refund.wave: Wavelet-Domain Regression with Functional Data

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/wcr.R

Performs generalized linear scalar-on-function or scalar-on-image regression in the wavelet domain, by sparse principal component regression (PCR) and sparse partial least squares (PLS).

wcr(y, xfuncs, min.scale, nfeatures, ncomp, method = c("pcr", "pls"), 
    mean.signal.term = FALSE, covt = NULL, filter.number = 10, 
    wavelet.family = "DaubLeAsymm", family = "gaussian", cv1 = FALSE, nfold = 5, 
    nsplit = 1, store.cv = FALSE, store.glm = FALSE, seed = NULL)

`y`	scalar outcome vector.
`xfuncs`	functional predictors. For 1D predictors, an n \times d matrix of signals, where n is the length of `y` and d is the number of sites at which each signal is defined. For 2D predictors, an n \times d \times d array comprising n images of dimension d \times d. For 3D predictors, an n \times d \times d \times d array comprising n images of dimension d \times d \times d. Note that d must be a power of 2.
`min.scale`	either a scalar, or a vector of values to be compared. Used to control the coarseness level of wavelet decomposition. Possible values are 0,1,…,log_2(d) - 1.
`nfeatures`	number(s) of features, i.e. wavelet coefficients, to retain for prediction of `y`: either a scalar, or a vector of values to be compared.
`ncomp`	number(s) of principal components (if `method="pcr"`) or PLS components (if `method="pls"`): either a scalar, or a vector of values to be compared.
`method`	either "`pcr`" (principal component regression) (the default) or "`pls`" (partial least squares).
`mean.signal.term`	logical: should the mean of each subject's signal be included as a covariate? By default, `FALSE`.
`covt`	covariates, if any: an n-row matrix, or a vector of length n.
`filter.number`	argument passed to function `wd`, `imwd`, or `wd3D` in the wavethresh package. Used to select the smoothness of wavelet in the decomposition.
`wavelet.family`	family of wavelets: passed to functions `wd`, `imwd`, or`wd3D`.
`family`	generalized linear model family. Current version supports `"gaussian"` (the default) and `"binomial"`.
`cv1`	logical: should cross-validation be performed (to estimate prediction error) even if a single value is provided for each of `min.scale`, `nfeatures` and `ncomp`? By default, `FALSE`. Note that whenever multiple candidate values are provided for one or more of these tuning parameters, CV is performed to select the best model.
`nfold`	the number of validation sets ("folds") into which the data are divided.
`nsplit`	number of splits into `nfold` validation sets; CV is computed by averaging over these splits.
`store.cv`	logical: should the output include a CV result table?
`store.glm`	logical: should the output include the fitted `glm`?
`seed`	the seed for random data division. If `seed = NULL`, a random seed is used.

Briefly, the algorithm works by (1) applying the discrete wavelet transform (DWT) to the functional/image predictors; (2) retaining only the nfeatures wavelet coefficients having the highest variance (for PCR; cf. Johnstone and Lu, 2009) or highest covariance with y (for PLS); (3) regressing y on the leading ncomp PCs or PLS components, along with any scalar covariates; and (4) applying the inverse DWT to the result to obtain the coefficient function estimate fhat.

This function supports only the standard DWT (see argument type in wd) with periodic boundary handling (see argument bc in wd).

For 2D predictors, setting min.scale=1 will lead to an error, due to a technical detail regarding imwd. Please contact the author if a workaround is needed.

See the Details for fpcr in refund for a note regarding decorrelation.

An object of class "wcr". This is a list that, if store.glm = TRUE, includes all components of the fitted glm object. The following components are included even if store.glm = FALSE:

`fitted.values`	the fitted values.
`param.coef`	coefficients for covariates with decorrelation. The model is fitted after decorrelating the functional predictors from any scalar covariates; but for CV, one needs the "undecorrelated" coefficients from the training-set models.
`undecor.coef`	coefficients for covariates without decorrelation. See `param.coef`.
`fhat`	coefficient function estimate.
`Rsq`	coefficient of determination.
`tuning.params`	if CV is performed, a 2 \times 4 table giving the indices and values of `min.scale`, `nfeatures` and `ncomp` chosen by CV.
`cv.table`	a table giving the CV criterion for each combination of `min.scale`, `nfeatures` and `ncomp`, if `store.cv = TRUE`; otherwise, the CV criterion only for the optimized combination of these parameters. Set to `NULL` if CV is not performed.
`se.cv`	if `store.cv = TRUE`, the standard error of the CV estimate for each combination of `min.scale`, `nfeatures` and `ncomp`.
`family`	generalized linear model family.

Lan Huo lan.huo@nyumc.org

Johnstone, I. M., and Lu, Y. (2009). On consistency and sparsity for principal components analysis in high dimensions. Journal of the American Statistical Association, 104, 682–693.

wnet