fosr: Function-on-scalar regression In refund: Regression with Functional Data

Description

Fit linear regression with functional responses and scalar predictors, with efficient selection of optimal smoothing parameters.

Usage

 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 fosr( formula = NULL, Y = NULL, fdobj = NULL, data = NULL, X, con = NULL, argvals = NULL, method = c("OLS", "GLS", "mix"), gam.method = c("REML", "ML", "GCV.Cp", "GACV.Cp", "P-REML", "P-ML"), cov.method = c("naive", "mod.chol"), lambda = NULL, nbasis = 15, norder = 4, pen.order = 2, multi.sp = ifelse(method == "OLS", FALSE, TRUE), pve = 0.99, max.iter = 1, maxlam = NULL, cv1 = FALSE, scale = FALSE )

Arguments

 formula Formula for fitting fosr. If used, data argument must not be null. Y, fdobj the functional responses, given as either an n\times d matrix Y or a functional data object (class "fd") as in the fda package. data data frame containing the predictors and responses. X the model matrix, whose columns represent scalar predictors. Should ordinarily include a column of 1s. con a row vector or matrix of linear contrasts of the coefficient functions, to be constrained to equal zero. argvals the d argument values at which the coefficient functions will be evaluated. method estimation method: "OLS" for penalized ordinary least squares, "GLS" for penalized generalized least squares, "mix" for mixed effect models. gam.method smoothing parameter selection method, to be passed to gam: "REML" for restricted maximum likelihood, "GCV.Cp" for generalized cross-validation. cov.method covariance estimation method: the current options are naive or modified Cholesky. See Details. lambda smoothing parameter value. If NULL, the smoothing parameter(s) will be estimated. See Details. nbasis, norder number of basis functions, and order of splines (the default, 4, gives cubic splines), for the B-spline basis used to represent the coefficient functions. When the functional responses are supplied using fdobj, these arguments are ignored in favor of the values pertaining to the supplied object. pen.order order of derivative penalty. multi.sp a logical value indicating whether separate smoothing parameters should be estimated for each coefficient function. Currently must be FALSE if method = "OLS". pve if method = 'mix', the percentage of variance explained by the principal components; defaults to 0.99. max.iter maximum number of iterations if method = "GLS". maxlam maximum smoothing parameter value to consider (when lamvec=NULL; see lofocv). cv1 logical value indicating whether a cross-validation score should be computed even if a single fixed lambda is specified (when method = "OLS"). scale logical value or vector determining scaling of the matrix X (see scale, to which the value of this argument is passed).

Details

The GLS method requires estimating the residual covariance matrix, which has dimension d\times d when the responses are given by Y, or nbasis\times nbasis when they are given by fdobj. When cov.method = "naive", the ordinary sample covariance is used. But this will be singular, or nonsingular but unstable, in high-dimensional settings, which are typical. cov.method = "mod.chol" implements the modified Cholesky method of Pourahmadi (1999) for estimation of covariance matrices whose inverse is banded. The number of bands is chosen to maximize the p-value for a sphericity test (Ledoit and Wolf, 2002) applied to the "prewhitened" residuals. Note, however, that the banded inverse covariance assumption is sometimes inappropriate, e.g., for periodic functional responses.

There are three types of values for argument lambda:

1. if NULL, the smoothing parameter is estimated by gam (package mgcv) if method = "GLS", or by optimize if method = "OLS";

2. if a scalar, this value is used as the smoothing parameter (but only for the initial model, if method = "GLS");

3. if a vector, this is used as a grid of values for optimizing the cross-validation score (provided method = "OLS"; otherwise an error message is issued).

Please note that currently, if multi.sp = TRUE, then lambda must be NULL and method must be "GLS".

Value

An object of class fosr, which is a list with the following elements:

 fd object of class "fd" representing the estimated coefficient functions. Its main components are a basis and a matrix of coefficients with respect to that basis. pca.resid if method = "mix", an object representing a functional PCA of the residuals, performed by fpca.sc if the responses are in raw form or by pca.fd if in functional-data-object form. U if method = "mix", an n\times m matrix of random effects, where m is the number of functional PC's needed to explain proportion pve of the residual variance. These random effects can be interpreted as shrunken FPC scores. yhat, resid objects of the same form as the functional responses (see arguments Y and fdobj), giving the fitted values and residuals. est.func matrix of values of the coefficient function estimates at the points given by argvals. se.func matrix of values of the standard error estimates for the coefficient functions, at the points given by argvals. argvals points at which the coefficient functions are evaluated. fit fit object outputted by amc. edf effective degrees of freedom of the fit. lambda smoothing parameter, or vector of smoothing parameters. cv cross-validated integrated squared error if method="OLS", otherwise NULL. roughness value of the roughness penalty. resp.type "raw" or "fd", indicating whether the responses were supplied in raw or functional-data-object form.

Author(s)

Philip Reiss phil.reiss@nyumc.org, Lan Huo, and Fabian Scheipl

References

Ledoit, O., and Wolf, M. (2002). Some hypothesis tests for the covariance matrix when the dimension is large compared to the sample size. Annals of Statistics, 30(4), 1081–1102.

Pourahmadi, M. (1999). Joint mean-covariance models with applications to longitudinal data: unconstrained parameterisation. Biometrika, 86(3), 677–690.

Ramsay, J. O., and Silverman, B. W. (2005). Functional Data Analysis, 2nd ed., Chapter 13. New York: Springer.

Reiss, P. T., Huang, L., and Mennes, M. (2010). Fast function-on-scalar regression with penalized basis expansions. International Journal of Biostatistics, 6(1), article 28. Available at https://pubmed.ncbi.nlm.nih.gov/21969982/

Examples

 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 ## Not run: require(fda) # The first two lines, adapted from help(fRegress) in package fda, # set up a functional data object representing daily average # temperatures at 35 sites in Canada daybasis25 <- create.fourier.basis(rangeval=c(0, 365), nbasis=25, axes=list('axesIntervals')) Temp.fd <- with(CanadianWeather, smooth.basisPar(day.5, dailyAv[,,'Temperature.C'], daybasis25)$fd) modmat = cbind(1, model.matrix(~ factor(CanadianWeather$region) - 1)) constraints = matrix(c(0,1,1,1,1), 1) # Penalized OLS with smoothing parameter chosen by grid search olsmod = fosr(fdobj = Temp.fd, X = modmat, con = constraints, method="OLS", lambda=100*10:30) plot(olsmod, 1) # Test use formula to fit fosr set.seed(2121) data1 <- pffrSim(scenario="ff", n=40) formod = fosr(Y~xlin+xsmoo, data=data1) plot(formod, 1) # Penalized GLS glsmod = fosr(fdobj = Temp.fd, X = modmat, con = constraints, method="GLS") plot(glsmod, 1) ## End(Not run)

Example output   Attaching package: 'fda'

The following object is masked from 'package:graphics':

matplot

Calculating CV for candidate smoothing parameter values...
lambda  LOFO-CV
[1,]   1000 418231.0
[2,]   1100 418229.5
[3,]   1200 418228.3
[4,]   1300 418227.3
[5,]   1400 418226.6
[6,]   1500 418225.9
[7,]   1600 418225.5
[8,]   1700 418225.1
[9,]   1800 418224.9
[10,]   1900 418224.9
[11,]   2000 418224.9
[12,]   2100 418225.0
[13,]   2200 418225.2
[14,]   2300 418225.4
[15,]   2400 418225.8
[16,]   2500 418226.2
[17,]   2600 418226.6
[18,]   2700 418227.1
[19,]   2800 418227.7
[20,]   2900 418228.3
[21,]   3000 418228.9
Finding optimal lambda by optimize()...

refund documentation built on July 1, 2021, 9:06 a.m.