# fRegress: Functional Regression Analysis In fda: Functional Data Analysis

## Description

This function carries out a functional regression analysis, where either the dependent variable or one or more independent variables are functional. Non-functional variables may be used on either side of the equation. In a simple problem where there is a single scalar independent covariate with values z_i, i=1,…,N and a single functional covariate with values x_i(t), the two versions of the model fit by fRegress are the scalar dependent variable model

y_i = β_1 z_i + \int x_i(t) β_2(t) \, dt + e_i

and the concurrent functional dependent variable model

y_i(t) = β_1(t) z_i + β_2(t) x_i(t) + e_i(t).

In these models, the final term e_i or e_i(t) is a residual, lack of fit or error term.

In the concurrent functional linear model for a functional dependent variable, all functional variables are all evaluated at a common time or argument value $t$. That is, the fit is defined in terms of the behavior of all variables at a fixed time, or in terms of "now" behavior.

All regression coefficient functions β_j(t) are considered to be functional. In the case of a scalar dependent variable, the regression coefficient for a scalar covariate is converted to a functional variable with a constant basis. All regression coefficient functions can be forced to be smooth through the use of roughness penalties, and consequently are specified in the argument list as functional parameter objects.

## Usage

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 fRegress(y, ...) ## S3 method for class 'formula' fRegress(y, data=NULL, betalist=NULL, wt=NULL, y2cMap=NULL, SigmaE=NULL, method=c('fRegress', 'model'), sep='.', ...) ## S3 method for class 'character' fRegress(y, data=NULL, betalist=NULL, wt=NULL, y2cMap=NULL, SigmaE=NULL, method=c('fRegress', 'model'), sep='.', ...) ## S3 method for class 'fd' fRegress(y, xfdlist, betalist, wt=NULL, y2cMap=NULL, SigmaE=NULL, returnMatrix=FALSE, ...) ## S3 method for class 'fdPar' fRegress(y, xfdlist, betalist, wt=NULL, y2cMap=NULL, SigmaE=NULL, returnMatrix=FALSE, ...) ## S3 method for class 'numeric' fRegress(y, xfdlist, betalist, wt=NULL, y2cMap=NULL, SigmaE=NULL, returnMatrix=FALSE, ...) 

## Arguments

 y the dependent variable object. It may be an object of five possible classes: character or formula a formula object or a character object that can be coerced into a formula providing a symbolic description of the model to be fitted satisfying the following rules: The left hand side, formula y, must be either a numeric vector or a univariate object of class fd or fdPar. If the former, it is replaced by fdPar(y, ...). All objects named on the right hand side must be either numeric or fd (functional data) or fdPar. The number of replications of fd or fdPar object(s) must match each other and the number of observations of numeric objects named, as well as the number of replications of the dependent variable object. The right hand side of this formula is translated into xfdlist, then passed to another method for fitting (unless method = 'model'). Multivariate independent variables are allowed in a formula and are split into univariate independent variables in the resulting xfdlist. Similarly, categorical independent variables with k levels are translated into k-1 contrasts in xfdlist. Any smoothing information is passed to the corresponding component of betalist. scalar a vector if the dependent variable is scalar. fd a functional data object if the dependent variable is functional. A y of this class is replaced by fdPar(y, ...) and passed to fRegress.fdPar. fdPar a functional parameter object if the dependent variable is functional, and if it is desired to smooth the prediction of the dependent variable. data an optional list or data.frame containing names of objects identified in the formula or character y. xfdlist a list of length equal to the number of independent variables (including any intercept). Members of this list are the independent variables. They can be objects of either of these two classes: scalar a numeric vector if the independent variable is scalar. fd a (univariate) functional data object. In either case, the object must have the same number of replications as the dependent variable object. That is, if it is a scalar, it must be of the same length as the dependent variable, and if it is functional, it must have the same number of replications as the dependent variable. (Only univariate independent variables are currently allowed in xfdlist.) betalist For the fd, fdPar, and numeric methods, betalist must be a list of length equal to length(xfdlist). Members of this list are functional parameter objects (class fdPar) defining the regression functions to be estimated. Even if a corresponding independent variable is scalar, its regression coefficient must be functional if the dependent variable is functional. (If the dependent variable is a scalar, the coefficients of scalar independent variables, including the intercept, must be constants, but the coefficients of functional independent variables must be functional.) Each of these functional parameter objects defines a single functional data object, that is, with only one replication. For the formula and character methods, betalist can be either a list, as for the other methods, or NULL, in which case a list is created. If betalist is created, it will use the bases from the corresponding component of xfdlist if it is function or from the response variable. Smoothing information (arguments Lfdobj, lambda, estimate, and penmat of function fdPar) will come from the corresponding component of xfdlist if it is of class fdPar (or for scalar independent variables from the response variable if it is of class fdPar) or from optional ... arguments if the reference variable is not of class fdPar. wt weights for weighted least squares y2cMap the matrix mapping from the vector of observed values to the coefficients for the dependent variable. This is output by function smooth.basis. If this is supplied, confidence limits are computed, otherwise not. SigmaE Estimate of the covariances among the residuals. This can only be estimated after a preliminary analysis with fRegress. method a character string matching either fRegress for functional regression estimation or mode to create the argument lists for functional regression estimation without running it. sep separator for creating names for multiple variables for fRegress.fdPar or fRegress.numeric created from single variables on the right hand side of the formula y. This happens with multidimensional fd objects as well as with categorical variables. returnMatrix logical: If TRUE, a two-dimensional is returned using a special class from the Matrix package. ... optional arguments

## Details

Alternative forms of functional regression can be categorized with traditional least squares using the following 2 x 2 table:

 explanatory variable response | scalar | function | | scalar | lm | fRegress.numeric | | function | fRegress.fd or | fRegress.fd or | fRegress.fdPar | fRegress.fdPar or linmod

For fRegress.numeric, the numeric response is assumed to be the sum of integrals of xfd * beta for all functional xfd terms.

fRegress.fd or .fdPar produces a concurrent regression with each beta being also a (univariate) function.

linmod predicts a functional response from a convolution integral, estimating a bivariate regression function.

In the computation of regression function estimates in fRegress, all independent variables are treated as if they are functional. If argument xfdlist contains one or more vectors, these are converted to functional data objects having the constant basis with coefficients equal to the elements of the vector.

Needless to say, if all the variables in the model are scalar, do NOT use this function. Instead, use either lm or lsfit.

These functions provide a partial implementation of Ramsay and Silverman (2005, chapters 12-20).

## Value

These functions return either a standard fRegress fit object or or a model specification:

 fRegress fit a list of class fRegress with the following components: y the first argument in the call to fRegress (coerced to class fdPar) xfdlist the second argument in the call to fRegress. betalist the third argument in the call to fRegress. betaestlist a list of length equal to the number of independent variables and with members having the same functional parameter structure as the corresponding members of betalist. These are the estimated regression coefficient functions. yhatfdobj a functional parameter object (class fdPar) if the dependent variable is functional or a vector if the dependent variable is scalar. This is the set of predicted by the functional regression model for the dependent variable. Cmatinv a matrix containing the inverse of the coefficient matrix for the linear equations that define the solution to the regression problem. This matrix is required for function fRegress.stderr that estimates confidence regions for the regression coefficient function estimates. wt the vector of weights input or inferred If class(y) is numeric, the fRegress object also includes: df equivalent degrees of freedom for the fit. OCV the leave-one-out cross validation score for the model. gcv the generalized cross validation score. If class(y) is either fd or fdPar, the fRegress object returned also includes 5 other components: y2cMap an input y2cMap SigmaE an input SigmaE betastderrlist an fd object estimating the standard errors of betaestlist bvar a covariance matrix c2bMap a map model specification The fRegress.formula and fRegress.character functions translate the formula into the argument list required by fRegress.fdPar or fRegress.numeric. With the default value 'fRegress' for the argument method, this list is then used to call the appropriate other fRegress function. Alternatively, to see how the formula is translated, use the alternative 'model' value for the argument method. In that case, the function returns a list with the arguments otherwise passed to these other functions plus the following additional components: xfdlist0 a list of the objects named on the right hand side of formula. This will differ from xfdlist for any categorical or multivariate right hand side object. type the type component of any fd object on the right hand side of formula. nbasis a vector containing the nbasis components of variables named in formula having such components xVars an integer vector with all the variable names on the right hand side of formula containing the corresponding number of variables in xfdlist. This can exceed 1 for any multivariate object on the right hand side of class either numeric or fd as well as any categorical variable.

## Author(s)

J. O. Ramsay, Giles Hooker, and Spencer Graves

## References

Ramsay, James O., Hooker, Giles, and Graves, Spencer (2009) Functional Data Analysis in R and Matlab, Springer, New York.

Ramsay, James O., and Silverman, Bernard W. (2005), Functional Data Analysis, 2nd ed., Springer, New York.

fRegress.formula, fRegress.stderr, fRegress.CV, linmod
  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 ### ### ### scalar response and explanatory variable ### ... to compare fRegress and lm ### ### # example from help('lm') ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14) trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69) group <- gl(2,10,20, labels=c("Ctl","Trt")) weight <- c(ctl, trt) lm.D9 <- lm(weight ~ group) fRegress.D9 <- fRegress(weight ~ group) (lm.D9.coef <- coef(lm.D9)) (fRegress.D9.coef <- sapply(fRegress.D9$betaestlist, coef)) all.equal(as.numeric(lm.D9.coef), as.numeric(fRegress.D9.coef)) ### ### ### vector response with functional explanatory variable ### ### ## ## set up ## annualprec <- log10(apply(CanadianWeather$dailyAv[,,"Precipitation.mm"], 2,sum)) # The simplest 'fRegress' call is singular with more bases # than observations, so we use a small basis for this example smallbasis <- create.fourier.basis(c(0, 365), 25) # There are other ways to handle this, # but we will not discuss them here tempfd <- smooth.basis(day.5, CanadianWeather$dailyAv[,,"Temperature.C"], smallbasis)$fd ## ## formula interface ## precip.Temp.f <- fRegress(annualprec ~ tempfd) ## ## Get the default setup and modify it ## precip.Temp.mdl <- fRegress(annualprec ~ tempfd, method='m') # First confirm we get the same answer as above: precip.Temp.m <- do.call('fRegress', precip.Temp.mdl) all.equal(precip.Temp.m, precip.Temp.f) # set up a smaller basis than for temperature nbetabasis <- 21 betabasis2. <- create.fourier.basis(c(0, 365), nbetabasis) betafd2. <- fd(rep(0, nbetabasis), betabasis2.) # add smoothing betafdPar2. <- fdPar(betafd2., lambda=10) precip.Temp.mdl2 <- precip.Temp.mdl precip.Temp.mdl2[['betalist']][['tempfd']] <- betafdPar2. # Now do it. precip.Temp.m2 <- do.call('fRegress', precip.Temp.mdl2) # Compare the two fits precip.Temp.f[['df']] # 26 precip.Temp.m2[['df']]# 22 = saved 4 degrees of freedom (var.e.f <- mean(with(precip.Temp.f, (yhatfdobj-yfdPar)^2))) (var.e.m2 <- mean(with(precip.Temp.m2, (yhatfdobj-yfdPar)^2))) # with a modest increase in lack of fit. ## ## Manual construction of xfdlist and betalist ## xfdlist <- list(const=rep(1, 35), tempfd=tempfd) # The intercept must be constant for a scalar response betabasis1 <- create.constant.basis(c(0, 365)) betafd1 <- fd(0, betabasis1) betafdPar1 <- fdPar(betafd1) betafd2 <- with(tempfd, fd(basisobj=basis, fdnames=fdnames)) # convert to an fdPar object betafdPar2 <- fdPar(betafd2) betalist <- list(const=betafdPar1, tempfd=betafdPar2) precip.Temp <- fRegress(annualprec, xfdlist, betalist) all.equal(precip.Temp, precip.Temp.f) ### ### ### functional response with vector explanatory variables ### ### ## ## simplest: formula interface ## daybasis65 <- create.fourier.basis(rangeval=c(0, 365), nbasis=65, axes=list('axesIntervals')) Temp.fd <- with(CanadianWeather, smooth.basisPar(day.5, dailyAv[,,'Temperature.C'], daybasis65)$fd) TempRgn.f <- fRegress(Temp.fd ~ region, CanadianWeather) ## ## Get the default setup and possibly modify it ## TempRgn.mdl <- fRegress(Temp.fd ~ region, CanadianWeather, method='m') # make desired modifications here # then run TempRgn.m <- do.call('fRegress', TempRgn.mdl) # no change, so match the first run all.equal(TempRgn.m, TempRgn.f) ## ## More detailed set up ## region.contrasts <- model.matrix(~factor(CanadianWeather$region)) rgnContr3 <- region.contrasts dim(rgnContr3) <- c(1, 35, 4) dimnames(rgnContr3) <- list('', CanadianWeather$place, c('const', paste('region', c('Atlantic', 'Continental', 'Pacific'), sep='.')) ) const365 <- create.constant.basis(c(0, 365)) region.fd.Atlantic <- fd(matrix(rgnContr3[,,2], 1), const365) region.fd.Continental <- fd(matrix(rgnContr3[,,3], 1), const365) region.fd.Pacific <- fd(matrix(rgnContr3[,,4], 1), const365) region.fdlist <- list(const=rep(1, 35), region.Atlantic=region.fd.Atlantic, region.Continental=region.fd.Continental, region.Pacific=region.fd.Pacific) beta1 <- with(Temp.fd, fd(basisobj=basis, fdnames=fdnames)) beta0 <- fdPar(beta1) betalist <- list(const=beta0, region.Atlantic=beta0, region.Continental=beta0, region.Pacific=beta0) TempRgn <- fRegress(Temp.fd, region.fdlist, betalist) all.equal(TempRgn, TempRgn.f) ### ### ### functional response with ### (concurrent) functional explanatory variable ### ### ## ## predict knee angle from hip angle; from demo('gait', package='fda') ## ## formula interface ## (gaittime <- as.numeric(dimnames(gait)[[1]])*20) gaitrange <- c(0,20) gaitbasis <- create.fourier.basis(gaitrange, nbasis=21) harmaccelLfd <- vec2Lfd(c(0, (2*pi/20)^2, 0), rangeval=gaitrange) gaitfd <- smooth.basisPar(gaittime, gait, gaitbasis, Lfdobj=harmaccelLfd, lambda=1e-2)$fd hipfd <- gaitfd[,1] kneefd <- gaitfd[,2] knee.hip.f <- fRegress(kneefd ~ hipfd) ## ## manual set-up ## # set up the list of covariate objects const <- rep(1, dim(kneefd\$coef)[2]) xfdlist <- list(const=const, hipfd=hipfd) beta0 <- with(kneefd, fd(basisobj=basis, fdnames=fdnames)) beta1 <- with(hipfd, fd(basisobj=basis, fdnames=fdnames)) betalist <- list(const=fdPar(beta0), hipfd=fdPar(beta1)) fRegressout <- fRegress(kneefd, xfdlist, betalist) all.equal(fRegressout, knee.hip.f) #See also the following demos: #demo('canadian-weather', package='fda') #demo('gait', package='fda') #demo('refinery', package='fda') #demo('weatherANOVA', package='fda') #demo('weatherlm', package='fda')