af: Construct an FGAM regression term In refund: Regression with Functional Data

Description

Defines a term \int_{T}F(X_i(t),t)dt for inclusion in an mgcv::gam-formula (or bam or gamm or gamm4:::gamm) as constructed by pfr, where F(x,t) is an unknown smooth bivariate function and X_i(t) is a functional predictor on the closed interval T. See smooth.terms for a list of bivariate basis and penalty options; the default is a tensor product basis with marginal cubic regression splines for estimating F(x,t).

Usage

  1 2 3 4 5 6 7 8 9 10 11 12 13 af( X, argvals = NULL, xind = NULL, basistype = c("te", "t2", "s"), integration = c("simpson", "trapezoidal", "riemann"), L = NULL, presmooth = NULL, presmooth.opts = NULL, Xrange = range(X, na.rm = T), Qtransform = FALSE, ... ) 

Arguments

 X functional predictors, typically expressed as an N by J matrix, where N is the number of columns and J is the number of evaluation points. May include missing/sparse functions, which are indicated by NA values. Alternatively, can be an object of class "fd"; see fd. argvals indices of evaluation of X, i.e. (t_{i1},.,t_{iJ}) for subject i. May be entered as either a length-J vector, or as an N by J matrix. Indices may be unequally spaced. Entering as a matrix allows for different observations times for each subject. If NULL, defaults to an equally-spaced grid between 0 or 1 (or within X$basis$rangeval if X is a fd object.) xind same as argvals. It will not be supported in the next version of refund. basistype defaults to "te", i.e. a tensor product spline to represent F(x,t) Alternatively, use "s" for bivariate basis functions (see s) or "t2" for an alternative parameterization of tensor product splines (see t2) integration method used for numerical integration. Defaults to "simpson"'s rule for calculating entries in L. Alternatively and for non-equidistant grids, "trapezoidal" or "riemann". L an optional N by ncol(argvals) matrix giving the weights for the numerical integration over t. If present, overrides integration. presmooth string indicating the method to be used for preprocessing functional predictor prior to fitting. Options are fpca.sc, fpca.face, fpca.ssvd, fpca.bspline, and fpca.interpolate. Defaults to NULL indicateing no preprocessing. See create.prep.func. presmooth.opts list including options passed to preprocessing method create.prep.func. Xrange numeric; range to use when specifying the marginal basis for the x-axis. It may be desired to increase this slightly over the default of range(X) if concerned about predicting for future observed curves that take values outside of range(X) Qtransform logical; should the functional be transformed using the empirical cdf and applying a quantile transformation on each column of X prior to fitting? ... optional arguments for basis and penalization to be passed to the function indicated by basistype. These could include, for example, "bs", "k", "m", etc. See te or s for details.

Value

A list with the following entries:

 call a "call" to te (or s, t2) using the appropriately constructed covariate and weight matrices. argvals the argvals argument supplied to af L the matrix of weights used for the integration xindname the name used for the functional predictor variable in the formula used by mgcv tindname the name used for argvals variable in the formula used by mgcv Lname the name used for the L variable in the formula used by mgcv presmooth the presmooth argument supplied to af Xrange the Xrange argument supplied to af prep.func a function that preprocesses data based on the preprocessing method specified in presmooth. See create.prep.func

Author(s)

Mathew W. McLean mathew.w.mclean@gmail.com, Fabian Scheipl, and Jonathan Gellar

References

McLean, M. W., Hooker, G., Staicu, A.-M., Scheipl, F., and Ruppert, D. (2014). Functional generalized additive models. Journal of Computational and Graphical Statistics, 23 (1), pp. 249-269. Available at https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3982924/.

pfr, lf, mgcv's linear.functional.terms, pfr for examples
  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ## Not run: data(DTI) ## only consider first visit and cases (no PASAT scores for controls) DTI1 <- DTI[DTI$visit==1 & DTI$case==1,] DTI2 <- DTI1[complete.cases(DTI1),] ## fit FGAM using FA measurements along corpus callosum ## as functional predictor with PASAT as response ## using 8 cubic B-splines for marginal bases with third ## order marginal difference penalties ## specifying gamma > 1 enforces more smoothing when using ## GCV to choose smoothing parameters fit1 <- pfr(pasat ~ af(cca, k=c(8,8), m=list(c(2,3), c(2,3)), presmooth="bspline", bs="ps"), method="GCV.Cp", gamma=1.2, data=DTI2) plot(fit1, scheme=2) vis.pfr(fit1) ## af term for the cca measurements plus an lf term for the rcst measurements ## leave out 10 samples for prediction test <- sample(nrow(DTI2), 10) fit2 <- pfr(pasat ~ af(cca, k=c(7,7), m=list(c(2,2), c(2,2)), bs="ps", presmooth="fpca.face") + lf(rcst, k=7, m=c(2,2), bs="ps"), method="GCV.Cp", gamma=1.2, data=DTI2[-test,]) par(mfrow=c(1,2)) plot(fit2, scheme=2, rug=FALSE) vis.pfr(fit2, select=1, xval=.6) pred <- predict(fit2, newdata = DTI2[test,], type='response', PredOutOfRange = TRUE) sqrt(mean((DTI2$pasat[test] - pred)^2)) ## Try to predict the binary response disease status (case or control) ## using the quantile transformed measurements from the rcst tract ## with a smooth component for a scalar covariate that is pure noise DTI3 <- DTI[DTI$visit==1,] DTI3 <- DTI3[complete.cases(DTI3\$rcst),] z1 <- rnorm(nrow(DTI3)) fit3 <- pfr(case ~ af(rcst, k=c(7,7), m = list(c(2, 1), c(2, 1)), bs="ps", presmooth="fpca.face", Qtransform=TRUE) + s(z1, k = 10), family="binomial", select=TRUE, data=DTI3) par(mfrow=c(1,2)) plot(fit3, scheme=2, rug=FALSE) abline(h=0, col="green") # 4 versions: fit with/without Qtransform, plotted with/without Qtransform fit4 <- pfr(case ~ af(rcst, k=c(7,7), m = list(c(2, 1), c(2, 1)), bs="ps", presmooth="fpca.face", Qtransform=FALSE) + s(z1, k = 10), family="binomial", select=TRUE, data=DTI3) par(mfrow=c(2,2)) zlms <- c(-7.2,4.3) plot(fit4, select=1, scheme=2, main="QT=FALSE", zlim=zlms, xlab="t", ylab="rcst") plot(fit4, select=1, scheme=2, Qtransform=TRUE, main="QT=FALSE", rug=FALSE, zlim=zlms, xlab="t", ylab="p(rcst)") plot(fit3, select=1, scheme=2, main="QT=TRUE", zlim=zlms, xlab="t", ylab="rcst") plot(fit3, select=1, scheme=2, Qtransform=TRUE, main="QT=TRUE", rug=FALSE, zlim=zlms, xlab="t", ylab="p(rcst)") vis.pfr(fit3, select=1, plot.type="contour") ## End(Not run)