funreg: Perform penalized functional regression

Description Usage Arguments Value Note References See Also Examples

View source: R/FunReg.r

Description

Performs a penalized functional regression as in Goldsmith et al. (2012) on irregularly measured data such as that found in ecological momentary assessment (see Walls & Schafer, 2006; Shiffman, Stone, & Hufford, 2008).

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
funreg(
  id,
  response,
  time,
  x,
  basis.method = 1,
  deg = 2,
  deg.penalty = 2,
  family = gaussian,
  other.covariates = NULL,
  num.bins = 35,
  preferred.num.eigenfunctions = 30,
  preferred.num.knots.for.beta = 35,
  se.method = 1,
  smoothing.method = 1,
  times.for.fit.grid = NULL
)

Arguments

id

An integer or string uniquely identifying the subject to which each observation belongs

response

The response, as a vector, one for each subject

time

The time of observation, as a vector, one for each observation (i.e., each assessment on each person)

x

The functional predictor(s), as a matrix, one row for each observation (i.e., for each assessment on each person)

basis.method

An integer, either 1 or 2, describing how the beta function should be internally modeled. A value of 1 indicates that a truncated power spline basis should be used, and a value of 2 indicates that a B-spline basis should be used.

deg

An integer, either 1, 2, or 3, describing how complicated the behavior of the beta function between knots may be. 1, 2, or 3 represent linear, quadratic or cubic function between knots.

deg.penalty

Only relevant for B-splines. The difference order used to weight the smoothing penalty (see Eilers and Marx, 1996)

family

The response distribution. For example, this is family=gaussian for normal linear models, family=binomial for logistic regression models, or family=poisson for count models. See the gam documentation in the mgcv package, or use help(family) for details on family objects.

other.covariates

Subject-level (time-invariant) covariates, if any, as a matrix, one column per covariate. The default, NULL, means that no subject-level covariates will be included in the model.

num.bins

The number of knots used in the spline basis for the beta function. The default is based on the Goldsmith et al. (2011) sample code.

preferred.num.eigenfunctions

The number of eigenfunctions to use in approximating the covariance function of x (see Goldsmith et al., 2011)

preferred.num.knots.for.beta

number of knots to use in the spline estimation. The default, is based on the Goldsmith et al (2011) sample code.

se.method

An integer, either 1 or 2, describing how the standard errors should be calculated. A value of 1 means that the uncertainty related to selecting the smoothing parameter is ignored. Option 2 means that a Bayesian approach is used to try to take this uncertainty into account (see the documentation for Wood's mgcv package).

smoothing.method

An integer, either 1 or 2, describing how the weight of the smoothing penalty should be determined. Option 1 means that the smoothing weight should be estimated using an approach similar to restricted maximum likelihood, and Option 2 means an approach similar to generalized cross-validation. Option 1 is strongly recommended (based both on our experience and on remarks in the documentation for the gam function in the mgcv package).

times.for.fit.grid

Points at which to calculate the estimated beta function. The default, NULL, means that the code will choose these times automatically.

Value

An object of type funreg. This object can be examined using summary, print, or fitted.

Note

This function mostly follows code by Jeff Goldsmith and co-workers: the sample code from Goldsmith et al (2011), and the "pfr" function in the "refund" R package. However, this code is adapted here to allow idiosyncratic measurement times and unequal numbers of observations per subject to be handled easily, and also allows the use of a different estimation method. Also follows some sample code for penalized B-splines from Eilers and Marx (1996) in implementing B-splines. As the pfr function in refund also does, the function calls the gam function in the mgcv package (Wood 2011) to do much of the internal calculations. This function may be somewhat obselete, because a more flexible function is available in the new version of pfr in the refund package (see http://jeffgoldsmith.com/software.html).

In the example below, to fit a more complicated model, replace x=SampleFunregData$x1 with x=cbind(SampleFunregData$x1, SampleFunregData$x2),other.covariates=cbind(SampleFunregData$s1, SampleFunregData$s2, SampleFunregData$s3, SampleFunregData$s4). This model will take longer to run, perhaps 10 or 20 seconds. Then try plot(complex.model).

References

Crainiceanu, C., Reiss, P., Goldsmith, J., Huang, L., Huo, L., Scheipl, F. (2012). refund: Regression with Functional Data (version 0.1-6). R package Available online at cran.r-project.org.

Eilers, P. H. C., and Marx, B. D. (1996). Flexible smoothing with B-splines and penalties. Statistical Science, 11, 89-121. DOI:10.1.1.47.4521.

Goldsmith, J., Bobb, J., Crainiceanu, C. M., Caffo, B., and Reich, D. (2011). Penalized functional regression. Journal of Computational and Graphical Statistics, 20(4), 830-851. DOI: 10.1198/jcgs.2010.10007. For writing parts of this function I especially followed "PFR_Example.R", in the supplemental materials for this paper, written on Jan. 15 2010, by Jeff Goldsmith.

Ruppert, D., Wand, M., and Carroll, R. (2003). Semiparametric regression. Cambridge, UK: Cambridge University Press.

Shiffman, S., Stone, A. A., and Hufford, M. R. (2008). Ecological momentary assessment. Annual Review of Clinical Psychology, 4, 1-32. DOI:10.1146/annurev.clinpsy.3.022806.091415.

Walls, T. A., & Schafer, J. L. (2006) Models for intensive longitudinal data. New York: Oxford.

Wood, S.N. (2006) Generalized Additive Models: An Introduction with R. Chapman and Hall/CRC.

Wood, S.N. (2011) Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models. Journal of the Royal Statistical Society (B) 73(1):3-36. DOI:10.1111/j.1467-9868.2010.00749.x

See Also

fitted.funreg, link{plot.funreg}, print.funreg, link{summary.funreg}

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
simple.model <- funreg(id=SampleFunregData$id,
                       response=SampleFunregData$y,
                       time=SampleFunregData$time,
                       x=SampleFunregData$x1,
                       family=binomial);
print(simple.model);                 
par(mfrow=c(2,2));                          
plot(x=simple.model$model.for.x[[1]]$bin.midpoints,
     y=simple.model$model.for.x[[1]]$mu.x.by.bin,
     xlab="Time t",ylab="X(t)",main="Smoothed mean x values"); 
# The smoothed average value of the predictor function x(t) at different times t. 
# The ``[[1]]'' after model.for.x is there because model.for.x is a list with one entry.
# This is because more than one functional covariate is allowed.
plot(simple.model,type="correlations");
# The marginal correlation of x(t) with y at different times t.
# It appears that earlier time points are more strongly related to y.
plot(simple.model,type="coefficients");
# The functional regression coefficient of y on x(t).  
# It also appears that earlier time points are more strongly related to y.
plot(simple.model$subject.info$response,
     simple.model$subject.info$fitted,
     main="Predictive Performance",
     xlab="True Y",
     ylab="Fitted Y");  

funreg documentation built on Oct. 4, 2021, 5:07 p.m.

Related to funreg in funreg...