fregression: Approximate low-rank processes from sparse longitudinal...
In kidzik/fcomplete: Trajectory estimation for sparsely observed longitudinal data

View source: R/fregression.R

fregression

R Documentation

Approximate low-rank processes from sparse longitudinal observations

Description

Method approximates individual trajectories from sparse noisy observations. Suppose, that we measure progression of a certain measurment over time at some irregular timepoints for multiple subjects. We want to approximate the progression process for each individual.

Usage

fregression(
  formula,
  data,
  covariates = NULL,
  bins = 51,
  method = c("fimpute", "fpcs", "mean", "pg"),
  lambda = c(0),
  maxIter = 1e+05,
  lambda.reg = 0,
  d = 7,
  K = NULL,
  K.reg = NULL,
  thresh = 1e-05,
  final = "soft",
  fold = 5,
  cv.ratio = 0.05,
  projection = "separate",
  verbose = 0,
  scale.covariates = TRUE,
  basis.type = "splines",
  lr = 1
)

Arguments

`formula`	formula describing the linear relation between processes and indicating time and grouping variables. See details
`data`	data in the long format.
`bins`	number of bins for matrix representation of the data
`method`	algorithm to use for finding model parameters: `fpca` for functional principal components, `mean` for mean impute, `fimpute` for functional impute, `pg` for proximal gradient
`lambda`	lambdas for SVD regularization in functional impute
`lambda.reg`	lambdas for SVD regularization in regression
`d`	dimensionality of the basis
`K`	upper bound of dimensionality for SVD regularization
`K.reg`	upper bound of dimensionality for regression
`thresh`	thershold for convergence in functional imputee
`final`	should the final model use `"hard"` or `"soft"` impute after choosing the optimal `lambda`
`fold`	how many folds in cross-validation
`projection`	"joint" or "separate" (default). If multiple regressors are available project them jointly or separately

Details

For a subject i, we observe Y^{i}(t),X_1^{i}(t),...,X_p^{i}(t) at irregular subject specific t \in t_1,...,t_p, where 0 < t_j < T. We can bin the time interval [0,T] and represent each individual as a vector of fixed length with missing values. Let Y, X1, ..., Xp be such matrices. Columns correspond to timepoints and rows to subjects.

There are multitple methods for approximating the process Y, we can:

regress Y on X_1,X_2,...,X_p, we can use sparse functional regression
project each subject into latent space and impute Y, X_1,X_2,...,X_p simultaniously
use only information from Y, we can use functional PCA method or functional impute.

Function fregression is an interface for fitting models for all three scenarios. Suppose data is a data matrix in the long format, i.e. data is a matrix with p + 3 columns, where data[,1] is a subjectID, data[,2] is time, data[,3] is a value observation of Y and remaining columns are covariates X1, ..., Xp. Each row corresponds to one observation for one subject.

There are three possible formulas:

Y ~ time + X1 + X2 | subjectID executes functional regression
Y + X1 + X2 ~ time | subjectID executes dimensionality reduction
Y ~ time | subjectID executes functional impute or functional PCA depending on the choice of method parameter

Value

Returns a list

fit fitted matrix Y
meta results of cross-validation
u,d,v svd of the underlying processes if the functional impute method has been chosen

In case of multidimensional SVD and simultanious approximation of Y,X1,X2,...,Xp, $fit is a list of models for Y,X1,X2,...,Xp.

References

James, Gareth M., Trevor J. Hastie, and Catherine A. Sugar. Principal component models for sparse functional data. Biometrika 87.3 (2000): 587-602.

Lukasz Kidzinski and Trevor J. Hastie. Modeling longitudinal data using matrix completion. Under review (2021)

Examples

# SIMULATE DATA
simulation = fsimulate(seed = 1)
data = simulation$data
ftrue = simulation$ftrue
K = simulation$params$K

model.mean = fregression(Y ~ time | id, data,
                         method = "mean")
model.fpca = fregression(Y ~ time | id, data,
                         lambda = 0, K = c(3,4,5), thresh = 1e-7, method = "fpcs")

lambdas = c(2,3,4,5,6,8,10,12,15,20)
model.fimp = fregression(Y ~ time | id, data,
                         lambda = lambdas, thresh = 1e-5, final = "hard")
model.fcmp = fregression(Y + X1 + X2 ~ time | id, data, covariates,
                         lambda = lambdas, K = K, final = "hard")
model.freg = fregression(Y ~ U1 + U2 + time | id, data, model.fcmp$u,
                         lambda = lambdas, thresh = 1e-5,
                         lambda.reg = 0.1, method = "fpcs", K = K)

kidzik/fcomplete documentation built on Aug. 24, 2023, 5:44 a.m.