fregression: Approximate low-rank processes from sparse longitudinal...

Description Usage Arguments Details Value References Examples

View source: R/fregression.R

Description

Method approximates a process from sparse observations. Suppose, that for a certain subject one or multiple observations are measured at some irregular timepoints. We assume that these are noise observations of some underlying process and we want to approximate this process for each individual.

Usage

1
2
3
4
5
fregression(formula, data, covariates = NULL, bins = 51,
  method = c("fimpute", "fpcs", "mean"), lambda = c(0), maxIter = 1e+05,
  lambda.reg = 0, d = 7, K = NULL, K.reg = NULL, thresh = 1e-05,
  final = "soft", fold = 5, cv.ratio = 0.05, projection = "separate",
  verbose = 0, scale.covariates = TRUE)

Arguments

formula

formula describing the linear relation between processes and indicating time and grouping variables. See details

data

data in the long format. Use fc.long2wide and fc.wide2long for conversions

bins

number of bins for matrix representation of the data

method

method for functional impute: fpca for functional principal components, mean for mean impute and fimpute for functional impute

lambda

lambdas for SVD regularization in functional impute

lambda.reg

lambdas for SVD regularization in regression

d

dimensionality of the basis

K

upper bound of dimensionality for SVD regularization

K.reg

upper bound of dimensionality for regression

thresh

thershold for convergence in functional imputee

final

should the final model use "hard" or "soft" impute after choosing the optimal lambda

fold

number of repetitions in cross-validation

projection

"joint" or "separate" (default). If multiple regressors are available project them jointly or separately

fold

how many folds in cross-validation

Details

For a subject i, we observe Y^{i}(t),X_1^{i}(t),...,X_p^{i}(t) at irregular subject specific t \in t_1,...,t_p, where 0 < t_j < T. We can bin the time interval [0,T] and represent each individual as a vector of fixed length with missing values. Let Y, X1, ..., Xp be such matrices. Columns correspond to timepoints and rows to subjects.

There are mulitple methods of approximating the process Y, we can:

Function fregression enables all three scenarios. Suppose data contains information in the long format, i.e. data is a matrix with p + 3 columns, where data[,1] is a subjectID, data[,2] is time, data[,3] is a value observation of Y and remaining columns are covariates X1, ..., Xp. Each row corresponds to one observation for one subject.

There are three possible formulas:

Value

Returns a list

In case of multidimensional SVD and simultanious approximation of Y,X1,X2,...,Xp, $fit is a list of models for Y,X1,X2,...,Xp.

References

James, Gareth M., Trevor J. Hastie, and Catherine A. Sugar. Principal component models for sparse functional data. Biometrika 87.3 (2000): 587-602.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
# SIMULATE DATA
simulation = fsimulate(seed = 1)
data = simulation$data
ftrue = simulation$ftrue
K = simulation$params$K

model.mean = fregression(Y ~ time | id, data,
                         method = "mean")
model.fpca = fregression(Y ~ time | id, data,
                         lambda = 0, K = c(3,4,5), thresh = 1e-7, method = "fpcs")

lambdas = c(2,3,4,5,6,8,10,12,15,20)
model.fimp = fregression(Y ~ time | id, data,
                         lambda = lambdas, thresh = 1e-5, final = "hard")
model.fcmp = fregression(Y + X1 + X2 ~ time | id, data, covariates,
                         lambda = lambdas, K = K, final = "hard")
model.freg = fregression(Y ~ U1 + U2 + time | id, data, model.fcmp$u,
                         lambda = lambdas, thresh = 1e-5,
                         lambda.reg = 0.1, method = "fpcs", K = K)

kidzik/fcomplete documentation built on Nov. 13, 2018, 1:14 p.m.