Home

/

GitHub

/

justin-petrovich/sparsefreg

/

misfit: MISFIT (Multiple Imputation for Sparsely-sampled Functions at...

misfit: MISFIT (Multiple Imputation for Sparsely-sampled Functions at...
In justin-petrovich/sparsefreg: Scalar-on-Function Regression with a Highly Irregularly-Sampled Functional Covariate

Description Usage Arguments Details References Examples

View source: R/misfit.R

Performs MISFIT for either linear (family="gaussian") or logistic (family="binomial") regression.

misfit(
  dat,
  grid,
  nimps = 10,
  J = NULL,
  pve = 0.95,
  family = "gaussian",
  link = NULL,
  impute_type = "Multiple",
  cond.y = T,
  seed = NULL,
  user_params = NULL,
  use_fcr = TRUE,
  k = -1,
  fcr.args = list(use_bam = T, niter = 1),
  face.args = list(knots = 12, lower = -3, pve = 0.95)
)

`dat`	A data frame with n rows (where N is the number of subjects, each with m_i observations, so that ∑_{i=1}^N m_i = n) expected to have either 3 or 4. If `cond.y` is TRUE, should include 4 columns, with variables 'X','y','subj', and 'argvals'. If `cond.y` is FALSE, only 3 columns are needed (no 'y' variable is used).
`grid`	A length M vector of the unique desired grid points on which to evaluate the function.
`nimps`	An integer specifying the number of desired imputations, if `impute_type` is "Multiple".
`J`	An integer specifying the number of FPCs to include in the regression model. By default (NULL), J will be chosen as the minimum number of FPCs required to explain a given percentage of variance.
`pve`	The desired percentage of variance to be explained by the FPCs. Only used if `J` is not supplied. Defaults to 0.95.
`family`	A string indicating the family of the response variable. Currently only "gaussian" (linear regression) and "binomial" (logistic regression) are supported.
`impute_type`	A string indicating whether to use mean or multiple imputation. Only accepts "Mean" or "Multiple". Defaults to "Multiple".
`cond.y`	A boolean indicating whehter to condition on the response variable when imputing. Defaults to TRUE.
`seed`	An integer used to specify the seed. Optional, but useful for making results reproducible in the Multiple Imputation step.
`user_params`	An optional list of user-defined imputation parameters. Currently, the user must provide either all necessary imputation parameters, or none. See 'Details'.
`use_fcr`	A boolean indicating whether to use `fcr` or `FPCA` when estimating the necessary imputation parameters. TRUE indicates fcr, FALSE indicates pace. Default is TRUE. See 'Details' for more discussion.
`k`	Dimension of the smooth terms used in `fcr`. Default is 15.
`fcr.args`	A list of arguments which can be passed to `fcr` (for the estimation of imputaion parameters). Default is to use `use_bam` = T and `niter` = 1. The list must not contain the formula, which is constructed within `misfit`. See `fcr` for more details.
`face.args`	A list of arguments to be passed to the underlying function `face.sparse`. Currently defaults to setting `knots` = 12 and `pve` = 0.95. See `face.sparse` for more details.

When using the user_params argument, the user must supply a list containing the following elements.

Linear Regression:

'Cx': An M\times M matrix representing the covariance function of X(t), evaluated on grid. Should not be missing any values.
'mux': A length M numeric vector representing the mean function of X(t), evaluated on grid. Should not be missing any values.
'var_delt': A single numeric value representing the variance of δ, the measurement error associated with X(t).
'muy': A single numeric value representing the mean of Y.
'lam': A numeric vector of length at least J, representing the eigenvalues of C_X(t,s), the covariance function of X(t).
'phi': A matrix with M rows and at least J columns, representing the eigenfunctions of C_X(t,s) (one per column) evaluated on grid. Should not be missing any values.
'Cxy': A numeric vector of length M, representing the cross-covariance C_{XY}(t) evaluated on grid. Should not be missing any values.
'var_y': A single numeric value representing the varinace of Y.

Logistic Regression:

'Cx': An M\times M matrix representing the covariance function of X(t), evaluated on grid. Should not be missing any values.
'mu0': A length M numeric vector representing the mean function of X(t)|Y = 1, evaluated on grid. Should not be missing any values.
'mu1': A length M numeric vector representing the mean function of X(t)|Y = 0, evaluated on grid. Should not be missing any values.
'var_delt': A single numeric value representing the variance of δ, the measurement error associated with X(t).
'lam': A numeric vector of length at least J, representing the eigenvalues of C_X(t,s), the covariance function of X(t).
'phi': A matrix with M rows and at least J columns, representing the eigenfunctions of C_X(t,s) (one per column) evaluated on grid. Should not be missing any values.

By default, use_fcr is TRUE, meaning that fcr is used to estimate imputation parameters. Using FPCA (i.e. use_fcr = FALSE) is roughly 10 times faster, at least for small to moderate data sets. For a single use of the function, this difference is not meaningful as both methods complete in under a minute. But when performing simulations, this speed difference is significant. More testing is needed to determine which method more accuartely estimates the imputation parameters. See 'References' below for details on the methods used in fcr and FPCA.

Leroux, A., Xiao, L., Crainiceanu, C., & Checkley, W. (2018). Dynamic prediction in functional concurrent regression with an application to child growth. Statistics in medicine, 37(8), 1376-1388.

Yao, Fang, Hans-Georg Mueller, and Jane-Ling Wang. "Functional data analysis for sparse longitudinal data." Journal of the American Statistical Association 100, no. 470 (2005): 577-590.

## Not run: 

###################################################################
#------- Example Using MISFIT for a Linear SoF Model -------------#
###################################################################

set.seed(123)

## Data generation
M <- 100 # grid size
N <- 400 # sample size
m <- 2 # observations per subject
J <- 5 # number of FPCs to use
nimps <- 10 # number of imputations
var_eps <- 1 # variance of model error
var_delt <- 0.5 # variance of measurement error
grid <- seq(from=0,to=1,length.out = M)
mux <- rep(0,M)
Cx_f<-function(t,s,sig2=1,rho=0.5){ # Matern covariance function with nu = 5/2
 d <- abs(outer(t,s,"-"))
 tmp2 <- sig2*(1+sqrt(5)*d/rho + 5*d^2/(3*rho^2))*exp(-sqrt(5)*d/rho)}
Cx <- Cx_f(grid,grid)
lam <- eigen(Cx,symmetric = T)$values/M
phi <- eigen(Cx,symmetric = T)$vectors*sqrt(M)

beta <- 10*(sin(2*pi*grid)+1)
alpha <- 0

X_s <- mvrnorm(N,mux,Cx)
X_comp <- X_s + rnorm(N*M,sd = sqrt(var_delt))
Xi <- (X_s-mux)%*%phi/M
eps <- rnorm(N,0,sd = sqrt(var_eps))
y <- c(alpha + X_s%*%beta/M + eps)

X_mat<-matrix(nrow=N,ncol=m)
T_mat<-matrix(nrow=N,ncol=m)
ind_obs<-matrix(nrow=N,ncol=m)

for(i in 1:N){
 ind_obs[i,]<-sort(sample(1:M,m,replace=FALSE))
 X_mat[i,]<-X_comp[i,ind_obs[i,]]
 T_mat[i,]<-grid[ind_obs[i,]]
}

spt<-1
ind_obs[spt,1] = 1; ind_obs[spt,m] = M
X_mat[spt,]<-X_comp[spt,ind_obs[spt,]]
T_mat[spt,]<-grid[ind_obs[spt,]]

## Create data frame for observed data
obsdf <- data.frame("X" = c(t(X_mat)),"argvals" = c(t(T_mat)),
                   "y" = rep(y,each = m),"subj" = rep(1:N,each = m))

misfit_out <- misfit(obsdf,grid = grid,nimps = nimps,J = J)


## End(Not run)

justin-petrovich/sparsefreg documentation built on Aug. 20, 2020, 9:04 p.m.

justin-petrovich/sparsefreg index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

justin-petrovich/sparsefreg
Scalar-on-Function Regression with a Highly Irregularly-Sampled Functional Covariate

misfit: MISFIT (Multiple Imputation for Sparsely-sampled Functions at...
In justin-petrovich/sparsefreg: Scalar-on-Function Regression with a Highly Irregularly-Sampled Functional Covariate

Description

Usage

Arguments

Details

References

Examples

Related to misfit in justin-petrovich/sparsefreg...

R Package Documentation

Browse R Packages

We want your feedback!

justin-petrovich/sparsefreg Scalar-on-Function Regression with a Highly Irregularly-Sampled Functional Covariate

misfit: MISFIT (Multiple Imputation for Sparsely-sampled Functions at... In justin-petrovich/sparsefreg: Scalar-on-Function Regression with a Highly Irregularly-Sampled Functional Covariate

Description

Usage

Arguments

Details

References

Examples

Related to misfit in justin-petrovich/sparsefreg...

R Package Documentation

Browse R Packages

We want your feedback!

justin-petrovich/sparsefreg
Scalar-on-Function Regression with a Highly Irregularly-Sampled Functional Covariate

misfit: MISFIT (Multiple Imputation for Sparsely-sampled Functions at...
In justin-petrovich/sparsefreg: Scalar-on-Function Regression with a Highly Irregularly-Sampled Functional Covariate