simLong: Simulate Low/High Dimensional and Linear/Nonlinear...

View source: R/MEGB.R

simLongR Documentation

Simulate Low/High Dimensional and Linear/Nonlinear Longitudinal dataset.

Description

Simulate p-dimensional linear/Nonlinear mixed-effects model given by:

Y_i(t)=f(X_i(t))+Z_i(t)\beta_i+\epsilon_i

with Y_i(t) the output at time t for the ith individual; X_i(t) the input predictors (fixed effects) at time t for the ith individual; Z_i(t) are the random effects at time t for the ith individual; \epsilon_i is the residual error with variance \sigma^2. If linear, f(X_i(t)) = X_i(t)\theta, where \theta = 1, \forall p, otherwise if nonlinear, the approach by Capitaine et al. (2021) is adapted.

Usage

simLong(
  n,
  p,
  rel_p = 6,
  time_points,
  rho_W = 0.5,
  rho_Z = 0.5,
  random_sd_intercept = 2,
  random_sd_slope = 1,
  noise_sd = 1,
  linear = TRUE
)

Arguments

n

[numeric]: Number of individuals.

p

[numeric]: Number of predictors.

rel_p

[numeric]: Number of relevant predictors (true predictors that are correlated to the outcome.). The default value is rel_p=6 if linear and rel_p=2 if nonlinear.

time_points

[numeric]: Number of realizations per individual. The default value is time_points=10.

rho_W

[numeric]: Within subject correlation. The default value is rho_W=0.5.

rho_Z

[numeric]: Correlation between intercept and slope for the random effect coefficients. The default value is rho_Z=0.5.

random_sd_intercept

[numeric]: Standard deviation for the random intercept. The default value is random_sd_intercept=\sqrt{0.5}.

random_sd_slope

[numeric]: Standard deviation for the random slope. The default value is random_sd_slope=\sqrt{3}.

noise_sd

[numeric]: Standard deviation for the random slope. The default value is noise_sd=0.5.

linear

[boolean]: If TRUE, a linear mixed effect model is simulated, if otherwise, a semi-parametric model similar to the one used in Capitaine et al. (2021).

Value

a dataframe of dimension (n*time_points) by (p+5) containing the following elements:

  • id: vector of the individual IDs.

  • time: vector of the time realizations.

  • Y: vector of the outcomes variable.

  • RandomIntercept: vector of the Random Intercept.

  • RandomSlope: vector of the Random Slope.

  • Vars : Remainder columns corresponding to the fixed effect variables.

Examples

set.seed(1)
data = simLong(n = 17,p = 6,rel_p = 6,time_points = 10,rho_W = 0.6, rho_Z=0.6,
              random_sd_intercept = sqrt(0.5),
              random_sd_slope = sqrt(3),
              noise_sd = 0.5,linear=FALSE) # Generate the data
head(data)   # first six rows of the data.
# Let's see the output :
w <- which(data$id==1)
plot(data$time[w],data$Y[w],type="l",ylim=c(min(data$Y),max(data$Y)), col="grey")
for (i in unique(data$id)){
  w <- which(data$id==i)
  lines(data$time[w],data$Y[w], col='grey')
}
# Let's see the fixed effects predictors:
oldpar <- par(no.readonly = TRUE)
oldopt <- options()
par(mfrow=c(2,3), mar=c(2,3,3,2))
for (i in 1:ncol(data[,-1:-5])){
  w <- which(data$id==1)
  plot(data$time[w],data[,-1:-5][w,i], col="grey",ylim=c(min(data[,-1:-5][,i]),
  max(data[,-1:-5][,i])),xlim=c(1,max(data$time)),main=latex2exp::TeX(paste0("$X^{(",i,")}$")))
  for (k in unique(data$id)){
    w <- which(data$id==k)
    lines(data$time[w],data[,-1:-5][w,i], col="grey")
  }
}
par(oldpar)
options(oldopt)


MEGB documentation built on April 4, 2025, 2:59 a.m.

Related to simLong in MEGB...