Description Usage Arguments Details Value Examples
(S)REEMforest algorithm
1 2 3 4 5 6 7 8 9 10 11 12 | REEMforest(
X,
Y,
id,
Z,
iter = 100,
mtry,
ntree = 500,
time,
sto,
delta = 0.001
)
|
X |
[matrix]: A |
Y |
[vector]: A vector containing the output trajectories. |
id |
[vector]: Is the vector of the identifiers for the different trajectories. |
Z |
[matrix]: A |
iter |
[numeric]: Maximal number of iterations of the algorithm. The default is set to |
mtry |
[numeric]: Number of variables randomly sampled as candidates at each split. The default value is |
ntree |
[numeric]: Number of trees to grow. This should not be set to too small a number, to ensure that every input row gets predicted at least a few times. The default value is |
time |
[time]: Is the vector of the measurement times associated with the trajectories in |
sto |
[character]: Defines the covariance function of the stochastic process, can be either |
delta |
[numeric]: The algorithm stops when the difference in log likelihood between two iterations is smaller than |
(S)REEMforest is an adaptation of the random forest regression method to longitudinal data introduced by Capitaine et. al. (2020) <doi:10.1177/0962280220946080>. The algorithm will estimate the parameters of the following semi-parametric stochastic mixed-effects model:
Y_i(t)=f(X_i(t))+Z_i(t)β_i + ω_i(t)+ε_i
with Y_i(t) the output at time t for the ith individual; X_i(t) the input predictors (fixed effects) at time t for the ith individual; Z_i(t) are the random effects at time t for the ith individual; ω_i(t) is the stochastic process at time t for the ith individual which model the serial correlations of the output measurements; ε_i is the residual error.
A fitted (S)REEMforest model which is a list of the following elements:
forest:
Random forest obtained at the last iteration.
random_effects :
Predictions of random effects for different trajectories.
id_btilde:
Identifiers of individuals associated with the predictions random_effects
.
var_random_effects:
Estimation of the variance covariance matrix of random effects.
sigma_sto:
Estimation of the volatility parameter of the stochastic process.
sigma:
Estimation of the residual variance parameter.
time:
The vector of the measurement times associated with the trajectories in Y
,Z
and X
.
sto:
Stochastic process used in the model.
Vraisemblance:
Log-likelihood of the different iterations.
id:
Vector of the identifiers for the different trajectories.
OOB:
OOB error of the fitted random forest at each iteration.
1 2 3 4 5 6 7 8 9 10 11 | set.seed(123)
data <- DataLongGenerator(n=20) # Generate the data composed by n=20 individuals.
# Train a SREEMforest model on the generated data. Should take ~ 50 secondes
# The data are generated with a Brownian motion
# so we use the parameter sto="BM" to specify a Brownian motion as stochastic process
SREEMF <- REEMforest(X=data$X,Y=data$Y,Z=data$Z,id=data$id,time=data$time,mtry=2,ntree=500,sto="BM")
SREEMF$forest # is the fitted random forest (obtained at the last iteration).
SREEMF$random_effects # are the predicted random effects for each individual.
SREEMF$omega # are the predicted stochastic processes.
plot(SREEMF$Vraisemblance) #evolution of the log-likelihood.
SREEMF$OOB # OOB error at each iteration.
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.