MERFranger | R Documentation |
This function enables the use of Mixed Effects Random Forests (MERFs) by effectively
combining a random forest from ranger with a model capturing random effects from
lme4. The MERF algorithm is an algorithmic procedure reminiscent of an EM-algorithm
(see Details). The function is the base-function for the wrapping function (SAEforest_model
and should not be directly used by the ordinary user. Recommended exceptions are applications exceeding
the scope of existing wrapper functions or further research. The function MERFranger
allows to model complex patterns of structural relations (see Examples). The function returns
an object of class MERFranger
, which can be used to produce unit-level predictions. In contrast to
the wrapping functions, this function does not directly provide SAE estimates on domain-specific indicators.
MERFranger(
Y,
X,
random,
data,
importance = "none",
initialRandomEffects = 0,
ErrorTolerance = 1e-04,
MaxIterations = 25,
na.rm = TRUE,
...
)
Y |
Continuous input value of target variable. |
X |
Matrix of predictive covariates. |
random |
Specification of random effects terms following the syntax of lmer.
Random effect terms are specified by vertical bars |
data |
data.frame of sample data including the specified elements of |
importance |
Variable importance mode processed by the random forest from the ranger. Must be 'none', 'impurity', 'impurity_corrected', 'permutation'. For further details see ranger. |
initialRandomEffects |
Numeric value or vector of initial estimate of random effects. Defaults to 0. |
ErrorTolerance |
Numeric value to monitor the MERF algorithm's convergence. Defaults to 1e-04. |
MaxIterations |
Numeric value specifying the maximal amount of iterations for the MERF algorithm. Defaults to 25. |
na.rm |
Logical. Whether missing values should be removed. Defaults to |
... |
Additional parameters are directly passed to the random forest ranger.
Most important parameters are for instance |
There exists a generic function for predict
for objects obtained by MERFranger
.
The MERF algorithm iteratively optimizes two separate steps: a) the random forest function, assuming the random effects term to be correct and b) estimates the random effects part, assuming the OOB-predictions from the forest to be correct. Overall convergence of the algorithm is monitored by the log-likelihood of a joint model of both components. For further details see Krennmair & Schmid (2022) or Hajjem et al. (2014).
Note that the MERFranger
object is a composition of elements from a random forest of class
ranger
and a random effects model of class merMod
. Thus, all generic functions are
applicable to corresponding objects. For further details on generic functions see ranger
and lmer
as well as the examples below.
An object of class MERFranger includes the following elements:
Forest |
A random forest of class ranger modelling fixed effects of the model. |
EffectModel |
A model of random effects of class |
RandomEffects |
List element containing the values of random intercepts from |
RanEffSD |
Numeric value of the standard deviation of random intercepts. |
ErrorSD |
Numeric value of standard deviation of unit-level errors. |
VarianceCovariance |
VarCorr matrix from |
LogLik |
Vector with numerical entries showing the loglikelihood of the MERF algorithm. |
IterationsUsed |
Numeric number of iterations used until convergence of the MERF algorithm. |
OOBresiduals |
Vector of OOB-residuals. |
Random |
Character specifying the random intercept in the random effects model. |
ErrorTolerance |
Numerical value to monitor the MERF algorithm's convergence. |
initialRandomEffects |
Numeric value or vector of initial specification of random effects. |
MaxIterations |
Numeric value specifying the maximal amount of iterations for the MERF algorithm. |
Hajjem, A., Bellavance, F., & Larocque, D. (2014). Mixed-Effects Random Forest for Clustered Data. Journal of Statistical Computation and Simulation, 84 (6), 1313–1328.
Krennmair, P., & Schmid, T. (2022). Flexible Domain Prediction Using Mixed Effects Random Forests. Journal of Royal Statistical Society: Series C (Applied Statistics) (forthcoming).
SAEforest
, ranger
, lmer
,
SAEforest_model
# Load Data
data("eusilcA_pop")
data("eusilcA_smp")
income <- eusilcA_smp$eqIncome
X_covar <- eusilcA_smp[, -c(1, 16, 17, 18)]
# Example 1:
# Calculating general model used in wrapper functions
model1 <- MERFranger(Y = income, X = X_covar, random = "(1|district)",
data = eusilcA_smp, num.trees=50)
# get individual predictions:
ind_pred <- predict(model1, eusilcA_pop)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.