preselect: Preselect variables for MetaForest analysis

Description Usage Arguments Details Value Examples

View source: R/preselect.R

Description

Takes a MetaForest object, and applies different algorithms for variable selection.

Usage

1
preselect(x, replications = 100L, algorithm = "replicate", ...)

Arguments

x

Model to perform variable selection for. Accepts MetaForest objects.

replications

Integer. Number of replications to run for variable preselection. Default: 100.

algorithm

Character. Preselection method to apply. Currently, 'replicate', 'recursive', and 'bootstrap' are available.

...

Other arguments to be passed to and from functions.

Details

Currently, available methods under algorithm are:

replicate

This simply replicates the analysis, which means the forest has access to the full data set, but the trees are grown on different bootstrap samples across replications (thereby varying monte carlo error).

bootstrap

This replicates the analysis on bootstrapped samples, which means each replication has access to a different sub-sample of the full data set. When selecting this algorithm, cases are either bootstrap-sampled by study, or a new study column is generated, and a clustered MetaForest is grown (because some of the rows in the data will be duplicated) , and this would lead to an under-estimation of the OOB error.

recursive

Starting with all moderators, the variable with the most negative variable importance is dropped from the model, and the analysis re-run. This is repeated until only variables with a positive variable importance are left, or no variables are left. The proportion of final models containing each variable reflects its importance.

Value

An object of class 'mf_preselect'

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
## Not run: 
data <- get(data(dat.bourassa1996))
data <- escalc(measure = "OR", ai = lh.le, bi = lh.re, ci = rh.le, di= rh.re,
               data = data, add = 1/2, to = "all")
data$mage[is.na(data$mage)] <- median(data$mage, na.rm = TRUE)
data[c(5:8)] <- lapply(data[c(5:8)], factor)
data$yi <- as.numeric(data$yi)
mf.model <- MetaForest(formula = yi~ selection + investigator + hand_assess + eye_assess +
                        mage +sex,
          data, study = "sample",
          whichweights = "unif", num.trees = 300)
preselect(mf.model,
          replications = 10,
          algorithm = "bootstrap")

## End(Not run)

Example output

Loading required package: ggplot2
Loading required package: metafor
Loading required package: Matrix
Loading 'metafor' package (version 2.4-0). For an overview 
and introduction to the package please type: help(metafor).
Loading required package: ranger
Loading required package: data.table
Mean variable importance across bootstrap replications:
  eye_assess investigator    selection          sex  hand_assess         mage 
    1.62e-01     3.21e-02     7.38e-03    -8.63e-05    -1.93e-03    -2.45e-02 

SD of variable importance across bootstrap replications:
  eye_assess investigator    selection          sex  hand_assess         mage 
      0.2287       0.0650       0.0223       0.0373       0.0680       0.0637 


R2 median:  -0.134447 
R2 sd: 0.2292673

metaforest documentation built on Jan. 8, 2020, 9:06 a.m.