FREEtree: Initial FREEtree call which then calls actual FREEtree...

Description Usage Arguments Value Examples

View source: R/FREEtree.r

Description

Initial FREEtree call which then calls actual FREEtree methods depending on parameters being passed through.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
FREEtree(
  data,
  fixed_regress = NULL,
  fixed_split = NULL,
  var_select = NULL,
  power = 6,
  minModuleSize = 1,
  cluster,
  maxdepth_factor_screen = 0.04,
  maxdepth_factor_select = 0.5,
  Fuzzy = TRUE,
  minsize_multiplier = 5,
  alpha_screen = 0.2,
  alpha_select = 0.2,
  alpha_predict = 0.05
)

Arguments

data

data to train or test FREEtree on.

fixed_regress

user specified char vector of regressors that will never be screened out; if fixed_regress = NULL, method uses PC as regressor at screening step.

fixed_split

user specified char vector of features to be used in splitting with certainty.

var_select

a char vector containing features to be selected. These features will be clustered by WGCNA and the chosen ones will be used in regression and splitting.

power

soft thresholding power parameter of WGCNA.

minModuleSize

WGCNA's minimum module size parameter.

cluster

the variable name of each cluster (in terms of random effect) using glmer's implementation.

maxdepth_factor_screen

when selecting features from one module, the maxdepth of the glmertree is set to ceiling function of maxdepth_factor_screen*(features in that module). Default is 0.04.

maxdepth_factor_select

Given screened features (from each modules, if Fuzzy=FALSE, that is the selected non-grey features from each non-grey modules), we want to select again from those screened features. The maxdepth of that glmertree is set to be ceiling of maxdepth_factor_select*(#screened features). Default is 0.6. for the maxdepth of the prediction tree (final tree), maxdepth is set to the length of the split_var (fixed+chosen ones).

Fuzzy

boolean to indicate desire to screen like Fuzzy Forest if Fuzzy = TRUE; if Fuzzy= FALSE, first screen within non-grey modules and then select the final non-grey features within the selected ones from each non-grey module; Use this final non-grey features as regressors (plus fixed_regress) and use grey features as split_var to select grey features. Then use final non-grey features and selected grey features together in splitting and regression variables, to do the final prediction. Fuzzy=FALSE is used if there are so many non-grey features and you want to protect grey features.

minsize_multiplier

At the final prediction tree, the minsize = minsize_multiplier times the length of final regressors. The default is 5. Note that we only set minsize for the final prediction tree instead of trees at the feature selection step since during feature selection, we don't have to be so careful. Note that when tuning the parameters, larger alpha and samller minsize_multiplier will result in deeper tree and therefore may cause overfitting problem. It is recommended to decrease alpha and decrease minsize_multiplier at the same time.

alpha_screen

alpha used in screening step.

alpha_select

alpha used in selection step.

alpha_predict

alpha used in prediction step.

Value

a glmertree object (trained tree).

Examples

1
2
3
4
5
6
7
#locate example data file
dataf <- system.file("data/data.RData", package="FREEtree")
mytree = FREEtree(data,fixed_regress=c("time","time2"), fixed_split=c("treatment"),
                  var_select=paste("V",1:200,sep=""), minModuleSize = 5,
                  cluster="patient", Fuzzy=TRUE, maxdepth_factor_select = 0.5,
                  maxdepth_factor_screen = 0.04, minsize_multiplier = 5,
                  alpha_screen = 0.2, alpha_select=0.2,alpha_predict=0.05)

FREEtree documentation built on July 1, 2020, 6:26 p.m.