screen.inter: Adaptive function for screening interactions
In sprinter: Framework for Screening Prognostic Interactions

Description Usage Arguments Details Value Author(s) References See Also

fit.logicReg and fit.rf are functions for screening interactions in high-dimensional datasets for the usage in the argument screen.inter in the function sprinter. They return a variable importance measurement for each variable.

fit.rf(nr, data, indices, seed.interselect, ...)

fit.rf.select(nr, data, indices, seed.interselect, n.select, ...)

fit.logicReg(nr, data, indices, seed.interselect,
       type,
       nleaves,
       ntrees, ...)
fit.logicReg.select(nr, data, indices, seed.interselect,
       type,
       nleaves,
       ntrees, 
       n.select,...)

`nr`	number of resample run.
`data`	data frame containing the y-outcome and x-variables in the model, which is orthogonalized to the clinical covariates and the main effects identified in the main effects detection step.
`indices`	indices to build the resample dataset.
`seed.interselect`	seed for random number generator.
`n.select`	Number of variables selected for performing random forest.
`type`	type of model to be fit. For survival data you can choose between (4) proportional hazards model (Cox regression), and (5) exponential survival model, or (0) your own scoring function.
`nleaves`	maximum number of leaves to be fit in all trees combined.
`ntrees`	number of logic trees to be fit.

...

further arguments passed to methods.

The functions logicReg and fit.rf are adapted for the usage in the function sprinter in order to screen interactions. Therein, variable importance measurements are evaluated for each variable, which will be used for pre-selecting relevant interactions in the function sprinter. In the function sprinter the identified interaction candidates will be combined with each other pairwise and will be provided as possible predictors for the final model.

fit.rf

This function performs a random forest for survival. It judges each variable by the permutation accuracy importance. For more information about performing the random forest see rfsrc.

fit.rf.select

This function performs a random forest for survival on a restricted data set. The number of covariables in this restricted data set can be set in n.select. The variables with the n.select smallest univariate p-values evaluated by Cox regression are selected.

fit.logicReg

For the usage of the logic regression all continuous variables are converted to binary variables at the median. Then the logic regression is fitted onto the binary data set. The variable importance measure is one, if the variable is included in the model and zero if not. In order to get the information about the variables in a multiple model, the set select = 2 is obligatory.

fit.logicReg.select

This function performs logic regression on a restricted data set. The number of covariables in this restricted data set can be set in n.select. The variables with the n.select smallest univariate p-values evaluated by Cox regression are selected.

Implementing new functions for the argument `screen.inter`

New functions for screening interactions can be constructed in a way that for each variable an importance measurement is returned as a vector of length p. The variable importance measurements larger than zero should be interpreted as relevant for the model.
The following arguments must be enclosed in this function:

nr	value displaying the actual resampling run.
data	data frame containing the y-outcome and x-variables in the model.
indices	indices to build the resample dataset.
seed.interselect	seed for random number generator.

With this directive other functions can be implemented and used for screening potential interaction candidates.

fit.rf and fit.logicReg return a vector of length p, containing the variable importance of each variable in the data set.

fit.rf evaluates the permutation accuracy importance (PAM) as a measure for the variable importance. The function fit.logicReg returns the information whether a variable is enclosed in the model (1) or not (0).

Written by Isabell Hoffmann isabell.hoffmann@uni-mainz.de.

Ruczinski I, Kooperberg C, LeBlanc ML (2003). Logic Regression, Journal of Computational and Graphical Statistics, 12, 475-511.

Breiman L. (2001). Random forests, Machine Learning, 45:5-32.

logreg, rfsrc

sprinter documentation built on May 1, 2019, 8:20 p.m.