# globalboosttest: Testing the additional predictive value of high-dimensional... In globalboosttest: Testing the additional predictive value of high-dimensional data

## Description

The function globalboosttest implements a permutation-based testing procedure to globally test the (additional) predictive value of a large set of predictors given that a small set of predictors is already available.

## Usage

 1 globalboosttest(X,Y,Z=NULL,nperm=1000,mstop=1000,mstopAIC=FALSE,pvalueonly=TRUE,plot=FALSE,...)

## Arguments

 X A n x p matrix or data frame with observations in rows and variables in columns, whose additional predictive value has to be tested. Y Either a n-vector of type factor (if the prediction outcome is binary), or a numeric vector of length n (if the prediction outcome is numeric and uncensored), or a Surv object (if the prediction outcome is a survival time). Z A n x q matrix or data frame with observations in rows and variables in columns, on which we want to condition. Note that q should be smaller than n. If Z=NULL, the function globalboosttest simply assesses the predictive value of X without conditioning. nperm The number of permutations used to derived the p-value. mstop A numeric vector giving the number(s) of boosting steps at which the p-value has to be calculated. mstopAIC If TRUE, the best number of boosting steps is determined based on AIC using the non-permuted data from the range 1:max(mstop). pvalueonly Should the function return only the permutation p-value or also the risk for all numbers of boosting steps and all permutations? plot If TRUE, a plot representing the minimized criterion for real data (in black) and permuted data (in grey). ... Further arguments to be passed to the plot function if plot=TRUE.

## Details

See Boulesteix and Hothorn (2009) for details on the methodology. If mstopAIC=TRUE, the number of boosting steps is chosen from 1 to max(mstop) independently of the specific values included in the vector mstop.

## Value

A list with the following arguments

 riskreal A numeric vector of length max(mstop) giving the risk computed from the original data set with mstop from 1 to max(mstop) (if pvalueonly=FALSE). riskperm A npermxmax(mstop) matrix giving the risk computed from the nperm permuted data sets with mstop from 1 to max(mstop) (if pvalueonly=FALSE). mstopAIC The number of boosting steps selected using the AIC-based procedure (if mstopAIC=TRUE). pvalue A numeric vector of length length(mstop) (if mstopAIC=FALSE) or length(mstop)+1 (if mstopAIC=TRUE) giving the permutation-pvalues obtained for each considered value of mstop

## Author(s)

Anne-Laure Boulesteix (http://www.ibe.med.uni-muenchen.de/organisation/mitarbeiter/020_professuren/boulesteix/eng.html),

Torsten Hothorn (http://www.statistik.lmu.de/~hothorn/)

## References

A. L. Boulesteix and Torsten Hothorn (2010). Testing the additional predictive value of high-dimensional data. BMC Bioinformatics 10:78.

## Examples

 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 # load globalboosttest library library(globalboosttest) # load the simulated data with binary outcome data(simdatabin) attach(simdatabin) # Test with 25 permutations test<-globalboosttest(X=X,Y=Y,Z=Z,nperm=25,mstop=c(100,500,1000)) # load the simulated data with survival outcome data(simdatasurv) attach(simdatasurv) # Test with 25 permutations test<-globalboosttest(X=X,Y=Surv(time,status),Z=NULL,nperm=25,mstop=c(100,500,1000),mstopAIC=FALSE)

globalboosttest documentation built on May 29, 2017, 3:27 p.m.