# globalboosttest: Testing the additional predictive value of high-dimensional... In globalboosttest: Testing the additional predictive value of high-dimensional data

## Description

The function `globalboosttest` implements a permutation-based testing procedure to globally test the (additional) predictive value of a large set of predictors given that a small set of predictors is already available.

## Usage

 `1` ```globalboosttest(X,Y,Z=NULL,nperm=1000,mstop=1000,mstopAIC=FALSE,pvalueonly=TRUE,plot=FALSE,...) ```

## Arguments

 `X` A n x p matrix or data frame with observations in rows and variables in columns, whose additional predictive value has to be tested. `Y` Either a n-vector of type factor (if the prediction outcome is binary), or a numeric vector of length n (if the prediction outcome is numeric and uncensored), or a `Surv` object (if the prediction outcome is a survival time). `Z` A n x q matrix or data frame with observations in rows and variables in columns, on which we want to condition. Note that q should be smaller than n. If `Z=NULL`, the function `globalboosttest` simply assesses the predictive value of `X` without conditioning. `nperm` The number of permutations used to derived the p-value. `mstop` A numeric vector giving the number(s) of boosting steps at which the p-value has to be calculated. `mstopAIC` If `TRUE`, the best number of boosting steps is determined based on AIC using the non-permuted data from the range `1:max(mstop)`. `pvalueonly` Should the function return only the permutation p-value or also the risk for all numbers of boosting steps and all permutations? `plot` If `TRUE`, a plot representing the minimized criterion for real data (in black) and permuted data (in grey). `...` Further arguments to be passed to the `plot` function if `plot=TRUE`.

## Details

See Boulesteix and Hothorn (2009) for details on the methodology. If `mstopAIC=TRUE`, the number of boosting steps is chosen from 1 to `max(mstop)` independently of the specific values included in the vector `mstop`.

## Value

A list with the following arguments

 `riskreal` A numeric vector of length `max(mstop)` giving the risk computed from the original data set with mstop from 1 to `max(mstop)` (if `pvalueonly=FALSE`). `riskperm` A `nperm`x`max(mstop)` matrix giving the risk computed from the `nperm` permuted data sets with mstop from 1 to `max(mstop)` (if `pvalueonly=FALSE`). `mstopAIC` The number of boosting steps selected using the AIC-based procedure (if `mstopAIC=TRUE`). `pvalue` A numeric vector of length `length(mstop)` (if `mstopAIC=FALSE`) or `length(mstop)+1` (if `mstopAIC=TRUE`) giving the permutation-pvalues obtained for each considered value of `mstop`

## Author(s)

Anne-Laure Boulesteix (http://www.ibe.med.uni-muenchen.de/organisation/mitarbeiter/020_professuren/boulesteix/eng.html),

Torsten Hothorn (http://www.statistik.lmu.de/~hothorn/)

## References

A. L. Boulesteix and Torsten Hothorn (2010). Testing the additional predictive value of high-dimensional data. BMC Bioinformatics 10:78.

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15``` ```# load globalboosttest library library(globalboosttest) # load the simulated data with binary outcome data(simdatabin) attach(simdatabin) # Test with 25 permutations test<-globalboosttest(X=X,Y=Y,Z=Z,nperm=25,mstop=c(100,500,1000)) # load the simulated data with survival outcome data(simdatasurv) attach(simdatasurv) # Test with 25 permutations test<-globalboosttest(X=X,Y=Surv(time,status),Z=NULL,nperm=25,mstop=c(100,500,1000),mstopAIC=FALSE) ```

globalboosttest documentation built on May 2, 2019, 2:09 a.m.