fprUnoShort: False Positive Rate for arbitrary predition models

Description Usage Arguments Details Value Note Author(s) References See Also Examples

View source: R/DiscSurvEvaluation.R

Description

Estimates the false positive rate (based on concept of Uno, et al.) for an arbitrary discrete survival prediction model on one test data set.

Usage

1
fprUnoShort(timepoint, marker, newTime, newEvent)

Arguments

timepoint

Gives the discrete time interval of which the fpr is evaluated (numeric scalar).

marker

Gives the predicted values of the linear predictor of a regression model (numeric vector). May also be on the response scale.

newTime

New time intervals in the test data (integer vector).

newEvent

New event indicators in the test data (integer binary vector).

Details

This function is useful, if other models than generalized, linear models (glm) should be used for prediction. In the case of glm better use the cross validation version fprUno.

Value

Note

It is assumed that all time points up to the last interval [a_q, Inf) are available in short format. If not already present, these can be added manually.

Author(s)

Thomas Welchowski [email protected]

Matthias Schmid [email protected]

References

Hajime Uno and Tianxi Cai and Lu Tian and L. J. Wei, (2007), Evaluating Prediction Rules for t-Year Survivors With Censored Regression Models, Journal of the American Statistical Association

Patrick J. Heagerty and Yingye Zheng, (2005), Survival Model Predictive Accuracy and ROC Curves, Biometrics 61, 92-105

See Also

tprUno, tprUnoShort, aucUno, concorIndex, createFolds, glm

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
##################################################
# Example with unemployment data and prior fitting

library(Ecdat)
library(caret)
library(mgcv)
data(UnempDur)
summary(UnempDur$spell)
# Extract subset of data
set.seed(635)
IDsample <- sample(1:dim(UnempDur)[1], 100)
UnempDurSubset <- UnempDur [IDsample, ]
set.seed(-570)
TrainingSample <- sample(1:100, 75)
UnempDurSubsetTrain <- UnempDurSubset [TrainingSample, ]
UnempDurSubsetTest <- UnempDurSubset [-TrainingSample, ]

# Convert to long format
UnempDurSubsetTrainLong <- dataLong(dataSet=UnempDurSubsetTrain, 
timeColumn="spell", censColumn="censor1")

# Estimate gam with smooth baseline
gamFit <- gam(formula=y ~ s(I(as.numeric(as.character(timeInt)))) + 
s(age) + s(logwage), data=UnempDurSubsetTrainLong, family=binomial())
gamFitPreds <- predict(gamFit, newdata=cbind(UnempDurSubsetTest, timeInt=UnempDurSubsetTest$spell))

# Estimate false positive rate
fprGamFit <- fprUnoShort (timepoint=1, marker=gamFitPreds, 
newTime=UnempDurSubsetTest$spell, newEvent=UnempDurSubsetTest$censor1)
plot(fprGamFit)

#####################################
# Example National Wilm's Tumor Study

library(survival)
head(nwtco)
summary(nwtco$rel)

# Select subset
set.seed(-375)
Indices <- sample(1:dim(nwtco)[1], 500)
nwtcoSub <- nwtco [Indices, ]

# Convert time range to 30 intervals
intLim <- quantile(nwtcoSub$edrel, prob=seq(0, 1, length.out=30))
intLim [length(intLim)] <- intLim [length(intLim)] + 1
nwtcoSubTemp <- contToDisc(dataSet=nwtcoSub, timeColumn="edrel", intervalLimits=intLim)
nwtcoSubTemp$instit <- factor(nwtcoSubTemp$instit)
nwtcoSubTemp$histol <- factor(nwtcoSubTemp$histol)
nwtcoSubTemp$stage <- factor(nwtcoSubTemp$stage)

# Split in training and test sample
set.seed(-570)
TrainingSample <- sample(1:dim(nwtcoSubTemp)[1], round(dim(nwtcoSubTemp)[1]*0.75))
nwtcoSubTempTrain <- nwtcoSubTemp [TrainingSample, ]
nwtcoSubTempTest <- nwtcoSubTemp [-TrainingSample, ]

# Convert to long format
nwtcoSubTempTrainLong <- dataLong(dataSet=nwtcoSubTempTrain, 
timeColumn="timeDisc", censColumn="rel")

# Estimate glm
inputFormula <- y ~ timeInt + histol + instit + stage
glmFit <- glm(formula=inputFormula, data=nwtcoSubTempTrainLong, family=binomial())
linPreds <- predict(glmFit, newdata=cbind(nwtcoSubTempTest, 
timeInt=nwtcoSubTempTest$timeDisc))

# Estimate tpr given one training and one test sample at time interval 10
fprFit <- fprUnoShort (timepoint=10, marker=linPreds, 
newTime=nwtcoSubTempTest$timeDisc, newEvent=nwtcoSubTempTest$rel)
plot(fprFit)

discSurv documentation built on May 29, 2017, 6:47 p.m.