Prediction from Fitted Piecewise Constant Hazards Models

Share:

Description

This function returns predictions for an object of class “pch”, usually the result of a call to pchreg.

Usage

1
2
3
## S3 method for class 'pch'
predict(object, type = c("distr", "quantile", "sim"), 
   newdata, p, sim.method = c("quantile", "sample"), ...)

Arguments

object

a “pch” object.

type

a character string (just the first letter can be used) indicating the type of prediction. See ‘Details’.

newdata

optional data frame in which to look for variables with which to predict. It must include all the covariates that enter the model and, if type = 'distr', also the time variable. If omitted, the original data are used.

p

vector of quantiles, to be specified if type = "quantile".

sim.method

a character string (just the first letter can be used) indicating the simulation method if type = "sim".

...

for future methods.

Details

If type = "distr" (the default), this function returns a data frame with columns (haz, Haz, Surv, f) containing the fitted values of the hazard function, the cumulative hazard, the survival function, and the probability density function, respectively.

If type = "quantile", a data frame with the fitted quantiles (corresponding to the supplied values of p) is returned.

If type = "sim", new data are simulated from the fitted model. Two methods are available: with sim.method = "quantile", data are simulated by applying the estimated quantile function to a vector of random uniform numbers; if sim.method = "sample", the quantile function is only used to identify the time interval, and the data are resampled from the observed values in the interval. The second method only works properly if there is a large number of breaks. However, it is less sensitive to model misspecification and facilitates sampling from distributions with a probability mass or non compact support.

If the data are censored, some high quantiles may not be estimated: beyond the last observable quantile, all types of predictions (including type = "sim" with sim.method = "sample") are computed assuming that the hazard remains constant after the last interval.

Predictions are computed at newdata, if supplied. In the current implementation, newdata must include all the variables that are needed for the prediction. Note that if type = "distr", new values of the responde variable are also required.

Value

If type = "distr", a 4-columns data frame with columns (haz, Haz, Surv, f). If type = "quantile", a named data frame with a column for each value of p. If type = "sim", a vector of simulated data.

The presence of NA values will always cause the prediction to be NA.

Author(s)

Paolo Frumento <paolo.frumento@ki.se>

See Also

pchreg

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
  # using simulated data
  
  ##### EXAMPLE 1 - Continuous distribution ############################
  
  n <- 1000
  x <- runif(n)
  time <- rnorm(n, 1 + x, 1 + x)
  cens <- rnorm(n,2,2)
  y <- pmin(time,cens) # censored variable
  d <- (time <= cens) # indicator of the event
  model <- pchreg(Surv(y,d) ~ x, breaks = 20)

  # predicting hazard, cumulative hazard, survival, density

  pred <- predict(model, type = "distr")
  plot(pred$Surv, 1 - pnorm(y, 1 + x, 1 + x)); abline(0,1) 
  # true vs fitted survival
  
  
  # predicting quartiles

  predQ <- predict(model, type = "quantile", p = c(0.25,0.5,0.75))
  plot(x,time)
  points(x, qnorm(0.5, 1 + x, 1 + x), col = "red") # true median
  points(x, predQ$p0.5, col = "green")             # fitted median
  
  
  # simulating new data
  
  tsim1 <- predict(model, type = "sim", sim.method = "quantile")
  tsim2 <- predict(model, type = "sim", sim.method = "sample")

  qt <- quantile(time, (1:9)/10)  # deciles of t
  q1 <- quantile(tsim1, (1:9)/10) # deciles of tsim1
  q2 <- quantile(tsim2, (1:9)/10) # deciles of tsim2

  par(mfrow = c(1,2))
  plot(qt,q1, main = "sim.method = 'quantile'"); abline(0,1)
  plot(qt,q2, main = "sim.method = 'sample'"); abline(0,1)

  # prediction with newdata
  
  predict(model, type = "distr", newdata = data.frame(y = 0, x = 0.5)) # need y!
  predict(model, type = "quantile", p = 0.5, newdata = data.frame(x = 0.5))
  predict(model, type = "sim", sim.method = "sample", newdata = data.frame(x = c(0,1)))

  ##### EXAMPLE 2 - non-compact support ############################
  # to simulate, sim.method = "sample" is recommended ##############
  
  n <- 1000
  t <- c(rnorm(n,-5), rnorm(n,5)) 
  model <- pchreg(Surv(t) ~ 1, breaks = 30)
  
  tsim1 <- predict(model, type = "sim", sim.method = "quantile")
  tsim2 <- predict(model, type = "sim", sim.method = "sample")
  
  par(mfrow = c(1,3))
  hist(t, main = "true distribution")
  hist(tsim1, main = "sim.method = 'quantile'") # the empty spaces are 'filled'
  hist(tsim2, main = "sim.method = 'sample'")   # perfect!