# bshazard: Nonparametric Smoothing of the Hazard Function In bshazard: Nonparametric Smoothing of the Hazard Function

## Description

The function estimates the hazard function non parametrically from a survival object (possibly adjusted for covariates). The smoothed estimate is based on B-splines from the perspective of generalized linear mixed models. Left truncated and right censoring data are allowed.

## Usage

 ```1 2``` ```## Default S3 method: bshazard(formula,data,nbin=NULL,nk,degree,l0,lambda,phi,alpha,err,verbose) ```

## Arguments

 `formula` a formula object, which must have a Surv object as the response on the left of the ~ operator. On the right, if desired are the names of the covariates in the Poisson model separated by + operators. For unadjusted estimates the right hand side should be ~1. `data` a data frame containing the variables named in the formula. `nbin` number of bins (equally spaced). If omitted, the function will split the data at each distinct time in the data (censoring or event). `nk` number of knots for B-splines (including minimum and maximum, default is 31): as a rule of thumb the number of observations divided by 4, but more than 40 knots are rarely needed. `degree` degree of B-splines, representing the effective number of parameters in the piecewise segments of splines (default is 1, corresponding to a linear segment) `l0` Starting value for the smoothing parameter lambda (default is 10). Relevant only if lambda=NULL `lambda` smoothing parameter. If not provided (default is NULL) it is estimated from the data as phi times the reciprocal of the variance of the random effect (by an iterative algorithm). A high value imposes a smoother estimate. `phi` overdispersion parameter. If not provided (default is NULL) it is estimated from the data. `alpha` 1 minus the level for the two-sided confidence interval for the hazard function. Default is 0.05. `err` relative error for the iterative process in the lambda/phi estimation (default is 0.0001) `verbose` TRUE/FALSE Print each iteration step (default is T)

## Details

The time axis is partitioned in a number of intervals equal to the number of different observations (if not fixed otherwise by the option `nbin`). The number of events in each interval is modeled by a Poisson model and the smoothing parameter (lambda) is estimated by a mixed model framework. The number of knots is 31 by default. The code also allows for overdispersion (phi). Adjustment for covariates can increase the computation time.

## Value

The output of the bshazard function can be divided into three parts: 1. a data frame with the original data split in time intervals, 2. the set of vectors containing the estimated hazard and confidence limits and 3. the parameter estimates.

 `raw.data` data frame with original data split in bins (intervals of time), containing: `time` mid point of each bin `n.event` number of events in each bin `PY` total person-time in each bin (for each covariate value) `raw.hazard` number of events in each interval divided by PY `...` covariate values `nobs` number of records in input data
 `time` mid point of each bin at which the curve is computed `hazard` hazard estimate for each bin `lower.ci` lower limit of the hazard confidence interval (depends on alpha level) `upper.ci` upper limit of the hazard confidence interval (depends on alpha level) `(cov.value)` values of covariates at which hazard is computed (mean value) `fitted.values` estimated number of events at each time (from the Poisson model)
 `coefficients` coefficient estimates for each covariate `phi` overdispersion parameter `sv2` variance of the random effect corresponding to the inverse of the smoothing parameter multiplied by phi `df` degrees of freedom representing the effective number of smoothing parameters

## Author(s)

Paola Rebora, Agus Salim, Marie Reilly Maintainer: Paola Rebora <paola.rebora@unimib.it>

## References

Rebora P, Salim A, Reilly M (2014) bshazard: A Flexible Tool for Nonparametric Smoothing of the Hazard Function.The R Journal Vol. 6/2:114-122.

Lee Y, Nelder JA, Pawitan Y (2006). Generalized Linear Models with Random Effects: Unified Analysis via H-likelihood, volume 106. Chapman & Hall/CRC.

Pawitan Y (2001). In All Likelihood: Statistical Modelling and Inference Using Likelihood. Oxford University Press

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60``` ```data(lung,package="survival") fit<-bshazard(Surv(time, status==2) ~ 1,data=lung) plot(fit\$time, fit\$hazard,xlab='Time', ylab='Rate/person-days',type='l') points(fit\$raw.data\$time,fit\$raw.data\$raw.hazard,cex=.3, lwd=3,col=1) lines(fit\$time, fit\$hazard, lty=1, lwd=3) lines(fit\$time, fit\$lower.ci, lty=1, lwd=1) lines(fit\$time, fit\$upper.ci, lty=1, lwd=1) # or alternatively plot(fit) #Example to graphically evaluate the Markov assumpion in an illness-death model ### data simulated under EXTENDED SEMI-MARKOV model set.seed(123) n <- 500 beta<-log(3) R <- runif(n, min = 0, max =2) cat <- cut(R, breaks = seq(0,2,0.5), labels = seq(1,4,1)) T12 <- 1/0.2*( (-log(runif(n, min = 0, max = 1))) / (exp(beta*R)))**(1/0.6) C <- runif(n, min =0, max = 10) T12obs <- pmin(T12, C) status <- ifelse(T12obs < C, 1, 0) T012obs <- R+T12obs #R: simulated time to illness #cat: time to illness categorised in 4 classes #T12obs: time from illness to death or censoring #T012obs: time from origin to death or censoring #status: indicator of death (1=death,0=censoring) # Hazard estimate in the Clock Reset scale (time from illness) by time to illness class fit <- bshazard(Surv(T12obs[cat == 1],status[cat==1]) ~ 1) plot(fit,conf.int=FALSE,xlab='Time since illness',xlim=c(0,5),ylim=c(0,10),lwd=2,col=rainbow(1)) for(i in 2:4) { fit <- bshazard(Surv(T12obs[cat == i],status[cat==i]) ~ 1) lines(fit\$time, fit\$hazard, type = 'l', lwd = 2, col = rainbow(4)[i]) } legend("top",title="Time to illness",c("<=0.5","0.5-|1","1-|1.5","1.5-|2"),col=c(rainbow(4)),lwd =2) # Hazard estimate in the Clock Forward scale (time from origin) by time to illness class fit <- bshazard(Surv(R[cat == 1],T012obs[cat == 1],status[cat==1]) ~ 1) plot(fit,conf.int=FALSE,xlab='Time since origin',xlim=c(0,5),ylim=c(0,10),lwd=2,col=rainbow(1)) for(i in 2:4) { fit <- bshazard(Surv(R[cat == i],T012obs[cat == i],status[cat==i]) ~ 1) lines(fit\$time, fit\$hazard, type = 'l', lwd = 2, col = rainbow(4)[i]) } legend("top",title="Time to illness",c("<=0.5","0.5-|1","1-|1.5","1.5-|2"),col=c(rainbow(4)),lwd =2) #Alternatively an adjusted estimate can be performed, assuming proportionl hazard. ##This computation can be time consuming! ## Not run: fit.adj <- bshazard(Surv(T12obs,status) ~ cat ) plot(fit.adj,overall=FALSE, xlab = 'Time since illness',col=rainbow(4),lwd=2, xlim = c(0,5), ylim = c(0,10)) legend("top",title="Time to illness",c("<=0.5","0.5-|1","1-|1.5","1.5-|2"),col=c(rainbow(4)),lwd =2) ## End(Not run) ### A more general setting with examples of Markov assumption evaluation can be found ### in Bernasconi et al. Stat in Med 2016 ## The function is currently defined as function (x, ...) UseMethod("bshazard") ```

bshazard documentation built on May 2, 2019, 5:58 a.m.