crr: Competing Risks Regression

Description Usage Arguments Details Value References See Also Examples


regression modeling of subdistribution functions in competing risks


crr(ftime, fstatus, cov1, cov2, tf, cengroup, failcode=1, cencode=0,
subset, na.action=na.omit, gtol=1e-06, maxiter=10, init, variance=TRUE)



vector of failure/censoring times


vector with a unique code for each failure type and a separate code for censored observations


matrix (nobs x ncovs) of fixed covariates (either cov1, cov2, or both are required)


matrix of covariates that will be multiplied by functions of time; if used, often these covariates would also appear in cov1 to give a prop hazards effect plus a time interaction


functions of time. A function that takes a vector of times as an argument and returns a matrix whose jth column is the value of the time function corresponding to the jth column of cov2 evaluated at the input time vector. At time tk, the model includes the term cov2[,j]*tf(tk)[,j] as a covariate.


vector with different values for each group with a distinct censoring distribution (the censoring distribution is estimated separately within these groups). All data in one group, if missing.


code of fstatus that denotes the failure type of interest


code of fstatus that denotes censored observations


a logical vector specifying a subset of cases to include in the analysis


a function specifying the action to take for any cases missing any of ftime, fstatus, cov1, cov2, cengroup, or subset.


iteration stops when a function of the gradient is < gtol


maximum number of iterations in Newton algorithm (0 computes scores and var at init, but performs no iterations)


initial values of regression parameters (default=all 0)


If FALSE, then suppresses computation of the variance estimate and residuals


Fits the 'proportional subdistribution hazards' regression model described in Fine and Gray (1999). This model directly assesses the effect of covariates on the subdistribution of a particular type of failure in a competing risks setting. The method implemented here is described in the paper as the weighted estimating equation.

While the use of model formulas is not supported, the model.matrix function can be used to generate suitable matrices of covariates from factors, eg model.matrix(~factor1+factor2)[,-1] will generate the variables for the factor coding of the factors factor1 and factor2. The final [,-1] removes the constant term from the output of model.matrix.

The basic model assumes the subdistribution with covariates z is a constant shift on the complementary log log scale from a baseline subdistribution function. This can be generalized by including interactions of z with functions of time to allow the magnitude of the shift to change with follow-up time, through the cov2 and tfs arguments. For example, if z is a vector of covariate values, and uft is a vector containing the unique failure times for failures of the type of interest (sorted in ascending order), then the coefficients a, b and c in the quadratic (in time) model a*z+b*z*t+c*z*t*t can be fit by specifying cov1=z, cov2=cbind(z,z), tf=function(uft) cbind(uft,uft*uft).

This function uses an estimate of the survivor function of the censoring distribution to reweight contributions to the risk sets for failures from competing causes. In a generalization of the methodology in the paper, the censoring distribution can be estimated separately within strata defined by the cengroup argument. If the censoring distribution is different within groups defined by covariates in the model, then validity of the method requires using separate estimates of the censoring distribution within those groups.

The residuals returned are analogous to the Schoenfeld residuals in ordinary survival models. Plotting the jth column of res against the vector of unique failure times checks for lack of fit over time in the corresponding covariate (column of cov1).

If variance=FALSE, then some of the functionality in summary.crr and print.crr will be lost. This option can be useful in situations where crr is called repeatedly for point estimates, but standard errors are not required, such as in some approaches to stepwise model selection.


Returns a list of class crr, with components


the estimated regression coefficients


log pseudo-liklihood evaluated at coef


derivitives of the log pseudo-likelihood evaluated at coef


-second derivatives of the log pseudo-likelihood


estimated variance covariance matrix of coef


matrix of residuals giving the contribution to each score (columns) at each unique failure time (rows)


vector of unique failure times


jumps in the Breslow-type estimate of the underlying sub-distribution cumulative hazard (used by predict.crr())


the tfs matrix (output of tf(), if used)


TRUE if the iterative algorithm converged


The call to crr


The number of observations used in fitting the model


The number of observations removed from the input data due to missing values


The value of the log pseudo-likelihood when all the coefficients are 0


- inverse of second derivative matrix of the log pseudo-likelihood


Fine JP and Gray RJ (1999) A proportional hazards model for the subdistribution of a competing risk. JASA 94:496-509.

See Also

predict.crr print.crr plot.predict.crr summary.crr


# simulated data to test 
ftime <- rexp(200)
fstatus <- sample(0:2,200,replace=TRUE)
cov <- matrix(runif(600),nrow=200)
dimnames(cov)[[2]] <- c('x1','x2','x3')
print(z <- crr(ftime,fstatus,cov))
z.p <- predict(z,rbind(c(.1,.5,.8),c(.1,.5,.2)))
# quadratic in time for first cov
crr(ftime,fstatus,cov,cbind(cov[,1],cov[,1]),function(Uft) cbind(Uft,Uft^2))
#additional examples in test.R

Example output

Loading required package: survival
convergence:  TRUE 
     x1      x2      x3 
-0.6306  0.2024  0.6384 
standard errors:
[1] 0.4043 0.3782 0.4105
two-sided p-values:
  x1   x2   x3 
0.12 0.59 0.12 
Competing Risks Regression

crr(ftime = ftime, fstatus = fstatus, cov1 = cov)

     coef exp(coef) se(coef)      z p-value
x1 -0.631     0.532    0.404 -1.560    0.12
x2  0.202     1.224    0.378  0.535    0.59
x3  0.638     1.893    0.411  1.555    0.12

   exp(coef) exp(-coef)  2.5% 97.5%
x1     0.532      1.879 0.241  1.18
x2     1.224      0.817 0.583  2.57
x3     1.893      0.528 0.847  4.23

Num. cases = 200
Pseudo Log-likelihood = -314 
Pseudo likelihood ratio test = 5.36  on 3 df,
convergence:  TRUE 
      x1       x2       x3 
 0.01017 -0.17020 -0.58140 
standard errors:
[1] 0.4418 0.4799 0.4217
two-sided p-values:
  x1   x2   x3 
0.98 0.72 0.17 
convergence:  TRUE 
                            x1                             x2 
                       -0.7257                         0.1976 
                            x3 cbind(cov[, 1], cov[, 1])1*Uft 
                        0.6392                         0.3692 
cbind(cov[, 1], cov[, 1])2*tf2 
standard errors:
[1] 0.8403 0.3785 0.4116 1.1810 0.3142
two-sided p-values:
                            x1                             x2 
                          0.39                           0.60 
                            x3 cbind(cov[, 1], cov[, 1])1*Uft 
                          0.12                           0.75 
cbind(cov[, 1], cov[, 1])2*tf2 

cmprsk documentation built on May 29, 2017, 8 p.m.