Introduction to nph and Usage Instructions"

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

Overview

The nph package includes functions to model survival distributions in terms of piecewise constant hazards and to simulate data from the specified distributions.

Installation {-}

The package is available from CRAN and can be installed directly from R.

install.packages("nph")

# For dev version
# install.packages("devtools")
devtools::install_github("repo/nph")

Getting started

Basically, there are three mechanisms for non-proportionality available in this package:

These scenarios are illustrated in the following figures. Note that the hazard ratio is not constant across time.

library(nph)
times <- c(0, 5 * 365)   # Time interval boundaries, in days

# Treatment group
t_resp <- 1              # There are no subgroups
B5 <- pop_pchaz(
  T = times,
  lambdaMat1    = m2r(matrix(18, nrow = 1)),
  lambdaMat2    = m2r(matrix(11, nrow = 1)),
  lambdaProgMat = m2r(matrix( 9, nrow = 1)),
  p = t_resp,
  timezero = FALSE, discrete_approximation = TRUE
)

# Control group
c_resp <- 1              # There are no subgroups
K5  <- pop_pchaz(
  T = times,
  lambdaMat1    = m2r(matrix(11, nrow = 1)),
  lambdaMat2    = m2r(matrix( 9, nrow = 1)),
  lambdaProgMat = m2r(matrix( 5, nrow = 1)),
  p = c_resp,
  timezero = TRUE, discrete_approximation = TRUE
)
plot_shhr(A = K5, B = B5, main = "Different disease progression by treatment")
times <- c(0, 100, 5 * 365)   # Time interval boundaries, in days

# Treatment group
t_resp <- 1               # There are no subgroups
B5 <- pop_pchaz(
  T = times,
  lambdaMat1    = m2r(matrix(c(11, 18), nrow = 1)),
  lambdaMat2    = m2r(matrix(c( 9, 11), nrow = 1)),
  lambdaProgMat = m2r(matrix(c( 5,  9), nrow = 1)),
  p = t_resp,
  timezero = FALSE, discrete_approximation = TRUE
)

# Control group
c_resp <- 1 
K5  <- pop_pchaz(
  T = times,
  lambdaMat1    = m2r(matrix(c(11, 11), nrow = 1)),
  lambdaMat2    = m2r(matrix(c( 9,  9), nrow = 1)),
  lambdaProgMat = m2r(matrix(c( 5,  5), nrow = 1)),
  p = c_resp,
  timezero = TRUE, discrete_approximation = TRUE
)

plot_shhr(K5, B5, main = "Different effect by time intervals")
times <- c(0, 5 * 365)   # Time interval boundaries, in days

# Treatment group
t_resp <-  c(.2, .8)
B5 <- pop_pchaz(
  T = times,
  lambdaMat1    = m2r(matrix(c(30,
                               18),nrow = 2)),
  lambdaMat2    = m2r(matrix(c(20,
                               11),nrow = 2)),
  lambdaProgMat = m2r(matrix(c(15, 
                               9), nrow = 2)),
  p = t_resp,
  timezero = FALSE, discrete_approximation = TRUE
)

# Control group
c_resp <- 1 
K5  <- pop_pchaz(
  T = times,
  lambdaMat1    = m2r(matrix(11,nrow = 1)),
  lambdaMat2    = m2r(matrix(9, nrow = 1)),
  lambdaProgMat = m2r(matrix(5, nrow = 1)),
  p = c_resp,
  timezero = TRUE, discrete_approximation = TRUE
)

plot_shhr(K5, B5, main = "Presence of subgroups with differential treatment effect")

Basics

The functions of the package can be grouped according to their functionality.

Functions for modelling/setting the underlying survival model:

Functions for generating simulated dataset given for a specified survival model:

Functions for performing statistical tests:

Plotting functions:

The basic underlying model for the survival mechanism assumes that each patient can be in one of three states: Alive with no progression of disease, Alive with progression of disease, and Dead.

times <- c(0, 5 * 365)   # Time interval boundaries, in days

# Treatment group
t_resp <- 1              # There are no subgroups
B5 <- pop_pchaz(
  T = times,
  lambdaMat1    = m2r(matrix(18, nrow = 1)),
  lambdaMat2    = m2r(matrix(11, nrow = 1)),
  lambdaProgMat = m2r(matrix( 9, nrow = 1)),
  p = t_resp,
  timezero = FALSE, discrete_approximation = TRUE
)

# Control group
c_resp <- 1              # There are no subgroups
K5  <- pop_pchaz(
  T = times,
  lambdaMat1    = m2r(matrix(11, nrow = 1)),
  lambdaMat2    = m2r(matrix( 9, nrow = 1)),
  lambdaProgMat = m2r(matrix( 5, nrow = 1)),
  p = c_resp,
  timezero = TRUE, discrete_approximation = TRUE
)
plot_diagram(K5, B5, which = "Control")

Creating the population model with pop_pchaz

The first step is to create the population model with the pop_pchaz function. As the previous figure shows, there are three hazard rates that need to be defined: the hazard of disease progression $\lambda_P(t)$, the hazard of death given no progression $\lambda_P(t)$, and the hazard of death given progression $\lambda_P(t)$. The arguments lambdaProgMat, lambdaMat1, and lambdaMat2 in the pop_pchaz function correspond to the three hazard rates, respectively.

The hazard rates are assumed piecewise constant functions across $k$ time intervals $[t_{j-1}, t_j)$, $j=1, \ldots,k$ with $0=t_0pop_pchaz function has also an argument T that is a vector to specify $t_0,t_1,\ldots,t_k$. When T is of length 2 (and therefore only one time interval) lambdaProgMat, lambdaMat1, and lambdaMat2 are scalars. If T is of length > 2, then the lambdaProgMat, lambdaMat1, and lambdaMat2 are matrices where the number of columns is equal to the number of time intervals [\begin{bmatrix}\lambda^{[t_0-t_1)} & \lambda^{[t_1-t_2)} & \ldots &\lambda^{[t_{k-1}-t_k)}\end{bmatrix}.]

For example, if the patients are followed for two years but the hazards change after the first year, then T should be specified as c(0, 365, 2*365). If we assume a hazard rate for death of 0.02 and 0.04 for the first and second year respectively, the we should specify lambdaMat1 = matrix(c(0.02, 0.04), ncol = 2).

Finally, is is also possible to specify different hazard rates for subgroups. The pop_pchaz has the argument p which is intended to specify the subgroup prevalences. Given $m$ subgroups with relative sizes $p_1, p_2, \ldots, p_m$, then the p argument should be specified as c(p_1, p_2, ..., p_m). The lambdaProgMat, lambdaMat1, and lambdaMat2 then should have the number of row equal to the number of defined subgroups:

[ \begin{bmatrix} \lambda_1 \ \lambda_2 \ \ldots \ \lambda_m \end{bmatrix}. ]

For example, if patients can be divided into two subgroup with prevalences 0.2 and 0.8 with hazard rates a hazard rate for death of 0.02 and 0.03 thoughout a one year interval, then we define T = c(0, 365), p = c(0.2, 0.8) and lambdaMat1 = matrix(c(0.02, 0.03), nrow = 2).

Naturally, it is possible to combine multiple time intervals and subgroups, then the hazard matrices have the form:

knitr::include_graphics("lambdamat.png")

Below, we consider an example where there two subgroups and two time intervals. In practice, this situation correspond to the case where there is a delayed effect of the drug. Note that for specifying the hazard matrices, we use the median time to death/progression and use the function m2r (also provided in the package) to obtain the hazard rates.

times <- c(0, 100, 5 * 365)   # Time interval boundaries, in days
t_resp <- c(0.2, 0.8) #Proportion of subgroups
B5 <- pop_pchaz(
  T = times,
  lambdaMat1    = m2r(matrix(c(11, 30,
                               11, 18), byrow = TRUE, nrow = 2)),
  lambdaMat2    = m2r(matrix(c( 9, 20,
                                9, 11), byrow = TRUE, nrow = 2)),
  lambdaProgMat = m2r(matrix(c( 5, 15,
                                5,  9), byrow = TRUE, nrow = 2)),
  p = t_resp, discrete_approximation = TRUE
)

The results object is of class mixpch, which has a dedicated plotting function to visualize the survival and hazard functions.

plot(B5, main = "Survival function")
plot(B5, fun = "haz", main = "Hazard function")
plot(B5, fun = "cumhaz", main = "Cumulative Hazard function")

\newpage

Creating a simulated dataset with sample_fun

The sample_fun function is designed to generate a simulated dataset that would be obtained from a parallel group randomised clinical trial.

The first step is to create two objects with the (theoretical) survival functions for the treatment and control groups using pop_pchaz:

times <- c(0, 100, 5 * 365)   # Time interval boundaries, in days
# Treatment group
B5 <- pop_pchaz(T = times,
                   lambdaMat1    = m2r(matrix(c(11, 30,
                                                11, 18), byrow = TRUE, nrow = 2)),
                   lambdaMat2    = m2r(matrix(c( 9, 20,
                                                 9, 11), byrow = TRUE, nrow = 2)),
                   lambdaProgMat = m2r(matrix(c( 5, 15,
                                                 5,  9), byrow = TRUE, nrow = 2)),
                   p = c(0.2, 0.8),#Proportion of subgroups
                discrete_approximation = TRUE 
)
# Control group
K5  <- pop_pchaz(T = times,
                    lambdaMat1    = m2r(matrix(c(11, 11), nrow = 1)),
                    lambdaMat2    = m2r(matrix(c( 9,  9), nrow = 1)),
                    lambdaProgMat = m2r(matrix(c( 5,  5), nrow = 1)),
                    p = 1, discrete_approximation = TRUE
)

Then, using the resulting objects, we use them to generate a dataset with the sample_fun function:

# Study set up and Simulation of a data set until interim analysis at 150 events
set.seed(15657)
dat <- sample_fun(K5, B5,
                  r0 = 0.5,                     # Allocation ratio
                  eventEnd = 450,               # maximal number of events
                  lambdaRecr = 300 / 365,       # recruitment rate per day (Poisson assumption)
                  lambdaCens = 0.013 / 365,     # censoring rate per day  (Exponential assumption)
                  maxRecrCalendarTime = 3 * 365,# Maximal duration of recruitment
                  maxCalendar = 4 * 365.25)     # Maximal study duration
head(dat)
tail(dat)

\newpage

The weighted log-rank test and the max-LRtest

The weighted log-rank test is implemented using the function logrank.test, which uses the statistic:

[ z = \sum_{t\in\mathcal{D}} w(t)(d_{t, ctr} - e_{t,ctr}) / \sqrt{\sum_{t\in\mathcal{D}} w(t)^2 var(d_{t, ctr})}. ]

where $w(t)$ are the Fleming-Harrington $\rho-\gamma$ family weights, such that $w(t)=\widehat{S}(t)^{\rho}(1-\widehat{S}(t))^{\gamma}$. Under the the least favorable configuration in $H_0$, the test statistic is asymptotically standard normally distributed and large values of $z$ are in favor of the alternative.

For example, the following code performs the weighted log-rank test using the simulated dataset and $\rho = 1$ and $\gamma = 0$.

logrank.test(time  = dat$y,
             event = dat$event,
             group = dat$group,
             # alternative = "greater",
             rho   = 1,
             gamma = 0) 
# survival::survdiff(formula = survival::Surv(time  = dat$y, event = dat$event) ~ dat$group)

For a set of $k$ different weight functions $w_1(t), \ldots, w_k(t)$, the maximum log-rank test statistic is $z_{max} = \max_{i=1,\ldots,k}z_i$. Under the least favorable configuration in $H_0$, approximately $(Z_1, \ldots, Z_k) \sim N_k(0, \Sigma)$. The $p$-value of the maximum test, $P_{H_0}(Z_{max} > z_{max})$, is calculated based on this multivariate normal approximation via numeric integration.

The following code performs the maximum log-rank test using four combinations of $\rho$ and $\gamma$ for the weights.

lrmt = logrank.maxtest(
      time  = dat$y,
      event = dat$event,
      group = dat$group,
      rho   = c(0, 0, 1, 1),
      gamma = c(0, 1, 0, 1)
)
lrmt

The individual tests can also be accesed using the testListe elements in the resulting object.

lrmt$logrank.test[[1]]
lrmt$logrank.test[[2]]
lrmt$logrank.test[[3]]
lrmt$logrank.test[[4]]


Try the nph package in your browser

Any scripts or data that you put into this service are public.

nph documentation built on May 17, 2022, 1:06 a.m.