Determine the optimal cut point for a continuous variable
in a coxph
or survfit
model.
1 2 3 4 5 6 7 
x 
A 
defCont 
definition of a continuous variable.

... 
Additional arguments (not implemented). 
For a cut point mu, of a predictor K,
the variable is split
into two groups, those >= mu and
those < mu.
The score (or logrank) statistic, sc,
is calculated for each unique element
k in K and uses
e1[i] the number of events
n1[i] the number at risk
in those above the cut point, respectively.
The basic statistic is
sc[k] = sum (e1[i]  n1[i] * e[i] / n[i])
The sum is taken across times with observed events, to D,
the largest of these.
It is normalized (standardized), in the case of censoring,
by finding s^2 which is:
s^2 = (1 / (D  1)) * sum[i:D](1  sum[j:i](1 / (D  j + 1))^2 )
The test statistic is then
Q = max(abs(sc[k])) / s * sqrt((D  1))
Under the null hypothesis that the chosen cut point does not predict survival, the distribution of Q has a limiting distibution which is the supremum of the absolute value of a Brownian bridge:
p= P(Q >= q) = 2 sum[i:Inf](1)^(i + 1) * e^(2 * i^2 *q^2)
A list
of data.table
s.
There is one list element per continuous variable.
Each has a column with possible values of the cut point
(i.e. unique values of the variable), and the
additional columns:
U 
The score (logrank) test for a model with the variable 'cut' into into those >= the cutpoint and those below. 
Q 
The test statistic. 
p 
The pvalue. 
The tables are ordered by pvalue, lowest first.
Contal C, O'Quigley J, 1999. An application of changepoint methods in studying the effect of age on survival in breast cancer. Computational Statistics & Data Analysis 30(3):253–70. ScienceDirect (paywall)
Mandrekar JN, Mandrekar, SJ, Cha SS, 2003. Cutpoint Determination Methods in Survival Analysis using SAS. Proceedings of the 28th SAS Users Group International Conference (SUGI). Paper 26128. SAS (free)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35  ## Mandrekar et al. above
data("bmt", package="KMsurv")
b1 < bmt[bmt$group==1, ] # ALL patients
c1 < coxph(Surv(t2, d3) ~ z1, data=b1) # z1=age
c1 < cutp(c1)$z1
data.table::setorder(c1, "z1")
## [] below is used to print data.table to console
c1[]
## Not run:
## compare to output from survival::coxph
matrix(
unlist(
lapply(26:30,
function(i) c(i, summary(coxph(Surv(t2, d3) ~ z1 >= i, data=b1))$sctest))),
ncol=5,
dimnames=list(c("age", "score_test", "df", "p")))
cutp(coxph(Surv(t2, d3) ~ z1, data=bmt[bmt$group==2, ]))$z1[]
cutp(coxph(Surv(t2, d3) ~ z1, data=bmt[bmt$group==3, ]))[[1]][]
## K&M. Example 8.3, pg 273274.
data("kidtran", package="KMsurv")
k1 < kidtran
## patients who are male and black
k2 < k1[k1$gender==1 & k1$race==2, ]
c2 < coxph(Surv(time, delta) ~ age, data=k2)
print(cutp(c2))
## check significance of computed value
summary(coxph(Surv(time, delta) ~ age >= 58, data=k2))
k3 < k1[k1$gender==2 & k1$race==2, ]
c3 < coxph(Surv(time, delta) ~ age, data=k3)
print(cutp(c3))
## doesn't apply to binary variables e.g. gender
print(cutp(coxph(Surv(time, delta) ~ age + gender, data=k1)))
## End(Not run)

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.
All documentation is copyright its authors; we didn't write any of that.