Influence function calculation for Brier score for event time data

\newcommand{\Xj}{\ensuremath{X^{\prime}}} \newcommand{\xj}{\ensuremath{x^{\prime}}} \newcommand{\AUC}{\ensuremath{\operatorname{AUC}}} \newcommand{\Brier}{\ensuremath{\operatorname{Brier}}} \newcommand{\survtau}{\ensuremath{\tilde S(\tau)}} \newcommand{\hatsurvtau}{\ensuremath{\hat{\tilde{S}}(\tau)}} \newcommand{\Htau}{\ensuremath{H(\tau)}} \newcommand{\hatHtau}{\ensuremath{\hat H^{(i)}(\tau)}}

\newcommand{\Utau}{\ensuremath{U(\tau)}} \newcommand{\hatUtau}{\ensuremath{\hat U^{(i)}(\tau)}} \newcommand{\Vtau}{\ensuremath{V(\tau)}} \newcommand{\hatVtau}{\ensuremath{\hat V^{(i)}(\tau)}} \newcommand{\Wtau}{\ensuremath{W(\tau)}} \newcommand{\hatWtau}{\ensuremath{\hat W^{(i)}(\tau)}} \newcommand{\margprob}{\ensuremath{F}} \newcommand{\hatmargprob}{\ensuremath{\hat{F}}} \newcommand{\Zi}{\ensuremath{Z_i}} \newcommand{\emp}{\ensuremath{I\negthickspace P_n}} \newcommand{\ifauc}{\ensuremath{\mathrm{IF}{\mathrm{AUC}}}} \newcommand{\hatifauc}{\ensuremath{\mathrm{\widehat{IF}}{\mathrm{AUC}}}} \newcommand{\ifnu}{\ensuremath{\mathrm{IF}{\nu}}} \newcommand{\ifnuc}{\ensuremath{\mathrm{IF}^1{\nu}}} \newcommand{\hatifnu}{\ensuremath{\mathrm{\widehat{IF}}{\nu}}} \newcommand{\hatifnuc}{\ensuremath{\mathrm{\widehat{IF}}^1{\nu}}} \newcommand{\ifmu}{\ensuremath{\mathrm{IF}{\mu}}} \newcommand{\hatifmu}{\ensuremath{\mathrm{\widehat{IF}}{\mu}}} \newcommand{\ifmuc}{\ensuremath{\mathrm{IF}^1_{\mu}}} \newcommand{\hatifmuc}{\ensuremath{\mathrm{\widehat{IF}}^1_{\mu}}}

\newcommand{\AUCO}{\hat{\theta}^{(1,1)}{\tau,m}} \newcommand{\auc}{\widehat{\text{auc}}} \newcommand{\aucO}{\theta{\tau,m}} \newcommand{\D}{\mathrm{d}} \newcommand{\Db}{D^{m,b}} \newcommand{\Dm}{D{m}} \newcommand{\E}{\mathbb{E}} \newcommand{\G}{\hat G_n} \newcommand{\g}{\hat{\mathbb{G}}} \newcommand{\II}{\mathcal{I}} \newcommand{\I}[1]{\II_{{#1}}} \newcommand{\Ibi}{\I{N_i^b=0}} \newcommand{\Ibj}{\I{N_j^b=0}} \newcommand{\Ibk}{\I{N_k^b=0}} \newcommand{\IC}{\text{IF}} \newcommand{\Isi}{\I{N_i=0}} \newcommand{\LL}{{\I{\T_{0}\leq \tau}-\R(X_{0})}^2} \newcommand{\Lbi}{{\I{\T_i\leq \tau}-\Rb(X_i)}^2} \newcommand{\Li}{{\I{\T_i\leq \tau}-\R(X_i)}^2} \newcommand{\Lj}{{\I{\T_j\leq \tau}-\R(X_j)}^2} \newcommand{\loo}{leave-one-out bootstrap } \newcommand{\lpo}{leave-pair-out bootstrap } \newcommand{\Lt}{\ensuremath{L_t}} \newcommand{\M}{\ensuremath{\omega_{\tau,m}}} \newcommand{\m}{\hat{\omega}^{(1)}{\tau,m}} \newcommand{\mbin}{\hat{\omega}^{(1)}{m}} \newcommand{\MU}{\ensuremath{\hat{\mu}^{(1)}{\tau,m}}} \newcommand{\p}{\ensuremath{\P_n}} \newcommand{\PSI}{\ensuremath{\hat{\psi}^{(1)}{m}}} \renewcommand{\P}{\ensuremath{\mathrm{P}}} \newcommand{\PEK}{\ensuremath{P_{\epsilon,k}}} \newcommand{\Qej}{\ensuremath{Q_{n,\epsilon,j}}} \newcommand{\RR}{\ensuremath{R_{\tau}}} \newcommand{\R}{\ensuremath{\RR(D_m)}} \newcommand{\Rb}{\ensuremath{\RR(D_{m,b}^)}} \newcommand{\Rbi}{\ensuremath{R(D_{m,b}^*)}} \newcommand{\T}{\ensuremath{\tilde{T}}} \newcommand{\V}{\text{Var}} \newcommand{\X}{\ensuremath{\tilde{X}}} \newcommand{\XX}{X} \newcommand{\W}{W_\tau} \newcommand{\Wi}{\W(X_i,\hat G_n)} \newcommand{\Wk}{\W(X_k,\hat G_n)} \newcommand{\WWi}{\mathcal{W}_{T_i}(X_i)} \newcommand{\WWj}{\mathcal{W}_t(X_j)}

knitr::opts_chunk$set(echo = TRUE)

Introduction

In this document, we consider the estimation of the Brier score - a discrimination measure. Let $R(D_m)(X)$ be a (risk) prediction for $X$ for a model $R$ trained on a data set $D_m$ of size $m$. Then, for binary data $Z=(Y,X)$, the Brier score is defined as: $$ \text{Brier}{R, D_m} = \mathbb{E}[(Y-R(D_m)(X))^2] = \int (y-R(D_m)(x))^2 P(dz) $$ Note that in the above definition that $D_m$ is fixed, so the Brier score above will depend on which data set that the model is trained on. To describe the situation with competing risks (and also survival) we introduce a random variable (D\in{1,2}) which indicates the cause (i.e., type of the event) observed at time (T) such that (D=1) means that the event of interest occurred, and (D=2) that a competing risk occurred. We let (Q) denote the joint probability measure of the uncensored data, ((T,D,X)\sim Q), and $P$ the joint probability measure of the right censored data $O=(\tilde{T},\Delta,X) \sim P$ now with $\Delta=D 1{{T \leq C}}$ taking values in the set ({0,1,2}). Also let $G$ denote the survival function for the censoring distribution. Now for event type data $Z=(T,D,X)$, we consider the above as a time-dependent discrimination measure with $Y=I(T \leq \tau)$ for some fixed $\tau$. Thus, $$ \begin{aligned} \text{Brier}{R, D_m, \tau} &= \mathbb{E}[(I(T \leq\tau, D=1)-R{\tau}(D_m)(X))^2] = \int (I(t \leq \tau, d= 1)-R_\tau(D_m)(x))^2 Q(dz) \ &= \int (I(t \leq \tau, \delta= 1)-R_\tau(D_m)(x))^2 W_\tau(z; G) P(dz) = \sum_{\delta = 0, 1, 2} \int\left{1_{{ t \leq \tau, \delta = 1}}R_\tau(D_m)(x)\right}^{2} W_\tau(z; G) P(d t, \delta, d x) \end{aligned} $$ Here we used IPCW with $W_\tau(z; G) = \frac{I(t \leq \tau, \delta = 1)}{G(t-|x)} + \frac{I(t \leq \tau, \delta=2)}{G(t-|x)} + \frac{I(t > \tau)}{G(\tau |x)}$.

In the situation with cross-validation, it will be of interest to estimate $\mathbb{E}{D_m}[\text{Brier}{R, D_m}]$ or $\mathbb{E}{D_m}[\text{Brier}{R, D_m, \tau}]$, i.e. the expected performance of the model over all training data sets of size $m$.

In the below sections, we will suggest some estimators of the Brier score and their asymptotic variances (by using influence functions). Also, we will calculate the efficient influence function for the Brier score.

Estimators of the Brier score

When we are dealing with the Brier score for binary data, we can consider the Brier score as a functional of the probability measure $P$. Plugging in the empirical measure $\mathbb{P}n$ instead of $P$, we get $$ \widehat{\text{Brier}}{R, D_m} = \int (y-R(D_m)(x))^2 \mathbb{P}n(dz) = \frac{1}{n} \sum{i=1}^n (Y_i-R(D_m)(X_i))^2 $$ By the functional delta method, one can derive the influence function of the above estimator as, $$ \text{IF}{\widehat{\text{Brier}}{R, D_m}}(Y_i,X_i) = (Y_i-R (D_m)(X_i))^2 - \text{Brier}{R, D_m} $$ This is also the efficient influence function. One can estimate this consistently by $$ \hat{\text{IF}}{\widehat{\text{Brier}}{R, D_m}}(Y_i,X_i) = (Y_i-R(D_m)(X_i))^2 - \widehat{\text{Brier}}{R, D_m} $$ The situation with type data is more complicated as one needs to estimate the censoring weights in $W_\tau(z;G)$. However, if we assume that the censoring does not depend on the covariates, then it is natural to estimate $G$ with the Kaplan-Meier estimator $\hat{G}{\text{KM}}$. Then by the plug-in principle, we consider the estimator, $$ \begin{aligned} \widetilde{\text{Brier}}{R, D_m, \tau} = \int (I(t \leq \tau, d= 1)-R(D_m)(x))^2 W_\tau(z; \hat{G}{\text{KM}}) \mathbb{P}_n(dz) = \frac{1}{n} \sum{i=1}^n (I(\tilde{T}i \leq \tau, \Delta_i = 1)-R{\tau}(D_m)(X_i))^2 W_\tau(Z_i; \hat{G}{\text{KM}}) \end{aligned} $$ The influence function of this may again be found by functional delta method as $$ \begin{aligned} \text{IF}{\widetilde{\text{Brier}}{R, D_m, \tau}}(\tilde{T}_i, \Delta_i) &= (I(\tilde{T}_i \leq \tau, \Delta_i = 1)-R{\tau}(D_m)(X_i))^2 W_\tau(Z_i; G)-\text{Brier}{R, D_m, \tau} \ &+ \int (I(t \leq \tau, \delta= 1)-R(D_m)(x))^2 \text{IF}{W_{\tau}, \hat{G}{\text{KM}}} (z; G)(\tilde{T}_i, \Delta_i) P(dz) \end{aligned} $$ with $$ \begin{aligned} \text{IF}{W_{\tau}, \hat{G}{\text{KM}}}(z;G)(\tilde{T}_i, \Delta_i) &= \frac{I(t \leq \tau, \delta = 1)}{G(t-)} \text{IF}{\hat{\Lambda}C}(t-)(\tilde{T}_i, \Delta_i) + \frac{I(t \leq \tau, \delta=2)}{G(t-)}\text{IF}{\hat{\Lambda}C}(t-)(\tilde{T}_i, \Delta_i) \ &+ \frac{I(t > \tau)}{G(\tau)}\text{IF}{\hat{\Lambda}C}(\tau)(\tilde{T}_i, \Delta_i) \end{aligned} $$ where $\text{IF}{\hat{\Lambda}C}(t)$ is the influence function of the cumulative hazard of the censoring at time $t$ of the Kaplan-Meier estimator of the censoring. We can estimate the influence function of the estimator, by plugging in $\hat{G}\text{KM}$ for $G$, $\hat{\text{IF}}{\hat{\Lambda}_C}(t)(\tilde{T}_i, \Delta_i)$ for $\text{IF}{\hat{\Lambda}_C}(t)(\tilde{T}_i, \Delta_i)$ and $\mathbb{P}_n$ for $P$ and of course, the Brier score with this estimator.

In a similar manner, we can also derive the influence function, when we instead try to estimate $G$ by a Cox model. Then the estimator is $$ \begin{aligned} \widecheck{\text{Brier}}{R, D_m, \tau} = \int (I(t \leq \tau, d= 1)-R(D_m)(x))^2 W\tau(z; \hat{G}{\text{Cox}}) \mathbb{P}_n(dz) = \frac{1}{n} \sum{i=1}^n (I(\tilde{T}i \leq \tau, \Delta_i = 1)-R{\tau}(D_m)(X_i))^2 W_\tau(Z_i; \hat{G}{\text{Cox}}) \end{aligned} $$ The influence function of this may again be found by functional delta method as $$ \begin{aligned} \text{IF}{\widecheck{\text{Brier}}{R, D_m, \tau}}(\tilde{T}_i, \Delta_i, X_i) &= (I(\tilde{T}_i \leq \tau, \Delta_i = 1)-R{\tau}(D_m)(X_i))^2 W_\tau(Z_i; G)-\text{Brier}{R, D_m, \tau} \ &+ \int (I(t \leq \tau, \delta= 1)-R(D_m)(x))^2 \text{IF}{W_{\tau}, \hat{G}{\text{Cox}}} (z; G)(\tilde{T}_i, \Delta_i, X_i) P(dz) \end{aligned} $$ with $$ \begin{aligned} \text{IF}{W_{\tau}, \hat{G}{\text{Cox}}}(z;G)(\tilde{T}_i, \Delta_i,X_i) &= \frac{I(t \leq \tau, \delta = 1)}{G(t- | x)} \text{IF}{\hat{\Lambda}C}(t-, x)(\tilde{T}_i, \Delta_i, X_i) + \frac{I(t \leq \tau, \delta=2)}{G(t- | x)}\text{IF}{\hat{\Lambda}C}(t- ,x)(\tilde{T}_i, \Delta_i, X_i) \ &+ \frac{I(t > \tau)}{G(\tau | x)}\text{IF}{\hat{\Lambda}C}(\tau, x)(\tilde{T}_i, \Delta_i, X_i) \end{aligned} $$ where $\text{IF}{\hat{\Lambda}C}(t,x)$ is the influence function of the cumulative hazard estimated by a Cox regression at time $t$ given covariate $x$. We can estimate the influence function of the estimator, by plugging in $\hat{G}\text{Cox}$ for $G$, $\hat{\text{IF}}{\hat{\Lambda}_C}(t,x)(\tilde{T}_i, \Delta_i)$ for $\text{IF}{\hat{\Lambda}C}(t,x)(\tilde{T}_i, \Delta_i)$ and $\mathbb{P}_n$ for $P$. This yields that $$ \begin{aligned} \hat{\text{IF}}{\widecheck{\text{Brier}}{R, D_m, \tau}}(\tilde{T}_i, \Delta_i, X_i) &= (I(\tilde{T}_i \leq \tau, \Delta_i = 1)-R{\tau}(D_m)(X_i))^2 W_\tau(Z_i; \hat{G}{\text{Cox}})-\widecheck{\text{Brier}}{R, D_m, \tau} \ &+ \int (I(t \leq \tau, \delta= 1)-R(D_m)(x))^2 \hat{\text{IF}}{W{\tau}, \hat{G}{\text{Cox}}} (z; \hat{G}{\text{Cox}})(\tilde{T}i, \Delta_i, X_i) \mathbb{P}_n(dz) \ &= (I(\tilde{T}_i \leq \tau, \Delta_i = 1)-R{\tau}(D_m)(X_i))^2 W_\tau(Z_i; \hat{G}{\text{Cox}})-\widecheck{\text{Brier}}{R, D_m, \tau} \ &+ \frac{1}{n} \sum_{j=1}^n (I(\tilde{T}j \leq \tau, \Delta_j= 1)-R(D_m)(X_j))^2 \hat{\text{IF}}{W_{\tau}, \hat{G}{\text{Cox}}} (Z_j; \hat{G}{\text{Cox}})(\tilde{T}i, \Delta_i, X_i) \ \end{aligned} $$ with $$ \begin{aligned} \hat{\text{IF}}{W_{\tau}, \hat{G}{\text{Cox}}} (z; \hat{G}{\text{Cox}})(\tilde{T}i, \Delta_i, X_i) &= \frac{I(t \leq \tau, \delta = 1)}{\hat{G}{\text{Cox}}(t- | x)} \hat{\text{IF}}{\hat{\Lambda}_C}(t-, x)(\tilde{T}_i, \Delta_i, X_i) + \frac{I(t \leq \tau, \delta=2)}{\hat{G}{\text{Cox}}(t- | x)}\hat{\text{IF}}{\hat{\Lambda}_C}(t- ,x)(\tilde{T}_i, \Delta_i, X_i) \ &+ \frac{I(t > \tau)}{\hat{G}{\text{Cox}}(\tau | x)}\hat{\text{IF}}{\hat{\Lambda}_C}(\tau, x)(\tilde{T}_i, \Delta_i, X_i) \end{aligned} $$ Here is $\hat{\text{IF}}{\hat{\Lambda}_C}(t ,x)$ is the estimated influence function of the cumulative hazard of the censoring when estimated by a Cox regression at time $t$ given covariate $x$.

Calculation of the efficient influence function for event type data

Taking the Gateaux derivative of the Brier score written as a functional of $P$ and $G$ yields, $$ \begin{aligned} \text{IF}{\text{Brier}{R, D_m, \tau}}(\tilde{T}i, \Delta_i,X_i) &=(I(\tilde{T}_i \leq \tau, D_i = 1)-R\tau(D_m)(X_i))^2 W_\tau(Z_i; G)-\text{Brier}{R, D_m, \tau} \ &+ \int (I(t \leq \tau, \delta= 1)-R(D_m)(x))^2 \text{IF}{W_{\tau}, G} (z; G)(\tilde{T}i, \Delta_i, X_i) P(dz) \end{aligned} $$ with $$ \begin{aligned} \text{IF}{W_{\tau}, \hat{G}{\text{Cox}}}(z;G)(\tilde{T}_i, \Delta_i,X_i) &= \frac{I(t \leq \tau, \delta = 1)}{G(t- | x)} \text{IF}{\Lambda_C}(t-, x)(\tilde{T}i, \Delta_i, X_i) + \frac{I(t \leq \tau, \delta=2)}{G(t- | x)}\text{IF}{\Lambda_C}(t- ,x)(\tilde{T}i, \Delta_i, X_i) \ &+ \frac{I(t > \tau)}{G(\tau | x)}\text{IF}{\Lambda_C}(\tau, x)(\tilde{T}i, \Delta_i, X_i) \end{aligned} $$ Here we have that $$ \text{IF}{\Lambda_C}(t ,x)(\tilde{T}i, \Delta_i, X_i) = \frac{\mathbbm{1}{{\tilde{T_i} \leq t, \Delta_i = 0}} \delta_{X_i}(x)}{G(\tilde{T_i}|X_i)S(\tilde{T_i}|X_i)} -\int_0^{\tilde{T}i \wedge t} \frac{ \delta{Z_i}(z)dP(s,0 | X_i)}{G(s|X_i)^2S(s|X_i)^2} $$ which is the efficient influence function of the cumulative hazard of the censoring. This is in general not a proper function of $x$. However, when both the outcome and the censoring does not depend on the covariates, one can see that the efficient influence function of the Brier score coincides with the estimator of the Brier when we use Kaplan-Meier censoring. In general though, our estimators are not efficient. Using the properties of the dirac measure yield that the last term in the EIF is: \begin{align} &\int I(t > \tau)R_\tau(D_m)(x) ^2 \frac{\text{IF}{\Lambda_C}(\tau ,x)(\tilde{T}_i, \Delta_i, X_i)}{G(\tau|x)}P(dt,dx) + \int I(t \leq \tau)(1-R\tau(D_m)(x))^2 \frac{\text{IF}{\Lambda_C}(t- ,x)(\tilde{T}_i, \Delta_i, X_i)}{G(t-|x)}P(dt,1,dx) \ &+ \int I(t \leq \tau)R\tau(D_m)(x) ^2 \frac{\text{IF}{\Lambda_C}(t- ,x)(\tilde{T}_i, \Delta_i, X_i)}{G(t-|x)}P(dt,2,dx) \ &= R\tau(D_m)(X_i)^2 S(\tau | X_i)\left(\frac{I(\tilde{T}i \leq \tau, \Delta_i = 0)}{G(\tilde{T}_i|X_i)S(\tilde{T}_i|X_i)}-\int_0^{\tilde{T}_i \wedge \tau} \frac{1}{G(s|X_i)^2S(s|X_i)^2}P(ds,0|X_i)\right) \ &+ (1-R\tau(D_m)(X_i))^2 \left(\frac{I(\tilde{T}i \leq \tau, \Delta_i = 0)}{G(\tilde{T}_i|X_i)S(\tilde{T}_i|X_i)}(F_1(\tau|X_i)-F_1(\tilde{T}_i|X_i))-\int_0^{\tilde{T}_i \wedge \tau} \frac{(F_1(\tau|X_i)-F_1(s|X_i))}{G(s|X_i)^2S(s|X_i)^2}P(ds,0|X_i)\right) \ &+ R\tau(D_m)(X_i)^2 \left(\frac{I(\tilde{T}i \leq \tau, \Delta_i = 0)}{G(\tilde{T}_i|X_i)S(\tilde{T}_i|X_i)}(F_2(\tau|X_i)-F_2(\tilde{T}_i|X_i))-\int_0^{\tilde{T}_i \wedge \tau} \frac{(F_2(\tau|X_i)-F_2(s|X_i))}{G(s|X_i)^2S(s|X_i)^2}P(ds,0|X_i)\right) \ &= R\tau(D_m)(X_i)^2 S(\tau | X_i)\left(\frac{I(\tilde{T}i \leq \tau, \Delta_i = 0)}{G(\tilde{T}_i|X_i)S(\tilde{T}_i|X_i)}-\int_0^{\tilde{T}_i \wedge \tau} \frac{1}{G(s|X_i)S(s|X_i)}\Lambda_C(ds|X_i)\right) \ &+ (1-R\tau(D_m)(X_i))^2 \left(\frac{I(\tilde{T}i \leq \tau, \Delta_i = 0)}{G(\tilde{T}_i|X_i)S(\tilde{T}_i|X_i)}(F_1(\tau|X_i)-F_1(\tilde{T}_i|X_i))-\int_0^{\tilde{T}_i \wedge \tau} \frac{(F_1(\tau|X_i)-F_1(s|X_i))}{G(s|X_i)S(s|X_i)}\Lambda_C(ds|X_i)\right) \ &+ R\tau(D_m)(X_i)^2 \left(\frac{I(\tilde{T}_i \leq \tau, \Delta_i = 0)}{G(\tilde{T}_i|X_i)S(\tilde{T}_i|X_i)}(F_2(\tau|X_i)-F_2(\tilde{T}_i|X_i))-\int_0^{\tilde{T}_i \wedge \tau} \frac{(F_2(\tau|X_i)-F_2(s|X_i))}{G(s|X_i)S(s|X_i)}\Lambda_C(ds|X_i)\right) \end{align} where $\Lambda_C$ is the cumulative hazard of the censoring and $F_1$ and $F_2$ are the subdistribution functions (or risk functions) of event 1 and 2.

IF with cross-validation

Binary case

Our cross-validation algorithm repeatedly splits the dataset (D_n) of size $n$ into training and validation datasets as follows. Let (B) be a large integer. For each (b=1,\ldots,B) we draw a bootstrap dataset (\Db={O_{b,1}^,\ldots,O_{b,m}^}) of size (m\le n) with or without replacement from the data (D_n). For (i=1,\ldots,n) let (N_i^b) be the number of times subject (i) is included in (\Db). For subsampling bootstrap (without replacement), (N_i^b) is either 0 or 1. In step (b) of the cross-validation algorithm the bootstrap dataset (\Db) is used for training. We apply (R ) to (\Db) to obtain the prediction model (\Rbi). All subjects (i) for which (N_i^b=0) are out-of-bag and we let these subjects form the validation dataset of step (b).

We now calculate the influence function in the case with binary outcome data ((Y\in {0,1}). In this case, we consider the following functional which describes the expected Brier score of the model (R) on average across all possible training datasets (D_m) of size (m). The expectation is taken with respect to the data of subject (0): \begin{align} \psi_{m}(P)&=\int\left(\int\cdots\int\left{y_{0}-R(\Dm)(x_{0})\right}^2 \prod_{i=1}^m P(d o_i)\right) P(d o_{0}) \end{align}

for some sample size $m < n$. Let \begin{equation} \omega_{m} (Y_{0},X_{0})=\mathbb{E}{D_m} [(Y{0}-R(D_m)(X_{0}))^2 | Y_{0}, X_{0}] \end{equation} because then $\psi_{m}(P) = \mathbb{E}{O{0}}[\omega_{m} (Y_{0},X_{0})]$. For estimating $\omega_{\tau,m}$, we propose to use leave one-out bootstrap estimation: \begin{equation} \hat{\omega}{m} (Y_i,X_i)=\frac{\sum{b=1}^B (Y_{i}-R(D_m)(X_{i}))^2I(N_i^b=0) }{\sum_{b=1}^B I(N_i^b=0)} \end{equation} Finally, the Brier score may then be estimated by leave one-out bootstrap estimation as ยด \begin{equation} \PSI=\frac{1}{n}\sum_{i=1}^n\hat{\omega}_{m} (Y_i,X_i) \end{equation} Finally, we also want standard errors of the estimates. For this, we consider the influence function by taking the Gateaux derivative and get: \begin{align} \IC_{\psi}(m;o_k)&=\frac{\partial}{\partial\epsilon}\psi_{m}(P_{\epsilon,k})\Big\vert_{\epsilon=0} \ &= \int \frac{\partial}{\partial\epsilon} \left(\int\cdots\int\left{y_{0}-R(\Dm)(x_{0})\right}^2 \prod_{i=1}^m \PEK (d o_i)\right) \PEK(d o_{0}) \Big\vert_{\epsilon=0} \ &= \int \frac{\partial}{\partial\epsilon} \left(\int\cdots\int\left{y_{0}-R(\Dm)(x_{0})\right}^2 \prod_{i=1}^m \PEK (d o_i)\right) \Big\vert_{\epsilon=0} P(d o_{0}) \ &+ \int \left(\int\cdots\int\left{y_{0}-R(\Dm)(x_{0})\right}^2 \prod_{i=1}^m P (d o_i)\right) \frac{\partial}{\partial\epsilon} \PEK(d o_{0}) \Big\vert_{\epsilon=0} \ &= \int \sum_{j=1}^m \left(\int\cdots\int\left{y_{0}-R(\Dm)(x_{0})\right}^2 \delta_{o_k}(o_j) \prod_{i \neq j} P(d o_i)\right) P(d o_{0}) - m \psi_{m}(P) \ %notation??? &+ \omega_{m} (Y_{k},X_{k})-\psi_{m}(P) \ &= \omega_{m} (Y_{k},X_{k})-(m+1) \psi_{m}(P) \ &+\int \sum_{j=1}^m \left(\int\cdots\int\left{y_{0}-R(\Dm)(x_{0})\right}^2 \delta_{o_k}(o_j) \prod_{i \neq j} P(d o_i)\right) P(d o_{0}) \end{align} wherein we used the product rule of differentiation. Note we use the approximation that $$\int \sum_{j=1}^m \left(\int\cdots\int\left{y_{0}-R(\Dm)(x_{0})\right}^2 \delta_{o_k}(o_j) \prod_{i \neq j} (d o_i)\right) P(d o_{0}) \approx m\psi_{m}(P)$$ Then \begin{equation} \IC_{\psi}(m;o_k) = \omega_{m} (Y_{k},X_{k}) - \psi_{m}(P) \end{equation} For estimating the influence function, we suggest the estimator: \begin{align} \widehat{\IC}_{\psi}(m;O_k)&=\mbin (Y_k,X_k)-\PSI \end{align}

Survival and competing risk case

Let us now try to expand this to the case with (right-censored) survival data, i.e. $O=(T, \Delta, X)$ and let $\bar{O} = (\tilde{T},X)$ denote the true event time, i.e. $T=\min{\tilde{T}, C}$ and $\Delta = I(T\leq C)$, where $C$ is the censoring time. Also let $Y_i = I(\tilde{T} \leq \tau)$ (or $T$, depending on whichever is the most appropriate), where $\tau$ be some prespecified time point and (\G) be an estimate of the censoring distribution (G) based on (D_n). Then we are concerned with the functional $\mu_{\tau, m}$ \begin{equation} \mu_{\tau,m}=\E_{O_{0}}\big[\E_{D_{m}}\big[\LL\big|\T_{0},X_{0}\big]\big] \end{equation} By rewriting the above above a bit (i.e. by using standard tricks when rewriting in terms of the observed data), it can be shown that this quantity can be defined in terms of the observed data, i.e. it can be expressed as the value of a statistical functional (\psi_{\tau,m}:\mathcal{P}\to [0,1]) \begin{align} \psi_{\tau,m}(\P)&=\int\left(\int\cdots\int\left{\I{u_{0}\leq\tau}-\RR(\Dm)(x_{0})\right}^2 \prod_{i=1}^m\P(\D o_i)\right)\ &\qquad\qquad\times\W(o_{0},\kappa_{\tau,x_{0}}(\P))\P(\D o_{0})\&=\mu_{\tau,m}. \end{align} with \begin{align} \Wi=\frac{\I{T_i\leq \tau}\Delta_i}{\G(T_i\vert X_i)}+\frac{\I{T_i>\tau}}{\G(\tau\vert X_i)} \end{align} then this can be estimated in much the same way as before, i.e. as \begin{equation} \MU=\frac{1}{n}\sum_{i=1}^n\m(T_i,X_i)\Wi \end{equation} Here, we redefine that \begin{equation} \m(\T_i,X_i)=\frac{\sum_{b=1}^B\Lbi\Ibi}{\sum_{b=1}^B\I{N_i^b=0}} \end{equation} In much the same way as before, we may find the the influence function of $\psi_{\tau,m}(\P)$ to be almost the same as before with an additional term corresponding to the fact that the censoring distribution has to be estimated, i.e. \begin{align} \IC_{\psi}(\tau,m;o_k)&=\M(t_k,x_k)\W(o_k,\kappa_{\tau,x_{0}}(\P))-(m+1)\, \mu_{\tau,m}\ &+\int\Big[\sum_{j=1}^m \int\cdots\int{\I{t_{0}\leq\tau}-\RR({o_i}{i=1}^n)(x{0})}^2 \delta_{o_k}(o_j)\prod_{i\neq j}\P(\D o_i)\Big]\W(o_{0},\kappa_{\tau,x_{0}}(\P))\P(\D o_{0}) \ &+\int\M(t_{0},x_{0})\left[\frac{\I{t_{0}\leq \tau}\delta_{0}}{G(t_{0}-\vert x_{0})} f_k(t_{0},x_{0}) +\frac{\I{t_{0}>\tau}}{G(\tau\vert x_{0})}f_k(\tau,x_{0}) \right]\P(\D o_{0}) \end{align} We can use the approximation that $\int\Big[\sum_{j=1}^m \int\cdots\int{\I{t_{0}\leq\tau}-\RR({o_i}{i=1}^n)(x{0})}^2 \delta_{o_k}(o_j)\prod_{i\neq j}\P(\D o_i)\Big]\W(o_{0},\kappa_{\tau,x_{0}}(\P))\P(\D o_{0}) \approx m\mu_{\tau,m}$. Then approximately, \begin{align} \IC_{\psi}(\tau,m;o_k)&\approx \M(t_k,x_k)\W(o_k,\kappa_{\tau,x_{0}}(\P))-\, \mu_{\tau,m}\ &+\int\M(t_{0},x_{0})\left[\frac{\I{t_{0}\leq \tau}\delta_{0}}{G(t_{0}-\vert x_{0})} f_k(t_{0},x_{0}) +\frac{\I{t_{0}>\tau}}{G(\tau\vert x_{0})}f_k(\tau,x_{0}) \right]\P(\D o_{0}) \end{align} This approximation can be justified in the sense that influence functions should have mean zero. This can then be estimated in much the same way as before \begin{align} \widehat{\IC}{\psi}(\tau,m;O_k)&=\m(T_k,X_k)\W(O_k,\G)-\, \MU \ &+\frac{1}{n}\sum{i=1}^n\m(T_i,X_i)\left[\frac{\I{T_i\leq\tau}\Delta_i}{\G(T_i-\vert X_i)}\hat{f_k} (T_i-,X_i) +\frac{\I{T_i>\tau}}{\G (\tau\vert X_i)}\hat{f_k} (\tau ,X_i)\right] \end{align} Here the last term corresponds to the censoring being unknown. Note that this corresponds exactly to estimating as the train-validation case situation with the residuals now being the cross-validated residuals $\m(T_k,X_k)\W(O_k,\G)$. This is for the survival case; for the competing risk case, we would use \begin{align} \widehat{\IC}{\psi}(\tau,m;O_k)&=\m(T_k,X_k)\W(O_k,\G)-\, \MU \ &+\frac{1}{n}\sum{i=1}^n\m(T_i,X_i)\left[\frac{\I{T_i\leq\tau, \Delta_i = 1}}{\G(T_i-\vert X_i)}\hat{f_k} (T_i-,X_i) + \frac{\I{T_i\leq\tau, \Delta_i = 2}}{\G(T_i-\vert X_i)}\hat{f_k} (T_i-,X_i) +\frac{\I{T_i>\tau}}{\G (\tau\vert X_i)}\hat{f_k} (\tau ,X_i)\right] \end{align} as an estimator.



Try the riskRegression package in your browser

Any scripts or data that you put into this service are public.

riskRegression documentation built on Sept. 8, 2023, 6:12 p.m.