mlr_measures_surv.dcalib | R Documentation |
This calibration method is defined by calculating the following statistic:
s = B/n \sum_i (P_i - n/B)^2
where B
is number of 'buckets' (that equally divide [0,1]
into intervals),
n
is the number of predictions, and P_i
is the observed proportion
of observations in the i
th interval. An observation is assigned to the
i
th bucket, if its predicted survival probability at the time of event
falls within the corresponding interval.
This statistic assumes that censoring time is independent of death time.
A model is well D-calibrated if s \sim Unif(B)
, tested with chisq.test
(p > 0.05
if well-calibrated, i.e. higher p-values are preferred).
Model i
is better calibrated than model j
if s(i) < s(j)
,
meaning that lower values of this measure are preferred.
This measure can either return the test statistic or the p-value from the chisq.test
.
The former is useful for model comparison whereas the latter is useful for determining if a model
is well-calibrated. If chisq = FALSE
and s
is the predicted value then you can manually
compute the p.value with pchisq(s, B - 1, lower.tail = FALSE)
.
NOTE: This measure is still experimental both theoretically and in implementation. Results should therefore only be taken as an indicator of performance and not for conclusive judgements about model calibration.
This Measure can be instantiated via the dictionary mlr_measures or with the associated sugar function msr():
MeasureSurvDCalibration$new() mlr_measures$get("surv.dcalib") msr("surv.dcalib")
Id | Type | Default | Levels | Range |
B | integer | 10 | [1, \infty) |
|
chisq | logical | FALSE | TRUE, FALSE | - |
truncate | numeric | Inf | [0, \infty) |
|
Type: "surv"
Range: [0, \infty)
Minimize: TRUE
Required prediction: distr
B
(integer(1)
)
Number of buckets to test for uniform predictions over.
Default of 10
is recommended by Haider et al. (2020).
Changing this parameter affects truncate
.
chisq
(logical(1)
)
If TRUE
returns the p-value of the corresponding chisq.test instead of the measure.
Default is FALSE
and returns the statistic s
.
You can manually get the p-value by executing pchisq(s, B - 1, lower.tail = FALSE)
.
The null hypothesis is that the model is D-calibrated.
truncate
(double(1)
)
This parameter controls the upper bound of the output statistic, when chisq
is FALSE
.
We use truncate = Inf
by default but values between 10-16
are sufficient
for most purposes, which correspond to p-values of 0.35-0.06
for the chisq.test
using
the default B = 10
buckets.
Values B > 10
translate to even lower p-values and thus less D-calibrated models.
If the number of buckets B
changes, you probably will want to
change the truncate
value as well to correspond to the same p-value significance.
Note that truncation may severely limit automated tuning with this measure.
mlr3::Measure
-> mlr3proba::MeasureSurv
-> MeasureSurvDCalibration
new()
Creates a new instance of this R6 class.
MeasureSurvDCalibration$new()
clone()
The objects of this class are cloneable with this method.
MeasureSurvDCalibration$clone(deep = FALSE)
deep
Whether to make a deep clone.
Haider, Humza, Hoehn, Bret, Davis, Sarah, Greiner, Russell (2020). “Effective Ways to Build and Evaluate Individual Survival Distributions.” Journal of Machine Learning Research, 21(85), 1–63. https://jmlr.org/papers/v21/18-772.html.
Other survival measures:
mlr_measures_surv.calib_alpha
,
mlr_measures_surv.calib_beta
,
mlr_measures_surv.calib_index
,
mlr_measures_surv.chambless_auc
,
mlr_measures_surv.cindex
,
mlr_measures_surv.graf
,
mlr_measures_surv.hung_auc
,
mlr_measures_surv.intlogloss
,
mlr_measures_surv.logloss
,
mlr_measures_surv.mae
,
mlr_measures_surv.mse
,
mlr_measures_surv.nagelk_r2
,
mlr_measures_surv.oquigley_r2
,
mlr_measures_surv.rcll
,
mlr_measures_surv.rmse
,
mlr_measures_surv.schmid
,
mlr_measures_surv.song_auc
,
mlr_measures_surv.song_tnr
,
mlr_measures_surv.song_tpr
,
mlr_measures_surv.uno_auc
,
mlr_measures_surv.uno_tnr
,
mlr_measures_surv.uno_tpr
,
mlr_measures_surv.xu_r2
Other calibration survival measures:
mlr_measures_surv.calib_alpha
,
mlr_measures_surv.calib_beta
,
mlr_measures_surv.calib_index
Other distr survival measures:
mlr_measures_surv.calib_alpha
,
mlr_measures_surv.calib_index
,
mlr_measures_surv.graf
,
mlr_measures_surv.intlogloss
,
mlr_measures_surv.logloss
,
mlr_measures_surv.rcll
,
mlr_measures_surv.schmid
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.