extract_data | R Documentation |
Compute various data related to model performance and calibration
from the provided dataset and familiarEnsemble
object and store it as a
familiarData
object.
extract_data(
object,
data,
data_element = waiver(),
is_pre_processed = FALSE,
cl = NULL,
time_max = waiver(),
aggregation_method = waiver(),
rank_threshold = waiver(),
ensemble_method = waiver(),
stratification_method = waiver(),
evaluation_times = waiver(),
metric = waiver(),
feature_cluster_method = waiver(),
feature_cluster_cut_method = waiver(),
feature_linkage_method = waiver(),
feature_similarity_metric = waiver(),
feature_similarity_threshold = waiver(),
sample_cluster_method = waiver(),
sample_linkage_method = waiver(),
sample_similarity_metric = waiver(),
sample_limit = waiver(),
detail_level = waiver(),
estimation_type = waiver(),
aggregate_results = waiver(),
confidence_level = waiver(),
bootstrap_ci_method = waiver(),
icc_type = waiver(),
dynamic_model_loading = FALSE,
message_indent = 0L,
verbose = FALSE,
...
)
object |
A |
data |
A |
data_element |
String indicating which data elements are to be extracted.
Default is |
is_pre_processed |
Flag that indicates whether the data was already
pre-processed externally, e.g. normalised and clustered. Only used if the
|
cl |
Cluster created using the |
time_max |
Time point which is used as the benchmark for e.g. cumulative
risks generated by random forest, or the cut-off value for Uno's concordance
index. If not provided explicitly, this parameter is read from settings used
at creation of the underlying |
aggregation_method |
Method for aggregating variable importances for the purpose of evaluation. Variable importances are determined during feature selection steps and after training the model. Both types are evaluated, but feature selection variable importance is only evaluated at run-time. See the documentation for the If not provided explicitly, this parameter is read from settings used at
creation of the underlying |
rank_threshold |
The threshold used to define the subset of highly important features during evaluation. See the documentation for the If not provided explicitly, this parameter is read from settings used at
creation of the underlying |
ensemble_method |
Method for ensembling predictions from models for the same sample. Available methods are:
|
stratification_method |
(optional) Method for determining the stratification threshold for creating survival groups. The actual, model-dependent, threshold value is obtained from the development data, and can afterwards be used to perform stratification on validation data. The following stratification methods are available:
One or more stratification methods can be selected simultaneously. This parameter is only relevant for |
evaluation_times |
One or more time points that are used for in analysis of
survival problems when data has to be assessed at a set time, e.g.
calibration. If not provided explicitly, this parameter is read from
settings used at creation of the underlying |
metric |
One or more metrics for assessing model performance. See the
vignette on performance metrics for the available metrics. If not provided
explicitly, this parameter is read from settings used at creation of the
underlying |
feature_cluster_method |
The method used to perform clustering. These are
the same methods as for the
If not provided explicitly, this parameter is read from settings used at
creation of the underlying |
feature_cluster_cut_method |
The method used to divide features into
separate clusters. The available methods are the same as for the
If not provided explicitly, this parameter is read from settings used at
creation of the underlying |
feature_linkage_method |
The method used for agglomerative clustering in
If not provided explicitly, this parameter is read from settings used at
creation of the underlying |
feature_similarity_metric |
Metric to determine pairwise similarity
between features. Similarity is computed in the same manner as for
clustering, and If not provided explicitly, this parameter is read from settings used at
creation of the underlying |
feature_similarity_threshold |
The threshold level for pair-wise
similarity that is required to form feature clusters with the If not provided explicitly, this parameter is read from settings used at
creation of the underlying |
sample_cluster_method |
The method used to perform clustering based on
distance between samples. These are the same methods as for the
If not provided explicitly, this parameter is read from settings used at
creation of the underlying |
sample_linkage_method |
The method used for agglomerative clustering in
If not provided explicitly, this parameter is read from settings used at
creation of the underlying |
sample_similarity_metric |
Metric to determine pairwise similarity
between samples. Similarity is computed in the same manner as for
clustering, but The underlying feature data is scaled to the If not provided explicitly, this parameter is read from settings used at
creation of the underlying |
sample_limit |
(optional) Set the upper limit of the number of samples that are used during evaluation steps. Cannot be less than 20. This setting can be specified per data element by providing a parameter
value in a named list with data elements, e.g.
This parameter can be set for the following data elements:
|
detail_level |
(optional) Sets the level at which results are computed and aggregated.
Note that each level of detail has a different interpretation for bootstrap
confidence intervals. For
A non-default |
estimation_type |
(optional) Sets the type of estimation that should be possible. This has the following options:
As with |
aggregate_results |
(optional) Flag that signifies whether results
should be aggregated during evaluation. If The default value is equal to As with |
confidence_level |
(optional) Numeric value for the level at which
confidence intervals are determined. In the case bootstraps are used to
determine the confidence intervals bootstrap estimation, The default value is |
bootstrap_ci_method |
(optional) Method used to determine bootstrap confidence intervals (Efron and Hastie, 2016). The following methods are implemented:
Note that the standard method is not implemented because this method is often not suitable due to non-normal distributions. The bias-corrected and accelerated (BCa) method is not implemented yet. |
icc_type |
String indicating the type of intraclass correlation
coefficient ( |
dynamic_model_loading |
(optional) Enables dynamic loading of models
during the evaluation process, if |
message_indent |
Number of indentation steps for messages shown during computation and extraction of various data elements. |
verbose |
Flag to indicate whether feedback should be provided on the computation and extraction of various data elements. |
... |
Unused arguments. |
A familiarData
object.
Shrout, P. E. & Fleiss, J. L. Intraclass correlations: uses in assessing rater reliability. Psychol. Bull. 86, 420–428 (1979).
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.