| familiarDataElement-class | R Documentation |
Most attributes of the familiarData object are objects of the familiarDataElement class. This (super-)class is used to allow for standardised aggregation and processing of evaluation data.
dataEvaluation data, typically a data.table or list.
identifiersIdentifiers of the data, e.g. the generating model name, learner, etc.
detail_levelSets the level at which results are computed and aggregated.
ensemble: Results are computed at the ensemble level, i.e. over all
models in the ensemble. This means that, for example, bias-corrected
estimates of model performance are assessed by creating (at least) 20
bootstraps and computing the model performance of the ensemble model for
each bootstrap.
hybrid (default): Results are computed at the level of models in an
ensemble. This means that, for example, bias-corrected estimates of model
performance are directly computed using the models in the ensemble. If
there are at least 20 trained models in the ensemble, performance is
computed for each model, in contrast to ensemble where performance is
computed for the ensemble of models. If there are less than 20 trained
models in the ensemble, bootstraps are created so that at least 20 point
estimates can be made.
model: Results are computed at the model level. This means that, for
example, bias-corrected estimates of model performance are assessed by
creating (at least) 20 bootstraps and computing the performance of the
model for each bootstrap.
Note that each level of detail has a different interpretation for bootstrap
confidence intervals. For ensemble and model these are the confidence
intervals for the ensemble and an individual model, respectively. That is,
the confidence interval describes the range where an estimate produced by a
respective ensemble or model trained on a repeat of the experiment may be
found with the probability of the confidence level. For hybrid, it
represents the range where any single model trained on a repeat of the
experiment may be found with the probability of the confidence level. By
definition, confidence intervals obtained using hybrid are at least as
wide as those for ensemble. hybrid offers the correct interpretation if
the goal of the analysis is to assess the result of a single, unspecified,
model.
Some child classes do not use this parameter.
estimation_typeSets the type of estimation that should be possible. This has the following options:
point: Point estimates.
bias_correction or bc: Bias-corrected estimates. A bias-corrected
estimate is computed from (at least) 20 point estimates, and familiar may
bootstrap the data to create them.
bootstrap_confidence_interval or bci (default): Bias-corrected
estimates with bootstrap confidence intervals (Efron and Hastie, 2016). The
number of point estimates required depends on the confidence_level
parameter, and familiar may bootstrap the data to create them.
Some child classes do not use this parameter.
confidence_level(optional) Numeric value for the level at which
confidence intervals are determined. In the case bootstraps are used to
determine the confidence intervals bootstrap estimation, familiar uses
the rule of thumb n = 20 / ci.level to determine the number of
required bootstraps.
bootstrap_ci_methodMethod used to determine bootstrap confidence intervals (Efron and Hastie, 2016). The following methods are implemented:
percentile (default): Confidence intervals obtained using the percentile
method.
bc: Bias-corrected confidence intervals.
Note that the standard method is not implemented because this method is often not suitable due to non-normal distributions. The bias-corrected and accelerated (BCa) method is not implemented yet.
value_columnIdentifies column(s) in the data attribute presenting
values.
grouping_columnIdentifies column(s) in the data attribute presenting
identifier columns for grouping during aggregation. Familiar will
automatically assign items from the identifiers attribute to the data and
this attribute when combining multiple familiarDataElements of the same
(child) class.
is_aggregatedDefines whether the object was aggregated.
Efron, B. & Hastie, T. Computer Age Statistical Inference. (Cambridge University Press, 2016).
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.