as_prediction_table: Convert to prediction table object

View source: R/PredictionTable.R

as_prediction_tableR Documentation

Convert to prediction table object

Description

Creates a prediction table object from input data.

Usage

as_prediction_table(
  x,
  type,
  y = waiver(),
  batch_id = waiver(),
  sample_id = waiver(),
  series_id = waiver(),
  repetition_id = waiver(),
  time = waiver(),
  class_levels = waiver(),
  value_range = waiver(),
  event_indicator = waiver(),
  censoring_indicator = waiver(),
  learner = waiver(),
  vimp_method = waiver(),
  model_object = NULL,
  data = NULL
)

Arguments

x

Values predicted using a learner. For all but classification problems, predicted values should be a single vector of values in any format that results in a single-column data.table using data.table::as.data.table. For classification problems, predicted values are probabilities for each class. Here, it is recommended to ensure probabilities can be mapped to their respective class, e.g. using a named list.

type

The type of prediction table that should be created. The following types are available:

  • regression: The predicted values are values for a regression.

  • classification: The predicted values are probabilities for specific classes.

  • hazard_ratio: The predicted values are hazard ratios.

  • cumulative_hazard: The predicted values are cumulative hazards at time time.

  • expected_survival_time: The predicted values are expected survival times.

  • survival_probability: The predicted values are survival probabilities at time time.

y

Known outcome value corresponding to each entry in x. For survival-related outcomes, two sets of values are expected, corresponding to the observed time and event status, respectively. Alternatively, a survival::Surv object can be provided.

batch_id

(optional) Array of batch or cohort identifiers.

In familiar any row of data is organised by four identifiers:

  • The batch identifier batch_id: This denotes the group to which a set of samples belongs, e.g. patients from a single study, samples measured in a batch, etc. The batch identifier is used for batch normalisation, as well as selection of development and validation datasets.

  • The sample identifier sample_id: This denotes the sample level, e.g. data from a single individual. Subsets of data, e.g. bootstraps or cross-validation folds, are created at this level.

  • The series identifier series_id: Indicates measurements on a single sample that may not share the same outcome value, e.g. a time series, or the number of cells in a view.

  • The repetition identifier repetition_id: Indicates repeated measurements in a single series where any feature values may differ, but the outcome does not. Repetition identifiers are always implicitly set when multiple entries for the same series of the same sample in the same batch that share the same outcome are encountered.

sample_id

(optional) Array of sample or subject identifiers. See batch_id above for more details.

If unset, every row will be identified as a single sample.

series_id

(optional) Array of series identifiers, which distinguish between measurements that are part of a series for a single sample. See batch_id above for more details.

repetition_id

(optional) Array of repetition identifiers, which distinguishes between repeated measurements within a single series. See batch_id above for more details.

time

Time point at which the predicted values are generated e.g. the cumulative risks generated by random forest.

This parameter is only relevant for survival outcomes.

class_levels

(optional) Class levels for binomial or multinomial outcomes. This argument can be used to specify the ordering of levels for categorical outcomes. These class levels must exactly match the levels present in the outcome column.

value_range

Range of observed, not predicted, values.

This parameter is only relevant for continuous outcomes.

event_indicator

(recommended) Indicator for events in survival and competing_risk analyses. familiar will automatically recognise 1, true, t, y and yes as event indicators, including different capitalisations. If this parameter is set, it replaces the default values.

censoring_indicator

(recommended) Indicator for right-censoring in survival and competing_risk analyses. familiar will automatically recognise 0, false, f, n, no as censoring indicators, including different capitalisations. If this parameter is set, it replaces the default values.

learner

The type of learner that generated the predictions.

vimp_method

The type of variable importance method for identifying the features included by the learner that generated the predictions.

model_object

A familiarModel or familiarEnsemble that can be used (and is used internally) for setting several of the other arguments of this function.

data

A familiar dataObject object that can be used (and is used internally) for setting many of the other arguments of this function.

Value

A prediction table object.


familiar documentation built on May 23, 2026, 1:07 a.m.