dot-check_class_level_plausibility: Internal function to test plausibility of provided class...
In familiar: End-to-End Automated Machine Learning and Model Evaluation

.check_class_level_plausibility

R Documentation

Internal function to test plausibility of provided class levels

Description

This function checks whether categorical levels are present in the data that are not found in the user-provided class levels.

Usage

.check_class_level_plausibility(
  data,
  outcome_type,
  outcome_column,
  class_levels,
  check_stringency = "strict"
)

Arguments

`data`	Data set as loaded using the `.load_data` function.
`outcome_type`	(recommended) Type of outcome found in the outcome column. The outcome type determines many aspects of the overall process, e.g. the available feature selection methods and learners, but also the type of assessments that can be conducted to evaluate the resulting models. Implemented outcome types are: `binomial`: categorical outcome with 2 levels. `multinomial`: categorical outcome with 2 or more levels. `count`: Poisson-distributed numeric outcomes. `continuous`: general continuous numeric outcomes. `survival`: survival outcome for time-to-event data. If not provided, the algorithm will attempt to obtain outcome_type from contents of the outcome column. This may lead to unexpected results, and we therefore advise to provide this information manually. Note that `competing_risk` survival analysis are not fully supported, and is currently not a valid choice for `outcome_type`.
`outcome_column`	(recommended) Name of the column containing the outcome of interest. May be identified from a formula, if a formula is provided as an argument. Otherwise an error is raised. Note that `survival` and `competing_risk` outcome type outcomes require two columns that indicate the time-to-event or the time of last follow-up and the event status.
`class_levels`	(optional) Class levels for `binomial` or `multinomial` outcomes. This argument can be used to specify the ordering of levels for categorical outcomes. These class levels must exactly match the levels present in the outcome column.
`check_stringency`	Specifies stringency of various checks. This is mostly: `strict`: default value used for `summon_familiar`. Thoroughly checks input data. Used internally for checking development data. `external_warn`: value used for `extract_data` and related methods. Less stringent checks, but will warn for possible issues. Used internally for checking data for evaluation and explanation. `external`: value used for external methods such as `predict`. Less stringent checks, particularly for identifier and outcome columns, which may be completely absent. Used internally for `predict`.