CrossValidationRiskCalculator: CrossValidationRiskCalculator

Description Usage Format Methods

Description

Class that contains various methods for calculating the crossvalidated risk of an estimator.

Usage

1

Format

An object of class R6ClassGenerator of length 24.

Methods

initialize()

Creates a new cross validated risk calculator.

calculate_evaluation(predicted.outcome, observed.outcome, relevantVariables, add_evaluation_measure_name=TRUE)

Calculates an evaluation using the provided predicted and observed outcomes. It uses a list of RelevantVariable objects to loop through all data provided to it. If the predicted.outcome list is provided with both a normalized and denormalized entry, it will use the normalized entry as the default. One can choose to add the evaluation metric that was used to the names of the output. This is done by setting add_evaluation_measure_name to true.

The input data should look looks as followos: list normalized AlgorithmName1 data.table with a W, A, Y entry. AlgorithmName2 data.table with a W, A, Y entry. denormalized AlgorithmName1 data.table with a W, A, Y entry. AlgorithmName2 data.table with a W, A, Y entry.

The output data then looks as follows: list AlgorithmName1 data.table with a W, A, Y entry. AlgorithmName2 data.table with a W, A, Y entry.

@param predicted.outcome the outcome predicted by the various algorithms in the super learner. This is a list which either has two entries (normalized and denormalized), and in which both those entries have a list of ML outputs, or it is a list of the outputs of each of the algorithms (e.g., the normalized output directly).

@param observed.outcome the actual data that was observed in the study.

@param relevantVariables the relevantvariables that are included in the prediction

@param add_evaluation_measure_name (default TRUE) should we add the name of the evaluation metric to the output?

@return a list with the evalutation of each of the algorithms.

evaluate_single_outcome(observed.outcome, predicted.outcome, ra ndomVariables

Perform the evaluation of a single estimator. In this case the data of just one estimator can be provided, such as: AlgorithmName1 data.table with a W, A, Y entry. the function will then use the default evaluation metric to determine the performance of the estimator.

@param predicted.outcome the outcome predicted by a single algorithms in the super learner.

@param observed.outcome the actual data that was observed in the study.

@param relevantVariables the relevantvariables that are included in the prediction.

@return a list with the evalutation of the algorithm.

calculate_risk(predicted.outcome, observed.outcome, relevantVariables

Calculate the CV risk for each of the relevant variables provided based on the predicted and observed outcomes. This function also expects a list of predictions in a similar way as calculate_evaluation does.

@param predicted.outcome the outcome predicted by the various algorithms in the super learner. This is a list which either has two entries (normalized and denormalized), and in which both those entries have a list of ML outputs, or it is a list of the outputs of each of the algorithms (e.g., the normalized output directly).

@param observed.outcome the actual data that was observed in the study (emperically, or from a simulation).

@param relevantVariables the relevantvariables that are included in the prediction

@param add_evaluation_measure_name (default TRUE) should we add the name of the evaluation metric to the output?

@return a list of lists, in which each element is the risk for one estimator. In each list per estimator, each element corresponds to one of the relevant variables.

update_risk(predicted.outcome, observed.outcome, relevantVariables, current_count, current_risk)

Function used by the OSL to update a previous risk. This function uses the equation by Benkeser et al. (2017) to update a previous risk. What it does is multiply a previous risk (current_risk) by the current_count and add the new risk to this multiplied risk. Then it divides this risk by current_count + 1 to come to the current risk estimate. This way we don't have to recalculate the whole risk when only one update is required.

@param predicted.outcome the outcome predicted by the various algorithms in the super learner. This is a list which either has two entries (normalized and denormalized), and in which both those entries have a list of ML outputs, or it is a list of the outputs of each of the algorithms (e.g., the normalized output directly).

@param observed.outcome the actual data that was observed in the study (emperically, or from a simulation).

@param relevantVariables the relevantvariables for which the distributions have been calculated

@param current_count the current number of evaluations performed for calculating the current_risk.

@param current_risk the previously calculated risk of each of the estimators (calculated over current_count number of evaluations).

@return a list of lists with the updated risk for each estimator, and for each estimator an estimate of the risk for each relevant variable.

update_single_risk(old_risk, new_risks, current_count, relevantVariables)

Instaed of updating the risk for each of estimators, one can also update a single risk. In this case the risks are updated using the old_risk and new_risks variable. Essentially, this function performs the internals of the update_risk function, however, here it expects risks to be calculated beforehand instead of mere predictions and observed outcomes. This function uses the equation by Benkeser et al. (2017) to update a previous risk. What it does is multiply a previous risk (current_risk) by the current_count and add the new risk to this multiplied risk. Then it divides this risk by current_count + 1 to come to the current risk estimate. This way we don't have to recalculate the whole risk when only one update is required.

@param old_risk the old risks, calculated in a previous iteration. @param new_risks the new risks, calculated using the current machine learning estimators. @param current_count the number of iterations used to calculate the old risk. @param relevantVariables the relevant variables for which the predictions have been created. @return the updated risk as a data.table.


frbl/OnlineSuperLearner documentation built on Feb. 9, 2020, 9:28 p.m.