OnlineSuperLearner.SampleIteratively: OnlineSuperLearner.SampleIteratively

Description Usage Format Methods

Description

This class offers the functionality to sample from a fitted OnlineSuperLearner instance. This is also the class used when calling the sampleIteratively function on that class.

Usage

1

Format

An object of class R6ClassGenerator of length 24.

Methods

initialize(osl, relevantVariables, summary_measure_generator, remove_future_variables = FALSE, verbose = FALSE)

Creates a new instance of the OSL sampleiteratively class. When creating a new instance, we expect to receive the fitted instance of the OSL, as it is used to sample data from. This instance is stored so you don't have to provide it with every sampling call. If you update the OSL, this instance should be recreated with the correct osl and attributes.

@param osl OnlineSuperLearner a fitted instance of the OnlineSuperLearner class.

@param relevantVariables list a list of RelevantVariable objects. These are the relevantVariables we have fit a conditional density for.

@param summaryMeasureGenerator SummaryMeasureGenerator an object of the type SummaryMeasureGenerator. This generator is used to get new observations with the correct aggregated columns.

@param remove_future_variables boolean (default = FALSE) it is possible to remove variables that are in the future. These can be automatically removed by setting this variable to TRUE. Removing future variables is an extra check on the code. What it actually does is when the sampling procedure starts, it will check which data is needed for the first run, and remove all data that is not needed (and should be sampled). If we made a mistake, and for example use a future value, this can be detected by setting this variable to TRUE. The only reason for setting this to FALSE is because it might make the sampling procedure a bit slower (but probably not significantly).

@param verbose (default = FALSE) the verbosity (how much logging). Note that this might be propagated to other classes.

validate_parameters(start_from_variable, start_from_time, tau, discrete, return_type, intervention)

A function that could help to validate the provided variables. This function is called from the sample functions whenever the corresponding check parameters are TRUE.

@param start_from_variable RelevantVariable should be an instance of a RelevantVariable class.

@param start_from_time integer should be an integer value $x$ where $1 <= x < tau$.

@param tau integer a value when the intervention is measured. Should be greater or equal to 1.

@param discrete boolean should we run the discrete superlearner? should be in c(TRUE, FALSE).

@param return_type string we use a string to specify different return types of our sampling algorithm. A return type should be either one of c('observations', 'full', 'summary_measures'). Each of them cause a different set of data to be returned from the sampling procedure. See the sample_iteratively function for more details.

@param intervention list (an intervention instance) each intervention provided to this (and any other class) should follow the intervention specification guidelines in the InterventionParser class.

sample_iteratively(data, tau = 10, intervention = NULL, discrete = TRUE, return_type = 'observations', start_from_variable = NULL, start_from_time = 1, check=TRUE)

A function to actually sample from the OnlineSuperLearner, and applies an intervention if necessary. This function will grasp the OSL from the instance variable of the object and runs (iteratively) a sampling procedure. That is, it starts from the first variable (or start_from_variable) at time 1 (or time start_from_time) and samples the next variable, and the next, until it reaches the last variable in the series. It will then overflow to the next unit of time, until it reaches $tau$, the time at which the intervention should be recorded. Note that some preliminary analysis showed that for this function approx 94

@param data data.table the initial data to initialize the sampling procedure with ($O_0$).

@param tau integer (default = 10) the time at which one wants to measure the outcome of the sampling procedure.

@param intervention list (default = NULL) (Should be an intervention instance) each intervention provided to this (and any other class) should follow the intervention specification guidelines in the InterventionParser class.

@param discrete boolean (default = TRUE) should we run the discrete (= TRUE) or the cts superlearner (= FALSE).

@param return_type default is observations, should be one of observations, full, summary_measures. When observations, we only return the denormalized observations (not the summaries), when summary_measures, we only return the normalized summary_measured, and when full, we return both (normalized and denormalized)

@param start_from_variable RelevantVariable (default = NULL) the relevantvariable to start the sampling from. When NULL, we'll just start from the first in the sequence.

@param start_from_time the start time to start from (default 1)

@param check boolean (default = TRUE) perform checks on the provided variables and shows if they are valid.

@return data.table a data.table with the sampled values. The size and shape of the data.table differs according to the return_type specified.

sample_single_block(current_time, start_from_variable, data, intervention)

Samples a single block from the OnlineSuperLearner. This function is used internaly in the sample_iteratively function, and samples a single block (instead of a full series).

@param current_time integer because we sample a single block, we need to know where we are in the sampling process. This variable representes the current time in the sampling process.

@param start_from_variable RelevantVariable the RelevantVariable from which we should start the sampling procedure.

@param data data.table the current data needed to initialize the sampling of this block.

@param intervention list the intervention that might (or might not) be used in the sampling of the present block. This takes into account the actual current time provided to this function. If the intervention specified should be performed at a different time, no intervention is performed for now. See for the correct specification of this element the InterventionParser class

@return list of data.tables with the sampled block, both normalized and denormalized.

sample_or_intervene_current_rv(data, intervention, current_time, current_rv, discrete)

On an even lower level (a lower level than sampling a block), we need to sample RelevantVariables. This is the heart of the sampling procedure, and is used to sample (if the current time and RV are not an intervention node) or intervene (if they are) and get a new value for the next relevant variable in line.

@param data data.table the current data needed to initialize the sampling of the relevantvariable.

@param intervention list the intervention that might (or might not) be used in the sampling of the present block. This takes into account the actual current time provided to this function. If the intervention specified should be performed at a different time, no intervention is performed for now. See for the correct specification of this element the InterventionParser class

@param current_time integer because we sample a single block, we need to know where we are in the sampling process. This variable representes the current time in the sampling process.

@param current_rv RelevantVariable the RelevantVariable we are currently working with.

@param discrete boolean (default = TRUE) should we run the discrete (= TRUE) or the cts superlearner (= FALSE).

@return list containing a noramlized and denormalized value.

perform_intervention(parsed_intervention)

Instead of sampling data, this function takes care of setting an intervention.

@param parsed_intervention list (intervention instance) the intervention to perform on the data.

perform_sample(data, current_rv, discrete)

Actually runs the prediction and performs the sample from the OSL.

@param data data.table the current data needed to initialize the sampling of the relevantvariable.

@param current_rv RelevantVariable the RelevantVariable we are currently working with.

@param discrete boolean (default = TRUE) should we run the discrete (= TRUE) or the cts superlearner (= FALSE).

@return list the normalized and denormalized outcome.

create_correct_result(result, result_denormalized_observations, return_type)

As described in the sample_iteratively function, we provide a result based on a return type. This function actually takes care of transforming the full result in this specified return type result.

@param result data.table the result after running a sampling procedure (the normalized results)

@param result_denormalized_observations data.table the result after running a sampling procedure (the denormalized results)

@param return_type the specification which return_type one wants.

@return data.table the data.table containing the results requested.

set_start_from_variable(start_from_variable = NULL)

Small helper function to set the start_from variable in the sample_iteratively function. If no variable is provided (== NULL), we have to select the firest one. If one is provided, we have to use that one. This is exactly what this function takes care of.

@param start_from_variable RelevantVariable (default = NULL) the RelevantVariable to start from. If NULL, we return the first one.

@return RelevantVariable the actual start_from_variable which is a RelevantVariable to start from. If NULL is provided, we return the first one.

get_online_super_learner

Active method. Returns the provided instance of the OnlineSuperLearner (the one provided when initializing the object).

is_removing_future_variables

Active method. Is the current instance removing future variables?

get_relevant_variables

Active method. Returns the list of RelevantVariable objects provided on initialization.

get_relevant_variable_names

Active method. Returns a list with names of the RelevantVariables provided on initialization.

get_summary_measure_generator

Active method. Returns the summary_measure_generator provided on initialization.


frbl/OnlineSuperLearner documentation built on Feb. 9, 2020, 9:28 p.m.