bandit_policy: Define a Bandit policy for early termination of HyperDrive...
In azuremlsdk: Interface to the 'Azure Machine Learning' 'SDK'

Description Usage Arguments Value Details Examples

Bandit is an early termination policy based on slack factor/slack amount and evaluation interval. The policy early terminates any runs where the primary metric is not within the specified slack factor/slack amount with respect to the best performing training run.

bandit_policy(
  slack_factor = NULL,
  slack_amount = NULL,
  evaluation_interval = 1L,
  delay_evaluation = 0L
)

`slack_factor`	A double of the ratio of the allowed distance from the best performing run.
`slack_amount`	A double of the absolute distance allowed from the best performing run.
`evaluation_interval`	An integer of the frequency for applying policy.
`delay_evaluation`	An integer of the number of intervals for which to delay the first evaluation.

The BanditPolicy object.

The Bandit policy takes the following configuration parameters:

slack_factor or slack_amount: The slack allowed with respect to the best performing training run. slack_factor specifies the allowable slack as a ration. slack_amount specifies the allowable slack as an absolute amount, instead of a ratio.
evaluation_interval: Optional. The frequency for applying the policy. Each time the training script logs the primary metric counts as one interval.
delay_evaluation: Optional. The number of intervals to delay the policy evaluation. Use this parameter to avoid premature termination of training runs. If specified, the policy applies every multiple of evaluation_interval that is greater than or equal to delay_evaluation.

Any run that doesn't fall within the slack factor or slack amount of the evaluation metric with respect to the best performing run will be terminated.

Consider a Bandit policy with slack_factor = 0.2 and evaluation_interval = 100. Assume that run X is the currently best performing run with an AUC (performance metric) of 0.8 after 100 intervals. Further, assume the best AUC reported for a run is Y. This policy compares the value (Y + Y * 0.2) to 0.8, and if smaller, cancels the run. If delay_evaluation = 200, then the first time the policy will be applied is at interval 200.

Now, consider a Bandit policy with slack_amount = 0.2 and evaluation_interval = 100. If run 3 is the currently best performing run with an AUC (performance metric) of 0.8 after 100 intervals, then any run with an AUC less than 0.6 (0.8 - 0.2) after 100 iterations will be terminated. Similarly, the delay_evaluation can also be used to delay the first termination policy evaluation for a specific number of sequences.

# In this example, the early termination policy is applied at every interval
# when metrics are reported, starting at evaluation interval 5. Any run whose
# best metric is less than (1 / (1 + 0.1)) or 91\% of the best performing run will
# be terminated
## Not run: 
early_termination_policy = bandit_policy(slack_factor = 0.1,
                                         evaluation_interval = 1L,
                                         delay_evaluation = 5L)

## End(Not run)