extend_with_decoupled_weight_decay: Factory function returning an optimizer class with decoupled...

Description Usage Arguments Details Value Note Examples

View source: R/weight_decay_optimizers.R

Description

Factory function returning an optimizer class with decoupled weight decay

Usage

1

Arguments

base_optimizer

An optimizer class that inherits from tf$optimizers$Optimizer.

Details

The API of the new optimizer class slightly differs from the API of the base optimizer:

- The first argument to the constructor is the weight decay rate. - minimize and apply_gradients accept the optional keyword argument decay_var_list, which specifies the variables that should be decayed. If NULLs, all variables that are optimized are decayed.

Value

A new optimizer class that inherits from DecoupledWeightDecayExtension and base_optimizer.

Note

Note: this extension decays weights BEFORE applying the update based on the gradient, i.e. this extension only has the desired behaviour for optimizers which do not depend on the value of 'var' in the update step! Note: when applying a decay to the learning rate, be sure to manually apply the decay to the 'weight_decay' as well.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
## Not run: 

### MyAdamW is a new class
MyAdamW = extend_with_decoupled_weight_decay(tf$keras$optimizers$Adam)
### Create a MyAdamW object
optimizer = MyAdamW(weight_decay = 0.001, learning_rate = 0.001)
#### update var1, var2 but only decay var1
optimizer$minimize(loss, var_list = list(var1, var2), decay_variables = list(var1))


## End(Not run)

tfaddons documentation built on July 2, 2020, 2:12 a.m.