EdNetTrain: Train a neural network model

Description Usage Arguments Value Author(s) Examples

View source: R/EdNetTrain.R

Description

Train a neural network model

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
EdNetTrain(
  X,
  Y,
  family=NULL,
  learning_rate=0.05,
  num_epochs,
  hidden_layer_dims=NULL,
  hidden_layer_activations=NULL,
  weight=NULL,
  offset=NULL,
  optimiser="GradientDescent",
  keep_prob=NULL,
  input_keep_prob=NULL,
  tweediePower=ifelse(family=="tweedie", 1.5, NULL),
  alpha=0,
  lambda=0,
  mini_batch_size=NULL,
  dev_set=NULL,
  beta1=ifelse(optimiser %in% c("Momentum", "Adam"), 0.9, NULL),
  beta2=ifelse(optimiser %in% c("RMSProp", "Adam"), 0.999, NULL),
  epsilon=ifelse(optimiser %in% c("RMSProp", "Adam"), 1E-8, NULL),
  initialisation_constant=2,
  print_every_n=NULL,
  seed=1984L,
  plot=TRUE,
  checkpoint=NULL,
  keep=FALSE
)

Arguments

X

A matrix with rows as training examples and columns as input features

Y

A matrix with rows as training examples and columns as target values

family

Type of regression to be performed. One of "binary", "multiclass", "gaussian", "poisson", "gamma", "tweedie". Will be ignored if starting from a checkpoint model. Alternatively you can specify a named list with the following elements: "family" - a character of length 1 for reference only (must use "multiclass" if target values have dimension > 1); "link.inv" - a function (the inverse link function for activating the output layer); "costfun" - a function with parameters 'Y'and 'Y_hat' representing the cost function to be minimised; "gradfun" - a function with parameters 'Y'and 'Y_hat' representing the gradient of the cost function with respect to the linear, pre-activation, matrix in the output layer.

learning_rate

Learning rate to use.

num_epochs

Number of epochs (complete pass through training data) to be performed. If using mini-batches the number of iterations may be much higher.

hidden_layer_dims

Integer vector representing the dimensions of the hidden layers. Should not be specified if starting from a checkpoint model.

hidden_layer_activations

Character vector the same length as the hidden_layer_dims vector or length 1. If length is 1 the same activation function will be used for all hidden layers. Should only contain "relu" or "tanh" as these are the only supported activation functions for hidden layers. Should not be specified if starting from a checkpoint model.

weight

An optional vector of weights the same length as the number of rows of X or Y.

offset

A matrix with the same dimensions of Y to be used as an offset model. The offset needs to be in linear space as the offset is applied before the activation function.

optimiser

Type of optimiser to use. One of "GradientDescent", "Momentum", "RMSProp", "Adam".

keep_prob

Keep probabilities for applying drop-out in hidden layers. Either a constant or a vector the same length as the hidden_layer_dims vector. If NULL no drop-out is applied.

input_keep_prob

Keep probabilities for applying drop-out in the input layer. Needs to be a single constant. If NULL no drop-out is applied.

tweediePower

Tweedie power parameter. Only applicable in Tweedie regression. Should be a number between 1 and 2.

alpha

L1 regularisation term.

lambda

L2 regularisation term.

mini_batch_size

Size of mini-batches to use. If NULL full training set is used for each iteration.

dev_set

Integer vector representing hold-out data. Integers refer to individual training examples in the order presented in X.

beta1

Exponential weighting term for gradients when using Momentum or Adam optimisation.

beta2

Exponential weighting term for square pf gradients when using RMSProp or Adam optimisation.

epsilon

Small number used for numerical stability to prevent division by zero when using RMSProp or Adam optimisation.

initialisation_constant

Weights are initialised randomly to have variance of k / n where k is the initialisation_constant and n is the dimension of the previous layer. Recommended to use the default of 2 if using relu activations and change to 1 for tanh, although it can be tuned for any specific learning task.

print_every_n

Print info to the log every n epochs. If NULL, no printing is done.

seed

Random seed to use for repeatability.

plot

Plot cost function when printing to log.

checkpoint

Rather than initialise new parameters, start from a checkpoint model.

keep

keep X and Y data in final output.

Value

An object of class EdNetModel.

Author(s)

Edwin Graham <edwingraham1984@gmail.com>

Examples

1
# No example yet

EdwinGraham/EdNet documentation built on May 6, 2019, 12:22 p.m.