netzuko: Fit a neural network using back-propagation

View source: R/netzuko.R

netzukoR Documentation

Fit a neural network using back-propagation

Description

Fit a neural network using back-propagation

Usage

netzuko(
  x_train,
  y_train,
  x_test = NULL,
  y_test = NULL,
  output_type = NULL,
  num_hidden = c(2, 2),
  iter = 300,
  activation = c("relu", "tanh", "logistic"),
  step_size = 0.01,
  batch_size = 128,
  lambda = 1e-05,
  momentum = 0.9,
  dropout = FALSE,
  retain_rate = 0.5,
  adam = FALSE,
  beta_1 = 0.9,
  beta_2 = 0.999,
  epsilon = 1e-08,
  ini_w = NULL,
  ini_method = c("normalized", "uniform", "gaussian"),
  scale = FALSE,
  sparse = FALSE,
  verbose = F,
  keep_grad = F
)

Arguments

x_train

The training inputs

y_train

The training outputs

x_test

The test inputs

y_test

The test outputs

output_type

The output type: either "numeric" (regression) or "categorical" (prediction). If NULL the function will try to guess the output type based on y_train

num_hidden

A vector with length equal the number of hidden layers, and values equal the number of hidden units in the corresponding layer. The default c(2, 2) will fit a neural network with 2 hidden layers with 2 hidden units in each layer.

iter

The number of iterations of gradient descent

activation

The hidden unit activation function (Tanh, ReLU, or Logistic)

step_size

The step size for gradient descent

batch_size

The batch size for stochastic gradient descent. If NULL, run (non-stochastic) gradient descent

lambda

The weight decay parameter

momentum

The momentum for the momentum term in gradient descent

dropout

If dropout should be used

retain_rate

If dropout is used, the retain rate for the input and hidden units

adam

If ADAM should be used for weight updates

beta_1

A parameter for ADAM

beta_2

A parameter for ADAM

epsilon

A parameter for ADAM

ini_w

A list of initial weights. If not provided the function will initialize the weights automatically by simulating from a Gaussian distribution with small variance.

ini_method

The initialization method

sparse

If the input matrix is sparse, setting sparse to TRUE can speed up the code.

verbose

Will display fitting progress when set to TRUE

keep_grad

Save the gradients at each iteration? (For research purpose)

Value

A list containing the following elements:

cost_train The training cost by iteration

cost_test: The test cost by iteration

w: The list of weights at the final iteration

Examples

set.seed(8)
logistic = function(alpha, beta, x) 1/(1 + exp(-(alpha + beta*x)))
x_train = matrix(rnorm(300), 100, 3)
y_train = factor(rbinom(100, 1, prob = logistic(alpha = 0, beta = 1, x_train[,1])) +
                  rbinom(100, 1, prob = logistic(alpha = 0, beta = 1, x_train[,2])))
x_test = matrix(rnorm(3000), 1000, 3)
y_test = factor(rbinom(1000, 1, prob = logistic(alpha = 0, beta = 1, x_test[,1])) +
                 rbinom(1000, 1, prob = logistic(alpha = 0, beta = 1, x_test[,2])))
fit = netzuko(x_train, y_train, x_test, y_test, num_hidden = c(3, 3), step_size = 0.01, iter = 200)
plot(fit$cost_train, type = "l")
lines(fit$cost_test, col = 2)
fit_2 = netzuko(x_train, y_train, iter = 200)
plot(fit_2$cost_train, type = "l")
fit_3 = netzuko(x_train, y_train, x_test, y_test, iter = 200, activation = "logistic")
plot(fit_3$cost_train, type = "l")
lines(fit$cost_test, col = 2)
y_train = factor(rbinom(100, 1, prob = logistic(alpha = 0, beta = 1, x_train[,1])))
y_test = factor(rbinom(1000, 1, prob = logistic(alpha = 0, beta = 1, x_test[,1])))
fit_4 = netzuko(x_train[,1], y_train, x_test[,1], y_test, iter = 200, num_hidden = 2)
plot(fit_4$cost_train, type = "l", ylim = range(c(fit_4$cost_train, fit_4$cost_test)))
lines(fit_4$cost_test, col = 2)
x_train = matrix(rnorm(300), 100, 3)
y_train = x_train[,1]^2
x_test = matrix(rnorm(3000), 1000, 3)
y_test = x_test[,1]^2
fit_5 = netzuko(x_train, y_train, x_test, y_test, step_size = 0.003, iter = 200)
plot(fit_5$cost_train, type = "l")
lines(fit_5$cost_test, col = 2)
y_train = cbind(y_train, x_train[,2]^2)
y_test = cbind(y_test, x_test[,2]^2)
fit_6 = netzuko(x_train, y_train, x_test, y_test, step_size = 0.003, iter = 200)
plot(fit_6$cost_train, type = "l")
lines(fit_6$cost_test, col = 2)
pred_6 = predict(fit_6, x_test)
fit_6$cost_test[200]
mean(rowSums((y_test - pred_6)^2))/2
fit_7 = netzuko(x_train, y_train, x_test, y_test, step_size = 0.01, iter = 500, scale = T)
plot(fit_7$cost_train, type = "l")
lines(fit_7$cost_test, col = 2)
pred_7 = predict(fit_7, x_test)
fit_7$cost_test[500]
tmp = scale_matrix(y_train, intercept = F)
mean(rowSums((scale_matrix(y_test, tmp$mean_x, tmp$sd_x, intercept = F)$x - scale_matrix(pred_7, tmp$mean_x, tmp$sd_x, intercept = F)$x)^2))/2

billyhw/netzuko documentation built on March 23, 2022, 4:26 p.m.