netzuko: Fit a neural network using back-propagation
In billyhw/netzuko:

netzuko

R Documentation

Fit a neural network using back-propagation

Description

Fit a neural network using back-propagation

Usage

netzuko(
  x_train,
  y_train,
  x_test = NULL,
  y_test = NULL,
  output_type = NULL,
  num_hidden = c(2, 2),
  iter = 300,
  activation = c("relu", "tanh", "logistic"),
  step_size = 0.01,
  batch_size = 128,
  lambda = 1e-05,
  momentum = 0.9,
  dropout = FALSE,
  retain_rate = 0.5,
  adam = FALSE,
  beta_1 = 0.9,
  beta_2 = 0.999,
  epsilon = 1e-08,
  ini_w = NULL,
  ini_method = c("normalized", "uniform", "gaussian"),
  scale = FALSE,
  sparse = FALSE,
  verbose = F,
  keep_grad = F
)

Arguments

`x_train`	The training inputs
`y_train`	The training outputs
`x_test`	The test inputs
`y_test`	The test outputs
`output_type`	The output type: either "numeric" (regression) or "categorical" (prediction). If NULL the function will try to guess the output type based on y_train
`num_hidden`	A vector with length equal the number of hidden layers, and values equal the number of hidden units in the corresponding layer. The default c(2, 2) will fit a neural network with 2 hidden layers with 2 hidden units in each layer.
`iter`	The number of iterations of gradient descent
`activation`	The hidden unit activation function (Tanh, ReLU, or Logistic)
`step_size`	The step size for gradient descent
`batch_size`	The batch size for stochastic gradient descent. If NULL, run (non-stochastic) gradient descent
`lambda`	The weight decay parameter
`momentum`	The momentum for the momentum term in gradient descent
`dropout`	If dropout should be used
`retain_rate`	If dropout is used, the retain rate for the input and hidden units
`adam`	If ADAM should be used for weight updates
`beta_1`	A parameter for ADAM
`beta_2`	A parameter for ADAM
`epsilon`	A parameter for ADAM
`ini_w`	A list of initial weights. If not provided the function will initialize the weights automatically by simulating from a Gaussian distribution with small variance.
`ini_method`	The initialization method
`sparse`	If the input matrix is sparse, setting sparse to TRUE can speed up the code.
`verbose`	Will display fitting progress when set to TRUE
`keep_grad`	Save the gradients at each iteration? (For research purpose)

Value

A list containing the following elements:

cost_train The training cost by iteration

cost_test: The test cost by iteration

w: The list of weights at the final iteration

Examples

set.seed(8)
logistic = function(alpha, beta, x) 1/(1 + exp(-(alpha + beta*x)))
x_train = matrix(rnorm(300), 100, 3)
y_train = factor(rbinom(100, 1, prob = logistic(alpha = 0, beta = 1, x_train[,1])) +
                  rbinom(100, 1, prob = logistic(alpha = 0, beta = 1, x_train[,2])))
x_test = matrix(rnorm(3000), 1000, 3)
y_test = factor(rbinom(1000, 1, prob = logistic(alpha = 0, beta = 1, x_test[,1])) +
                 rbinom(1000, 1, prob = logistic(alpha = 0, beta = 1, x_test[,2])))
fit = netzuko(x_train, y_train, x_test, y_test, num_hidden = c(3, 3), step_size = 0.01, iter = 200)
plot(fit$cost_train, type = "l")
lines(fit$cost_test, col = 2)
fit_2 = netzuko(x_train, y_train, iter = 200)
plot(fit_2$cost_train, type = "l")
fit_3 = netzuko(x_train, y_train, x_test, y_test, iter = 200, activation = "logistic")
plot(fit_3$cost_train, type = "l")
lines(fit$cost_test, col = 2)
y_train = factor(rbinom(100, 1, prob = logistic(alpha = 0, beta = 1, x_train[,1])))
y_test = factor(rbinom(1000, 1, prob = logistic(alpha = 0, beta = 1, x_test[,1])))
fit_4 = netzuko(x_train[,1], y_train, x_test[,1], y_test, iter = 200, num_hidden = 2)
plot(fit_4$cost_train, type = "l", ylim = range(c(fit_4$cost_train, fit_4$cost_test)))
lines(fit_4$cost_test, col = 2)
x_train = matrix(rnorm(300), 100, 3)
y_train = x_train[,1]^2
x_test = matrix(rnorm(3000), 1000, 3)
y_test = x_test[,1]^2
fit_5 = netzuko(x_train, y_train, x_test, y_test, step_size = 0.003, iter = 200)
plot(fit_5$cost_train, type = "l")
lines(fit_5$cost_test, col = 2)
y_train = cbind(y_train, x_train[,2]^2)
y_test = cbind(y_test, x_test[,2]^2)
fit_6 = netzuko(x_train, y_train, x_test, y_test, step_size = 0.003, iter = 200)
plot(fit_6$cost_train, type = "l")
lines(fit_6$cost_test, col = 2)
pred_6 = predict(fit_6, x_test)
fit_6$cost_test[200]
mean(rowSums((y_test - pred_6)^2))/2
fit_7 = netzuko(x_train, y_train, x_test, y_test, step_size = 0.01, iter = 500, scale = T)
plot(fit_7$cost_train, type = "l")
lines(fit_7$cost_test, col = 2)
pred_7 = predict(fit_7, x_test)
fit_7$cost_test[500]
tmp = scale_matrix(y_train, intercept = F)
mean(rowSums((scale_matrix(y_test, tmp$mean_x, tmp$sd_x, intercept = F)$x - scale_matrix(pred_7, tmp$mean_x, tmp$sd_x, intercept = F)$x)^2))/2

billyhw/netzuko documentation built on March 23, 2022, 4:26 p.m.