netzuko | R Documentation |
Fit a neural network using back-propagation
netzuko( x_train, y_train, x_test = NULL, y_test = NULL, output_type = NULL, num_hidden = c(2, 2), iter = 300, activation = c("relu", "tanh", "logistic"), step_size = 0.01, batch_size = 128, lambda = 1e-05, momentum = 0.9, dropout = FALSE, retain_rate = 0.5, adam = FALSE, beta_1 = 0.9, beta_2 = 0.999, epsilon = 1e-08, ini_w = NULL, ini_method = c("normalized", "uniform", "gaussian"), scale = FALSE, sparse = FALSE, verbose = F, keep_grad = F )
x_train |
The training inputs |
y_train |
The training outputs |
x_test |
The test inputs |
y_test |
The test outputs |
output_type |
The output type: either "numeric" (regression) or "categorical" (prediction). If NULL the function will try to guess the output type based on y_train |
num_hidden |
A vector with length equal the number of hidden layers, and values equal the number of hidden units in the corresponding layer. The default c(2, 2) will fit a neural network with 2 hidden layers with 2 hidden units in each layer. |
iter |
The number of iterations of gradient descent |
activation |
The hidden unit activation function (Tanh, ReLU, or Logistic) |
step_size |
The step size for gradient descent |
batch_size |
The batch size for stochastic gradient descent. If NULL, run (non-stochastic) gradient descent |
lambda |
The weight decay parameter |
momentum |
The momentum for the momentum term in gradient descent |
dropout |
If dropout should be used |
retain_rate |
If dropout is used, the retain rate for the input and hidden units |
adam |
If ADAM should be used for weight updates |
beta_1 |
A parameter for ADAM |
beta_2 |
A parameter for ADAM |
epsilon |
A parameter for ADAM |
ini_w |
A list of initial weights. If not provided the function will initialize the weights automatically by simulating from a Gaussian distribution with small variance. |
ini_method |
The initialization method |
sparse |
If the input matrix is sparse, setting sparse to TRUE can speed up the code. |
verbose |
Will display fitting progress when set to TRUE |
keep_grad |
Save the gradients at each iteration? (For research purpose) |
A list containing the following elements:
cost_train The training cost by iteration
cost_test: The test cost by iteration
w: The list of weights at the final iteration
set.seed(8) logistic = function(alpha, beta, x) 1/(1 + exp(-(alpha + beta*x))) x_train = matrix(rnorm(300), 100, 3) y_train = factor(rbinom(100, 1, prob = logistic(alpha = 0, beta = 1, x_train[,1])) + rbinom(100, 1, prob = logistic(alpha = 0, beta = 1, x_train[,2]))) x_test = matrix(rnorm(3000), 1000, 3) y_test = factor(rbinom(1000, 1, prob = logistic(alpha = 0, beta = 1, x_test[,1])) + rbinom(1000, 1, prob = logistic(alpha = 0, beta = 1, x_test[,2]))) fit = netzuko(x_train, y_train, x_test, y_test, num_hidden = c(3, 3), step_size = 0.01, iter = 200) plot(fit$cost_train, type = "l") lines(fit$cost_test, col = 2) fit_2 = netzuko(x_train, y_train, iter = 200) plot(fit_2$cost_train, type = "l") fit_3 = netzuko(x_train, y_train, x_test, y_test, iter = 200, activation = "logistic") plot(fit_3$cost_train, type = "l") lines(fit$cost_test, col = 2) y_train = factor(rbinom(100, 1, prob = logistic(alpha = 0, beta = 1, x_train[,1]))) y_test = factor(rbinom(1000, 1, prob = logistic(alpha = 0, beta = 1, x_test[,1]))) fit_4 = netzuko(x_train[,1], y_train, x_test[,1], y_test, iter = 200, num_hidden = 2) plot(fit_4$cost_train, type = "l", ylim = range(c(fit_4$cost_train, fit_4$cost_test))) lines(fit_4$cost_test, col = 2) x_train = matrix(rnorm(300), 100, 3) y_train = x_train[,1]^2 x_test = matrix(rnorm(3000), 1000, 3) y_test = x_test[,1]^2 fit_5 = netzuko(x_train, y_train, x_test, y_test, step_size = 0.003, iter = 200) plot(fit_5$cost_train, type = "l") lines(fit_5$cost_test, col = 2) y_train = cbind(y_train, x_train[,2]^2) y_test = cbind(y_test, x_test[,2]^2) fit_6 = netzuko(x_train, y_train, x_test, y_test, step_size = 0.003, iter = 200) plot(fit_6$cost_train, type = "l") lines(fit_6$cost_test, col = 2) pred_6 = predict(fit_6, x_test) fit_6$cost_test[200] mean(rowSums((y_test - pred_6)^2))/2 fit_7 = netzuko(x_train, y_train, x_test, y_test, step_size = 0.01, iter = 500, scale = T) plot(fit_7$cost_train, type = "l") lines(fit_7$cost_test, col = 2) pred_7 = predict(fit_7, x_test) fit_7$cost_test[500] tmp = scale_matrix(y_train, intercept = F) mean(rowSums((scale_matrix(y_test, tmp$mean_x, tmp$sd_x, intercept = F)$x - scale_matrix(pred_7, tmp$mean_x, tmp$sd_x, intercept = F)$x)^2))/2
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.