Neural Network with MXNet in Five Minutes

This is the first tutorial for new users of the R package mxnet. You will learn to construct a neural network to do regression in 5 minutes.

We will show you how to do classification and regression tasks respectively. The data we use comes from the package mlbench.

Classification

First of all, let us load in the data and preprocess it:

require(mlbench)
require(mxnet)

data(Sonar, package = "mlbench")

Sonar[,61] <- as.numeric(Sonar[,61])-1
train.ind <- c(1:50, 100:150)
train.x <- data.matrix(Sonar[train.ind, 1:60])
train.y <- Sonar[train.ind, 61]
test.x <- data.matrix(Sonar[-train.ind, 1:60])
test.y <- Sonar[-train.ind, 61]

Next we are going to use a multi-layer perceptron (MLP) as our classifier. In mxnet, we have a function called mx.mlp so that users can build a general multi-layer neural network to do classification (out_activation="softmax") or regression (out_activation="rmse"). Note for the softmax activation, the output is zero-indexed not one-indexed. In the data we use:

table(train.y)
table(test.y)

There are several parameters we have to feed to mx.mlp:

The following code piece is showing a possible usage of mx.mlp:

mx.set.seed(0)
model <- mx.mlp(train.x, train.y, hidden_node=10, out_node=2, out_activation="softmax",
                num.round=20, array.batch.size=15, learning.rate=0.07, momentum=0.9, 
                eval.metric=mx.metric.accuracy)

Note that mx.set.seed is the correct function to control the random process in mxnet. You can see the accuracy in each round during training. It is also easy to make prediction and evaluate.

To get an idea of what is happening, we can easily view the computation graph from R.

graph.viz(model$symbol)
preds <- predict(model, test.x)
pred.label <- max.col(t(preds)) - 1
table(pred.label, test.y)

Note for multi-class prediction, mxnet outputs nclass x nexamples, each each row corresponding to probability of that class.

Regression

Again, let us preprocess the data first.

data(BostonHousing, package="mlbench")

train.ind <- seq(1, 506, 3)
train.x <- data.matrix(BostonHousing[train.ind, -14])
train.y <- BostonHousing[train.ind, 14]
test.x <- data.matrix(BostonHousing[-train.ind, -14])
test.y <- BostonHousing[-train.ind, 14]

Although we can use mx.mlp again to do regression by changing the out_activation, this time we are going to introduce a flexible way to configure neural networks in mxnet. The configuration is done by the "Symbol" system in mxnet, which takes care of the links among nodes, the activation, dropout ratio, etc. To configure a multi-layer neural network, we can do it in the following way:

# Define the input data
data <- mx.symbol.Variable("data")
# A fully connected hidden layer
# data: input source
# num_hidden: number of neurons in this hidden layer
fc1 <- mx.symbol.FullyConnected(data, num_hidden=1)

# Use linear regression for the output layer
lro <- mx.symbol.LinearRegressionOutput(fc1)

What matters for a regression task is mainly the last function, this enables the new network to optimize for squared loss. We can now train on this simple data set. In this configuration, we dropped the hidden layer so the input layer is directly connected to the output layer.

next we can make prediction with this structure and other parameters with mx.model.FeedForward.create:

mx.set.seed(0)
model <- mx.model.FeedForward.create(lro, X=train.x, y=train.y,
                                     ctx=mx.cpu(), num.round=50, array.batch.size=20,
                                     learning.rate=2e-6, momentum=0.9, eval.metric=mx.metric.rmse)

It is also easy to make prediction and evaluate

preds <- predict(model, test.x)
sqrt(mean((preds-test.y)^2))

Currently we have four pre-defined metrics "accuracy", "rmse", "mae" and "rmsle". One might wonder how to customize the evaluation metric. mxnet provides the interface for users to define their own metric of interests:

demo.metric.mae <- mx.metric.custom("mae", function(label, pred) {
  res <- mean(abs(label-pred))
  return(res)
})

This is an example for mean absolute error. We can simply plug it in the training function:

mx.set.seed(0)
model <- mx.model.FeedForward.create(lro, X=train.x, y=train.y,
                                     ctx=mx.cpu(), num.round=50, array.batch.size=20,
                                     learning.rate=2e-6, momentum=0.9, eval.metric=demo.metric.mae)

In the previous example, our target is to predict the last column ("medv") in the dataset. It is also possible to build a regression model with multiple outputs. This time we use the last two columns as the targets:

train.x <- data.matrix(BostonHousing[train.ind, -(13:14)])
train.y <- BostonHousing[train.ind, c(13:14)]
test.x <- data.matrix(BostonHousing[-train.ind, -(13:14)])
test.y <- BostonHousing[-train.ind, c(13:14)]

and build a similar network symbol:

data <- mx.symbol.Variable("data")
fc2 <- mx.symbol.FullyConnected(data, num_hidden=2)
lro2 <- mx.symbol.LinearRegressionOutput(fc2)

We use mx.io.arrayiter to build an iter for our training set and train the model using mx.model.FeedForward.create:

mx.set.seed(0)
train_iter = mx.io.arrayiter(data = t(train.x), label = t(train.y))

model <- mx.model.FeedForward.create(lro2, X=train_iter,
                                     ctx=mx.cpu(), num.round=50, array.batch.size=20,
                                     learning.rate=2e-6, momentum=0.9)

After training, we can see that the dimension of the prediction is the same with our target.

preds <- t(predict(model, test.x))
dim(preds)
dim(test.y)

Congratulations! Now you have learnt the basic for using mxnet. Please check the other tutorials for advanced features.



AlfonsoRReyes/rDeepThought documentation built on May 3, 2019, 6:42 p.m.