mistnet: Construct a mistnet model
In davharris/mistnet: Stochastic Neural Networks

Description Usage Arguments Details Note References See Also Examples

This function creates a network object for fitting a mistnet model.

mistnet(x, y, layer.definitions, loss, updater,
  sampler = gaussian.sampler(ncol = 10L, sd = 1), n.importance.samples = 25,
  n.minibatch = 25, training.iterations = 0, shuffle = TRUE,
  initialize.biases = TRUE, initialize.weights = TRUE)

`x`	a `numeric` `matrix` of predictor variables. One row per record, one column per predictive feature.
`y`	a `matrix` of responses to `x`. One row per record, one column per response variable.
`layer.definitions`	a `list` of specifications for each layer in the network, as produced by `defineLayer`.
`loss`	a `loss` object, defining the function for optimization to minimize, as well as its gradient.
`updater`	an `updater` object, specifying how the model should move across the likelihood surface (e.g. stochastic gradient descent or adagrad)
`sampler`	a `sampler` object, specifying the distribution of the latent variables
`n.importance.samples`	an `integer`. More samples will take more time to compute, but will provide a more precise estimate of the likelihood gradient.
`n.minibatch`	an `integer` specifying the number of rows to include in each stochastic estimate of the likelihood gradient.
`training.iterations`	an `integer` number of minibatches to process before terminating. Defaults to zero so that the user can adjust the network before training begins.
`shuffle`	logical. Should the data be shuffled after each epoch? Defaults to TRUE.
`initialize.biases`	logical. Should the network's final layer's biases be initialized to nonzero values? If `TRUE`, initial values will depend on the `nonlinearity` of the final layer. Otherwise, all values will be zero.
`initialize.weights`	logical. Should the weights in each layer be initialized automatically? If `TRUE`, each `layer`'s weights will be sampled randomly from their `prior`s. Otherwise, all values will be zero, which can prevent the network from learning.

The mistnet function produces a network object that produces a joint distribution over y given x. This distribution is defined by a stochastic feed-forward neural network (Neal 1992), which is trained using a variant of backpropagation described in Tang and Salakhutdinov (2013) and Harris (2014). During each training iteration, model descends the gradient defined by its loss function, averaged over a number of Monte Carlo samples and a number of rows of data.

A network concatenates the predictor variables in x with random variables produced by a sampler and passes the resulting data vectors through one or more layer objects to make predictions about y. The weights and biases in each layer can be trained using the network's fit method (see example below).

network objects produced by mistnet are ReferenceClasses, and behave differently from other R objects. In particular, binding a network or other reference class object to a new variable name will not produce a copy of the original object, but will instead create a new alias for it.

Harris, D.J. Building realistic assemblages with a Joint Species Distribution Model. BioRxiv preprint. http://dx.doi.org/10.1101/003947

Neal, R.M. (1992) Connectionist learning of belief networks. Artificial Intelligence, 56, 71-113.

Tang, Y. & Salakhutdinov, R. (2013) Learning Stochastic Feedforward Neural Networks. Advances in Neural Information Processing Systems 26 (eds & trans C.J.C. Burges), L. Bottou), M. Welling), Z. Ghahramani), & K.Q. Weinberger), pp. 530-538.

network

layer

# 107 rows of fake data
x = matrix(rnorm(1819), nrow = 107, ncol = 17) 
y = dropoutMask(107, 14)

# Create the network object
net = mistnet(
  x = x,
  y = y,
  layer.definitions = list(
    defineLayer(
      nonlinearity = rectify.nonlinearity(), 
      size = 30, 
      prior = gaussian.prior(mean = 0, sd = 0.1)
    ),
    defineLayer(
      nonlinearity = rectify.nonlinearity(), 
      size = 12, 
      prior = gaussian.prior(mean = 0, sd = 0.1)
    ),
    defineLayer(
      nonlinearity = sigmoid.nonlinearity(), 
      size = ncol(y), 
      prior = gaussian.prior(mean = 0, sd = 0.1)
    )
  ),
  loss = bernoulliLoss(),
  updater = adagrad.updater(learning.rate = .01),
  sampler = gaussian.sampler(ncol = 10L, sd = 1),
  n.importance.samples = 30,
  n.minibatch = 10,
  training.iterations = 0
)

# Fit the model
net$fit(iterations = 10)

predict(net, newdata = x, n.importance.samples = 10)