Mistnet models can require a lot of tuning.
x
). Otherwise, the model will expect variables with larger means or larger ranges to play a disproportionate role in your predictions. Consider other preprocessing steps, such as independent components analysis as well.update
method periodically---especially when first applying mistnet to a new problem---to share information among weights within the same layer, as in multilevel/hierarchical modeling or empirical Bayes.adagrad
updater for routine work, and the sgd.updater
when you can afford to do lots of tuning. For the sgd.updater
, start with a momentum of 0.8 and see if you can get better results with 0.9 or 0.95.sgd.updater
.n.minibatch
range from 10 to 100. The default of 25 should work well in most cases. Larger minibatches will lead to more accurate gradient estimates, at the cost of additional computation per update.sampler
is very problem-dependent. The default (a ten-dimensional isotropic Gaussian) seems to work for modeling species assemblages, but much higher dimensions could be necessary for other applications.n.importance.samples
involves another speed-accuracy tradeoff. 25 is probably good for many applications.nonlinearity
should match the loss
function you're optimizing. For the other layers, rectify.nonlinearities
tend to work best.Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.