snnR: snnR

Description Usage Arguments Details Value References Examples

Description

The snnR function fits a sparse neural network by minimizing the square error subject to a penalty on the L1-norm of the parameters (weights and biases) to solve the over-parameterization in NN for improving the predictive performance. It is based on choosing a subgradient with minimum norm as a steepest descent and using an active-set method to set many of the parameters to exactly zero.

Usage

1
2
 snnR(x, y, nHidden, normalize=FALSE, verbose=TRUE, optimtol = 1e-5, prgmtol = 1e-9, 
 iteramax = 100, decsuff =1e-4,lambda)

Arguments

x

(numeric, n x p) incidence matrix.

y

(numeric, n) the response data-vector (NAs not allowed).

nHidden

(positive integer, 1 x h) matrix, h indicates the number of hidden-layers and nHidden[1,h] indicates the neurons of the h-th hidden-layer.

normalize

logical, if TRUE normalizes output, the default value is FALSE.

verbose

logical, if TRUE prints detail history.

optimtol

numeric, a tiny number useful for checking convergenge of subgradients.

prgmtol

numeric, a tiny number useful for checking convergenge of parameters of NN.

iteramax

positive integer, maximum number of epochs(iterations) to train, default 100.

decsuff

numeric, a tiny number useful for checking change of loss function.

lambda

numeric, L1 norm lagrange multiplier.

Details

The software fits sparse deep neural networks including a two layer network as described in Gianola (2011). The two layer network model is given by:

y_i=g(\boldsymbol{x}_i)+e_i = ∑_{k=1}^s w_k g_k (b_k + ∑_{j=1}^p x_{ij} β_j^{[k]}) + e_i, i=1,...,n

where:

For estimating sparse SLNN, the parameters are typically trained by minimizing the approximated least squares error subject to the sum of the absolute parameters being less than some constants:

\min\limits_{\bm{W},\bm{b},\bm{β}} \hat{\bm{L}}(\bm{W}, \bm{b}, \bm{β}) = ∑\limits_{i=1}^{n}\bigg{(}μ+\bm{W}^T \bm{g}\big{(}\bm{b}+\bm{β}\bm{x_i}\big{)}-y_i\bigg{)}^2

subject \quad to \quad ∑\limits_{j=1}^{\mathcal{S}}∑\limits_{k=1}^{p} |β_{j}^{[k]}|≤q t_1, ∑\limits_{k=1}^{\mathcal{S}}|W_k|≤q t_2, ∑\limits_{k=1}^{\mathcal{S}}|b_k|≤q t_3.

Value

object of class "snnR". Mostly internal structure, but it is a list containing:

$wDNNs

A list containing weights and biases.

$inputwgts

A list containing input weights and biases.

$outputwgts

A list containing output weights and biases.

$hidewgts

A list containing hidden weights and biases.

$Mse=Mse

The mean squared error between observed and predicted values.

$message

String that indicates the stopping criteria for the training process.

References

Gianola, D. Okut, H., Weigel, K. and Rosa, G. 2011. "Predicting complex quantitative traits with Bayesian neural networks:a case study with Jersey cows and wheat". BMC Genetics, 12(1), 87-92.

Perez-Rodriguez, P., Gianola, D., Weigel, K. A., Rosa, G. J., and Crossa, J. 2013. Technical note: an r package for fitting bayesian regularized neural networks with applications in animal breeding. Journal of Animal Science, 91(8), 3522-3531.

Krishnan, D., Lin, P., and Yip, A., M. 2007. A primal-dual active-set method for non-negativity constrained total variation deblurring problems. IEEE Transactions on Image Processing, 16(11), 2766-2777.

Nocedal, J. 1980. Updating quasi-newton matrices with limited storage. Mathematics of Computation, 35(35), 773-782.

Wang, Y., Lin, P., and Wang, L. 2012. Exponential stability of reaction-diffusion high-order markovian jump hopfield neural networks with time-varying delays. Nonlinear Analysis B, 13(3), 1353-1361.

Liang, X., Wang, L., Wang, Y., and Wang, R. 2015. Dynamical behavior of delayed reaction-diffusion hopfield neural networks driven by infinite dimensional wiener processes. IEEE Transactions on Neural Networks, 27(9), 1816-1826.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
 

#Load the library
library(snnR)
###############################################################
#Example 1 
  #Nonlinear function  regression
  library(snnR)
  #Generate the data
  nsamples<-200
  nvaibles<-1
  Xydata<-SimData("Nonlinearregress",nsamples,nvaibles)
  x<-as.matrix(Xydata$X) 
  y<-as.vector(Xydata$y)
  #Generate the structure of neural network
  #5 hidden layers and 5 or 15 neurons in each layer
  nHidden <- matrix(c(5,5,15,5,5),1,5)
  # call function to train the sparse nerual network
  network=snnR(x=x,y=y,nHidden=nHidden)
  # test data
  X_test<-as.matrix(seq(-5,5,0.05))
  #  predictive results
  yhat=predict(network,X_test)
  split.screen(c(1,2))
  screen(1)
  plot(x,y)
  screen(2)
  plot(X_test,yhat)
  ### please install R package NeuralNetTools to show the optimal structure of NN
  ### and use the following commands
   #library(NeuralNetTools)
   #optstru=write.NeuralNetTools(w =network$wDNNs,nHidden =nHidden ,x = x,y=y )
   #plotnet(optstru$w_re,struct = optstru$structure)

###############################################################
#Example 2
  #Jersey dataset
  data(Jersey) 
  #Fit the model with additive effects
  y<-as.vector(pheno$yield_devMilk)
  X_test<-G[partitions==2,]
  X_train<-G[partitions!=2,]
  y_test<-y[partitions==2]
  y_train<-y[partitions!=2]
  #Generate the structure of neural network   
  nHidden <- matrix(c(5,5,5),1,3)
  #call function to train the sparse nerual network 
  network=snnR(x=X_train,y=y_train,nHidden=nHidden,iteramax =10,normalize=TRUE)
  #predictive results
  yhat= predict(network,X_test)
  plot(y_test,yhat)

snnR documentation built on May 2, 2019, 8:54 a.m.

Related to snnR in snnR...