gd: Gradient Descent.
In jeksterslabds/jeksterslabRds: Jekster's Lab Data Science and Statistics

Implements the gradient descent algorithm. Weights are updated using the following equation \mathbf{w} \rightarrow \mathbf{w} η Δ where Δ = f(\mathbf{Xw}) - \mathbf{y} and f is the activation function.

1	gd(X, y, aFUN, eta, epochs, criteria = 1e-08, final = TRUE, ...)

`X`	The data matrix, that is an n \times k matrix of n observations of k regressors, which includes a regressor whose value is 1 for each observation.
`y`	n \times 1 vector of observations on the dependent variable.
`aFUN`	Activation function.
`eta`	η learning rate.
`epochs`	Number of iterations.
`criteria`	Stopping criteria. The algorithm stops if the sum of the absolute values of delta is less than `criteria`.
`final`	Logical. If `TRUE`, returns the value of `w` for the final iteration. If `FALSE`, returns all values of `w` from the random start to the final iteration.
`...`	Arguments to pass to `aFUN`.

If final is TRUE, returns a vector of k estimated weights w for the final iteration. If final is FALSE, returns all the values of w from the random start to the final iteration.