knitr::opts_chunk$set(echo = TRUE)

library(NNS) library(data.table) require(knitr) require(rgl) require(meboot)

Below are some examples demonstrating unsupervised learning with NNS clustering and nonlinear regression using the resulting clusters. As always, for a more thorough description and definition, please view the References.

`NNS.part()`

** NNS.part** is both a partitional and hierarchical clustering method.

`NNS`

iteratively partitions the joint distribution into partial moment quadrants, and then assigns a quadrant identification (1:4) at each partition.** NNS.part** returns a

`data.table`

of observations along with their final quadrant identification. It also returns the regression points, which are the quadrant means used in `NNS.reg`

x = seq(-5, 5, .05); y = x ^ 3 for(i in 1 : 4){NNS.part(x, y, order = i, Voronoi = TRUE, obs.req = 0)}

** NNS.part** offers a partitioning based on $x$ values only

`NNS.part(x, y, type = "XONLY", ...)`

for(i in 1 : 4){NNS.part(x, y, order = i, type = "XONLY", Voronoi = TRUE)}

Note the partition identifications are limited to 1's and 2's (left and right of the partition respectively), not the 4 values per the $x$ and $y$ partitioning.

NNS.part(x,y,order = 4, type = "XONLY")

The right column of plots shows the corresponding regression for the order of `NNS`

partitioning.
```r,results='hide'}
for(i in 1 : 3){NNS.part(x, y, order = i, obs.req = 0, Voronoi = TRUE) ; NNS.reg(x, y, order = i, ncores = 1)}

# NNS Regression `NNS.reg()` **`NNS.reg`** can fit any $f(x)$, for both uni- and multivariate cases. **`NNS.reg`** returns a self-evident list of values provided below. ## Univariate: ```r NNS.reg(x, y, ncores = 1)

Multivariate regressions return a plot of $y$ and $\hat{y}$, as well as the regression points (`$RPM`

) and partitions (`$rhs.partitions`

) for each regressor.

f = function(x, y) x ^ 3 + 3 * y - y ^ 3 - 3 * x y = x ; z <- expand.grid(x, y) g = f(z[ , 1], z[ , 2]) NNS.reg(z, g, order = "max", plot = FALSE, ncores = 1)

`NNS.reg`

can inter- or extrapolate any point of interest. The ** NNS.reg(x, y, point.est = ...)** parameter permits any sized data of similar dimensions to $x$ and called specifically with

`NNS.reg(...)$Point.est`

** NNS.reg** also provides a dimension reduction regression by including a parameter

`NNS.reg(x, y, dim.red.method = "cor", ...)`

`NNS.reg(..., dim.red.method = "cor", ...)$equation`

NNS.reg(iris[ , 1 : 4], iris[ , 5], dim.red.method = "cor", location = "topleft", ncores = 1)$equation

a = NNS.reg(iris[ , 1 : 4], iris[ , 5], dim.red.method = "cor", location = "topleft", ncores = 1, plot = FALSE)$equation

Thus, our model for this regression would be:
$$Species = \frac{`r round(a$Coefficient[1],3)`

*Sepal.Length r round(a$Coefficient[2],3)*Sepal.Width +

`r round(a$Coefficient[3],3)`

`r round(a$Coefficient[4],3)`

** NNS.reg(x, y, dim.red.method = "cor", threshold = ...)** offers a method of reducing regressors further by controlling the absolute value of required correlation.

NNS.reg(iris[ , 1 : 4], iris[ , 5], dim.red.method = "cor", threshold = .75, location = "topleft", ncores = 1)$equation

a = NNS.reg(iris[ , 1 : 4], iris[ , 5], dim.red.method = "cor", threshold = .75, location = "topleft", ncores = 1, plot = FALSE)$equation

Thus, our model for this further reduced dimension regression would be:
$$Species = \frac{\: `r round(a$Coefficient[1],3)`

*Sepal.Length + r round(a$Coefficient[2],3)*Sepal.Width +

`r round(a$Coefficient[3],3)`

`r round(a$Coefficient[4],3)`

and the `point.est = (...)`

operates in the same manner as the full regression above, again called with ** NNS.reg(...)$Point.est**.

NNS.reg(iris[ , 1 : 4], iris[ , 5], dim.red.method = "cor", threshold = .75, point.est = iris[1 : 10, 1 : 4], location = "topleft", ncores = 1)$Point.est

For a classification problem, we simply set ** NNS.reg(x, y, type = "CLASS", ...)**.

**NOTE: Base category of response variable should be 1, not 0 for classification problems.**

NNS.reg(iris[ , 1 : 4], iris[ , 5], type = "CLASS", point.est = iris[1 : 10, 1 : 4], location = "topleft", ncores = 1)$Point.est

`NNS.stack()`

The ** NNS.stack** routine cross-validates for a given objective function the

`n.best`

parameter in the multivariate `NNS.reg`

`threshold`

parameter in the dimension reduction `NNS.reg`

`NNS.stack`

`NNS.stack(..., type = "CLASS", ...)`

or continuous dependent variables:

** NNS.stack(..., type = NULL, ...)**.

Any objective function `obj.fn`

can be called using `expression()`

with the terms `predicted`

and `actual`

.

NNS.stack(IVs.train = iris[ , 1 : 4], DV.train = iris[ , 5], IVs.test = iris[1 : 10, 1 : 4], dim.red.method = "cor", obj.fn = expression( mean(round(predicted) == actual) ), objective = "max", type = "CLASS", folds = 1, ncores = 1)

Folds Remaining = 0 Current NNS.reg(... , threshold = 0.95 ) MAX Iterations Remaining = 2 Current NNS.reg(... , threshold = 0.78 ) MAX Iterations Remaining = 1 Current NNS.reg(... , threshold = 0.415 ) MAX Iterations Remaining = 0 Current NNS.reg(... , n.best = 1 ) MAX Iterations Remaining = 12 Current NNS.reg(... , n.best = 2 ) MAX Iterations Remaining = 11 Current NNS.reg(... , n.best = 3 ) MAX Iterations Remaining = 10 Current NNS.reg(... , n.best = 4 ) MAX Iterations Remaining = 9 Current NNS.reg(... , n.best = 5 ) MAX Iterations Remaining = 8 Current NNS.reg(... , n.best = 6 ) MAX Iterations Remaining = 7 Current NNS.reg(... , n.best = 7 ) MAX Iterations Remaining = 6 Current NNS.reg(... , n.best = 8 ) MAX Iterations Remaining = 5 Current NNS.reg(... , n.best = 9 ) MAX Iterations Remaining = 4 Current NNS.reg(... , n.best = 10 ) MAX Iterations Remaining = 3 Current NNS.reg(... , n.best = 11 ) MAX Iterations Remaining = 2 Current NNS.reg(... , n.best = 12 ) MAX Iterations Remaining = 1 Current NNS.reg(... , n.best = 150 ) MAX Iterations Remaining = 0 $OBJfn.reg [1] 0.9733333 $NNS.reg.n.best [1] 1 $probability.threshold [1] 0.5 $OBJfn.dim.red [1] 0.96 $NNS.dim.red.threshold [1] 0.78 $reg [1] 1 1 1 1 1 1 1 1 1 1 $dim.red [1] 1 1 1 1 1 1 1 1 1 1 $stack [1] 1 1 1 1 1 1 1 1 1 1

Given multicollinearity is not an issue for nonparametric regressions as it is for OLS, in the case of an ill-fit univariate model a better option may be to increase the dimensionality of regressors with a copy of itself and cross-validate the number of clusters `n.best`

via:

** NNS.stack(IVs.train = cbind(x, x), DV.train = y, method = 1, ...)**.

set.seed(123) x = rnorm(100); y = rnorm(100) nns.params = NNS.stack(IVs.train = cbind(x, x), DV.train = y, method = 1, ncores = 1)

set.seed(123) x = rnorm(100); y = rnorm(100) nns.params = list() nns.params$NNS.reg.n.best = 9

NNS.reg(cbind(x, x), y, n.best = nns.params$NNS.reg.n.best, point.est = cbind(x, x), residual.plot = TRUE, ncores = 1)

If the user is so motivated, detailed arguments further examples are provided within the following:

