fenn: Free Energy Nearest Nighbour (FENN)
In kouroshz/fenn: Free Energy Nearest Neighbor (FENN)

Description Usage Arguments Details Value Note See Also Examples

View source: R/fenn.R

Fits a Von Neumann entropy penalized distance metric learning model.

fenn(x, ...)

## S3 method for class 'formula'
fenn(formula, data, ..., subset, na.action)

## Default S3 method:
fenn(x, grouping, prior = proportions, tol = 1.0e-4, ...)
                      
## S3 method for class 'data.frame'
fenn(x, ...)

## S3 method for class 'matrix'
fenn(x, grouping, ..., subset, na.action)

`x`	(required if no formula is given as the principal argument.) a matrix or data frame or Matrix containing the explanatory variables.
`...`	arguments passed to or from other methods.
`formula`	A formula of the form `groups ~ x1 + x2 + ...` That is, the response is the grouping factor and the right hand side specifies the (non-factor) discriminators.
`data`	Data frame from which variables specified in `formula` are preferentially to be taken.
`grouping`	(required if no formula principal argument is given.) a factor specifying the class for each observation.
`prior`	the prior probabilities of class membership. If unspecified, the class proportions for the training set are used. If present, the probabilities should be specified in the order of the factor levels.
`tol`	A tolerance to decide if a matrix is singular; it will be used to modify the scatter matrices by stabilizing eignevalues less that `tol^2`.
`subset`	An index vector specifying the cases to be used in the training sample. (NOTE: If given, this argument must be named.)
`na.action`	A function to specify the action to be taken if `NA`s are found. The default action is for the procedure to fail. An alternative is `na.omit`, which leads to rejection of cases with missing values on any required variable. (NOTE: If given, this argument must be named.)

The function fits a Von Neumann Entropy penalized distance metric learning problem to identify informative features and directions of maximam dissimilarity in the multi class case. The method automatically finds the optimal value of the entropy tuning parameter by maximizing the Fisher Information. The method can be used for sinlge class case to identify informative directions as well as multi class case to identiy directions of maximum dissimilarity. In the multi class case, optimal solution is an optimally scaled lda for maximum separability between classes that can results in more accurate classification. These direction are refered to as FENN directions.

Specifying the prior will affect the classification unless over-ridden in predict.fenn.

An object of class "fenn" containing the following components:

`prior`	The prior probabilities used.
`counts`	The group counts.
`means`	The group means.
`X.S`	The matrix of projections into similarity directions. Same as average within class covariance matrices.
`X.D`	The matrix of projections into dissimilarity directions. Same as `X.S + (n/n-1) * Cov(m)`, where `m` is the class means.
`X_D_neg_1_2`	The matrix `X.D` raised to the power -1/2. This is the scaling that is applied to data points prior to tilde transformation.
`S_1_2`	The learned optimal transformation that should be applied to tilde transformed data points.
`X_tilde_S`	The hamiltonian generated from data points.
`informative.dims`	The index of informative directions in the optimal space. Can be used for dimension reduction.
`muVec`	The automatically selected path of `mu` (temperature) values.
`best.mu`	The the optimal (temperature) parameter obtained by maximizing Fisher Information.
`E`	The average Energy for an automatically selected path of `mu` values `muVec`.
`dE`	The Fisher Information of `mu`.
`scaling`	The weights (eigenvalues) of maximum dissimilarity directions (eigenvectors of `S_1_2`).
`x.tilde`	The tilde transformed data.
`x.fenn`	The `fenn` transformed data.
`N`	The number of observations used.
`groupings`	The class variable of original data points.
`call`	The (matched) function call.

This function may be called giving either a formula and optional data frame, or a matrix and grouping factor as the first two arguments. All other arguments are optional, but subset= and na.action=, if required, must be fully named.

If a formula is given as the principal argument the object may be modified using update() in the usual way.

predict.fenn

Iris <- data.frame(rbind(iris3[,,1], iris3[,,2], iris3[,,3]),
Sp = rep(c("s","c","v"), rep(50,3)))
train <- sample(1:150, 75)
table(Iris$Sp[train])
## your answer may differ
##  c  s  v
## 22 23 30
z <- fenn(Sp ~ ., Iris, prior = c(1,1,1)/3, subset = train)
predict(z, Iris[-train, ])$class
##  [1] s s s s s s s s s s s s s s s s s s s s s s s s s s s c c c
## [31] c c c c c c c v c c c c v c c c c c c c c c c c c v v v v v
## [61] v v v v v v v v v v v v v v v
(z1 <- update(z, . ~ . - Petal.W.))