Description Usage Arguments Details Value See Also Examples
Classification task. The Neighborhood measures analyze the neighborhoods of the data items and try to capture class overlapping and the shape of the decision boundary. They work over a distance matrix storing the distances between all pairs of data points in the dataset.
1 2 3 4 5 6 7 | neighborhood(...)
## Default S3 method:
neighborhood(x, y, measures = "all", ...)
## S3 method for class 'formula'
neighborhood(formula, data, measures = "all", ...)
|
... |
Not used. |
x |
A data.frame contained only the input attributes. |
y |
A factor response vector with one label for each row/component of x. |
measures |
A list of measures names or |
formula |
A formula to define the class column. |
data |
A data.frame dataset contained the input attributes and class. |
The following measures are allowed for this method:
Fraction of borderline points (N1) computes the percentage of vertexes incident to edges connecting examples of opposite classes in a Minimum Spanning Tree (MST). The default package to build the MST is igraph. If you are handling data with duplicated instances, we suggest using ape.
Ratio of intra/extra class nearest neighbor distance (N2) computes the ratio of two sums: intra-class and inter-class. The former corresponds to the sum of the distances between each example and its closest neighbor from the same class. The later is the sum of the distances between each example and its closest neighbor from another class (nearest enemy).
Error rate of the nearest neighbor (N3) classifier corresponds to the error rate of a one Nearest Neighbor (1NN) classifier, estimated using a leave-one-out procedure in dataset.
Non-linearity of the nearest neighbor classifier (N4) creates a new dataset randomly interpolating pairs of training examples of the same class and then induce a the 1NN classifier on the original data and measure the error rate in the new data points.
Fraction of hyperspheres covering data (T1) builds hyperspheres centered at each one of the training examples, which have their radios growth until the hypersphere reaches an example of another class. Afterwards, smaller hyperspheres contained in larger hyperspheres are eliminated. T1 is finally defined as the ratio between the number of the remaining hyperspheres and the total number of examples in the dataset.
Local Set Average Cardinality (LSC) is based on Local Set (LS) and defined as the set of points from the dataset whose distance of each example is smaller than the distance from the exemples of the different class. LSC is the average of the LS.
X in [N1,N2,N3,N4,T1]. It is the decomposed version of the correponding X function. Instead of giving a single complexity value for the dataset, it returns one complexity value per class.
A list named by the requested neighborhood measure.
Other complexity-measures: linearity.class,
overlapping
1 2 3 | ## Extract all neighborhood measures
data(iris)
neighborhood(Species ~ ., iris)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.