ds.knn | R Documentation |
Perform a non-disclosive distributed K-Nearest Neighbour Classification
ds.knn( x, classificator, query, neigh = 3, method.indicator = "knn", k = 3, noise = 0.25, datasources = NULL )
x |
|
classificator |
|
query |
|
neigh |
|
method.indicator |
|
k |
|
noise |
|
datasources |
a list of |
If the argument method
is set to 'knn'
, the server-side function searches
for the k-1
nearest neighbors of each single data point and calculates the centroid
of such k
points.
The proximity is defined by the minimum Euclidean distances of z-score transformed data.
When the coordinates of all centroids are estimated the function applies scaling to expand the centroids back to the dispersion of the original data. The scaling is achieved by multiplying the centroids with a scaling factor that is equal to the ratio between the standard deviation of the original variable and the standard deviation of the calculated centroids. The coordinates of the scaled centroids are then returned to the client-side.
The value of k
is specified by the user.
The suggested and default value is equal to 3 which is also
the suggested minimum threshold that is used to prevent disclosure which is specified in the
protection filter nfilter.kNN
. When the value of k
increases,
the disclosure risk decreases but the utility loss increases.
The value of k
is used only
if the argument method
is set to 'knn'
.
Any value of k
is ignored if the
argument method
is set to 'noise'
.
If the argument method
is set to 'noise'
,
the server-side function generates a random normal noise of zero mean
and variance equal to 10\
The noise is added to each x
and y
variable and the disturbed by the addition of
noise
data are returned to the client-side. Note that the seed random number generator is fixed to a
specific number generated from the data and therefore the user gets the same figure every time
that chooses the knn method in a given set of variables.
The value of noise
is used only if the argument method
is set to 'noise'
.
Any value of noise
is ignored if
the argument method
is set to 'knn'
.
character
with the classification assigned to the queried vector
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.