Description Usage Arguments Details Value Author(s) Examples
cross validated kernel-k-nearest-neighbors using a distance matrix
1 2 3 | distMat.KernelKnnCV(DIST_mat, y, k = 5, folds = 5, h = 1,
weights_function = NULL, regression = F, threads = 1, extrema = F,
Levels = NULL, minimize = T, seed_num = 1)
|
DIST_mat |
a distance matrix (square matrix) having a diagonal filled with either zero's (0) or NA's (missing values) |
y |
a numeric vector (in classification the labels must be numeric from 1:Inf) |
k |
an integer specifying the k-nearest-neighbors |
folds |
the number of cross validation folds (must be greater than 1) |
h |
the bandwidth (applicable if the weights_function is not NULL, defaults to 1.0) |
weights_function |
there are various ways of specifying the kernel function. See the details section. |
regression |
a boolean (TRUE,FALSE) specifying if regression or classification should be performed |
threads |
the number of cores to be used in parallel (openmp will be employed) |
extrema |
if TRUE then the minimum and maximum values from the k-nearest-neighbors will be removed (can be thought as outlier removal) |
Levels |
a numeric vector. In case of classification the unique levels of the response variable are necessary |
minimize |
either TRUE or FALSE. If TRUE then lower values will be considered as relevant for the k-nearest search, otherwise higher values. |
seed_num |
a numeric value specifying the seed of the random number generator |
This function takes a number of arguments (including the number of cross-validation-folds) and it returns predicted values and indices for each fold. There are three possible ways to specify the weights function, 1st option : if the weights_function is NULL then a simple k-nearest-neighbor is performed. 2nd option : the weights_function is one of 'uniform', 'triangular', 'epanechnikov', 'biweight', 'triweight', 'tricube', 'gaussian', 'cosine', 'logistic', 'gaussianSimple', 'silverman', 'inverse', 'exponential'. The 2nd option can be extended by combining kernels from the existing ones (adding or multiplying). For instance, I can multiply the tricube with the gaussian kernel by giving 'tricube_gaussian_MULT' or I can add the previously mentioned kernels by giving 'tricube_gaussian_ADD'. 3rd option : a user defined kernel function
a list of length 2. The first sublist is a list of predictions (the length of the list equals the number of the folds). The second sublist is a list with the indices for each fold.
Lampros Mouselimis
1 2 3 4 5 6 7 8 9 10 11 12 13 | ## Not run:
data(ionosphere)
X = ionosphere[, -c(2, ncol(ionosphere))]
y = as.numeric(ionosphere[, ncol(ionosphere)])
dist_obj = dist(X)
dist_mat = as.matrix(dist_obj)
out = distMat.KernelKnnCV(dist_mat, y, k = 5, folds = 3, Levels = unique(y))
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.