SL.kernelKnn: SL wrapper for KernelKNN

View source: R/SL.kernelKnn.R

SL.kernelKnnR Documentation

SL wrapper for KernelKNN

Description

Wrapper for a configurable implementation of k-nearest neighbors. Supports both binomial and gaussian outcome distributions.

Usage

SL.kernelKnn(Y, X, newX, family, k = 10, method = "euclidean",
  weights_function = NULL, extrema = F, h = 1, ...)

Arguments

Y

Outcome variable

X

Training dataframe

newX

Test dataframe

family

Gaussian or binomial

k

Number of nearest neighbors to use

method

Distance method, can be 'euclidean' (default), 'manhattan', 'chebyshev', 'canberra', 'braycurtis', 'pearson_correlation', 'simple_matching_coefficient', 'minkowski' (by default the order 'p' of the minkowski parameter equals k), 'hamming', 'mahalanobis', 'jaccard_coefficient', 'Rao_coefficient'

weights_function

Weighting method for combining the nearest neighbors. Can be 'uniform' (default), 'triangular', 'epanechnikov', 'biweight', 'triweight', 'tricube', 'gaussian', 'cosine', 'logistic', 'gaussianSimple', 'silverman', 'inverse', 'exponential'.

extrema

if TRUE then the minimum and maximum values from the k-nearest-neighbors will be removed (can be thought as outlier removal).

h

the bandwidth, applicable if the weights_function is not NULL. Defaults to 1.0.

...

Any additional parameters, not currently passed through.

Value

List with predictions and the original training data & hyperparameters.

Examples


# Load a test dataset.
data(PimaIndiansDiabetes2, package = "mlbench")

data = PimaIndiansDiabetes2

# Omit observations with missing data.
data = na.omit(data)

Y_bin = as.numeric(data$diabetes)
X = subset(data, select = -diabetes)

set.seed(1)

sl = SuperLearner(Y_bin, X, family = binomial(),
                 SL.library = c("SL.mean", "SL.kernelKnn"))
sl


ecpolley/SuperLearner documentation built on Feb. 21, 2024, 11:38 p.m.