SL.kernelKnn: SL wrapper for KernelKNN
In ecpolley/SuperLearner: Super Learner Prediction

SL.kernelKnn

R Documentation

SL wrapper for KernelKNN

Description

Wrapper for a configurable implementation of k-nearest neighbors. Supports both binomial and gaussian outcome distributions.

Usage

SL.kernelKnn(Y, X, newX, family, k = 10, method = "euclidean",
  weights_function = NULL, extrema = F, h = 1, ...)

Arguments

`Y`	Outcome variable
`X`	Training dataframe
`newX`	Test dataframe
`family`	Gaussian or binomial
`k`	Number of nearest neighbors to use
`method`	Distance method, can be 'euclidean' (default), 'manhattan', 'chebyshev', 'canberra', 'braycurtis', 'pearson_correlation', 'simple_matching_coefficient', 'minkowski' (by default the order 'p' of the minkowski parameter equals k), 'hamming', 'mahalanobis', 'jaccard_coefficient', 'Rao_coefficient'
`weights_function`	Weighting method for combining the nearest neighbors. Can be 'uniform' (default), 'triangular', 'epanechnikov', 'biweight', 'triweight', 'tricube', 'gaussian', 'cosine', 'logistic', 'gaussianSimple', 'silverman', 'inverse', 'exponential'.
`extrema`	if TRUE then the minimum and maximum values from the k-nearest-neighbors will be removed (can be thought as outlier removal).
`h`	the bandwidth, applicable if the weights_function is not NULL. Defaults to 1.0.
`...`	Any additional parameters, not currently passed through.

Value

List with predictions and the original training data & hyperparameters.

Examples


# Load a test dataset.
data(PimaIndiansDiabetes2, package = "mlbench")

data = PimaIndiansDiabetes2

# Omit observations with missing data.
data = na.omit(data)

Y_bin = as.numeric(data$diabetes)
X = subset(data, select = -diabetes)

set.seed(1)

sl = SuperLearner(Y_bin, X, family = binomial(),
                 SL.library = c("SL.mean", "SL.kernelKnn"))
sl

ecpolley/SuperLearner documentation built on Feb. 21, 2024, 11:38 p.m.