Description Usage Arguments Details Author(s) Examples
Nearest neighbor methods needs to have a distance matrix of the dataset it works on. When doing repeated model fittings on subsets of the entire dataset it is unnecessary to recalculate it every time, therefore this function requires the user to manually calculate it prior to resampling and supply it in a wrapper function.
1 | pre_impute_knn(data, k = 0.05, distance_matrix)
|
data |
Fitting and testing data sets, as returned by
|
k |
Number of nearest neighbors to calculate mean from. Set to < 1 to specify a fraction. |
distance_matrix |
A matrix, |
Features with fewer than k
non-missing values will be removed
automatically.
Christofer Bäcklin
1 2 3 4 5 6 7 8 | x <- iris[-5]
x[sample(nrow(x), 30), 3] <- NA
my.dist <- dist(x)
evaluate(modeling_procedure("lda"), x = x, y = iris$Species,
pre_process = function(...){
pre_split(...) %>% pre_impute_knn(k = 4, distance_matrix = my.dist)
}
)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.