my_knn_cv: k-Nearest Neighbour Classification
In txqtiffany/STAT302package: Package Building Demonstration

Description Usage Arguments Value Examples

View source: R/my_knn_cv.R

k-nearest neighbour classification for test set from training set. For each row of the test set, the k nearest (in Euclidean distance) training set vectors are found, and the classification is decided by majority vote, with ties broken at random. If there are ties for the kth nearest vector, all candidates are included in the vote.

1	my_knn_cv(train, cl, k_nn, k_cv)

`train`	input data frame
`cl`	true class value of your training data
`k_nn`	integer representing the number of neighbors
`k_cv`	integer representing the number of folds

a list with objects

`class`	a vector of the predicted class Ŷ i for all observations
`cv_err`	a numeric with the cross-validation misclassification error

library(palmerpenguins)
data(package = "palmerpenguins")
penguins_df <- na.omit(penguins)
train <- lapply(
  penguins_df[c("bill_length_mm", "bill_depth_mm", "flipper_length_mm", "body_mass_g")],
   as.numeric)
cl <- as.numeric(penguins_df$species)
nearest_1 <- my_knn_cv(train, cl, 1, 5)
nearest_5 <- my_knn_cv(train, cl, 5, 5)