knitr::opts_chunk$set(echo = TRUE)

First, we load the needed libraries.

library(datasets)
library(rDML)

We will use the iris dataset.

# Loading dataset
data(iris)
X = iris[1:4]
y = as.array(as.numeric(iris[5][,1]))

Let's use the k-NN classifier with a DML algorithm.

nca = dml$NCA()
knn = distance_clf$kNN(n_neighbors = 7, dml_algorithm = nca)

Now, we fit the transformer and the predictor.

nca$fit(X,y)
knn$fit(X,y)

We can now predict the labels for the k-NN with the learned distance,

# The predictions for the training set. They are made leaving the sample to predict out.
knn$predict()

# We can also make predictions for new collected data.
X_ = matrix(nrow = 3, ncol = 4, data = c(1,0,0,0,
                                         1,1,0,0,
                                         1,1,1,0))
knn$predict(X_)

and see the classification scores.

# Score for the training set (leaving one out)
knn$score()

# Scoring test data
y_ = as.numeric(c(1,2,2))
knn$score(X_,y_)

Another interesting classifier is NCMC. With this classifier we can make predicitions by choosing the class who has a centroid the nearest. The centroids number can be set for each class and are calculated using k-Means over each class subdataset. We can use it in the same way as the previous classifier.

ncmc = distance_clf$NCMC_Classifier(centroids_num = c(2,3,4))
ncmc$fit(X,y)
ncmc$score(X,y)

If we want to use this classifier with the learned distance, we may transform the data first.

Lx = nca$transform()
ncmc = distance_clf$NCMC_Classifier(centroids_num = c(2,3,4))
ncmc$fit(Lx,y)
ncmc$score(Lx,y)


jlsuarezdiaz/rDML documentation built on May 24, 2019, 12:35 a.m.