GRFClassifier: Label propagation using Gaussian Random Fields and Harmonic...

View source: R/GRFClassifier.R

GRFClassifierR Documentation

Label propagation using Gaussian Random Fields and Harmonic functions

Description

Implements the approach proposed in Zhu et al. (2003) to label propagation over an affinity graph. Note, as in the original paper, we consider the transductive scenario, so the implementation does not generalize to out of sample predictions. The approach minimizes the squared difference in labels assigned to different objects, where the contribution of each difference to the loss is weighted by the affinity between the objects. The default in this implementation is to use a knn adjacency matrix based on euclidean distance to determine this weight. Setting adjacency="heat" will use an RBF kernel over euclidean distances between objects to determine the weights.

Usage

GRFClassifier(X, y, X_u, adjacency = "nn",
  adjacency_distance = "euclidean", adjacency_k = 6,
  adjacency_sigma = 0.1, class_mass_normalization = FALSE, scale = FALSE,
  x_center = FALSE)

Arguments

X

matrix; Design matrix for labeled data

y

factor or integer vector; Label vector

X_u

matrix; Design matrix for unlabeled data

adjacency

character; "nn" for nearest neighbour graph or "heat" for radial basis adjacency matrix

adjacency_distance

character; distance metric for nearest neighbour adjacency matrix

adjacency_k

integer; number of neighbours for the nearest neighbour adjacency matrix

adjacency_sigma

double; width of the rbf adjacency matrix

class_mass_normalization

logical; Should the Class Mass Normalization heuristic be applied? (default: FALSE)

scale

logical; Should the features be normalized? (default: FALSE)

x_center

logical; Should the features be centered?

References

Zhu, X., Ghahramani, Z. & Lafferty, J., 2003. Semi-supervised learning using gaussian fields and harmonic functions. In Proceedings of the 20th International Conference on Machine Learning. pp. 912-919.

See Also

Other RSSL classifiers: EMLeastSquaresClassifier, EMLinearDiscriminantClassifier, ICLeastSquaresClassifier, ICLinearDiscriminantClassifier, KernelLeastSquaresClassifier, LaplacianKernelLeastSquaresClassifier(), LaplacianSVM, LeastSquaresClassifier, LinearDiscriminantClassifier, LinearSVM, LinearTSVM(), LogisticLossClassifier, LogisticRegression, MCLinearDiscriminantClassifier, MCNearestMeanClassifier, MCPLDA, MajorityClassClassifier, NearestMeanClassifier, QuadraticDiscriminantClassifier, S4VM, SVM, SelfLearning, TSVM, USMLeastSquaresClassifier, WellSVM, svmlin()

Examples

library(RSSL)
library(ggplot2)
library(dplyr)

set.seed(1)
df_circles <- generateTwoCircles(400,noise=0.1) %>% 
  add_missinglabels_mar(Class~.,0.99)

# Visualize the problem
df_circles %>% 
  ggplot(aes(x=X1,y=X2,color=Class)) +
  geom_point() + 
  coord_equal()

# Visualize the solution
class_grf <- GRFClassifier(Class~.,df_circles,
                           adjacency="heat",
                           adjacency_sigma = 0.1)
df_circles %>%
  filter(is.na(Class)) %>% 
  mutate(Responsibility=responsibilities(class_grf)[,1]) %>% 
  ggplot(aes(x=X1,y=X2,color=Responsibility)) +
  geom_point() + 
  coord_equal()

# Generate problem
df_para <- generateParallelPlanes()
df_para$Class <- NA
df_para$Class[1] <- "a"
df_para$Class[101] <- "b"
df_para$Class[201] <- "c"
df_para$Class <- factor(df_para$Class)

# Visualize problem
df_para %>% 
  ggplot(aes(x=x,y=y,color=Class)) +
  geom_point() + 
  coord_equal()

# Estimate GRF classifier with knn adjacency matrix (default)
class_grf <- GRFClassifier(Class~.,df_para)

df_para %>%
  filter(is.na(Class)) %>% 
  mutate(Assignment=factor(apply(responsibilities(class_grf),1,which.max))) %>% 
  ggplot(aes(x=x,y=y,color=Assignment)) +
  geom_point()

jkrijthe/RSSL documentation built on Jan. 13, 2024, 1:56 a.m.