democraticG: Democratic generic method

Description Usage Arguments Details Value References Examples

View source: R/Democratic.R

Description

Democratic is a semi-supervised learning algorithm with a co-training style. This algorithm trains N classifiers with different learning schemes defined in list gen.learners. During the iterative process, the multiple classifiers with different inductive biases label data for each other.

Usage

1
democraticG(y, gen.learners, gen.preds)

Arguments

y

A vector with the labels of training instances. In this vector the unlabeled instances are specified with the value NA.

gen.learners

A list of functions for training N different supervised base classifiers. Each function needs two parameters, indexes and cls, where indexes indicates the instances to use and cls specifies the classes of those instances.

gen.preds

A list of functions for predicting the probabilities per classes. Each function must be two parameters, model and indexes, where the model is a classifier trained with gen.learner function and indexes indicates the instances to predict.

Details

democraticG can be helpful in those cases where the method selected as base classifier needs a learner and pred functions with other specifications. For more information about the general democratic method, please see democratic function. Essentially, democratic function is a wrapper of democraticG function.

Value

A list object of class "democraticG" containing:

W

A vector with the confidence-weighted vote assigned to each classifier.

model

A list with the final N base classifiers trained using the enlarged labeled set.

model.index

List of N vectors of indexes related to the training instances used per each classifier. These indexes are relative to the y argument.

instances.index

The indexes of all training instances used to train the N models. These indexes include the initial labeled instances and the newly labeled instances. These indexes are relative to the y argument.

model.index.map

List of three vectors with the same information in model.index but the indexes are relative to instances.index vector.

classes

The levels of y factor.

References

Yan Zhou and Sally Goldman.
Democratic co-learning.
In IEEE 16th International Conference on Tools with Artificial Intelligence (ICTAI), pages 594-602. IEEE, Nov 2004. doi: 10.1109/ICTAI.2004.48.

Examples

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
## Not run: 
# this is a long running example

library(ssc)

## Load Wine data set
data(wine)

cls <- which(colnames(wine) == "Wine")
x <- wine[, -cls] # instances without classes
y <- wine[, cls] # the classes
x <- scale(x) # scale the attributes

## Prepare data
set.seed(20)
# Use 50% of instances for training
tra.idx <- sample(x = length(y), size = ceiling(length(y) * 0.5))
xtrain <- x[tra.idx,] # training instances
ytrain <- y[tra.idx]  # classes of training instances
# Use 70% of train instances as unlabeled set
tra.na.idx <- sample(x = length(tra.idx), size = ceiling(length(tra.idx) * 0.7))
ytrain[tra.na.idx] <- NA # remove class information of unlabeled instances

# Use the other 50% of instances for inductive testing
tst.idx <- setdiff(1:length(y), tra.idx)
xitest <- x[tst.idx,] # testing instances
yitest <- y[tst.idx] # classes of testing instances

## Example A: 
# Training from a set of instances with 
# 1-NN and C-svc (SVM) as base classifiers.

### Define knn base classifier using knn3 from caret package
library(caret)
# learner function
knn <- function(indexes, cls) {
  knn3(x = xtrain[indexes, ], y = cls, k = 1)
}
# function to predict probabilities
knn.prob <- function(model, indexes) {
  predict(model, xtrain[indexes, ])
}

### Define svm base classifier using ksvm from kernlab package
library(kernlab)
library(proxy)
# learner function
svm <- function(indexes, cls) {
  rbf <- function(x, y) {
    sigma <- 0.048
    d <- dist(x, y, method = "Euclidean", by_rows = FALSE)
    exp(-sigma *  d * d)
  }
  class(rbf) <- "kernel"
  ksvm(x = xtrain[indexes, ], y = cls, scaled = FALSE,
       type = "C-svc",  C = 1,
       kernel = rbf, prob.model = TRUE)
}
# function to predict probabilities
svm.prob <- function(model, indexes) {
  predict(model, xtrain[indexes, ], type = "probabilities")
}

### Train
m1 <- democraticG(y = ytrain, 
                  gen.learners = list(knn, svm),
                  gen.preds = list(knn.prob, svm.prob))
### Predict
# predict labels using each classifier
m1.pred1 <- predict(m1$model[[1]], xitest, type = "class")
m1.pred2 <- predict(m1$model[[2]], xitest)
# combine predictions
m1.pred <- list(m1.pred1, m1.pred2)
cls1 <- democraticCombine(m1.pred, m1$W, m1$classes)
table(cls1, yitest)

## Example B: 
# Training from a distance matrix and a kernel matrix with 
# 1-NN and C-svc (SVM) as base classifiers.

### Define knn2 base classifier using oneNN from ssc package
library(ssc)
# Compute distance matrix D
# D is used in knn2.prob
D <- as.matrix(dist(x = xtrain, method = "euclidean", by_rows = TRUE))
# learner function
knn2 <- function(indexes, cls) {
  model <- oneNN(y = cls)
  attr(model, "tra.idxs") <- indexes
  model
}
# function to predict probabilities
knn2.prob <- function(model, indexes)  {
  tra.idxs <- attr(model, "tra.idxs")
  predict(model, D[indexes, tra.idxs], distance.weighting = "none")
}

### Define svm2 base classifier using ksvm from kernlab package
library(kernlab)

# Compute kernel matrix K
# K is used in svm2 and svm2.prob functions
sigma <- 0.048
K <- exp(- sigma * D * D)

# learner function
svm2 <- function(indexes, cls) {
  model <- ksvm(K[indexes, indexes], y = cls, 
                type = "C-svc", C = 1,
                kernel = "matrix", 
                prob.model = TRUE)
  attr(model, "tra.idxs") <- indexes
  model
}
# function to predict probabilities
svm2.prob <- function(model, indexes)  {
  tra.idxs <- attr(model, "tra.idxs")
  sv.idxs <- tra.idxs[SVindex(model)]
  predict(model, 
          as.kernelMatrix(K[indexes, sv.idxs]),
          type = "probabilities") 
}


## End(Not run)

mabelc/SSC documentation built on Dec. 27, 2019, 11:28 a.m.