EMQ: Expectation-Maximization Quantification

Description Usage Arguments Value References Examples

View source: R/EMQ_method.r

Description

This method is an instance of the well-known algorithm for finding maximum-likelihood estimates of the model's parameters. It quantifies events based on testing scores, applying the Expectation Maximization for Quantification (EMQ) method proposed by Saerens et al. (2002).

Usage

1
EMQ(train, test, it=5, e=1e-4)

Arguments

train

a data.frame of the labeled set.

test

a numeric matrix of scores predicted from each test set instance. First column must be the positive score.

it

maximum number of iteration steps (default 5).

e

a numeric value for the stop threshold (default 1e-4). If the difference between two consecutive steps is lower or equal than e, the iterative process will be stopped. If e is null then the iteration phase is defined by the it parameter.

Value

A numeric vector containing the class distribution estimated from the test set.

References

Saerens, M., Latinne, P., & Decaestecker, C. (2002). Adjusting the outputs of a classifier to new a priori probabilities: a simple procedure. Neural computation.<doi.org/10.1162/089976602753284446>.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
library(randomForest)
library(caret)
cv <- createFolds(aeAegypti$class, 2)
tr <- aeAegypti[cv$Fold1,]
ts <- aeAegypti[cv$Fold2,]

# -- Getting a sample from ts with 80 positive and 20 negative instances --
ts_sample <- rbind(ts[sample(which(ts$class==1),80),],
                   ts[sample(which(ts$class==2),20),])
scorer <- randomForest(class~., data=tr, ntree=500)
test.scores <- predict(scorer, ts_sample, type = c("prob"))
EMQ(train=tr, test=test.scores)

mlquantify documentation built on Jan. 20, 2022, 5:07 p.m.