EMQ: Expectation-Maximization Quantification
In mlquantify: Algorithms for Class Distribution Estimation

Description Usage Arguments Value References Examples

This method is an instance of the well-known algorithm for finding maximum-likelihood estimates of the model's parameters. It quantifies events based on testing scores, applying the Expectation Maximization for Quantification (EMQ) method proposed by Saerens et al. (2002).

1	EMQ(train, test, it=5, e=1e-4)

`train`	a `data.frame` of the labeled set.
`test`	a numeric `matrix` of scores predicted from each test set instance. First column must be the positive score.
`it`	maximum number of iteration steps (default `5`).
`e`	a numeric value for the stop threshold (default `1e-4`). If the difference between two consecutive steps is lower or equal than `e`, the iterative process will be stopped. If `e` is null then the iteration phase is defined by the `it` parameter.

A numeric vector containing the class distribution estimated from the test set.

Saerens, M., Latinne, P., & Decaestecker, C. (2002). Adjusting the outputs of a classifier to new a priori probabilities: a simple procedure. Neural computation.<doi.org/10.1162/089976602753284446>.

library(randomForest)
library(caret)
cv <- createFolds(aeAegypti$class, 2)
tr <- aeAegypti[cv$Fold1,]
ts <- aeAegypti[cv$Fold2,]

# -- Getting a sample from ts with 80 positive and 20 negative instances --
ts_sample <- rbind(ts[sample(which(ts$class==1),80),],
                   ts[sample(which(ts$class==2),20),])
scorer <- randomForest(class~., data=tr, ntree=500)
test.scores <- predict(scorer, ts_sample, type = c("prob"))
EMQ(train=tr, test=test.scores)