HDy_LP: HDy with Laplace smoothing
In mlquantify: Algorithms for Class Distribution Estimation

Description Usage Arguments Value Author(s) References Examples

It computes the class distribution using the HDy algorithm proposed by González-Castro et al. (2013) with Laplace smoothing (Maletzke et al. (2019)).

1	HDy_LP(p.score, n.score, test)

`p.score`	a numeric `vector` of positive scores estimated either from a validation set or from a cross-validation method.
`n.score`	a numeric `vector` of negative scores estimated either from a validation set or from a cross-validation method.
`test`	a numeric `vector` containing the score estimated for the positive class from each test set instance.

A numeric vector containing the class distribution estimated from the test set.

Andre Maletzke <andregustavom@gmail.com>

González-Castro, V., Alaíz-Rodriguez, R., & Alegre, E. (2013). Class distribution estimation based on the Hellinger distance. Information Sciences.<doi.org/10.1016/j.ins.2012.05.028>

Maletzke, A., Reis, D., Cherman, E., & Batista, G. (2019). DyS: a Framework for Mixture Models in Quantification. in Proceedings of the The Thirty-Third AAAI Conference on Artificial Intelligence, ser. AAAI’19, 2019. <doi.org/10.1609/aaai.v33i01.33014552>.

library(randomForest)
library(caret)
cv <- createFolds(aeAegypti$class, 3)
tr <- aeAegypti[cv$Fold1,]
validation <- aeAegypti[cv$Fold2,]
ts <- aeAegypti[cv$Fold3,]

# -- Getting a sample from ts with 80 positive and 20 negative instances --
ts_sample <- rbind(ts[sample(which(ts$class==1),80),],
                   ts[sample(which(ts$class==2),20),])
scorer <- randomForest(class~., data=tr, ntree=500)
scores <- cbind(predict(scorer, validation, type = c("prob")), validation$class)
test.scores <- predict(scorer, ts_sample, type = c("prob"))
HDy_LP(p.score = scores[scores[,3]==1,1], n.score=scores[scores[,3]==2,1],
test=test.scores[,1])