HDy_LP: HDy with Laplace smoothing

Description Usage Arguments Value Author(s) References Examples

View source: R/HDy_LP_method.r

Description

It computes the class distribution using the HDy algorithm proposed by González-Castro et al. (2013) with Laplace smoothing (Maletzke et al. (2019)).

Usage

1
HDy_LP(p.score, n.score, test)

Arguments

p.score

a numeric vector of positive scores estimated either from a validation set or from a cross-validation method.

n.score

a numeric vector of negative scores estimated either from a validation set or from a cross-validation method.

test

a numeric vector containing the score estimated for the positive class from each test set instance.

Value

A numeric vector containing the class distribution estimated from the test set.

Author(s)

Andre Maletzke <andregustavom@gmail.com>

References

González-Castro, V., Alaíz-Rodriguez, R., & Alegre, E. (2013). Class distribution estimation based on the Hellinger distance. Information Sciences.<doi.org/10.1016/j.ins.2012.05.028>

Maletzke, A., Reis, D., Cherman, E., & Batista, G. (2019). DyS: a Framework for Mixture Models in Quantification. in Proceedings of the The Thirty-Third AAAI Conference on Artificial Intelligence, ser. AAAI’19, 2019. <doi.org/10.1609/aaai.v33i01.33014552>.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
library(randomForest)
library(caret)
cv <- createFolds(aeAegypti$class, 3)
tr <- aeAegypti[cv$Fold1,]
validation <- aeAegypti[cv$Fold2,]
ts <- aeAegypti[cv$Fold3,]

# -- Getting a sample from ts with 80 positive and 20 negative instances --
ts_sample <- rbind(ts[sample(which(ts$class==1),80),],
                   ts[sample(which(ts$class==2),20),])
scorer <- randomForest(class~., data=tr, ntree=500)
scores <- cbind(predict(scorer, validation, type = c("prob")), validation$class)
test.scores <- predict(scorer, ts_sample, type = c("prob"))
HDy_LP(p.score = scores[scores[,3]==1,1], n.score=scores[scores[,3]==2,1],
test=test.scores[,1])

mlquantify documentation built on Jan. 20, 2022, 5:07 p.m.