ebc: Expected Loss of the Bayesian Classifier

Description Usage Arguments Value Examples

Description

The function offers a method to select variables by univariate filtering based on the estimated loss of the univariate Bayesian Classifer. The statistic requires the parametric assumption that the variable consists of a mixture of Gaussian variables.

Usage

1
2
ebc(class, data, oc = c(1, 1, 0.5), positive = levels(class)[1],
  robust = FALSE, p.val = FALSE, adj.method = "BH")

Arguments

class

a factor vector indicating the class membership of the instances. Must have exactly two levels.

data

a data frame with the variables to filer in columns.

oc

a vector containing three elements. oc[1], the cost of misclassifying a negative instance, oc[2], the cost of missclassifying a positive instance, and oc[3], the share of negative instances in the population.

positive

a character object indicating the factor label of the positive class.

robust

a logical indicating whether a robust estimator of the mean and variance of the two classes should be used.

p.val

a logical indicating whether the p-values of ebc values under the null hypothesis that both classes are equal should be calculated. Currently the null distribution is calculated by permutation.

adj.method

a character string indicating the method with which to correct the p-values for multiple testing. See ?p.adjust.

Value

a list containing three components:

ebc

a numerical vector containing the etc values for every variable of dat.

p.val

the corresponding p-values of etc. (optional)

p.val.adj

the corresponding adjusted p-values of ebc. (optional)

Examples

1
2
3
4
oc <- c(1,3,0.5)
class <- factor(c(rep(0, 25), rep(1, 25)), labels=c("neg", "pos"))
data <- data.frame("var1"=c(rnorm(25, 0, 1/2), rnorm(25, 1, 2)))
res <- ebc(class, data, oc, positive="pos", p.val=TRUE)

SchroederFabian/CVOC documentation built on May 9, 2019, 1:18 p.m.