etc: Expected Loss of the Threshold Classifier.

Description Usage Arguments Value Examples

Description

The function offers a method to select variables by univariate filtering based on the estimated loss of the optimal univariate threshold classifer. No parametric assumption about the class conditional distributions is required.

Usage

1
2
etc(class, data, oc, positive = levels(class)[1], p.val = TRUE,
  plot = FALSE, adj.method = "BH")

Arguments

class

a factor vector indicating the class membership of the instances. Must have exactly two levels.

data

a data frame with variables in columns.

oc

a vector containing three elements. oc[1], the cost of misclassifying a negative instance, oc[2], the cost of missclassifying a positive instance, and oc[3], the share of negative instances in the population.

positive

a character object indicating the factor label of the positive class.

p.val

a logical indicating whether the p-values of etc values under the null hypothesis that both classes are equal should be calculated. The exact null distribution is calculated by means of a recursive algorithm.

plot

a logical. If TRUE a plot of the null distribution will be generated.

adj.method

a character string indicating the method with which to correct the p-values for multiple testing. See ?p.adjust.

Value

a list containing three components:

etc

a numerical vector containing the etc values for every variable of dat.

p.val

the corresponding p-values of etc. (optional)

p.val.adj

the corresponding adjusted p-values of etc. (optional)

Examples

1
2
3
4
oc <- c(1,3,0.5)
class <- factor(c(rep(0, 25), rep(1, 25)), labels=c("neg", "pos"))
data <- data.frame("var1"=c(rnorm(25, 0, 1/2), rnorm(25, 1, 2)))
res <- etc(class, data, oc, positive="pos", p.val=TRUE)

SchroederFabian/CVOC documentation built on May 9, 2019, 1:18 p.m.