diffExamples: The number of different (unique) examples in a dataset
In RatingScaleReduction: Rating Scale Reduction Procedure

Description Usage Arguments Value Author(s) Examples

View source: R/diffExamples.R

Datasets often contain replications. In particular, one example may be replicated n times, where n is the total number of examples, so that there are no other examples. Such situation would deviate computations and should be early detected. Ideally, no example should be replicated but if the rate is small, we can progress to computing AUC.

1	diffExamples(attribute)

attribute

a matrix or data.frame containing attributes

`total.examples`	a number of examples in a data
`diff.examples`	a number of different examples in a data
`dup.exapmles`	a number of duplicate examples in a data

Waldemar W. Koczkodaj, Feng Li,Alicja Wolny-Dominiak

#creating the matrix of attributes and the decision vector
#must be as.numeric()
data(aSAH)
attach(aSAH)
is.numeric(aSAH)

attribute <-data.frame(as.numeric(gender), 
as.numeric(age), as.numeric(wfns), as.numeric(s100b), as.numeric(ndka))
colnames(attribute) <-c("a1", "a2", "a3", "a4", "a5")

#show the number of different examples
diffExamples(attribute)

Loading required package: pROC
Type 'citation("pROC")' for a citation.

Attaching package: 'pROC'

The following objects are masked from 'package:stats':

    cov, smooth, var

Loading required package: ggplot2
[1] FALSE
$total.examples
[1] 113

$diff.examples
[1] 113

$dup.examples
[1] 0