diffExamples: The number of different (unique) examples in a dataset

Description Usage Arguments Value Author(s) Examples

View source: R/diffExamples.R

Description

Datasets often contain replications. In particular, one example may be replicated n times, where n is the total number of examples, so that there are no other examples. Such situation would deviate computations and should be early detected. Ideally, no example should be replicated but if the rate is small, we can progress to computing AUC.

Usage

1
diffExamples(attribute)

Arguments

attribute

a matrix or data.frame containing attributes

Value

total.examples

a number of examples in a data

diff.examples

a number of different examples in a data

dup.exapmles

a number of duplicate examples in a data

Author(s)

Waldemar W. Koczkodaj, Feng Li,Alicja Wolny-Dominiak

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
#creating the matrix of attributes and the decision vector
#must be as.numeric()
data(aSAH)
attach(aSAH)
is.numeric(aSAH)

attribute <-data.frame(as.numeric(gender), 
as.numeric(age), as.numeric(wfns), as.numeric(s100b), as.numeric(ndka))
colnames(attribute) <-c("a1", "a2", "a3", "a4", "a5")

#show the number of different examples
diffExamples(attribute)

Example output

Loading required package: pROC
Type 'citation("pROC")' for a citation.

Attaching package: 'pROC'

The following objects are masked from 'package:stats':

    cov, smooth, var

Loading required package: ggplot2
[1] FALSE
$total.examples
[1] 113

$diff.examples
[1] 113

$dup.examples
[1] 0

RatingScaleReduction documentation built on Jan. 21, 2021, 5:06 p.m.