rsr: Rating scale reduction

Description Usage Arguments Value Author(s) References Examples

View source: R/rsr.R

Description

This package implements a rather sophisticated method published in (Koczkodaj et al., 2017) In essence, it is a stepwise method fro maximizing the area under the area (AUC) of receiver operating characteristic (ROC). In this description, data mining terminology will be used:

The implemented algorithm (when reduced to its minimum) comes to using a loop for all attributes (with the class excluded) to compute AUC. Subsequently, attributes are sorted in the descending order by AUC. The attribute with the largest AUC is added to a subset of all attributes (evidently, it cannot be empty since it is supposed to be the minimum subset S of all attributes with the maximum AUC). We keep adding the next in line (according to AUC) attribute to the subset S checking AUC. If it decreases, we stop the procedure. The above procedure can be described by the following algorithm.

Algorithm:

  1. compute AUC of all attributes excluding class

  2. sort attributes by their AUC in the ascending order

  3. select the attribute with the largest AUC to subset S

  4. select the next attribute A with the largest AUC to subset S

  5. if the AUC of the subset S is larger that AUC of the former AUC then go to 3

There are a lot of checking (e.g., if the dataset is not empty or full of replications) involved.

Usage

1
rsr(attribute, D, plotRSR = FALSE, method=c('Stop1Max', 'StopGlobalMax'))

Arguments

attribute

a matrix or data.frame containing attributes

D

the decision vector

plotRSR

If TRUE the ROC curve is ploted

method

the Stop reduction criteria: First Max of AUC or Global Max of AUC, default: 'Stop1Max'

Value

rsr.auc

total AUC of atrtibutes

rsr.label

attribute labels

summary

a summary table

Author(s)

Waldemar W. Koczkodaj, Alicja Wolny-Dominiak

References

1. W.W. Koczkodaj, T. Kakiashvili, A. Szymanska, J. Montero-Marin, R. Araya, J. Garcia-Campayo, K. Rutkowski, D. Strzalka, How to reduce the number of rating scale items without predictability loss? Scientometrics, 909(2):581-593(open access), 2017
https://link.springer.com/article/10.1007/s11192-017-2283-4

2. T. Kakiashvili, W. W. Koczkodaj, and M. Woodbury-Smith. Improving the medical scale predictability by the pairwise comparisons method: Evidence from a clinical data study. Computer Methods and Programs in Biomedicine, 105(3), 2012
https://www.sciencedirect.com/science/article/abs/pii/S0169260711002586

3. X. Robin, N. Turck, A. Hainard, N. Tiberti, F. Lisacek, J.-C. Sanchez, and M. Muller. proc: an opensource package for r and s+ to analyze and compare roc curves. BMC Bioinformatics, 2011
https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-12-77

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
#creating the matrix of attributes and the decision vector
#must be as.numeric()
data(aSAH)
attach(aSAH)
is.numeric(aSAH)

attribute <-data.frame(as.numeric(gender), 
as.numeric(age), as.numeric(wfns), as.numeric(s100b), as.numeric(ndka))
colnames(attribute) <-c("a1", "a2", "a3", "a4", "a5")
decision <-as.numeric(outcome)

#rating scale reduction procedure
rsred <-rsr(attribute, decision, plotRSR=TRUE)
rsred

Example output

Loading required package: pROC
Type 'citation("pROC")' for a citation.

Attaching package: 'pROC'

The following objects are masked from 'package:stats':

    cov, smooth, var

Loading required package: ggplot2
[1] FALSE
Warning message:
In if (method == "Stop1Max") { :
  the condition has length > 1 and only the first element will be used
$rsr.auc
[1] 0.8236789 0.8241870

$rsr.label
[1] "a3" "a4"

$summary
   AUC one variable AUC running total
a3        0.8236789         0.8236789
a4        0.7313686         0.8241870

[[4]]

attr(,"class")
[1] "RatingScaleReduction"

RatingScaleReduction documentation built on Jan. 21, 2021, 5:06 p.m.