mergeInfreqLevelsByCumsumprop: Merge infrequent levels by setting the threshold of the...

Description Usage Arguments Value Note Examples

View source: R/cumsumprop.R

Description

Merge infrequent levels by setting the threshold of the proportion of cumulative sum over sum a.k.a. cumsumprop

Usage

1
2
3
4
5
6
mergeInfreqLevelsByCumsumprop(
  classes,
  thr = 0.9,
  mergedLevel = "others",
  returnFactor = TRUE
)

Arguments

classes

Character strings or factor.

thr

Numeric, between 0 and 1, how to define frequent levels. Default: 0.9, namely levels which make up over 90% of all instances.

mergedLevel

Character, how the merged level should be named.

returnFactor

Logical, whether the value returned should be coereced into a factor.

Value

A character string vector or a factor, of the same length as the input classes, but with potentially fewer levels.

Note

In case only one class is deemed as infrequent, its label is unchanged.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
set.seed(1887)
myVals <- sample(c(rep("A", 4), rep("B", 3), rep("C", 2), "D"))
## in the example below, since A, B, C make up of 90% of the total,
## D is infrequent. Since it is alone, it is not merged
mergeInfreqLevelsByCumsumprop(myVals, 0.9) 
mergeInfreqLevelsByCumsumprop(myVals, 0.9, returnFactor=FALSE) ## return characters
## in the example below, since A and B make up 70% of the total, 
## and A, B, C 90%, they are all frequent and D is infrequent. 
## Following the logic above, no merging happens
mergeInfreqLevelsByCumsumprop(myVals, 0.8)
mergeInfreqLevelsByCumsumprop(myVals, 0.7) ## A and B are left, C and D are merged
mergeInfreqLevelsByCumsumprop(myVals, 0.5) ## A and B are left, C and D are merged
mergeInfreqLevelsByCumsumprop(myVals, 0.4) ## A is left
mergeInfreqLevelsByCumsumprop(myVals, 0.3) ## A is left

ribiosUtils documentation built on March 13, 2020, 2:54 a.m.