MergeLevels: Combines least-frequently occurring levels of a factor into...

Description Usage Arguments Value Author(s) Examples

Description

Take a nominal variable and merge the least-frequently occurring levels into an Other category, to leave only max.levels distinct categories (including Other). For example, if there are 15 levels in the data and we request max.levels = 10, then the leading 9 levels will be retained, and the least frequent 6 levels will be merged into Other.

Usage

1
2
## S3 method for class 'factor'
MergeLevels(this, max.levels, other.name="Other", ...)

Arguments

this

A a factor, ie a nominal variable.

max.levels

The maximum number of levels required. eg If we request 10 levels, then there will be 9 distinct levels, plus Other. max.levels must be at least 2. If max.levels is greater than the number of levels in the data then no merging is done.

other.name

The merged levels will be assigned to a new level with the name provided.

...

Unused extra arguments.

Value

Returns a new factor with the smaller levels merged.

Author(s)

Jason McFall, Justin Hemann <support@causata.com>

Examples

1
2
3
4
library(stringr)
f <- factor(str_split("a a a b b b c c c d e f g h", " ")[[1]])
# d,e,f,g,h are merged into Other
MergeLevels(f, max.levels=4) 

Causata documentation built on May 2, 2019, 3:26 a.m.

Related to MergeLevels in Causata...