factor.remove.thin.levels: Removes thin levels from factor variables in a data frame

Description Usage Arguments Value Examples

Description

Removes thin levels from factor variables in a data frame

Usage

1
factor.remove.thin.levels(data, vars.list, thresh = 0.001, tag = NA)

Arguments

data

A data frame

vars.list

A list of pairs (variable.name,variable.type) such as those produced by allvariables.manual.review

thresh

Defines the minimum number (or, if <1, the minimum proportion) of occurences of a level to be kept

tag

(defaults to NA) A value that will replace levels that do not appear in both data1[[variable]] and data2[[variable]]

Value

A modified version of data

Examples

1
2
3
4
5
6
7
8
set.seed(1)
X <- data.frame(a = factor(sample(1:5,100,TRUE)),
                b = factor(sample(letters[1:5],100,TRUE)))
table(X$a)
table(X$b)
Y <- factor.remove.thin.levels(X, thresh = 0.19, tag = "unk")
table(Y$a)
table(Y$b)

ahdxb/data.exploration documentation built on May 11, 2019, 11:31 p.m.