scale_onehot: One-Hot Binary Matrix Scaling

Description Usage Arguments Details Value Examples

Description

One-Hot Binary Matrix Scaling

Usage

1

Arguments

data

a one-hot binary matrix

Details

The scaling technique is taken from Outlier Analysis (Aggarwal, 2017), section 8.3. For each column j in the binary transformed matrix, a normalization factor is defined as sqrt(ni \* pj \* (1-pj)), where ni is the number of distinct categories in the reference variable from the raw data set and pj is the proportion of records taking the value of 1 for the jth variable

Value

A scaled numerical matrix is returned.

Examples

1
2
onehot_scale(data = one_hot(my_data))
onehot(data = mydata, scale = T) # alternative when working with raw data

dannymorris/onehotter documentation built on May 15, 2019, 9:08 p.m.