Description Usage Arguments Details Value Examples
Categorical Data Matrix to One-Hot Binary Matrix
1 2 |
data |
a categorical data matrix. |
minus_level |
logical (default FALSE); if TRUE then create binary encodings for m-1 levels of a variable with m original levels |
clarify_levels |
logical (default TRUE); if TRUE then disambiguate resulting column names |
scale |
logical, if FALSE then binary matrix is returned. If TRUE, then normalization (see details) is applied to each binary transformed variable. |
The normalization technique is taken from Outlier Analysis (Aggarwal, 2017), section 8.3. For each column j in the binary transformed matrix, a normalization factor is defined as sqrt(ni \* pj \* (1-pj)), where ni is the number of distinct categories in the reference variable from the raw data set and pj is the proportion of records taking the value of 1 for the jth variable
A transformed one hot encoded matrix is returned.
1 2 3 | df <- data.frame(gender = sample(c("male", "female"), 25, T),
age = sample(c("young", "old", "unknown"), 25, T))
make_onehot(data = df)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.