Data dummification is also known as one hot encoding or feature binarization. It turns each category to a distinct column with binary (numeric) values.
dummify(data, maxcat = 50L, select = NULL)
maximum categories allowed for each discrete feature. Default is 50.
names of selected features to be dummified. Default is
Continuous features will be ignored if added in
select features will be ignored if categories exceed
dummified dataset (discrete features only) preserving original features. However, column order might be different.
This is different from model.matrix, where the latter aims to create a full rank matrix for regression-like use cases. If your intention is to create a design matrix, use model.matrix instead.
## Dummify iris dataset str(dummify(iris)) ## Dummify diamonds dataset ignoring features with more than 5 categories data("diamonds", package = "ggplot2") str(dummify(diamonds, maxcat = 5)) str(dummify(diamonds, select = c("cut", "color")))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.