encode_dummy: Encode a given factor variable using dummy variables

Description Usage Arguments Details Value Examples

View source: R/encode_dummy.R

Description

Transforms the original design matrix using a dummy variable encoding.

Usage

1
2
3
4
5
6
7
8
encode_dummy(
  X,
  fact,
  keep_factor = FALSE,
  encoding_only = FALSE,
  use_reference = TRUE,
  reference_value = 0
)

Arguments

X

The data.frame/data.table to transform.

fact

The factor variable to encode by - either a positive integer specifying the column number, or the name of the column.

keep_factor

Whether to keep the original factor column(defaults to **FALSE**).

encoding_only

Whether to return the full transformed dataset or only the new columns. Defaults to FALSE and returns the full dataset.

use_reference

Whether to include a reference level (i.e. whether the new encoding contains an **intercept-like** constant term). Defaults to **TRUE**.

reference_value

What the reference value should be if **use_reference** is set to **TRUE**. Defaults to 0.

Details

The basic dummy variable encoding, with reference class level set to 0. The reference class is always the first class observed.

Value

A new data.table X which contains the new columns and optionally the old factor.

Examples

1
2
3
4
5
6
design_mat <- cbind( data.frame( matrix(rnorm(5*100),ncol = 5) ),
                     sample( sample(letters, 10), 100, replace = TRUE)
                     )
colnames(design_mat)[6] <- "factor_var"

encode_dummy(X = design_mat, fact = "factor_var", keep_factor = FALSE)

JSzitas/categoryEncodings documentation built on Sept. 29, 2021, 9:54 a.m.