ohe_commas: One Hot Encoding for a Vector with Comma Separated Values

ohe_commasR Documentation

One Hot Encoding for a Vector with Comma Separated Values

Description

This function lets the user do one hot encoding on a variable with comma separated values

Usage

ohe_commas(df, ..., sep = ",", noval = "NoVal", remove = FALSE)

Arguments

df

Dataframe. May contain one or more columns with comma separated values which will be separated as one hot encoding

...

Variables. Which variables to split into new columns?

sep

Character. Which regular expression separates the elements?

noval

Character. No value text

remove

Boolean. Remove original variables?

Value

data.frame on which all features are numerical by nature or transformed with one hot encoding.

See Also

Other Data Wrangling: balance_data(), categ_reducer(), cleanText(), date_cuts(), date_feats(), file_name(), formatHTML(), holidays(), impute(), left(), normalize(), num_abbr(), ohse(), quants(), removenacols(), replaceall(), replacefactor(), textFeats(), textTokenizer(), vector2text(), year_month(), zerovar()

Other One Hot Encoding: date_feats(), holidays(), ohse()

Examples

df <- data.frame(
  id = c(1:5),
  x = c("AA, D", "AA,B", "B,  D", "A,D,B", NA),
  z = c("AA+BB+AA", "AA", "BB,  AA", NA, "BB+AA")
)
ohe_commas(df, x, remove = TRUE)
ohe_commas(df, z, sep = "\\+")
ohe_commas(df, x, z)

laresbernardo/lares documentation built on Oct. 23, 2024, 12:05 p.m.