Description Usage Arguments Examples
View source: R/featureEngineering.R
This function allows you to match columns to input into an lapply function based on a regular expression.
1 |
X |
a data.table or data.frame object. |
M |
a character vector containing regular expressions prepended with tilde (~) and/or a fixed string (without the tilde). |
FUN |
the function to be applied to each element of X. This can be a value if assign is supplied. |
... |
optional arguments to FUN. |
assign |
a character string containing what column name to prepend each assignment to. Can be left an empty string "" for in-place transformation. |
by |
a character vector of column names to group the operation by. |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | data(iris)
iris.dt <- data.table(iris)
rsapply(iris.dt, "~*Sepal", as.character, assign = "ch") # build new features with a character conversion, each column prepended with 'ch'
rsapply(iris.dt, "~*ch$", 1, assign = "") # different type: will have data.table warning
rsapply(iris.dt, "~*ch$", "2", assign = "") # same type: no data.table warning
rsapply(iris.dt, "~*ch$", NULL, assign = "") # remove all the columns that end in 'ch'
str(rsapply(iris.dt, "~*Sepal", as.character))
rsapply(iris.dt, c("~Sepal","~Petal"), quantile, probs = 1:3/4) # calculate the first 3 quantiles for all columns that have Sepal or Petal
rsapply(iris.dt, c("~Sepal","~Petal"), quantile, probs = 1:3/4, by = "Species") # calculate the first 3 quantiles for all Sepal or Petal grouped by Species
# Find the mean difference between 1st and 3rd quantile of all species for all Length only columns
rsapply(
rsapply(
rsapply(iris.dt, c("~Sepal","~Petal"), quantile, probs = c(1,3)/4, by = "Species"),
c("~Sepal","~Petal"), function(x) max(x) - min(x), by = "Species"),
c("~Length"), mean
)
rsapply(iris.dt, c("~Sepal","~Petal"), mean, by = "Species")[, .(ratio = Sepal.Length / Sepal.Width)] # Chain a new column called ratio which computes the ratio of Sepal Length and Width
melt(rsapply(iris.dt, c("~Sepal","~Petal"), mean, by = "Species"), id.vars = "Species") # Naturally can use melt and dcast for pivoting
rsapply(rsapply(dt, "~*SEGMENT*", function(x) ifelse(is.na(x), -1, x), assign = "_NEW"), "~*SEGMENT", print) # imputation
rsapply(rsapply(dt, "~*SEGMENT*", function(x) ifelse(is.na(x), -1, x), assign = ""), "~*SEGMENT", print) # in place imputation
num.col <- colnames(dd)[dd[, lapply(.SD, function(x) class(x)[1]) == "numeric"]] # only get the numeric attributes
rsapply(dd, num.col, print) # Print the columns
rsapply(dd, num.col, function(x) ifelse(is.na(x),-1,x), assign = "") # in place imputation for numeric only attributes
rsapply(dd, rsClass(dd, "numeric"), rsPrint) # fetch only the numeric attributes utilising rsClass
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.