dt_mutate: 'dplyr'-like interface for data.table.
In hope-data-science/tidydt0: Tidy Verbs for `data.table`

Subset of 'dplyr' verbs to work with data.table. Note that there is no group_by verb - use by or keyby argument when needed.

dt_mutate adds new variables or modify existing variables. If data is data.table then it modifies in-place.
dt_summarize computes summary statistics. Splits the data into subsets, computes summary statistics for each, and returns the result in the "data.table" form.
dt_summarize_all the same as dt_summarize but work over all non-grouping variables.
dt_filter Selects rows/cases where conditions are true. Rows where the condition evaluates to NA are dropped.
dt_select Selects column/variables from the data set.
dt_arrange sorts dataset by variable(-s). Use '-' to sort in descending order. If data is data.table then it modifies in-place.

dt_mutate(data, ..., by)

dt_summarize(data, ..., by, keyby, fun = NULL)

dt_summarize_all(data, fun, by, keyby)

dt_summarise(data, ..., by, keyby, fun = NULL)

dt_summarise_all(data, fun, by, keyby)

dt_select(data, ...)

dt_filter(data, ...)

dt_arrange(data, ..., na.last = FALSE)

`data`	data.table/data.frame data.frame will be automatically converted to data.table. `dt_mutate`, `dt_mutate_if`, `dt_mutate_if` modify data.table object in-place.
`...`	List of variables or name-value pairs of summary/modifications functions. The name will be the name of the variable in the result. In the `mutate` function we can use `a = b` or `a := b` notation. Advantages of `:=` are multiassignment (`c("a", "b") := list(1,2)`) and parametric assignment (`(a) := 2`).
`by`	unquoted name of grouping variable of list of unquoted names of grouping variables. For details see data.table
`keyby`	Same as `by`, but with an additional `setkey()` run on the by columns of the result, for convenience. It is common practice to use 'keyby=' routinely when you wish the result to be sorted. For details see data.table.
`fun`	function which will be applied to all variables in `dt_summarize` and `dt_summarize_all`.
`na.last`	logical. FALSE by default. If TRUE, missing values in the data are put last; if FALSE, they are put first.

# examples from 'dplyr'
# newly created variables are available immediately
mtcars  %>%
    dt_mutate(
        cyl2 = cyl * 2,
        cyl4 = cyl2 * 2
    ) %>%
    head()


# you can also use dt_mutate() to remove variables and
# modify existing variables
mtcars %>%
    dt_mutate(
        mpg = NULL,
        disp = disp * 0.0163871 # convert to litres
    ) %>%
    head()


# window functions are useful for grouped mutates
mtcars %>%
    dt_mutate(
        rank = rank(-mpg, ties.method = "min"),
        keyby = cyl) %>%
    print()


# You can drop variables by setting them to NULL
mtcars %>% dt_mutate(cyl = NULL) %>% head()

# A summary applied without by returns a single row
mtcars %>%
    dt_summarise(mean = mean(disp), n = .N)

# Usually, you'll want to group first
mtcars %>%
    dt_summarise(mean = mean(disp), n = .N, by = cyl)


# Multiple 'by' - variables
mtcars %>%
    dt_summarise(cyl_n = .N, by = list(cyl, vs))

# Newly created summaries immediately
# doesn't overwrite existing variables
mtcars %>%
    dt_summarise(disp = mean(disp),
                  sd = sd(disp),
                  by = cyl)

# You can group by expressions:
mtcars %>%
    dt_summarise_all(mean, by = list(vsam = vs + am))

# filter by condition
mtcars %>%
    dt_filter(am==0)

# filter by compound condition
mtcars %>%
    dt_filter(am==0,  mpg>mean(mpg))


# select
mtcars %>% dt_select(vs:carb, cyl)
mtcars %>% dt_select(-am, -cyl)

# sorting
dt_arrange(mtcars, cyl, disp)
dt_arrange(mtcars, -disp)