summarizing | R Documentation |
data.table
.Convenient functions for summarizing a data.table
. These
functions are convenient to use with data.table
's .SD
argument Shorthand functions are also available that only require a
data.table
, a character vector of variables, vars
,
and an optional names.extra
argument:
DT_summarize(DT, fun, vars, names.extra = NULL)
DT_mean(DT, vars, names.extra = ".mean", na.rm = FALSE)
DT_sd(DT, vars, names.extra = ".sd", na.rm = FALSE)
DT_var(DT, vars, names.extra = ".var", na.rm = FALSE)
DT_sum(DT, vars, names.extra = ".sum", na.rm = FALSE)
DT_log_diff(DT, vars, names.extra = ".log.diff")
DT_perc_diff(DT, vars, names.extra = ".perc.diff")
DT |
a |
fun |
a function used to summarize the data |
vars |
a character vector with variable names |
names.extra |
a string used as a suffix for new variable names |
na.rm |
set to |
fun = mean
and names.extra
defaults to ".mean"
fun = sd
and and names.extra
defaults to ".sd"
fun = var
and names.extra
defaults to ".var"
fun = sum
and names.extra
defaults to ".sum"
fun = function(x) log(x[length(x)]) -log(x[1])
and names.extra
defaults to ".log.diff"
fun = function(x) function(x) (x[length(x)]-x[1])/x[1]
and names.extra
defaults to ".perc.diff"
a summarized data.table
data(mtcars)
setDT(mtcars) ##Convert to a data.table
##Base use of DT_summarize()
DT_summarize(mtcars, fun = mean, vars = c("mpg","hp"))
##using with by and .SD
mtcars %>%
.[, DT_summarize(.SD, fun = mean, vars = c("mpg","hp")), by = cyl]
##Using the convenience function DT_mean (and leaving the names.extra
##argument as the default)
DT_mean(mtcars, vars = c("mpg", "hp"))
mtcars %>%
.[, DT_mean(.SD, vars = c("mpg","hp")), by = cyl]
##Take the mean of of hp and mpg, and the variance of disp and wt
##Note, we concatenate usign c()
mtcars %>%
.[, c(DT_mean(.SD, vars = c("mpg", "hp")),
DT_var(.SD, vars = c("disp", "wt"))
)]
##by cyl
mtcars %>%
.[, c(DT_mean(.SD, vars = c("mpg", "hp")),
DT_var(.SD, vars = c("disp", "wt"))
), by = cyl]
##The mean, standard deviation, and variance of mpg and hp
summary.vars <- c("mpg", "hp")
mtcars %>%
.[, c(DT_mean(.SD, vars = summary.vars),
DT_sd(.SD, vars = summary.vars),
DT_var(.SD, vars = summary.vars)), by = cyl]
## Other useful summary functions (may not have a
##useful interpretation here)
summary.vars <- c("mpg", "hp")
mtcars %>%
.[, c(DT_sum(.SD, vars = summary.vars),
DT_log_diff(.SD, vars = summary.vars),
DT_perc_diff(.SD, vars = summary.vars)),
by = cyl]
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.