| data_summary | R Documentation |
This function can be used to compute summary statistics for a data frame or a matrix.
data_summary(x, ...)
## S3 method for class 'data.frame'
data_summary(x, ..., by = NULL, remove_na = FALSE, suffix = NULL)
x |
A (grouped) data frame. |
... |
One or more named expressions that define the new variable name
and the function to compute the summary statistic. Example:
|
by |
Optional character string, indicating the names of one or more variables in the data frame. If supplied, the data will be split by these variables and summary statistics will be computed for each group. |
remove_na |
Logical. If |
suffix |
Optional, suffixes to be added to the new variable names,
especially useful when a function returns several values (e.g.
The new column names are a combination of the left-hand side (i.e.,
the name) of the expression and the related suffixes. If |
A data frame with the requested summary statistics.
data(iris)
data_summary(iris, MW = mean(Sepal.Width), SD = sd(Sepal.Width))
data_summary(
iris,
MW = mean(Sepal.Width),
SD = sd(Sepal.Width),
by = "Species"
)
# same as
d <- data_group(iris, "Species")
data_summary(d, MW = mean(Sepal.Width), SD = sd(Sepal.Width))
# multiple groups
data(mtcars)
data_summary(mtcars, MW = mean(mpg), SD = sd(mpg), by = c("am", "gear"))
# expressions can also be supplied as character strings
data_summary(mtcars, "MW = mean(mpg)", "SD = sd(mpg)", by = c("am", "gear"))
# count observations within groups
data_summary(mtcars, observations = n(), by = c("am", "gear"))
# first and last observations of "mpg" within groups
data_summary(
mtcars,
first = mpg[1],
last = mpg[length(mpg)],
by = c("am", "gear")
)
# allow more than one-column-summaries for expressions
d <- data.frame(
x = rnorm(100, 1, 1),
y = rnorm(100, 2, 2),
groups = rep(1:4, each = 25)
)
# since we have multiple columns for one expression, the names of the
# returned summary results are used as suffix by default
data_summary(
d,
quant_x = quantile(x, c(0.25, 0.75)),
mean_x = mean(x),
quant_y = quantile(y, c(0.25, 0.5, 0.75))
)
# if a summary function, like `fivenum()`, returns no named vector, suffixes
# are automatically numbered
data_summary(
d,
quant_x = quantile(x, c(0.25, 0.75)),
mean_x = mean(x),
fivenum_y = fivenum(y)
)
# specify column suffix for expressions, matching by names
data_summary(
d,
quant_x = quantile(x, c(0.25, 0.75)),
mean_x = mean(x),
quant_y = quantile(y, c(0.25, 0.5, 0.75)),
suffix = list(quant_y = c("_Q1", "_Q2", "_Q3"))
)
# name multiple expression suffixes, grouped by variable
data_summary(
d,
quant_x = quantile(x, c(0.25, 0.75)),
mean_x = mean(x),
quant_y = quantile(y, c(0.25, 0.5, 0.75)),
suffix = list(quant_x = c("Q1", "Q3"), quant_y = c("_Q1", "_Q2", "_Q3")),
by = "groups"
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.