| descript | R Documentation |
Function returns univariate data summaries for each variable supplied. For presentation
purposes, discrete and continuous variables are treated separately, the former of which
reflects count/proportion information while the ladder are supplied to a (customizable) list
of univariate summary functions. As such, quantitative/continuous variable
information is kept distinct in the output, while discrete variables (e.g.,
factors and character vectors) are returned by using the
discrete argument.
descript(df, funs = get_descriptFuns(), discrete = FALSE)
get_descriptFuns()
df |
typically a Note that |
funs |
functions to apply when
Note that by default the |
discrete |
logical; include summary statistics for |
The purpose of this function is to provide
a more pipe-friendly API for selecting and subsetting variables using the
dplyr syntax, where conditional statistics are evaluated
internally using the by function (when multiple variables are
to be summarised). As a special case,
if only a single variable is being summarised then the canonical output
from dplyr::summarise will be returned.
Conditioning: As the function is intended to support
pipe-friendly code specifications, conditioning/group subset
specifications are declared using group_by
and subsequently passed to descript.
summarise, group_by, xtabs
library(dplyr)
data(mtcars)
if(FALSE){
# run the following to see behavior with NA values in dataset
mtcars[sample(1:nrow(mtcars), 3), 'cyl'] <- NA
mtcars[sample(1:nrow(mtcars), 5), 'mpg'] <- NA
}
fmtcars <- within(mtcars, {
cyl <- factor(cyl)
am <- factor(am, labels=c('automatic', 'manual'))
vs <- factor(vs)
})
# with and without factor variables
mtcars |> descript()
fmtcars |> descript() # factors/discrete vars omitted
fmtcars |> descript(discrete=TRUE) # discrete variables only
# for discrete variables, xtabs() is generally nicer as cross-tabs can
# be specified explicitly (though can be cumbersome)
xtabs(~ am, fmtcars)
xtabs(~ am, fmtcars) |> prop.table()
xtabs(~ am + cyl + vs, fmtcars)
xtabs(~ am + cyl + vs, fmtcars) |> prop.table()
# usual pipe chaining
fmtcars |> select(mpg, wt) |> descript()
fmtcars |> filter(mpg > 20) |> select(mpg, wt) |> descript()
# conditioning with group_by()
fmtcars |> group_by(cyl) |> descript()
fmtcars |> group_by(cyl, am) |> descript()
fmtcars |> group_by(cyl, am) |> select(mpg, wt) |> descript()
# with single variables, typical dplyr::summarise() output returned
fmtcars |> select(mpg) |> descript()
fmtcars |> group_by(cyl) |> select(mpg) |> descript()
fmtcars |> group_by(cyl, am) |> select(mpg) |> descript()
# discrete variables also work with group_by(), though again
# xtabs() is generally more flexible
fmtcars |> group_by(cyl) |> descript(discrete=TRUE)
fmtcars |> group_by(am) |> descript(discrete=TRUE)
fmtcars |> group_by(cyl, am) |> descript(discrete=TRUE)
# only return a subset of summary statistics
funs <- get_descriptFuns()
sfuns <- funs[c('n', 'mean', 'sd')] # subset
fmtcars |> descript(funs=sfuns) # only n, miss, mean, and sd
# add a new functions
funs2 <- c(sfuns,
trim_20 = \(x) mean(x, trim=.2, na.rm=TRUE),
median= \(x) median(x, na.rm=TRUE))
fmtcars |> descript(funs=funs2)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.