descr: Univariate Statistics for Numerical Data

View source: R/descr.R

descrR Documentation

Univariate Statistics for Numerical Data

Description

Calculates mean, sd, min, Q1\*, median, Q3\*, max, MAD, IQR\*, CV, skewness\*, SE.skewness\*, and kurtosis\* on numerical vectors. (\*) Not available when using sampling weights.

Usage

descr(
  x,
  var = NULL,
  stats = st_options("descr.stats"),
  na.rm = TRUE,
  round.digits = st_options("round.digits"),
  transpose = st_options("descr.transpose"),
  order = "sort",
  style = st_options("style"),
  plain.ascii = st_options("plain.ascii"),
  justify = "r",
  headings = st_options("headings"),
  display.labels = st_options("display.labels"),
  split.tables = 100,
  weights = NULL,
  rescale.weights = FALSE,
  ...
)

Arguments

x

A numerical vector or a data frame.

var

Unquoted expression referring to a specific column in x. Provides support for piped function calls (e.g. my_df |> descr(my_var).

stats

Character. Which stats to produce. Either “all” (default), “fivenum”, “common” (see Details), or a selection of : “mean”, “sd”, “min”, “q1”, “med”, “q3”, “max”, “mad”, “iqr”, “cv”, “skewness”, “se.skewness”, “kurtosis”, “n.valid”, “n”, and “pct.valid”. Can be set globally via st_options, option “descr.stats”. See Details.

na.rm

Logical. Argument to be passed to statistical functions. Defaults to TRUE.

round.digits

Numeric. Number of significant digits to display. Defaults to 2. Can be set globally with st_options.

transpose

Logical. Make variables appears as columns, and stats as rows. Defaults to FALSE. Can be set globally with st_options, option “descr.transpose”.

order

Character. When analyzing more than one variable, this parameter determines how to order variables. Valid values are “sort” (or simply “s”), “preserve” (or “p”), or a vector containing all variable names in the desired order. Defaults to “sort”.

style

Character. Style to be used by pander. One of “simple” (default), “grid”, “rmarkdown”, or “jira”. Can be set globally with st_options.

plain.ascii

Logical. pander argument; when TRUE (default), no markup characters will be used (useful when printing to console). If style = 'rmarkdown' is specified, value is set to FALSE automatically. Can be set globally using st_options.

justify

Character. Alignment of numbers in cells; “l” for left, “c” for center, or “r” for right (default). Has no effect on html tables.

headings

Logical. Set to FALSE to omit heading section. Can be set globally via st_options. TRUE by default.

display.labels

Logical. Show variable / data frame labels in heading section. Defaults to TRUE. Can be set globally with st_options.

split.tables

Character. pander argument that specifies how many characters wide a table can be. 100 by default.

weights

Numeric. Vector of weights having same length as x. NULL (default) indicates that no weights are used.

rescale.weights

Logical. When set to TRUE, a global constant is apply to make the total count equal nrow(x). FALSE by default.

...

Additional arguments passed to pander or format.

Details

Since version 1.1, the stats argument can be set in a more flexible way; keywords (all, common, fivenum) can be combined with single statistics, or their “negation”. For instance, using stats = c("all", "-q1", "-q3") would show all except q1 and q3.

For further customization, you could redefine any preset in the following manner: .st_env$descr.stats$common <- c("mean", "sd", "n"). Use caution when modifying .st_env, and reload the package if errors ensue. Changes are temporary and will not persist across R sessions.

Value

An object having classes “matrix” and “summarytools” containing the statistics, with extra attributes useful to other functions/methods.

Author(s)

Dominic Comtois, dominic.comtois@gmail.com

Examples

data("exams")

# All stats (default behavior) for all numerical variables
descr(exams)

# Show only "common" statistics, plus "n"
descr(exams, stats = c("common", "n"))

# Selection of statistics, transposing the results
descr(exams, stats = c("mean", "sd", "min", "max"), transpose = TRUE)

# Rmarkdown-ready
descr(exams, plain.ascii = FALSE, style = "rmarkdown")

# Grouped statistics
data("tobacco")
with(tobacco, stby(BMI, gender, descr, check.nas = FALSE))

# Grouped statistics in tidy table:
tb(with(tobacco, stby(BMI, age.gr, descr, stats = "common")))

## Not run: 
# Show in Viewer (or browser if not in RStudio)
view(descr(exams))

# Save to html file with title
print(descr(exams),
      file = "descr_exams.html", 
      report.title = "BMI by Age Group",
      footnote = "<b>Schoolyear:</b> 2018-2019<br/><b>Semester:</b> Fall")

## End(Not run)


dcomtois/summarytools documentation built on March 1, 2025, 8:50 p.m.