varlist: Generate a comprehensive summary of the variables

View source: R/varlist.R

varlistR Documentation

Generate a comprehensive summary of the variables

Description

varlist() lists the variables of a data frame and extracts essential metadata, including variable names, labels, summary values, classes, number of distinct values, number of valid (non-missing) observations, and number of missing values.

vl() is a convenient shorthand for varlist() that offers identical functionality with a shorter name.

Usage

varlist(
  x,
  ...,
  values = FALSE,
  tbl = FALSE,
  include_na = FALSE,
  .raw_expr = substitute(x)
)

vl(x, ..., values = FALSE, tbl = FALSE, include_na = FALSE)

Arguments

x

A data frame or a transformation of one. Must be named and identifiable.

...

Optional tidyselect-style column selectors (e.g. starts_with("var"), where(is.numeric), etc.).

values

Logical. If FALSE (the default), only min/max or representative values are displayed. If TRUE, all unique values are listed.

tbl

Logical. If FALSE (the default), the summary is opened in the Viewer (if interactive). If TRUE, a tibble is returned instead.

include_na

Logical. If TRUE, missing values (NA) are included in the Values column. Default is FALSE.

.raw_expr

Internal. Do not use. Captures the original expression from vl() to generate an informative title. Used only for internal purposes.

Details

The function can also apply tidyselect-style variable selectors to filter columns dynamically.

If used interactively (e.g. in RStudio), the summary is displayed in the Viewer pane with a contextual title like ⁠VARLIST iris⁠. If the data frame has been transformed or subsetted, the title will display an asterisk (*), e.g. ⁠VARLIST iris*⁠.

For full documentation, see varlist().

Value

A tibble with one row per (selected) variable, containing the following columns:

  • Variable: variable names

  • Label: variable labels (if available via the label attribute)

  • Values: a summary of the variable's values, depending on the values and include_na arguments. If values = FALSE, a compact summary (max 4 values: 3 + ... + last) is shown. If values = TRUE, all unique non-missing values are displayed. For labelled variables, prefixed labels are displayed using labelled::to_factor(levels = "prefixed"). For factors, levels are used as-is. Missing values (NA, NaN) are optionally appended at the end (controlled via include_na).

  • Class: the class of each variable (possibly multiple, e.g. ⁠"labelled", "numeric"⁠)

  • Ndist_val: number of distinct non-missing values

  • N_valid: number of non-missing observations

  • NAs: number of missing observations If tbl = FALSE and used interactively, the summary is displayed in the Viewer pane. If the data frame is a transformation (e.g. head(df) or df[ , 1:3]), an asterisk (*) is appended to the name in the title (e.g. ⁠VARLIST df*⁠).

Examples

varlist(iris)
iris |> varlist()
iris |> varlist(starts_with("Sepal"), tbl = TRUE)
varlist(mtcars, where(is.numeric), values = TRUE, tbl = TRUE)
varlist(head(mtcars), tbl = TRUE)
varlist(mtcars, tbl = TRUE)
varlist(iris[, 1:3], tbl = TRUE)
varlist(mtcars[1:10, ], tbl = TRUE)

vl(iris)
iris |> vl()
vl(mtcars, starts_with("d"))
vl(head(iris), include_na = TRUE)
vl(iris[, 1:3], values = TRUE, tbl = TRUE)

spicy documentation built on June 8, 2025, 11:06 a.m.