tibble: Build a data frame

View source: R/tibble.R

tibbleR Documentation

Build a data frame

Description

tibble() constructs a data frame. It is used like base::data.frame(), but with a couple notable differences:

  • The returned data frame has the class tbl_df, in addition to data.frame. This allows so-called "tibbles" to exhibit some special behaviour, such as enhanced printing. Tibbles are fully described in tbl_df.

  • tibble() is much lazier than base::data.frame() in terms of transforming the user's input.

    • Character vectors are not coerced to factor.

    • List-columns are expressly anticipated and do not require special tricks.

    • Column names are not modified.

    • Inner names in columns are left unchanged.

  • tibble() builds columns sequentially. When defining a column, you can refer to columns created earlier in the call. Only columns of length one are recycled.

  • If a column evaluates to a data frame or tibble, it is nested or spliced. If it evaluates to a matrix or a array, it remains a matrix or array, respectively. See examples.

tibble_row() constructs a data frame that is guaranteed to occupy one row. Vector columns are required to have size one, non-vector columns are wrapped in a list.

Usage

tibble(
  ...,
  .rows = NULL,
  .name_repair = c("check_unique", "unique", "universal", "minimal")
)

tibble_row(
  ...,
  .name_repair = c("check_unique", "unique", "universal", "minimal")
)

Arguments

...

<dynamic-dots> A set of name-value pairs. These arguments are processed with rlang::quos() and support unquote via !! and unquote-splice via !!!. Use ⁠:=⁠ to create columns that start with a dot.

Arguments are evaluated sequentially. You can refer to previously created elements directly or using the .data pronoun. To refer explicitly to objects in the calling environment, use !! or .env, e.g. !!.data or .env$.data for the special case of an object named .data.

.rows

The number of rows, useful to create a 0-column tibble or just as an additional check.

.name_repair

Treatment of problematic column names:

  • "minimal": No name repair or checks, beyond basic existence,

  • "unique": Make sure names are unique and not empty,

  • "check_unique": (default value), no name repair, but check they are unique,

  • "universal": Make the names unique and syntactic

  • a function: apply custom name repair (e.g., .name_repair = make.names for names in the style of base R).

  • A purrr-style anonymous function, see rlang::as_function()

This argument is passed on as repair to vctrs::vec_as_names(). See there for more details on these terms and the strategies used to enforce them.

Value

A tibble, which is a colloquial term for an object of class tbl_df. A tbl_df object is also a data frame, i.e. it has class data.frame.

See Also

Use as_tibble() to turn an existing object into a tibble. Use enframe() to convert a named vector into a tibble. Name repair is detailed in vctrs::vec_as_names(). See quasiquotation for more details on tidy dots semantics, i.e. exactly how the ... argument is processed.

Examples

# Unnamed arguments are named with their expression:
a <- 1:5
tibble(a, a * 2)

# Scalars (vectors of length one) are recycled:
tibble(a, b = a * 2, c = 1)

# Columns are available in subsequent expressions:
tibble(x = runif(10), y = x * 2)

# tibble() never coerces its inputs,
str(tibble(letters))
str(tibble(x = list(diag(1), diag(2))))

# or munges column names (unless requested),
tibble(`a + b` = 1:5)

# but it forces you to take charge of names, if they need repair:
try(tibble(x = 1, x = 2))
tibble(x = 1, x = 2, .name_repair = "unique")
tibble(x = 1, x = 2, .name_repair = "minimal")

## By default, non-syntactic names are allowed,
df <- tibble(`a 1` = 1, `a 2` = 2)
## because you can still index by name:
df[["a 1"]]
df$`a 1`
with(df, `a 1`)

## Syntactic names are easier to work with, though, and you can request them:
df <- tibble(`a 1` = 1, `a 2` = 2, .name_repair = "universal")
df$a.1

## You can specify your own name repair function:
tibble(x = 1, x = 2, .name_repair = make.unique)

fix_names <- function(x) gsub("\\s+", "_", x)
tibble(`year 1` = 1, `year 2` = 2, .name_repair = fix_names)

## purrr-style anonymous functions and constants
## are also supported
tibble(x = 1, x = 2, .name_repair = ~ make.names(., unique = TRUE))

tibble(x = 1, x = 2, .name_repair = ~ c("a", "b"))

# Tibbles can contain columns that are tibbles or matrices
# if the number of rows is compatible. Unnamed tibbled are
# spliced, i.e. the inner columns are inserted into the
# tibble under construction.
tibble(
  a = 1:3,
  tibble(
    b = 4:6,
    c = 7:9
  ),
  d = tibble(
    e = tibble(
      f = b
    )
  )
)
tibble(
  a = 1:3,
  b = diag(3),
  c = cor(trees),
  d = Titanic[1:3, , , ]
)

# Data can not contain tibbles or matrices with incompatible number of rows:
try(tibble(a = 1:3, b = tibble(c = 4:7)))

# Use := to create columns with names that start with a dot:
tibble(.dotted := 3)

# This also works, but might break in the future:
tibble(.dotted = 3)

# You can unquote an expression:
x <- 3
tibble(x = 1, y = x)
tibble(x = 1, y = !!x)

# You can splice-unquote a list of quosures and expressions:
tibble(!!!list(x = rlang::quo(1:10), y = quote(x * 2)))

# Use .data, .env and !! to refer explicitly to columns or outside objects
a <- 1
tibble(a = 2, b = a)
tibble(a = 2, b = .data$a)
tibble(a = 2, b = .env$a)
tibble(a = 2, b = !!a)
try(tibble(a = 2, b = .env$bogus))
try(tibble(a = 2, b = !!bogus))

# Use tibble_row() to construct a one-row tibble:
tibble_row(a = 1, lm = lm(Height ~ Girth + Volume, data = trees))

tibble documentation built on March 31, 2023, 11 p.m.