distinct.duckplyr_df: Keep distinct/unique rows

View source: R/distinct.R

distinct.duckplyr_dfR Documentation

Keep distinct/unique rows

Description

This is a method for the dplyr::distinct() generic. Keep only unique/distinct rows from a data frame. This is similar to unique.data.frame() but considerably faster.

Usage

## S3 method for class 'duckplyr_df'
distinct(.data, ..., .keep_all = FALSE)

Arguments

.data

A data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr). See Methods, below, for more details.

...

<data-masking> Optional variables to use when determining uniqueness. If there are multiple rows for a given combination of inputs, only the first row will be preserved. If omitted, will use all variables in the data frame.

.keep_all

If TRUE, keep all variables in .data. If a combination of ... is not distinct, this keeps the first row of values.

See Also

dplyr::distinct()

Examples

df <- duckdb_tibble(
  x = sample(10, 100, rep = TRUE),
  y = sample(10, 100, rep = TRUE)
)
nrow(df)
nrow(distinct(df))

duckdblabs/duckplyr documentation built on March 5, 2025, 3:46 a.m.