stream_top_k: Stream top-k rows by a numeric column

View source: R/table_stream.R

stream_top_kR Documentation

Stream top-k rows by a numeric column

Description

Finds the top k rows by col without collecting the full dataset.

Usage

stream_top_k(x, col, k = 10L, decreasing = TRUE, na_drop = TRUE)

Arguments

x

A shard_row_groups or shard_dataset handle.

col

Column name to rank by.

k

Number of rows to keep.

decreasing

Logical; TRUE for largest values (default TRUE).

na_drop

Logical; drop rows where col is NA (default TRUE).

Details

For native-encoded partitions, this selects candidate rows using the numeric column without decoding strings, then decodes only the chosen rows for the returned result.

Value

A data.frame (or tibble if the tibble package is installed) with at most k rows ordered by col.

Examples


s <- schema(x = float64())
sink <- table_sink(s, mode = "row_groups")
table_write(sink, 1L, data.frame(x = c(3.0, 1.0, 2.0)))
rg <- table_finalize(sink)
stream_top_k(rg, "x", k = 2L)


shard documentation built on April 3, 2026, 9:08 a.m.