window_rank: Windowed Rank Functions

window_rankR Documentation

Windowed Rank Functions

Description

Six variations on ranking functions, mimicking the ranking functions described in SQL2003. They are currently implemented using the built in rank() function. All ranking functions map smallest inputs to smallest outputs. Use desc() to reverse the direction.

Usage

cume_dist(x)

dense_rank(x)

min_rank(x)

ntile(x = row_number(), n)

percent_rank(x)

row_number(x)

Arguments

x

A vector of values to rank. Missing values are left as is. If you want to treat them as the smallest or largest values, replace with Inf or -Inf before ranking.

n

integer(1). The number of groups to split up into.

Details

  • cume_dist(): a cumulative distribution function. Proportion of all values less than or equal to the current rank.

  • dense_rank(): like min_rank(), but with no gaps between ranks

  • min_rank(): equivalent to rank(ties.method = "min")

  • ntile(): a rough rank, which breaks the input vector into n buckets. The size of the buckets may differ by up to one, larger buckets have lower rank.

  • percent_rank(): a number between 0 and 1 computed by rescaling min_rank to ⁠[0, 1]⁠

  • row_number(): equivalent to rank(ties.method = "first")

Examples

x <- c(5, 1, 3, 2, 2, NA)
row_number(x)
min_rank(x)
dense_rank(x)
percent_rank(x)
cume_dist(x)

ntile(x, 2)
ntile(1:8, 3)

# row_number can be used with single table verbs without specifying x
# (for data frames and databases that support windowing)
mutate(mtcars, row_number() == 1L)
mtcars %>% filter(between(row_number(), 1, 10))


poorman documentation built on Nov. 2, 2023, 5:27 p.m.