# percent_rank: Proportional ranking functions In dplyr: A Grammar of Data Manipulation

 percent_rank R Documentation

## Proportional ranking functions

### Description

These two ranking functions implement two slightly different ways to compute a percentile. For each `x_i` in `x`:

• `cume_dist(x)` counts the total number of values less than or equal to `x_i`, and divides it by the number of observations.

• `percent_rank(x)` counts the total number of values less than `x_i`, and divides it by the number of observations minus 1.

In both cases, missing values are ignored when counting the number of observations.

### Usage

``````percent_rank(x)

cume_dist(x)
``````

### Arguments

 `x` A vector to rank By default, the smallest values will get the smallest ranks. Use `desc()` to reverse the direction so the largest values get the smallest ranks. Missing values will be given rank `NA`. Use `coalesce(x, Inf)` or `coalesce(x, -Inf)` if you want to treat them as the largest or smallest values respectively. To rank by multiple columns at once, supply a data frame.

### Value

A numeric vector containing a proportion.

Other ranking functions: `ntile()`, `row_number()`

### Examples

``````x <- c(5, 1, 3, 2, 2)

cume_dist(x)
percent_rank(x)

# You can understand what's going on by computing it by hand
sapply(x, function(xi) sum(x <= xi) / length(x))
sapply(x, function(xi) sum(x < xi)  / (length(x) - 1))
# The real computations are a little more complex in order to
# correctly deal with missing values
``````

dplyr documentation built on Nov. 17, 2023, 5:08 p.m.