as_discrete | R Documentation |
This is a cheapr version of cut.numeric()
which is more efficient and
prioritises pretty-looking breaks by default through
the use of get_breaks()
.
Out-of-bounds values can be included naturally through the
include_oob
argument. Left-closed (right-open) intervals are
returned by default in contrast to cut's default right-closed intervals.
Furthermore there is flexibility in formatting the interval bins,
allowing the user to specify formatting functions and symbols for
the interval close and open symbols.
as_discrete(x, ...)
## S3 method for class 'numeric'
as_discrete(
x,
breaks = if (left_closed) get_breaks(x) else cheapr_rev(-get_breaks(-x)),
left_closed = TRUE,
include_endpoint = FALSE,
include_oob = FALSE,
ordered = FALSE,
intv_start_fun = prettyNum,
intv_end_fun = prettyNum,
intv_closers = c("[", "]"),
intv_openers = c("(", ")"),
intv_sep = ",",
inf_label = NULL,
...
)
## S3 method for class 'integer64'
as_discrete(x, ...)
x |
A numeric vector. |
... |
Extra arguments passed onto methods. |
breaks |
Break-points.
The default option creates pretty looking breaks.
Unlike |
left_closed |
Left-closed intervals or right-closed intervals? |
include_endpoint |
Include endpoint? Default is |
include_oob |
Include out-of-bounds values? Default is cut(10, c(9, 10, Inf), right = F, include.lowest = T) != as_discrete(10, c(9, 10), include_endpoint = T, include_oob = T) |
ordered |
Should result be an ordered factor? Default is |
intv_start_fun |
Function used to format interval start points. |
intv_end_fun |
Function used to format interval end points. |
intv_closers |
A length 2 character vector denoting the symbol to use for closing either left or right closed intervals. |
intv_openers |
A length 2 character vector denoting the symbol to use for opening either left or right closed intervals. |
intv_sep |
A length 1 character vector used to separate the start and end points. |
inf_label |
Label to use for intervals that include infinity.
If left |
A factor of discrete bins (intervals of start/end pairs).
bin get_breaks
library(cheapr)
# `as_discrete()` is very similar to `cut()`
# but more flexible as it allows you to supply
# formatting functions and symbols for the discrete bins
# Here is an example of how to use the formatting functions to
# categorise age groups nicely
ages <- 1:100
age_group <- function(x, breaks){
age_groups <- as_discrete(
x,
breaks = breaks,
intv_sep = "-",
intv_end_fun = function(x) x - 1,
intv_openers = c("", ""),
intv_closers = c("", ""),
include_oob = TRUE,
ordered = TRUE
)
# Below is just renaming the last age group
lvls <- levels(age_groups)
n_lvls <- length(lvls)
max_ages <- paste0(max(breaks), "+")
attr(age_groups, "levels") <- c(lvls[-n_lvls], max_ages)
age_groups
}
age_group(ages, seq(0, 80, 20))
age_group(ages, seq(0, 25, 5))
age_group(ages, 5)
# To closely replicate `cut()` with `as_discrete()` we can use the following
cheapr_cut <- function(x, breaks, right = TRUE,
include.lowest = FALSE,
ordered.result = FALSE){
if (length(breaks) == 1){
breaks <- get_breaks(x, breaks, pretty = FALSE,
expand_min = FALSE, expand_max = FALSE)
adj <- diff(range(breaks)) * 0.001
breaks[1] <- breaks[1] - adj
breaks[length(breaks)] <- breaks[length(breaks)] + adj
}
as_discrete(x, breaks, left_closed = !right,
include_endpoint = include.lowest,
ordered = ordered.result,
intv_start_fun = function(x) formatC(x, digits = 3, width = 1),
intv_end_fun = function(x) formatC(x, digits = 3, width = 1))
}
x <- rnorm(100)
cheapr_cut(x, 10)
identical(cut(x, 10), cheapr_cut(x, 10))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.