compute_breakpoints: Compute Breakpoints Based on Sorting Variable

View source: R/compute_breakpoints.R

compute_breakpointsR Documentation

Compute Breakpoints Based on Sorting Variable

Description

[Experimental]

This function computes breakpoints based on a specified sorting. It can optionally filter the data by exchanges before computing the breakpoints. The function requires either the number of portfolios to be created or specific percentiles for the breakpoints, but not both. The function also optionally handles cases where the sorting variable clusters on the edges, by assigning all extreme values to the edges and attempting to compute equally populated breakpoints with the remaining values.

Usage

compute_breakpoints(
  data,
  sorting_variable,
  breakpoint_options,
  data_options = NULL
)

Arguments

data

A data frame containing the dataset for breakpoint computation.

sorting_variable

A string specifying the column name in data to be used for determining breakpoints.

breakpoint_options

A named list of breakpoint_options for the breakpoints. The arguments include

  • n_portfolios An optional integer specifying the number of equally sized portfolios to create. This parameter is mutually exclusive with percentiles.

  • percentiles An optional numeric vector specifying the percentiles for determining the breakpoints of the portfolios. This parameter is mutually exclusive with n_portfolios.

  • breakpoint_exchanges An optional character vector specifying exchange names to filter the data before computing breakpoints. Exchanges must be stored in a column named exchange in data. If NULL, no filtering is applied.

  • smooth_bunching An optional logical parameter specifying if to attempt smoothing non-extreme portfolios if the sorting variable bunches on the extremes (TRUE, the default), or not (FALSE). In some cases, smoothing will not result in equal-sized portfolios off the edges due to multiple clusters. If sufficiently large bunching is detected, percentiles is ignored and equally-spaced portfolios are returned for these cases with a warning.

data_options

A named list of data_options with characters, indicating the column names required to run this function. The required column names identify dates. Defaults to exchange = exchange.

Value

A vector of breakpoints of the desired length.

Note

This function will stop and throw an error if both n_portfolios and percentiles are provided or if neither is provided. Ensure that you only use one of these parameters.

Examples

data <- data.frame(
  id = 1:100,
  exchange = sample(c("NYSE", "NASDAQ"), 100, replace = TRUE),
  market_cap = 1:100
)

compute_breakpoints(data, "market_cap", breakpoint_options(n_portfolios = 5))
compute_breakpoints(
  data, "market_cap",
  breakpoint_options(percentiles = c(0.2, 0.4, 0.6, 0.8), breakpoint_exchanges = c("NYSE"))
 )


tidyfinance documentation built on April 3, 2025, 6:10 p.m.