ps_streaks_get_max_rank_by_sampling: ps_streaks_get_max_rank_by_sampling

View source: R/mod_plot_functions_max_rank.R

ps_streaks_get_max_rank_by_samplingR Documentation

ps_streaks_get_max_rank_by_sampling

Description

Give an estimate of the rank returned by ps_streaks_get_max_rank_simple using this method:

  • First: Apply the algorithm of ps_streaks_get_max_rank_simple to a limited set of intensity levels (e.g. c(25,50,75) instead of 1:101).

  • Second, increase the returned rank and increase it by a scaling factor (e.g. 1.5).

  • Third, restrict the full streaks table to Rank values below the scaled initial estimate.

  • Finally, apply ps_streaks_get_max_rank_simple to the restricted streak table, this time across all intensity levels.

Usage

ps_streaks_get_max_rank_by_sampling(
  lzy_streaks,
  n,
  min_year,
  max_year,
  teams,
  levels,
  scaling
)

Arguments

lzy_streaks

Lazy streaks table

n

Function will maximize value of nth highest rank

min_year

Minimum year for filter

max_year

Maximum year for filter

teams

Vector of team IDs for filter.

levels

Intensity levels for the sampling, e.g. c(25,50,75)

scaling

Scaling factor, e.g. 1.5

Details

Notes:

  • This estimate will always be less than or equal to the true value.

  • This function calls ps_streaks_get_max_rank_simple twice, but each time with a filter applied to the lzy_streaks_tbl. It is less efficient than ps_streaks_get_max_rank_simple on smaller datasets, but much faster on larger datasets.

  • Increasing the scaling factor or the intensity sample space increases the accuracy at the cost of speed.

  • Smaller datasets require larger scaling factors, and larger datasets require smaller scaling factors.

Value

Estimate of maximum value


tor-gu/streakexplorer documentation built on Aug. 2, 2022, 8:22 p.m.