plot_beta_distrib: plot_beta_distrib

View source: R/plot_beta_distrib.R

plot_beta_distribR Documentation

plot_beta_distrib

Description

Function estimates the parameters of the beta distribution to replicate a set of data based on their population mean and standard deviation.

The function estimates the shape parameters of the beta distribution using the method of moments involving the submitted mean and standard deviation. The returned data set reflects the 0 to 1 range of the beta distributions rescaled to the user's min/max units. A histogram ggplot2 plot of the estimated distribution is also returned.

Usage

plot_beta_distrib(
  n = NULL,
  mean = NULL,
  sd = NULL,
  min_val = 0,
  max_val = 1,
  digits = 3,
  seed = NULL,
  title = NULL,
  subtitle = NULL,
  x_title = NULL,
  y_title = NULL,
  bins = 100,
  binwidth = NULL,
  bin_breaks = NULL,
  bin_class = NULL,
  bar_fill = NA,
  bar_color = "black",
  bar_alpha = 0.4,
  bar_lwd = 1,
  y_limits = NULL,
  y_major_breaks = waiver(),
  y_minor_breaks = waiver(),
  y_labels = waiver()
)

Arguments

n

An integer that sets the number of beta distribution points to estimate using stats::rbeta().

mean

A numeric that sets the population mean of the observed data.

sd

A numeric that sets the population standard deviation of the data.

min_val

A numeric that sets the population's minimal value.

max_val

A numeric that sets the population's maximum value.

digits

An integer that sets the number of digits to round the returned data.

seed

An integer used by stats::rbeta() in setting the random seed.

title

A string that sets the overall title.

subtitle

A string that sets the overall subtitle.

x_title

A string that sets the x axis title.

y_title

A string that sets the y axis title.

bins

An integer that sets the number of bins for the histogram. Default is 100.

binwidth

A numeric that sets the number of bins based on this value. If the histogram x axis should depict a date variable then 'binwidth' is the number of days and if a time variable then 'binwidth' is the number of seconds.

bin_breaks

A numeric vector that sets the number of bins by giving the bin boundaries explicitly.

bin_class

A character string that sets the number of bins by selecting one of three types of formulas. Acceptable values are "Sturges", "Scott", or "FD".

bar_fill

A string that sets the fill color attribute for the bars.

bar_color

A string that sets the outline color attribute for the bars.

bar_alpha

A numeric that set the alpha component attribute to 'bar_color'.

bar_lwd

A numeric that sets the outline thickness attribute of the bars.

y_limits

A numeric 2 element vector or function that sets the minimum and maximum for the y axis. Use NA to refer to the existing minimum and maximum.

y_major_breaks

A numeric vector or function that sets the major tic locations along the y axis.

y_minor_breaks

A numeric vector or function that sets the minor tic locations along the y axis.

y_labels

A character vector or function giving y axis tic labels. Must be the same length as 'y_major_breaks'.

Value

A list with the shape parameters used to estimate the beta distribution along with numeric vectors of estimated data in both unscaled and scaled format. Also a ggplot2 histogram of the distribution is also provided.

Examples

library(data.table)
library(ggplot2)
library(RplotterPkg)
library(RregressPkg)

therms_dt <- data.table::as.data.table(RregressPkg::Therms18) |>
_[, .(na.omit(fttrump))] |>
  data.table::setnames(old = "V1",new = "Rating")

beta_est_lst <- RregressPkg::plot_beta_distrib(
  n = nrow(therms_dt),
  mean = mean(therms_dt$Rating),
  sd = sd(therms_dt$Rating),
  min_val = 0,
  max_val = 100,
  seed = 8675309,
  x_title = "Trump Rating",
  y_title = "Count",
  bar_fill = "blue",
  bar_alpha = 0.5,
  y_limits = c(0, 600),
  y_major_breaks = seq(from = 0, to = 600, by = 50)
)
a_plot <- beta_est_lst$histo_plot


deandevl/RregressPkg documentation built on Feb. 5, 2025, 12:11 p.m.