knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
histogramR is a tool based on dplyr and ggplot2 that creates classical frequency distribution tables, histograms and frequency polygons. Also, a comparison between number of classes compute methods (Sturges, Friedman-Diaconis and Scott) are performed. This package is part of a final work in Computational Statistics course at the Master of Applied Statistics in Universidad del Norte, Colombia.
histogramR is stored in this github repository, thus package devtools is needed to install. If you are on a fresh install of R, then following code will install a lot of packages.
install.packages("devtools") devtools::install_github("rodianf/histogramR") library(histogramR)
This function creates a classical frequency distribution table, of class tibble, with five columns.
variable name: Class intervals computed by selected method, default is "Sturges".
f: Counts or frequency of the variable in a class interval.
rf: Relative frequency or density.
cf: Cummulative frequency.
crf: Cummulative relative frequency.
As the return object is a tibble, functions from dplyr can be applied. To include in Rmarkdown use knitr::kable
for better results.
Classes with zero frequency are dropped from table. This is caused by function group_by
from dplyr package, however a correction for this behavior will be implemented soon. See https://github.com/tidyverse/dplyr/pull/3492.
library(MASS) data("Melanoma") attach(Melanoma) tab_freq(thickness) tab_freq(thickness, nclass = "FD") tab_freq(thickness) %>% rename("Frequency" = f, "Relative frequency" = rf) tab_freq(thickness, nclass = "scott") %>% rename("Frequency" = f, "Relative frequency" = rf) %>% knitr::kable()
This function creates an histogram and frequency polygon or a cummulative frequency polygon. The return object is a ggplot2 plot, thus layers can be applied.
plot_freq(thickness) plot_freq(thickness, nclass = "FD", density = TRUE) plot_freq(thickness, nclass = "scott", density = TRUE, cfp = TRUE) + theme_classic()
This function compare the methods for calculation of the number of classes from a numerical random variable. Uses plot_freq
function to generate plots. Generics as print
, summary
and ggplot
can be used.
nc_comp(thickness) comparison <- nc_comp(thickness) print(comparison) summary(comparison) ggplot(comparison)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.