knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

histogramR

Overview

histogramR is a tool based on dplyr and ggplot2 that creates classical frequency distribution tables, histograms and frequency polygons. Also, a comparison between number of classes compute methods (Sturges, Friedman-Diaconis and Scott) are performed. This package is part of a final work in Computational Statistics course at the Master of Applied Statistics in Universidad del Norte, Colombia.

Installation

histogramR is stored in this github repository, thus package devtools is needed to install. If you are on a fresh install of R, then following code will install a lot of packages.

install.packages("devtools")
devtools::install_github("rodianf/histogramR")

library(histogramR)

Usage

tab_freq

This function creates a classical frequency distribution table, of class tibble, with five columns.

As the return object is a tibble, functions from dplyr can be applied. To include in Rmarkdown use knitr::kable for better results.

Note

Classes with zero frequency are dropped from table. This is caused by function group_by from dplyr package, however a correction for this behavior will be implemented soon. See https://github.com/tidyverse/dplyr/pull/3492.

library(MASS)

data("Melanoma")

attach(Melanoma)

tab_freq(thickness)

tab_freq(thickness, nclass = "FD")

tab_freq(thickness) %>% 
  rename("Frequency" = f,
         "Relative frequency" = rf)

tab_freq(thickness, nclass = "scott") %>% 
  rename("Frequency" = f,
         "Relative frequency" = rf) %>% 
  knitr::kable()

plot_freq

This function creates an histogram and frequency polygon or a cummulative frequency polygon. The return object is a ggplot2 plot, thus layers can be applied.

plot_freq(thickness)

plot_freq(thickness, nclass = "FD", density = TRUE)

plot_freq(thickness, nclass = "scott", density = TRUE, cfp = TRUE) +
  theme_classic()

nc_comp

This function compare the methods for calculation of the number of classes from a numerical random variable. Uses plot_freq function to generate plots. Generics as print, summary and ggplot can be used.

nc_comp(thickness)

comparison <- nc_comp(thickness)

print(comparison)

summary(comparison)

ggplot(comparison)


rodianf/histogramR documentation built on May 14, 2019, 7:33 a.m.