Introduction to fdth"

knitr::opts_chunk$set(
  collapse  = TRUE,
  comment   = "#>",
  fig.width = 6,
  fig.height = 4,
  fig.align = "center"
)

Overview

fdth builds frequency distribution tables (fdt) and their associated graphics from vectors, data frames, and matrices for both numerical and categorical variables.

Core functions:

| Function | Purpose | |---|---| | fdt() | Frequency table for numerical data | | fdt_cat() | Frequency table for categorical data | | make.fdt() | Reconstruct a table from frequencies alone | | make.fdt_cat() | Reconstruct a categorical table from frequencies | | mfv() | Most frequent value (mode) | | sd() / var() | Standard deviation / variance for grouped data |

library(fdth)

1. Numerical data — fdt()

1.1 Basic usage

set.seed(42)
x <- rnorm(200,
           mean = 10,
           sd = 2)

ft <- fdt(x)
ft

The default table has six columns:

| Column | Description | |---|---| | Class limits | Interval notation | | f | Absolute frequency | | rf | Relative frequency | | rf(%) | Relative frequency (%) | | cf | Cumulative frequency | | cf(%) | Cumulative frequency (%) |

1.2 Choosing the number of classes

# Sturges (default)
fdt(x, breaks = "Sturges")

# Scott
fdt(x, breaks = "Scott")

# Freedman-Diaconis
fdt(x, breaks = "FD")

# Fixed number of classes
fdt(x, k = 8)

1.3 Custom interval boundaries

# Fixed start, end and width
ft2 <- fdt(x,
           start = 4,
           end   = 16,
           h     = 2)
ft2

1.4 Formatting class limits

Use format.classes = TRUE together with pattern to control the number of decimal places displayed in the class limits:

# Two decimal places
print(ft,
      format.classes = TRUE,
      pattern        = "%.2f")

# Summary with the same formatting
summary(ft,
        format.classes = TRUE,
        pattern        = "%.2f")

1.5 Right-closed intervals

By default intervals are left-closed [a, b). Use right = TRUE for right-closed (a, b]:

fdt(x, right = TRUE)

1.6 Missing values

x_na <- c(x, 
          NA, 
          NA)

# This errors by design:
tryCatch(fdt(x_na), error = function(e) message("Error: ", e$message))

# Remove NAs explicitly:
fdt(x_na, na.rm = TRUE)

2. Plots — plot.fdt.default()

All plot types are selected with the type argument.

2.1 Absolute frequency histogram and polygon

plot(ft, 
     type = "fh", 
     main = "Frequency histogram")
plot(ft, 
     type = "fp", 
     main = "Frequency polygon")

2.2 Relative frequency (proportion and percentage)

plot(ft,
     type = "rfh",
     main = "Relative frequency histogram")
plot(ft,
     type = "rfph",
     main = "Relative frequency (%) histogram")

2.3 Density

plot(ft,
     type = "d",
     main = "Density histogram")

2.4 Cumulative frequency

plot(ft,
     type = "cfp",
     main = "Cumulative frequency polygon")
plot(ft,
     type = "cfpp",
     main = "Cumulative frequency (%) polygon")

2.5 Value labels on bars

plot(ft,
     type    = "fh",
     v       = TRUE,
     v.round = 0,
     main    = "Histogram with counts")

3. Summary statistics from grouped data

Once an fdt object exists, the usual statistics can be computed directly from the grouped (tabulated) data — no access to the original vector is needed.

ft3 <- fdt(x)

mean(ft3)
median(ft3)
mfv(ft3)          # mode(s)
var(ft3)
sd(ft3)

# Quartiles (default)
quantile(ft3)

# Deciles
quantile(ft3,
         i = 1:9,
         probs = seq(0,
                     1,
                     0.1))

4. Multiple numerical variables — fdt.data.frame()

When the input is a data frame or matrix, fdt() builds one table per numeric column and returns an fdt.multiple object.

4.1 All numeric columns

ft_iris <- fdt(iris[, 1:4])
ft_iris

4.2 Grouped by a factor

Use the by argument to stratify each numeric variable by a categorical column:

ft_by <- fdt(iris[, c(1, 2, 5)],
             k  = 5,
             by = "Species")
ft_by

4.3 Plotting multiple tables

plot(ft_iris, type = "fh")

4.4 Statistics on multiple tables

mean(ft_iris)

5. Categorical data — fdt_cat()

5.1 Basic usage

set.seed(7)
fruits <- sample(c("apple", 
                   "banana", 
                   "cherry",
                   "strawberry",
                   "melon"),
                 size = 150,
                 replace = TRUE)

ft_cat <- fdt_cat(fruits)
ft_cat

By default the table is sorted by descending frequency.

5.2 Preserving natural order

fdt_cat(fruits, sort = FALSE)

5.3 Formatting

print(ft_cat, round = 3)

5.4 Plots for categorical data

plot(ft_cat,
     type = "fb",
     main = "Frequency bar chart")
plot(ft_cat,
     type = "fd",
     main = "Frequency dotchart")
plot(ft_cat,
     type = "pa",
     main = "Pareto chart")

6. Reconstructing a table from frequencies

If the original data is no longer available but the frequency table is known, make.fdt() and make.fdt_cat() rebuild complete fdt objects.

# Numerical
ft_ref <- fdt(x)

ft_new <- make.fdt(f     = ft_ref$table$f,
                   start = ft_ref$breaks["start"],
                   end   = ft_ref$breaks["end"])

print(ft_new,
      format.classes = TRUE,
      pattern = "%.2f")
# Categorical
ft_new_cat <- make.fdt_cat(f = ft_cat$f,
                           categories = ft_cat$Category)
ft_new_cat

7. LaTeX export

For publication-ready LaTeX tables use xtable::xtable() on any fdt object. A dedicated vignette covers this workflow in detail:

vignette("latex_fdt", package = "fdth")

Session information

sessionInfo()


Try the fdth package in your browser

Any scripts or data that you put into this service are public.

fdth documentation built on May 26, 2026, 1:06 a.m.