Working with the tsg package

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
library(tsg)

Throughout the examples, we will use the person_record sample dataset, which is included in the tsg package. This dataset contains demographic information about individuals, including person_id, sex, age, marital_status, employed status, and functional difficulties.

dim(person_record)
head(person_record)

Generate frequency table

The generate_frequency() function creates frequency tables for one or more categorical variables in a data frame. It supports a variety of enhancements, such as sorting, adding totals and percentages, handling missing values, and customizing labels. This function is highly versatile and can work with grouped data, outputting either a single table or a list of tables.

Basic usage

person_record |> 
  generate_frequency(sex)

Multiple variables

If you pass multiple variables, it will generate frequency tables for each variable separately in a list.

person_record |>
  generate_frequency(sex, age, marital_status)

Grouping

You can also specify grouping using the group_by() from dplyr and it will calculate the frequency table for each group.

person_record |>
  dplyr::group_by(sex) |>
  generate_frequency(marital_status)

By default, the function will generate a single frequency table for the grouped data. If you want to generate a list of frequency tables for each group, you can set group_as_list = TRUE.

person_record |>
  dplyr::group_by(sex) |>
  generate_frequency(marital_status, group_as_list = TRUE)

Sorting

By default, the output is sorted by frequency in descending order. If sort_value is set to FALSE, the output will be sorted by the variable values in ascending order.

person_record |>
  generate_frequency(age, sort_value = TRUE)

person_record |>
  generate_frequency(age, sort_value = FALSE)

If multiple variables are specified, you can indicate which variable/s is/are excluded from sorting using the sort_except argument.

person_record |>
  generate_frequency(
    sex, 
    age, 
    marital_status, 
    # vector of variable names (character) to exclude from sorting
    sort_except = "age" 
  )

Top n values

You can specify the top n most frequent values to display in the frequency table, if sort_value is TRUE. By default, it will show top-n values plus the remaining values grouped into "Others".

person_record |>
  generate_frequency(
    marital_status,
    top_n = 3
  )

If you want to show only the top-n values and exclude the rest, set top_n_only = TRUE.

person_record |>
  generate_frequency(
    marital_status, 
    top_n = 3,
    top_n_only = TRUE
  )

Handling missing values

You can also specify whether to include or exclude NAs (missing values) from the frequency table.

person_record |>
  generate_frequency(
    employed,
    include_na = TRUE # default
  )

# Exclude NA values
person_record |>
  generate_frequency(
    employed,
    include_na = FALSE
  )

Collapse list

If the all variables passed to generate_frequency() are of the same structure (i.e. have the same number of levels or categories), you can collapse them into a single frequency table by setting collapse_list = TRUE.

person_record |>
  generate_frequency(
    seeing,
    hearing,
    walking,
    remembering,
    self_caring,
    communicating, 
    collapse_list = TRUE
  )

Or equivalently using the collapse_list() helper function.

person_record |>
  generate_frequency(
    seeing,
    hearing,
    walking,
    remembering,
    self_caring,
    communicating
  ) |> 
  collapse_list()

More options

You can also add cumulative frequency and percentage to the frequency table.

person_record |>
  generate_frequency(
    sex, 
    add_cumulative = TRUE, 
    add_cumulative_percent = TRUE 
  )

You can also specify whether to express the value as a proportion.

person_record |>
  generate_frequency(
    marital_status,
    as_proportion = TRUE
  )

You can also position the total row at the top of the table.

person_record |>
  generate_frequency(
    marital_status,
    position_total = "top"
  )

NOTE: For labelled data, the value for the row total is automatically set the lowest numeric value. The default label for the total row is "Total"; if you want to set a custom label for the total row, you can use the label_total argument.

Generate cross-tabulation

The generate_crosstab() function allows you to create cross-tabulations between two variables, which is useful for exploring relationships between categorical variables.

Basic usage

person_record |>
  generate_crosstab(marital_status, sex)

NOTE: If you pass only one variable, it will fall back to generate_frequency() and generate a frequency table for variable specified.

Multiple variables

If you pass mutliple variables, it will generate cross-tabulations for each pair of variables separately in a list.

person_record |>
  generate_crosstab(
    sex,
    seeing,
    hearing,
    walking,
    remembering,
    self_caring,
    communicating
  )

Grouping

You can also specify grouping with group_by() from dplyr and it will calculate the cross-tabulation for each group.

person_record |>
  dplyr::group_by(sex) |>
  generate_crosstab(marital_status, employed)

If you want to generate a list of cross-tabulations for each group, you can set group_as_list = TRUE.

person_record |>
  dplyr::group_by(sex) |>
  generate_crosstab(marital_status, employed, group_as_list = TRUE)

Percent or proportion by row or column

You can specify whether to calculate the percentage or proportion by row or column using the percent_by_column argument. If it is set to TRUE, the percentage will be calculated by column; if set to FALSE, it will be calculated by row. The default is FALSE.

person_record |>
  generate_crosstab(
    marital_status,
    sex,
    percent_by_column = TRUE
  )

More options

Just like generate_frequency(), you can also specify whether to express the value as a proportion.

person_record |>
  generate_crosstab(
    marital_status,
    sex,
    as_proportion = TRUE
  )

You can also position the total row at the top of the table.

person_record |>
  generate_crosstab(
    marital_status,
    sex,
    position_total = "top"
  )

Generate output

You can export your frequency table or cross-tabulation to Excel using the write_xlsx().

Basic usage

person_record |> 
  generate_frequency(sex) |> 
  write_xlsx(path = "table-01.xlsx")

Add table info

You can add a title and subtitle to your table using the add_table_title() and add_table_subtitle() functions.

person_record |> 
  generate_crosstab(marital_status, sex) |> 
  add_table_title("Marital Status by Sex") |>
  add_table_subtitle("Sample dataset: person_record") |>
  write_xlsx(path = "table-02.xlsx")

You can also add end notes to your table using the add_source_note() and add_footnote() functions.

person_record |> 
  generate_crosstab(marital_status, sex) |> 
  add_table_title("Marital Status by Sex") |>
  add_table_subtitle("Sample dataset: person_record") |>
  add_source_note("Source: person_record dataset") |>
  add_footnote("This is a footnote for the table") |>
  write_xlsx(path = "table-03.xlsx")

Alternatively, you can directly add table title, subtitle, source_note, and footnotes by specifying them in the arguments of the write_xlsx() function.

person_record |> 
  generate_crosstab(marital_status, sex) |> 
  write_xlsx(
    path = "table-03.xlsx",
    table_title = "Marital Status by Sex",
    table_subtitle = "Sample dataset: person_record",
    source_note = "Source: person_record dataset",
    footnotes = "This is a footnote for the table"
  )

Facade

You can use the add_facade() function to apply a facade to your table. A facade is a set of styling options that can be applied to the table to customize its appearance.

person_record |> 
  generate_frequency(sex) |> 
  add_facade(
    table.offsetRow = 2, 
    table.offsetCol = 1
  ) |> 
  write_xlsx(
    path = "table-04.xlsx",
    # Using built-in facade
    facade = get_tsg_facade("yolo")
  )

If you want to further customize the appearance of your table, you can use the facade argument to specify a YAML facade file. The facade file contains styling options for the table, such as font size, border style, background color, and text alignment.

person_record |> 
  generate_frequency(sex) |> 
  write_xlsx(
    path = "table-05.xlsx",
    # Using built-in facade
    facade = get_tsg_facade("yolo")
  )

You can generate a template facade file using the generate_template() function and then customize it to your needs.

The generate_output() function

generate_output() can be used to generate and save the output file in the specified format (e.g., Excel, HTML, PDF, Word). It supports various formats and can handle different data structures.

person_record |> 
  generate_frequency(sex) |> 
  generate_output(path = "table-06.xlsx")

NOTE: At the moment, it only supports Excel output. The other formats are not yet implemented.



Try the tsg package in your browser

Any scripts or data that you put into this service are public.

tsg documentation built on Feb. 22, 2026, 5:08 p.m.