tab_functions: Tabulate counts and proportions
In R4EPI/tuni: Tables for Epidemiological Analysis

tab_linelist

R Documentation

Tabulate counts and proportions

Description

Tabulate counts and proportions

Usage

tab_linelist(
  x,
  ...,
  strata = NULL,
  keep = TRUE,
  drop = NULL,
  na.rm = TRUE,
  prop_total = FALSE,
  row_total = FALSE,
  col_total = FALSE,
  wide = TRUE,
  transpose = NULL,
  digits = 1,
  pretty = TRUE
)

tab_survey(
  x,
  ...,
  strata = NULL,
  keep = TRUE,
  drop = NULL,
  na.rm = TRUE,
  prop_total = FALSE,
  row_total = FALSE,
  col_total = FALSE,
  wide = TRUE,
  transpose = NULL,
  digits = 1,
  method = "logit",
  deff = FALSE,
  pretty = TRUE
)

Arguments

`x`	a `data.frame()` or tbl_svy object
`...`	categorical variables to tabulate
`strata`	a stratifier to split the data
`keep`	a character vector specifying which values to retain in the tabulation. Defaults to `TRUE`, which keeps all the values.
`drop`	a character vector specifying which values to drop in the tabulation. Defaults to `NULL`, which keeps all values.
`na.rm`	When `TRUE` (default), missing (NA) values present in `var` will be removed from the data set with a warning, causing a change in denominator for the tabulations. Setting this to `FALSE` creates an explicit missing value called "(Missing)".
`prop_total`	if `TRUE` and `strata` is not `NULL`, then the totals of the rows will be reported as proportions of the total data set, otherwise, they will be proportions within the stratum (default).
`row_total`	create a new column with the total counts for each row of stratified data.
`col_total`	create a new row with the total counts for each column of stratified data.
`wide`	if `TRUE` (default) and strata is defined, then the results are presented in a wide table with each stratification counts and estimates in separate columns. If `FALSE`, then the data will be presented in a long format where the counts and estimates are presented in single columns. This has no effect if strata is not defined.
`transpose`	if `wide = TRUE`, then this will transpose the columns to the rows, which is useful when you stratify by age group. Default is `NULL`, which will not transpose anything. You have three options for transpose: `transpose = "variable"`: uses the variable column, (dropping values if strata exists). Use this if you know that your values are all identical or at least identifiable by the variable name. `transpose = "value"` : uses the value column, (dropping variables if strata exists). Use this if your values are important and the variable names are generic placeholders. `transpose = "both"` : combines the variable and value columns. Use this if both the variables and values are important.
`digits`	(survey only) if `pretty = FALSE`, this indicates the number of digits used for proportion and CI
`pretty`	(survey only) if `TRUE`, default, the proportion and CI are merged
`method`	(survey only) a method from `survey::svyciprop()` to calculate the confidence interval. Defaults to "logit".
`deff`	a logical indicating if the design effect should be reported. Defaults to `TRUE`.

Value

a tibble::tibble() with a column for variables, a column for values, and counts and proportions. If strata is not NULL and wide = TRUE, then there will be separate columns for each strata for the counts and proportions. Survey data will report confidence intervals.

Examples

have_packages <- require("matchmaker") & require("epidict")

if (have_packages) {
  withAutoprint({

    # Simulating linelist data

    linelist <- epidict::gen_data("Measles", numcases = 1000, org = "MSF")
    measles_dict <- epidict::msf_dict("Measles", compact = FALSE)

    # Cleaning linelist data
    linelist_clean <- matchmaker::match_df(
      x = linelist,
      dictionary = measles_dict,
      from = "option_code",
      to = "option_name",
      by = "data_element_shortname",
      order = "option_order_in_set"
    )

    # get a descriptive table by sex
    tab_linelist(linelist_clean, sex)

    # describe prenancy statistics, but remove missing data from the tally
    tab_linelist(linelist_clean, trimester, na.rm = TRUE)

    # describe by symptom

    tab_linelist(linelist_clean,
      cough, nasal_discharge, severe_oral_lesions,
      transpose = "value"
    )
    # describe prenancy statistics, stratifying by vitamin A perscription
    tab_linelist(linelist_clean, trimester, sex,
      strata = prescribed_vitamin_a,
      na.rm = TRUE, row_total = TRUE
    )
  })
}

have_survey_packages <- require("survey") && require("srvyr")
if (have_survey_packages) {
  withAutoprint({
    data(api)

    # stratified sample
    surv <- apistrat %>%
      as_survey_design(strata = stype, weights = pw)

    s <- surv %>%
      tab_survey(awards, strata = stype, col_total = TRUE, row_total = TRUE, deff = TRUE)
    s

    # making things pretty
    s %>%
      # wrap all "n" variables in braces (note space before n).
      epikit::augment_redundant(" (n)" = " n") %>%
      # relabel all columns containing "prop" to "% (95% CI)"
      epikit::rename_redundant(
        "% (95% CI)" = ci,
        "Design Effect" = deff
      )

    # long data
    surv %>%
      tab_survey(awards, strata = stype, wide = FALSE)

    # tabulate binary variables
    surv %>%
      tab_survey(yr.rnd, sch.wide, awards, keep = "Yes")

    # stratify the binary variables
    surv %>%
      tab_survey(yr.rnd, sch.wide, awards,
        strata    = stype,
        keep      = "Yes"
      )

    # invert the tabulation
    surv %>%
      tab_survey(yr.rnd, sch.wide, awards,
        strata = stype,
        drop = "Yes",
        deff = TRUE,
        row_total = TRUE
      )
  })
}

R4EPI/tuni documentation built on March 20, 2023, 4:37 p.m.