The goal of {epikit} is to provide miscellaneous functions for applied epidemiologists. This is a product of the R4EPIs project; learn more at https://r4epis.netlify.app/.

Installation

You can install {epikit} from CRAN (see details for the latest version):

install.packages("epikit")

Click here for alternative installation options If there is a bugfix or feature that is not yet on CRAN, you can install it via the {drat} package:

You can also install the in-development version from GitHub using the {remotes} package (but there's no guarantee that it will be stable):

# install.packages("remotes")
remotes::install_github("R4EPI/epikit") 


library("epikit")

The {epikit} was primarily designed to house convenience functions for applied epidemiologists to use in tidying their reports. The functions in {epikit} come in a few categories:

Age categories

A couple of functions are dedicated to constructing age categories and partitioning them into separate chunks.

library("knitr")
library("magrittr")

set.seed(1)
x <- sample(0:100, 20, replace = TRUE)
y <- ifelse(x < 2, sample(48, 20, replace = TRUE), NA)
df <- data.frame(
  age_years = age_categories(x, upper = 80), 
  age_months = age_categories(y, upper = 16, by = 6)
)
df %>% 
  group_age_categories(years = age_years, months = age_months)

Quick proportions with conficence intervals

There are three functions that will provide quick statistics for different rates based on binomial estimates of proportions from binom::binom.wilson()

attack_rate(10, 50)
case_fatality_rate(2, 50)
mortality_rate(40, 50000)

In addition, it's possible to rapidly calculate Case fatality rate from a linelist, stratified by different groups (e.g. gender):

library("outbreaks")
case_fatality_rate_df(ebola_sim_clean$linelist, 
  outcome == "Death", 
  group = gender,
  add_total = TRUE,
  mergeCI = TRUE
)

Inline functions

The inline functions make it easier to print estimates with confidence intervals in reports with the correct number of digits.

The _df suffixes (fmt_ci_df(), fmt_pci_df()) will print the confidence intervals for data stored in data frames. These are designed to work with the outputs of the rates functions. For example, fmt_ci_df(attack_rate(10, 50)) will produce r fmt_ci_df(attack_rate(10, 50)). All of these suffixes will have three options e, l, and u. These refer to estimate, lower, and upper column positions or names.

Confidence interval manipulation

The confidence interval manipulation functions take in a data frame and combine their confidence intervals into a single character string much like the inline functions do. There are two flavors:

This is useful for reporting models:

fit <- lm(100/mpg ~ disp + hp + wt + am, data = mtcars)
df  <- data.frame(v = names(coef(fit)), e = coef(fit), confint(fit), row.names = NULL)
names(df) <- c("variable", "estimate", "lower", "upper")
print(df)

# unite CI has more options
unite_ci(df, "slope (CI)", estimate, lower, upper, m100 = FALSE, percent = FALSE)

# merge_ci just needs to know where the estimate is
merge_ci_df(df, e = 2)

Give me a break

If you need a quick function to determine the number of breaks you need for a grouping or color scale, you can use find_breaks(). This will always start from 1, so that you can include zero in your scale when you need to.

find_breaks(100) # four breaks from 1 to 100
find_breaks(100, snap = 20) # four breaks, snap to the nearest 20
find_breaks(100, snap = 20, ceiling = TRUE) # include the highest number

Pull together population counts

To quickly pull together population counts for use in surveys or demographic pyramids the gen_population() function can help. If you only know the proportions in each group the function will convert this to counts for you - whereas if you have counts, you can type those in directly. The default proportions are based on Doctors Without Borders general emergency intervention standard values.

# get population counts based on proportion, stratified
gen_population(groups = c("0-4","5-14","15-29","30-44","45+"), 
               strata = c("Male", "Female"), 
               proportions = c(0.079, 0.134, 0.139, 0.082, 0.067))

Type in counts directly to get the groups in a data frame.

# get population counts based on counts, stratified - type out counts
# for each group and strata
gen_population(groups = c("0-4","5-14","15-29","30-44","45+"), 
               strata = c("Male", "Female"), 
               counts = c(20, 10, 30, 40, 0, 0, 40, 30, 20, 20))

Table modification

These functions all modify the appearance of a table displayed in a report and work best with the knitr::kable() function.

df <- data.frame(
  `a n` = 1:6,
  `a prop` = round((1:6) / 6, 2),
  `a deff` = round(pi, 2),
  `b n` = 6:1,
  `b prop` = round((6:1) / 6, 2),
  `b deff` = round(pi * 2, 2),
  check.names = FALSE
)
knitr::kable(df)
df %>%
  rename_redundant("%" = "prop", "Design Effect" = "deff") %>%
  augment_redundant(" (n)" = " n$") %>%
  knitr::kable()


Try the epikit package in your browser

Any scripts or data that you put into this service are public.

epikit documentation built on Feb. 16, 2023, 7:42 p.m.