The goal of {epikit} is to provide miscellaneous functions for applied epidemiologists. This is a product of the R4EPIs project; learn more at https://r4epis.netlify.com.
You can install {epikit} from CRAN (see details for the latest version):
install.packages("epikit")
Click here for alternative installation options
If there is a bugfix or feature that is not yet on CRAN, you can install it via
the {drat} package:
You can also install the in-development version from GitHub using the {remotes} package (but there's no guarantee that it will be stable):
# install.packages("remotes") remotes::install_github("R4EPI/epikit")
library("epikit")
The {epikit} was primarily designed to house convenience functions for applied epidemiologists to use in tidying their reports. The functions in {epikit} come in a few categories:
A couple of functions are dedicated to constructing age categories and partitioning them into separate chunks.
age_categories()
takes in a vector of numbers and returns formatted age categories.group_age_categories()
will take a data frame with different age
categories in columns (e.g. years, months, weeks) and combine them into a
single column, selecting the column with the lowest priority.library("knitr") library("magrittr") set.seed(1) x <- sample(0:100, 20, replace = TRUE) y <- ifelse(x < 2, sample(48, 20, replace = TRUE), NA) df <- data.frame( age_years = age_categories(x, upper = 80), age_months = age_categories(y, upper = 16, by = 6) ) df %>% group_age_categories(years = age_years, months = age_months)
The inline functions make it easier to print estimates with confidence intervals in reports with the correct number of digits.
fmt_ci()
formats confidence intervals from three numbers. (e.g. fmt_ci(50, 10, 80)
produces r fmt_ci(50, 10, 80)
fmt_pci()
formats confidence intervals from three fractions, multiplying by 100 beforehand.The confidence interval manipulation functions take in a data frame and combine their confidence intervals into a single character string much like the inline functions do. There are two flavors:
merge_ci_df()
and merge_pci_df()
will merge just the values of the confidence
interval and leave the estimate alone. Note: this WILL remove the lower and
upper columns.unite_ci()
merges both the confidence interval and the estimate into a
single character column. This generally has more options than merge_ci()
This is useful for reporting models:
fit <- lm(100/mpg ~ disp + hp + wt + am, data = mtcars) df <- data.frame(v = names(coef(fit)), e = coef(fit), confint(fit), row.names = NULL) names(df) <- c("variable", "estimate", "lower", "upper") print(df) # unite CI has more options unite_ci(df, "slope (CI)", estimate, lower, upper, m100 = FALSE, percent = FALSE) # merge_ci just needs to know where the estimate is merge_ci_df(df, e = 2)
If you need a quick function to determine the number of breaks you need for a
grouping or color scale, you can use find_breaks()
. This will always start
from 1, so that you can include zero in your scale when you need to.
find_breaks(100) # four breaks from 1 to 100 find_breaks(100, snap = 20) # four breaks, snap to the nearest 20 find_breaks(100, snap = 20, ceiling = TRUE) # include the highest number
To quickly pull together population counts for use in surveys or demographic
pyramids the gen_population()
function can help.
If you only know the proportions in each group the function will convert this to
counts for you - whereas if you have counts, you can type those in directly.
The default proportions are based on Doctors Without Borders general emergency
intervention standard values.
# get population counts based on proportion, stratified gen_population(groups = c("0-4","5-14","15-29","30-44","45+"), strata = c("Male", "Female"), proportions = c(0.079, 0.134, 0.139, 0.082, 0.067))
Type in counts directly to get the groups in a data frame.
# get population counts based on counts, stratified - type out counts # for each group and strata gen_population(groups = c("0-4","5-14","15-29","30-44","45+"), strata = c("Male", "Female"), counts = c(20, 10, 30, 40, 0, 0, 40, 30, 20, 20))
These functions all modify the appearance of a table displayed in a report and
work best with the knitr::kable()
function.
rename_redundant()
renames redundant columns with a single name. (e.g. hopitalized_percent
and confirmed_percent
can both be renamed to %
)augment_redundant()
is similar to rename_redundant()
, but it modifies the redundant column names (e.g. hospitalized_n
and confirmed_n
can become hospitalized (n)
and confirmed (n)
)merge_ci()
combines estimate, lower bound, and upper bound columns into a single column.df <- data.frame( `a n` = 1:6, `a prop` = round((1:6) / 6, 2), `a deff` = round(pi, 2), `b n` = 6:1, `b prop` = round((6:1) / 6, 2), `b deff` = round(pi * 2, 2), check.names = FALSE ) knitr::kable(df) df %>% rename_redundant("%" = "prop", "Design Effect" = "deff") %>% augment_redundant(" (n)" = " n$") %>% knitr::kable()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.