{farrago} is an R package serving as a collection of tools for data workflows and analysis, with focus on health surveillance data. Although {farrago} primarily serves as a personal collection of odds-and-ends picked-up or created over the past several years, it may assist wider audiences as well. The package is organized by general purpose/functionality, which may eventually be separated into discrete packages.
{farrago} is only available from GitHub and the latest version can be installed with:
# install.packages("devtools")
devtools::install_github("al-obrien/farrago")
{farrago} has a variety of functions available. They are roughly organized into the following categories:
Calculations: provides algorithms for some routine processes such as…
Determining pregnancy trimesters
Assigning episode periods (e.g. for repeat infections)
Collapsing time-steps
Determining overlaps in time
Basic metrics such as rates, max/min, etc.
Conversions: helper functions to convert between common formats in epidemiology
Replace all blank values to NA (e.g. when importing data from SAS)
Switch between flu and calendar weeks
Determine flu season from date
Quickly convert a table to image (png)
Basic conversions such as from numbers to percent, number to factor, etc.
Creation: generate new content
Make multi-level factors similar to SAS ‘multi-label’ functionality
Create hypercubes (i.e. n-dimensional table including group summaries and totals)
Determine break points from set of values
Transferal: methods to move objects and data
Easily stow()
and retrieve()
data-sets to make efficient use
of RAM
File transfer using WinSCP wrapper
Locate files
Pass code and retrieve data from SAS (primarily for use with Classic 9.4)
Plotting: helper functions for shared legends and less common plots such as bulls-eye charts and X-splines
Miscellaneous
This is a basic example using a sub-set of functions from {farrago}…
# Load libraries
library(farrago)
library(magrittr)
library(dplyr)
library(lubridate)
# Download from configured SFTP location
transfer_winscp(file ='my_rmt_file.csv'),
direction = 'download',
connection = 'sftp://myusername:mypwd@hostlocation.ca/'
rmt_path = './location/',
drop_location = 'C:/PATH/TO/DESIRED/FOLDER/')
# Non-sense data for example
my_rmt_file <- tibble::tribble(~grp_id, ~date, ~date_of_birth, ~condition, ~date_of_birth_child,
1, '2020-01-01', '1970-06-04', 'alive', '1991-01-01',
1, '2020-01-01', '1980-04-05', '', '1990-02-04',
1, '2020-01-03', '1930-04-05', 'alive', '',
1, '2020-01-04', '1967-04-05', 'alive', '1998-01-21',
2, '2020-01-01', '1978-04-05', 'alive', '1998-06-21',
2, '2020-09-10', '1970-04-05', 'alive', '1992-09-13',
2, '2020-09-21', '1949-04-05', 'dead', '1987-01-03',
3, '2020-01-01', '1977-04-05', '', '1992-01-21',
3, '2020-01-02', '1944-04-05', 'alive', '',
3, '2020-01-21', '1943-06-05', 'alive', '1967-09-12',
3, '2020-01-22', '1969-07-05', 'alive', '2006-12-21',
3, '2020-04-22', '', NA, NA,
3, '2021-06-09', '1978-09-21', 'dead', '1992-01-21') %>%
dplyr::mutate_at(vars(contains('date')), ymd)
# Remove blanks
my_rmt_file <- convert_blank2NA(my_rmt_file)
# Determine episode period based on first date by group
my_rmt_file$episode <- assign_episode(data = my_rmt_file,
grp_id = grp_id,
date = date,
threshold = 10)
# Determine age and age group from date
my_rmt_file$age <- calculate_age(my_rmt_file$date_of_birth)
my_rmt_file$age_grp <- create_breaks(my_rmt_file$age, breaks = c(0, 10, 20, 30, 40, 50, 60, 70, 80, 90), format = TRUE)
# Calculate trimester based on dob (of child)
my_rmt_file <- calculate_trimesters(my_rmt_file, date_of_birth_child)
#> Warning in calculate_trimesters(my_rmt_file, date_of_birth_child): No variable
#> for gestation length was provided, all pregnancies will assume the average
#> pregnancy length of: 40
# View final dataset
knitr::kable(my_rmt_file)
| grp_id | date | date_of_birth | condition | date_of_birth_child | episode | age | age_grp | tri1_s | tri1_e | tri2_s | tri2_e | tri3_s | preterm | |--------:|:-----------|:----------------|:----------|:-----------------------|--------:|----:|:---------|:-----------|:-----------|:-----------|:-----------|:-----------|--------:| | 1 | 2020-01-01 | 1970-06-04 | alive | 1991-01-01 | 1 | 51 | 50-59 | 1990-03-27 | 1990-06-26 | 1990-06-27 | 1990-09-26 | 1990-09-27 | 0 | | 1 | 2020-01-01 | 1980-04-05 | NA | 1990-02-04 | 1 | 41 | 40-49 | 1989-04-30 | 1989-07-30 | 1989-07-31 | 1989-10-30 | 1989-10-31 | 0 | | 1 | 2020-01-03 | 1930-04-05 | alive | NA | 1 | 91 | >=90 | NA | NA | NA | NA | NA | NA | | 1 | 2020-01-04 | 1967-04-05 | alive | 1998-01-21 | 1 | 54 | 50-59 | 1997-04-16 | 1997-07-16 | 1997-07-17 | 1997-10-16 | 1997-10-17 | 0 | | 2 | 2020-01-01 | 1978-04-05 | alive | 1998-06-21 | 1 | 43 | 40-49 | 1997-09-14 | 1997-12-14 | 1997-12-15 | 1998-03-16 | 1998-03-17 | 0 | | 2 | 2020-09-10 | 1970-04-05 | alive | 1992-09-13 | 2 | 51 | 50-59 | 1991-12-08 | 1992-03-08 | 1992-03-09 | 1992-06-08 | 1992-06-09 | 0 | | 2 | 2020-09-21 | 1949-04-05 | dead | 1987-01-03 | 3 | 72 | 70-79 | 1986-03-29 | 1986-06-28 | 1986-06-29 | 1986-09-28 | 1986-09-29 | 0 | | 3 | 2020-01-01 | 1977-04-05 | NA | 1992-01-21 | 1 | 44 | 40-49 | 1991-04-16 | 1991-07-16 | 1991-07-17 | 1991-10-16 | 1991-10-17 | 0 | | 3 | 2020-01-02 | 1944-04-05 | alive | NA | 1 | 77 | 70-79 | NA | NA | NA | NA | NA | NA | | 3 | 2020-01-21 | 1943-06-05 | alive | 1967-09-12 | 2 | 78 | 70-79 | 1966-12-06 | 1967-03-07 | 1967-03-08 | 1967-06-07 | 1967-06-08 | 0 | | 3 | 2020-01-22 | 1969-07-05 | alive | 2006-12-21 | 2 | 52 | 50-59 | 2006-03-16 | 2006-06-15 | 2006-06-16 | 2006-09-15 | 2006-09-16 | 0 | | 3 | 2020-04-22 | NA | NA | NA | 3 | NA | NA | NA | NA | NA | NA | NA | NA | | 3 | 2021-06-09 | 1978-09-21 | dead | 1992-01-21 | 4 | 43 | 40-49 | 1991-04-16 | 1991-07-16 | 1991-07-17 | 1991-10-16 | 1991-10-17 | 0 |
# Save file for easy retrieval later
my_rmt_file_stowed <- stow(my_rmt_file, cleanup = TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.