knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

Purpose

The pupose of this package is calculate test utilization rate normalization (TURN), of respiratory pathogen data.

A note on data inputs

The turnr package as it stands now is very particular about inputs to the functions. In an effort to limit functions breaking in unforseen ways I've devoloped them to throw errors if the input columns etc aren't as expected, this comes at the expense of flexibility. A note of caution is that this package is still being developed and breaking changes can be expected. To get documentation about any of the turnr functions run ?function_name.

Installation

If turnr hasn't already been installed run:

install.packages("devtools")

devtools::install_github("https://github.com/MartinHoldrege/turnr")

Alternatively, run the following code to also build this vignette (slower because additional dependencies installed).

devtools::install_github("https://github.com/MartinHoldrege/turnr", build_vignettes = TRUE)

Processing data

turnr comes with a built in fake data set, that will be used here. To read more about the dataset run ?rp_raw. Data used by this package would need to have this same form, with the same columns.

library(turnr)
library(dplyr)
head(rp_raw)

First, run a check on the data set to see if it has the elements necessary required by subsequent functions.

initial_check(rp_raw)

Extract site information from the raw data file (to be used by later functions).

site_info <-  get_site_info(rp_raw)
# add cdc/hhs regions--this could be used for later grouping
site_info$cdc_region <- cdc_region(state_abb = site_info$Region)
head(site_info)

Now, pre-process the data, parse columns etc. Additionally, when no pathogen is detected a new TargetName of 'negative' is added. See ?pre-process. This function also gives you the ability to replace TargetName's with synonyms.

df1 <- pre_process(rp_raw)
head(df1)

Replace co-detections of pathogens with new TargetName, 'co-detection'. For example, if a test detected pathogens 'a' and 'b', these TargetNames would just be replaced with 'co-detection'. Note, this step can be skipped if you want to keep the total counts of each pathogen. The benefit of doing this step is that then negatives, co-detections, and detections of single pathogens sum to the total test utilization rate.

df2 <- co_detection(df1)

unique(df2$TargetName)[19]

Calculate test utilization and active instruments by site

This step calculates the number respiratory panel tests, the number of all tests, aper day/SiteID/InstrumentVersion, as well as the number of instruments active in the previous 3 months. A seperate InsrumentVersion is also created called 'all', which is all instrument versions combined.

TUR_dat <- calc_active_instruments(df2)
head(TUR_dat)

Get pathogen counts

Calculate the number of each pathogen per date, SiteID and InstrumentVersion.

path_dat <- calc_count_by_site_inst(df2)
head(path_dat)

TURN

Overall TURN

First calc_TURN should be run, to get the national TURN, as well as the mean tests/adjusted active instrument for each InstrumentVersion, which will be needed when calculating regional TURN or TURN for pathogens.

l <- calc_TURN(TUR_dat, return_means = TRUE)
# mean of Y for each instrument version
means <- l$means
means
# data frame contaning national turn
TURN_national <- l$df

# calculating TURN by cdc region
cdc_turn <- calc_TURN(TUR_dat,
                        group_vars = dplyr::vars(InstrumentVersion, cdc_region),
                        site_info = site_info,
                        means = means)

Calculate TURN for individual pathogens

Calculation of normalization of the number of each pathogen.

# calculate TURN for each pathogen -- national
path_TURN_national <- calc_path_TURN(df = path_dat,
               TURN_df = TURN_national,
               means = means)

# calculate TURN for each pathogen -- regional
path_TURN_cdc <-  calc_path_TURN(df = path_dat,
               TURN_df = cdc_turn,
               means = means,
               group_vars = vars(InstrumentVersion, cdc_region),
               site_info = site_info)

Simplify final dataframes

The output from the previous functions creates rows and columns that aren't necessarily that important to keep.

head(path_TURN_national)

To simplify the output use the clean_TURN_output function. For example

# national overall turn
TURN1 <- clean_TURN_output(TURN_national)
head(TURN1)

# regional
TURN2 <- clean_TURN_output(cdc_turn, extra_cols = "cdc_region")
head(TURN2)

# contains overall turn and turn by pathogen 'path_TURN' column

# national
path_TURN1 <- clean_TURN_output(path_TURN_national, is_path = TRUE)
head(path_TURN1)

# regional
path_TURN2 <- clean_TURN_output(path_TURN_cdc, extra_cols = "cdc_region",
                                is_path = TRUE)
head(path_TURN2)

Plotting national turn

plot(TURN ~ epidate, data = TURN1, type = 'l', main = "TURN of fake RP data")

Plotting TUR (total tests/week) for comparison--this is unsmoothed (no 3 week moving window average taken)

plot(epi_TUR_rp ~ epidate,
     TURN_national[TURN_national$InstrumentVersion == "all", ],
     type = "l")

Example for calculating TURN for a different PouchTitle

These functions work with piping (%>%) from dplyr.

GI_TURN <- rp_raw %>% 
  pre_process(target_PouchTitle = "Gastro_Intestinal") %>% 
  calc_active_instruments(target_PouchTitle = "Gastro_Intestinal") %>% 
  calc_TURN() 

GI_TURN_clean <- GI_TURN %>% 
  clean_TURN_output()

plot(TURN ~ epidate, data = GI_TURN_clean, type = 'l', main = "TURN of fake GI data")

Plotting TUR of GI data for comparison--this is unsmoothed (no 3 week moving windo average taken)

plot(epi_TUR_rp ~ epidate,
     GI_TURN[GI_TURN$InstrumentVersion == "all", ],
     type = "l")


MartinHoldrege/turnr documentation built on May 16, 2020, 10:39 a.m.