knitr::opts_chunk$set(echo = TRUE, fig.height = 6, fig.width = 8, message = FALSE, warning = FALSE)
In 2017, HIV/AIDS was responsible for the deaths of one million people globally, including 50,000 children less than one year old [@GBD2017paper; @GBD2017web]. Although mathematical modeling has provided important insights into the dynamics of HIV infection during anti-retroviral treatment (ART), there is still a lack of accessible tools for researchers unfamiliar with modeling techniques to apply them to their own datasets.
ushr
is an open-source R package that models the decline of HIV during ART using a popular mathematical framework. The package can be applied to longitudinal data of viral load measurements, and automates all stages of the model fitting process. By mathematically fitting the data, important biological parameters can be estimated, including the lifespans of short and long-lived HIV-infected cells, and the time to suppress viral load below a defined detection threshold. The package also provides visualization and summary tools for fast assessment of model results.
Overall, we hope ushr
will increase accessibility to mathematical modeling techniques so that greater insights on HIV infection and treatment dynamics may be gained.
Sinead E Morris (author and maintainer), Luise Dziobek-Garrett (contributor) and Andrew J Yates (contributor).
Citation information can be found using citation("ushr")
; the package paper is open access and available at BMC Bioinformatics.
Versions 0.2.2 and 0.2.3
Version 0.2.1
Version 0.2.0:
?ushr_triphasic()
)If you encounter any bugs related to this package please contact the package author directly. Additional descriptions of the model and analyses performed by the package can be found in the vignette; details are also be available in Morris et al. (2020) BMC Bioinformatics. Further details on the mathematical theory can also be found in the references cited below.
Please read the package vignette for full details on the mathematical model and its implementation in ushr
, including data processing, model fitting, and parameter estimation.
HIV dynamics in an infected individual can be mathematically described as the production and spread of virus by two groups of infected target cells: so called 'short-lived' infected cells that die at a fast rate (such as CD4 T cells), and other 'long-lived' infected cells that die more slowly (Fig A).
{width=650px}
Once ART has begun, the decline of HIV viral load, $V$, can be modeled using the following expression
$V(t)~ =$ A $\exp(-\delta ~t) ~+$ B $\exp(- \gamma~ t),$
where $\delta$ and $\gamma$ are the death rates of short and long-lived infected cells, respectively [@perelson1996hiv; @perelson1997a; @nowak2000book; @Shet2016]. This equation is referred to as the biphasic model: viral decay is fast initially, reflecting the loss of short-lived infected cells (at rate $\delta$), but then enters a slower decline phase reflecting the loss of long-lived infected cells (at rate $\gamma$) (Fig B). Eventually, viral load is suppressed below the detection threshold of the experiment (dashed line, Fig B). Note that for patient data exhibiting only one decline phase (for example, due to sparse or delayed viral load measurements), one can use a single phase version of the biphasic model given by
$V(t) =$ B̂ $\exp(-$γ̂ $t)$,
where decay could reflect the fast or the slow phase of virus suppression.
By fitting the model, we can estimate the death rate parameters and use these to calculate the lifespans of infected cells: $1/\delta$ and $1/\gamma$ for short and long-lived infected cells from the biphasic model, and 1/γ̂ for the single phase model. We can also estimate the time taken to suppress viral load ('time to suppression' (TTS)) by calculating the first time at which $V(t) = x$, where $x$ is a user-defined suppression threshold, and $V(t)$ is given by either the biphasic or single phase equation.
To install the package from CRAN use
install.packages("ushr")
The vignette can be viewed here. To install the package from Github, first install and load devtools
, then install ushr
as follows
install.packages("devtools") library("devtools") install_github("SineadMorris/ushr")
Note that to install the package from Github with its vignette, you must first install knitr
, then install ushr
with build_vignettes = TRUE
. The package vignette can then be viewed using browseVignettes()
.
install.packages("knitr") install_github("SineadMorris/ushr", build_vignettes = TRUE) browseVignettes(package = "ushr")
To illustrate basic usage of the package, we include a publicly available data set from the ACTG315 clinical trial. The raw data (actg315raw
) consist of longitudinal HIV viral load measurements from 46 chronically-infected adults up to 28 weeks following ART initiation. The detection threshold was 100 copies/ml and observations are recorded as $\log_{10}$ RNA copies/ml. These data are available online (originally accessed 2019-09-15) and have been described previously [@Lederman1998; @wu1999biometrics; @Connick2000].
To begin, we load the package and print the first six rows of the raw data to identify our columns of interest; these are the viral load observations ('log.10.RNA.'), the timing of these observations ('Day'), and the identifier for each subject ('Patid').
library(ushr) print(head(actg315raw))
Since ushr
requires absolute viral load (VL) measurements, and specific column names ('vl', 'time', 'id'), we first back-transform the $\log_{10}$ viral load measurements into absolute values, and rename the column headings.
actg315 <- actg315raw %>% mutate(vl = 10^log10.RNA.) %>% select(id = Patid, time = Day, vl) print(head(actg315))
We can then visualize these data using the plot_data()
function. The detection_threshold
argument defines the detection threshold of the measurement assay.
plot_data(actg315, detection_threshold = 100)
Each panel represents a different individual and the dashed horizontal line is the assay detection threshold. We can see that the data is noisy, individuals have different numbers of observations, and only a subset suppress viral load below the detection threshold.
To fit the model to these data in just one line of code we use the ushr()
function. This processes the data to filter out any individuals who do not suppression viral load, or who violate other inclusion criteria (described in the vignette), and then fits the model to each remaining trajectory. Note that only subjects with a minimum number of measurements above the detection threshold can reliably be fit. The number can be specified by the user, but we recommend at least six observations for the biphasic model and three for the single phase model. Note the censor_value
argument specifies how measurements below the detection threshold should be treated (here we set them to half of the detection theshold in line with previous work [@wu1999characterization]).
model_output <- ushr(data = actg315, detection_threshold = 100, censor_value = 50)
With the fitted model output, we can then plot both the biphasic and single phase fits as follows
plot_model(model_output, type = "biphasic", detection_threshold = 100)
plot_model(model_output, type = "single", detection_threshold = 100)
The solid lines are the best-fit model for each subject. Twelve were successfully fit with the biphasic model, and four with the single phase model. Although some single phase subjects had sufficient data to fit the biphasic model (i.e. at least six observations), the resulting 95\% parameter confidence intervals were either unattainable or sufficiently wide to indicate an unreliable fit. As a result, the subjects were automatically re-fit with the single phase model.
We can also summarize the fitting procedure and parameter estimates using summarize_model()
. This creates a list with the following elements: (i) a summary of which subjects were successfully fit, with the corresponding infected cell lifespan estimates (summary
); (ii) summary statistics for the biphasic model parameter estimates (biphasicstats
); and (iii) summary statistics for the single phase parameter estimates (singlestats
).
actg315_summary <- summarize_model(model_output, data = actg315, stats = TRUE) head(actg315_summary$summary) actg315_summary$biphasicstats actg315_summary$singlestats
For a better understanding of parameter identifiability, one can also print the parameter estimates for each individual and model, along with their corresponding 95\% confidence intervals.
head(model_output$biphasicCI) head(model_output$singleCI)
To calculate the time to viral suppression (TTS) we use the fitted model output and the get_TTS()
function (see the vignette for more details). Here we set the suppression threshold to be the same as the detection threshold (i.e. we want to know when viral load drops below the detection threshold of the assay). We can subsequently obtain median and SD statistics, and the total number of subjects included in the analysis, using the summarize()
function from dplyr
.
TTS <- get_TTS(model_output = model_output, suppression_threshold = 100) head(TTS) TTS %>% summarize(median = median(TTS), SD = sd(TTS), N = n())
We can also plot the distribution of estimates using plot_TTS()
.
plot_TTS(TTS, bins = 6, textsize = 7)
ushr
provides additional functionality to the examples documented here. For example, noisy clinical data can be simulated from an underlying biphasic model using the simulate_data()
function. We also provide an alternative, non-parametric method for estimating TTS that does not require prior model fitting. Further details of all functions and user-specific customizations can be found in the documentation.
ushr
provides additional functionality to the examples documented here. Notable examples are:
ushr_triphasic()
(see ?ushr_triphasic()
); this may be more appropriate than the biphasic model [@Cardozo2017]. Results can be visualized using the same plotting/summary functions as above.simulate_data()
function. Further details of all functions and user-specific customizations can be found in the documentation.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.