txvis
- Data Visualization for Treament PatternsThe txvis
package for R is intended to provide users with the ability to generate high quality, customizable figures for both inital exploratory data analysis, and for presentation. The package uses a txvis
data object that is composed of both treatment sequence data and treatment related events. The package provides print
and summary
methods for the txvis
object, and includes a number of options for plotting.
Many packages are now available from GitHub. Currently txvis
is not supported through the Comprehensive R Network (CRAN), and so the installation procedure is somewhat different than for other common packages. To install txvis
you must use the devtools
package:
library(devtools) devtools::install_github('johnwilsonICON/txvis')
The package itself comes with a data object, treat
that provides the opportunity to test the package. treat
is intended to be a sample of treatment data a client might provide, and does not reflect the required data format within the package. Data is formatted by the function create_txvis
.
Most treatment data includes:
This is the minimum information supported by the package. The data model can be extended to include one or more treatment classes, hierarcies for the treatment types. These are user defined, so, for example, the hierarchy may be: warfarin -> anticoagulant
The plotting functions in txvis
expect a txvis
object. The txvis
object is a list
with two elements, a treatment
and event
object. Each of these are optional, but most plotting cannot occur without treatment data.
library(txvis) head(treat)
In the supplied data, a set of patients with unique encoding have up to 8 unique treatments. The treat
dates are text string, but represent numeric data/time objects. This data is meant to represent data that might be obtained from a clinical trial, which could require cleaning prior to analysis.
head(events)
For individual patients a number of events
have also been encoded. These events are associated with either point events, or events that occur over multiple days. The txvis
package makes use of a number of other visualization packages, but is intended to wrap the pre-processing. To do this requires the use of a special txvis
object. We simplify the process of generating the object using the create_txvis
function:
# Using the existing data. Event data is optional, but treatment data is required. # Here the date encoding for the startand end date is slightly peculiar: head(treat$start) # So we use the `create_txvis` flag `date_format` to ensure that the formatting is correct: hlth_data <- create_txvis(patient = treat$pat_id, treatment = treat$treatment, start = treat$start, end = treat$end, date_format = "%d%B%Y", ev_patient = events$pat_id, events = events$event, event_date = events$start, event_end_date = events$end)
The package was designed to accept multiple date formats since researchers encode & manage dates in multiple formats. The encoding follows the formatting presented in strptime
and as such can easily accomodate many standard and non-standard formats.
With the newly created txvis
object it becomes possible to plot one of the many options in the package. The package also provides print
and summary
methods to easily provide an overview of the data structure and contents:
print(hlth_data)
shows the head and tail of the treatment data, and provides a brief summary.
The summary
method:
summary(hlth_data)
provides more extensive insight into the object data and individual treatment sequences. Ultimately the txvis
object is a list
, assigned the class c('txvis', 'list')
. Each of the events
and treat
objects are simply data.frame
s, and as such can be manipulated with any of R
's tools for manipulating data.
The alluvial plot wraps the alluvial
function from the alluvial
package, which is available from GitHub. The first time the function is run it will prompt you to download the package if you haven't already.
# Given that data has already been loaded into `hlth_data` tx_alluvial(hlth_data)
It may be the case that we are interested in seeing how patient data squences change over time. Using the ggplot2
package, we can show treatment data over time using tx_indiv
tx_indiv(hlth_data, events = FALSE)
Here we see that the data is plotted out, with ten patients sampled randomly from the set of patients. We can sample more patients and add the set of events to the mapping:
tx_indiv(hlth_data, nsample = 50, events = TRUE)
Or we can align the plots so that they all appear to start at time 0, and provide some extra customization. Note that to provide extra customization we need to load the ggplot2
package.
library(ggplot2) tx_indiv(hlth_data, nsample = 50, events = TRUE, align = TRUE) + scale_x_continuous(expand = c(0,0)) + xlab("Days Since Treatment Start") + ylab("Patient") + theme_bw() + theme(axis.text.y = element_blank())
With each method it is possible to visualize treatment outcomes, and emphasise particular aspects of the treatment sequencing. For example, understanding transitions from one treatment to another can be understood using tx_transmat
, a visualization for transition matrices. The method examines treatment sequences rather than dates or date ranges. The sequences of interest can be passed using the sequences
parameter. By default the first two sequences are examined, although any two sequences may be chosen.
tx_transmat(hlth_data, sequences = c(1,2))
Here we can easily see that the most frequent treatment shift is from Treatment Tx1 to Tx3. Treatments Tx4 - 8 are infrequently administered in the early phase of the treatment sequence. We could arrange multiple sequence comparisons together:
# Show transition matrices for the first four sequences. par(mfrow = c(2,2)) tx_transmat(hlth_data, sequences = c(1,2)) tx_transmat(hlth_data, sequences = c(2,3)) tx_transmat(hlth_data, sequences = c(3,4))
We can see now that the main sequence of treatments do vary, but that in general Tx1-3 dominate the treatment regime.
Recent work porting the d3.js
JavaScript package to R has allowed us to leverage verious plotting devices that are HTML ready and interctive. This includes sunburst plots:
tx_sunburst(hlth_data)
and d3 style alluvial plots:
tx_d3alluvial(hlth_data)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.