In johnschwenck/bp: Blood Pressure Analysis in R

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

library(bp)

Introduction

Despite the tremendous progress in the medical field, cardiovascular disease (CVD) remains the leading cause of death worldwide. Hypertension, specifically, affects over 1.1 billion people annually according to the American Heart Association [9]. This package serves to visualize and quantify various aspects of hypertension in a more digestible format using various metrics proposed in the literature.

Blood pressure data can be analyzed at varying degrees of granularity depending on the reading frequency, presence of a sleep indicator, and whether or not the temporal structure is accounted for. These factors almost always depend on the type of device used, where ABPM monitors are predominantly used for the short term (within 24 hours) and home monitoring devices or office readings are used for measuring variability over the medium and long term (day-to-day, visit-to-visit, etc) [11]. Unlike continuous heart rate monitors or continuous glucose monitors, there are currently no commercially-available continuous blood pressure monitors available for the middle to long term, posing a unique challenge for research.

Of primary concern to researchers is the ability to accurately quantify blood pressure variability (BPV). BPV has been shown to be an important factor in predicting cardiovascular events and sudden death, especially during susceptible periods such as the first two hours of waking up [10]. There have been many proposed methods for characterizing this variability; this package seeks to incorporate as many of these metrics as possible.

We introduce the first comprehensive open-source R package, bp, that both analyzes and visualizes blood pressure data. In an effort to help clinicians make sense of their patients' data without requiring multiple software platforms for data processing, bp uses only a minimal amount of code to do so and offers additional capabilities beyond the traditional proprietary counterparts. At the time of writing, to the best of the authors' knowledge, there are currently no other available software packages through the Comprehensive R Archive Network (CRAN) dedicated to blood pressure analysis.

In this paper, we demonstrate the main functionality of the bp package by exploring and analyzing both a single-subject pilot study of [ 8 ] and a multi-subject study, HYPNOS [ 11 ], to illustrate the differences between dataset structures and elaborate how to adjust the settings within the R package to accommodate either.

Blood Pressure Monitoring Overview

Blood pressure monitoring devices work by measuring the pressure of the artery’s restricted blood flow; for digital devices, the vibrations are translated into electrical signals. Unlike home monitors that only take readings upon the subject's initiation, ambulatory blood pressure monitoring (ABPM) devices take automatic readings at pre-specified intervals over a 24-hour period or longer.

ABPM allows medical professionals to analyze blood pressure during sleep which has been shown to be a more accurate predictor of cardiovascular events than daytime blood pressure. ABPM also allows researchers to discern true hypertension from "whitecoat" hypertension in an office or laboratory setting. Because of the burden of assembling the device (and because of the lack of a commercial-grade alternative), ABPM measurements are intended for the short-term of 24-hours to a few days.

Home monitoring on the other hand, offers individuals the ability to record their blood pressure at will and can be tracked easily using mobile apps over the long-term of weeks, months, or years. However, because the user has to initiate the recording, readings cannot be taken during sleep.

As the nature of the two devices inhibits certain functionality depending on which device is used, we outline how to effectively analyze data for both types of devices.

According to the American Heart Association, there are currently 6 blood pressure stages that correspond to the readings from the monitoring devices: Low (Hypotension), Normal, Elevated, Stage 1 Hypertension, Stage 2 Hypertension, Hypertensive Crisis. Below is a table outlining the categories according to their definitions. Note that because of the ambiguity between Normal, Elevated, and Stage 1 diastolic blood pressure readings (because of the similar thresholds), this package splits the difference and sets a default threshold for Elevated DBP from 80 - 85 and Stage 1 Hypertension from 85 - 90. These thresholds can be adjusted by the user where applicable.

Blood Pressure Category | Systolic (mmHg) | | Diastolic (mmHg) ------------------------|-----------------|--------|------------------ Low (Hypotension) | Less than 100 | and | Less than 60 Normal | 100 - 120 | and | 60 - 80 Elevated | 120 - 129 | and | 60 - 80 Stage 1 Hypertension | 130 - 139 | or | 80 - 89 Stage 2 Hypertension | 140 - 180 | or | 90 - 120 Hypertensive Crisis | Higher than 180 | and/or | Higher than 120

Recently, an adaption of these guidelines by (Lee et al 2020) provides a two-to-one mapping of SBP and DBP in order to avoid the instances where there is an unusually high SBP value and low DBP value, or vice versa. These new guidelines are as follows:

BP Stage | Systolic (mmHg) | | Diastolic (mmHg) ------------------------|------------------|-------|----------------- Low (optional) | <100 | and | <60 Normal | <120 | and | <80 Elevated | 120 - 129 | and | <80 Stage 1 - All | 130 - 139 | and | 80 - 89 Stage 1 - ISH (ISH - S1)| 130 - 139 | and | <80 Stage 1 - IDH (IDH - S1)| <130 | and | 80 - 89 Stage 2 - All | >140 | and | >90 Stage 2 - ISH (ISH - S2)| >140 | and | <90 Stage 2 - IDH (IDH - S2)| <140 | and | >90 Crisis (optional) | >180 | or | >120

The `bp` Package

The general workflow of the bp package consists of 1) a data processing stage and 2) an analysis stage, in ideally as little as two lines of code. The processing stage formats the user's supplied input data in such a way that it adheres to the rest of the bp functions. The analysis stage uses the processed data to quantify various attributes of the blood pressure relationships or to provide various visualizations. One of the key abilities of the bp package is bp_report function which generates a report that combines such visualization plots into one easily digestible summary for clinicians or researchers to interpret an individual's (or multiple individuals') blood pressure results. We will walk through each of these stages in the subsequent sections.

Data Processing with the `process_data` function

Before any analysis can be done, the user-supplied data set must be first processed into the proper format using process_data to adhere to package data structure requirements and naming conventions. This function ensures that user-supplied data columns aren't double counted or missed, since blood pressure data are often inconsistent and come from a wide variety of formats. While a tedious initial step, it will save time in the long-run as the resulting processed data will not require any future specification, which can then be directly plugged into the analysis functions. It is worth noting that if the user-supplied data set already adheres to the column naming conventions and data types, then the process_data function will be unnecessary. However, it is good practice to still make use of this function as a sanity check to verify all available variables.

The basic workflow is to load in the user-supplied unprocessed raw data, process it with the process_data function and save to a new dataframe. Note that the capitalization does not matter when specifying the columns.

## Load the sample bp_hypnos
## In this scenario, the bp_hypnos acts as the "user-supplied" data that is to be processed
data("bp_hypnos")

## Assign the output of the process_data function to a new dataframe object
hypnos_proc <- process_data(bp_hypnos,
                     bp_type = "ABPM",
                     sbp = 'syst',
                     dbp = 'diast',
                     date_time = 'date.time',
                     id = 'id',
                     visit = 'visit',
                     hr = 'hr',
                     wake = 'wake',
                     pp = 'pp',
                     map = 'map',
                     rpp = 'rpp')

Notice how the column names of the original bp_hypnos changed in the processed data. Notably, SYST became SBP, DIAST became DBP, and DATE.TIME became DATE_TIME.

names(bp_hypnos)
names(hypnos_proc)

While the results seem to be trivial at first glance, let's see what happens when we use a much different data set with a completely different naming convention: the bp_jhs data set. Unlike bp_hypnos which has all of the available columns needed in the process_data function with multiple subjects, bp_jhs is a single-subject data set without many of the multi-subject identifiers such as ID, VISIT, or WAKE (as it is non-ABPM data). Further, there is no MAP or PP column, but these (as we will see) can be automatically created.

## Load the sample bp_jhs data set
## As before, this is what will be referred to as the "user-supplied" data set
data("bp_jhs")

## Assign the output of the process_data function to a new dataframe object
jhs_proc <- process_data(bp_jhs,
                     sbp = 'sys.mmhg.',
                     dbp = 'dias.mmhg.',
                     date_time = 'datetime',
                     hr = 'PULSE.BPM.')
head(jhs_proc, 5)

After a quick inspection of the original bp_jhs data set and the newly processed data data set, it should be evident that there was a lot going on "under the hood" of the process_data function. As we can see from the column names, the awkward nuisance of typing sys.mmhg., dias.mmhg., pulse.bpm., and datetime have now been replaced with the more concise SBP, DBP, HR, and DATE_TIME names, respectively. Additionally, MAP, PP, RPP, SBP_Category, and DBP_Category were all calculated as additional columns which previously did not exist in the data. Additionally, if the supplied data has a column corresponding to a "date/time" format, the columns Time_of_Day and DAY_OF_WEEK will also be created for ease.

names(bp_jhs)
names(jhs_proc)

NOTE: For consistency, process_data will coerce all column names to upper-case.

Blood Pressure Metrics

After the data has been processed, we can now utilize the built-in metrics from the literature to characterize the blood pressure variability. To start, the following metrics are what is currently offered through the bp package:

Function | Metric Name | Source ----------- | --------------------------------------- | ---------- bp_arv | Average Real Variability | Mena et al (2005) bp_center | Mean and Median | Amaro Lijarcio et al (2006) bp_cv | Coefficient of Variation | Munter et al (2011) bp_mag | Blood Pressure Magnitude (peak and trough) | Munter et al (2011) bp_range | Blood Pressure Range | Levitan et al (2013) bp_sv | Successive Variation | Munter et al (2011) bp_sleep_metrics | Blood Pressure Sleep Metrics | (Multiple - see documentation) bp_stages | Blood Pressure Stages Classification | American Heart Association bp_stats | Aggregation of statistical summaries | N/A dip_calc | Nocturnal Dipping \% and Classification | Okhubo et al (1997)

Time-Dependent Dispersion Metrics

bp_arv - Average Real Variability
A measure of dispersion using the sum of absolute differences in successive observations
bp_sv - Successive Variation
A measure of dispersion using the sum of squared differences in successive observations

Time-Independent Dispersion Metrics

bp_mag - Blood Pressure Magnitude (peak and trough)
Peak measures the distance from the average value to the maximum value
Trough measures the distance from the minimum value to the average value
bp_range - Blood Pressure Range
Range measures the distance from the minimum value to the maximum value
bp_cv - Coefficient of Variation
Coefficient of Variation is a ratio of the standard deviation / average

Sleep-dependent Metrics

dip_calc - Nocturnal Dipping \% and Classification
Nocturnal dipping percentage is the \% drop in blood pressure while asleep compared with awake. Requires an indication of when a subject is asleep to know how to calculate: 1 - (avg sleep BP / avg daytime BP). The severity of the dipping percentage indicates the corresponding classification of that individual (dipper, non-dipper, reverse dipper).
bp_sleep_metrics - Sleep-specific metrics for BP
This function includes the following sleep metrics: dipping calculation, nocturnal fall, morning blood pressure surge (MBPS) calculated both sleep-trough and pre-wake variants, morningness-eveningness average and difference, and weighted standard deviation (wSD)

Let's say we are working with the bp_hypnos and would like to compare the time-dependent nature of the bp_arv with the bp_sv for each subject.

head(bp_arv(hypnos_proc, bp_type = 'both'))
head(bp_sv(hypnos_proc, bp_type = 'both'))

Comparing vertically can be challenging, so with some help from dplyr we can obtain the following:

head(dplyr::left_join(bp_arv(hypnos_proc, bp_type = 'both'), bp_cv(hypnos_proc, bp_type = 'both')))

Note that this is possible thanks to the work we did in standardizing column names from the processing step.

Because the hypnos_proc data contains BP readings during sleep, sleep-specific metrics through the bp_sleep_metrics function can be used to garner further insight:

bp_sleep_metrics(hypnos_proc)

Turning back to the bp_jhs data set, let's examine the peaks and troughs of the BP readings. We would call the bp_mag function on our data.

head(bp_mag(jhs_proc, bp_type = 'both'))

Here, we notice something different. Because there weren't ID, VISIT, or WAKE columns, the bp_mag aggregated everything together. This is technically correct, but say we wanted to glean more information by breaking our data down by DATE instead; we would need to include the inc_date = TRUE optional argument to the function.

tail(bp_mag(jhs_proc, inc_date = TRUE, bp_type = 'both'))

Interpretation: While it may not seem obvious at first glance, the blood pressure magnitude (whether a peak or a trough) is calculated as $peak = max(BP) - mean(BP)$ and $trough = mean(BP) - min(BP)$ where BP could correspond to either SBP or DBP. If we manually inspect the data from 2019-07-31 we see that N = 2 measurements and within the bp_jhs data set we see the two measurements are 126 and 128 for SBP and 77, and 76 for DBP. $\bar{x}{SBP} = \frac{(126+128)}{2} = 127$ and $\bar{x}{DBP} = \frac{(78+76)}{2} = 76.5$ so the respective peak and trough values from our output make sense.

Visualization

So far, we have processed the original data and ran a couple metrics to get a clearer picture of the variability, now let's visualize it. Though the processed data can easily be incorporated into other visualization packages or code (such as ggplot which we will demonstrate in the first example below with bp_mag), the following visuals are currently included with the bp package:

Function | Visual
---------------- | ---------------------------------- bp_scatter | Scatter plot of BP stages (American Heart Association) bp_ts_plots | Time series plots bp_hist | Histograms of BP stages dip_class_plot | Dipping % category plot bp_report | Exportable report of BP summary dow_tod_plots | Day of Week / Time of Day chart

The bp_hist returns three histograms of all readings corresponding to total number within each stage, frequency of SBP readings, and frequency of DBP readings. Furthermore, it breaks the data down by color according to which blood pressure stage it falls under:

bp_hist(jhs_proc)

Let's now suppose that we wish to break down our readings by Time of Day and Day of Week. For this, we can implement the dow_tod_plots function. Because this function is mainly used as a helper function for the bp_report function, we need to add a couple steps using the gridExtra package

bptable_ex <- dow_tod_plots(jhs_proc)
#gridExtra::grid.arrange(bptable_ex[[1]], bptable_ex[[2]], bptable_ex[[3]], bptable_ex[[4]], nrow = 2)

As a final step before returning to our other example, let's compile everything that we have done so far into a more compact and digestible report that visualizes everything simultaneously. To do so, we will rely on the bp_report function, which will generate a report in PDF (although other formats such as PNG are available):

bp_report(jhs_proc, save_report = 0)

img <- png::readPNG("vignette_report_1_subj.png")
grid::grid.raster(img)

Suppose we wanted to look at the observations for the bp_jhs data set over time. The time series plotting capabilities allow this to be visualized. The output returns two plots, one for the total duration through time, and another showing repeated measurements within a 24 hour period (by hour). In the event that there are more than one series of recordings (such as the bp_hypnos data set where there are two visits per subject), the wrap_var and group_var function arguments can be employed.

bp_ts_plots(jhs_proc)

Suppose now that we turn our attention back to the bp_hypnos example, suppose now we wish to examine sleep patterns. Using the dipping category plot we can analyze BP behavior during sleep

dip_class_plot(hypnos_proc)

Future Directions

As our understanding of cardiovascular disease continues to grow, this package will remain ongoing project. As such, collaboration is highly encouraged. Corrections to existing metrics, extensions or new method proposals and visualizations, and code optimization are all welcome.

In the short term, the following new features are to be incorporated with the next release of the package:

Extensions to the bp_report function to include more visuals on another page, and include the ability to break down visuals by individuals (if data includes multiple subjects)
Shiny App for visualization and PDF export
Package website
Additional metrics: morning bp surge power, SDIM

References

Mancia, G., Di Rienzo, M., & Parati, G. (1993). Ambulatory blood pressure monitoring use in hypertension research and clinical practice. Hypertension, 21(4), 510-524.
Levitan, E., Kaciroti, N., Oparil, S. et al. Relationships between metrics of visit-to-visit variability of blood pressure. J Hum Hypertens 27, 589–593 (2013). doi: 10.1038/jhh.2013.19
Muntner, Paula,b; Joyce, Carac; Levitan, Emily B.a; Holt, Elizabethd; Shimbo, Daichie; Webber, Larry S.c; Oparil, Suzanneb; Re, Richardf; Krousel-Wood, Maried,g Reproducibility of visit-to-visit variability of blood pressure measured as part of routine clinical care, Journal of Hypertension: December 2011 - Volume 29 - Issue 12 - p 2332-2338 doi: 10.1097/HJH.0b013e32834cf213
O’Brien E, Sheridan J, O’Malley K . Dippers and non-dippers. Lancet 1988; 2: 397.
Ohkubo T, Imai Y, Tsuji I, Nagai K, Watanabe N, Minami N, Kato J, Kikuchi N, Nishiyama A, Aihara A, Sekino M, Satoh H, Hisamichi S. Relation between nocturnal decline in blood pressure and mortality. The Ohasama Study. Am J Hypertens. 1997 Nov;10(11):1201-7. doi: 10.1016/s0895-7061(97)00274-4. PMID: 9397237.
Mena L, Pintos S, Queipo NV, Aizpúrua JA, Maestre G, Sulbarán T. A reliable index for the prognostic significance of blood pressure variability. J Hypertens. 2005 Mar;23(3):505-11. doi: 10.1097/01.hjh.0000160205.81652.5a. PMID: 15716690.
Holt-Lunstad J, Jones BQ, Birmingham W. The influence of close relationships on nocturnal blood pressure dipping. Int J Psychophysiol. 2009 Mar;71(3):211-7. doi: 10.1016/j.ijpsycho.2008.09.008. Epub 2008 Oct 5. PMID: 18930771.
Schwenck J. Riding for Research: A 5,775-mile Cycling Journey Across North America. Harvard Dataverse https://dataverse.harvard.edu/dataverse/r4r
Webb S. AHA 2019 Heart Disease and Stroke Statistics. American College of Cardiology. https://www.acc.org/latest-in-cardiology/ten-points-to-remember/2019/02/15/14/39/aha-2019-heart-disease-and-stroke-statistics
Bilo G, Grillo A, Guida V, Parati G. Morning blood pressure surge: pathophysiology, clinical relevance and therapeutic aspects. Integr Blood Press Control. 2018;11:47-56. Published 2018 May 24. https://doi.org/10.2147/IBPC.S130277
Irina Gaynanova, Naresh Punjabi, Ciprian Crainiceanu, Modeling continuous glucose monitoring (CGM) data during sleep, Biostatistics, 2020. https://doi.org/10.1093/biostatistics/kxaa023