vignette-spinners/bp-distributions_ARTICLE.R

#'---
#'title: "Pediatric Blood Pressure Distributions"
#'output:
#'  rmarkdown::html_vignette:
#'    toc: true
#'    number_sections: true
#'bibliography: references.bib
#'resource_files:
#'  - '../man/figures/flowchart.png'
#'vignette: >
#'  %\VignetteIndexEntry{Pediatric Blood Pressure Distributions}
#'  %\VignetteEngine{knitr::rmarkdown}
#'  %\VignetteEncoding{UTF-8}
#'---
#'
#+ label = "setup", include = FALSE
#/*
devtools::load_all() # load the dev version while editing
#*/
################################################################################
#                        !!! DO NOT EDIT .Rmd files !!!                        #
#                                                                              #
# .Rmd files are generated by their corresponding .R files found in the        #
# vignette-spinners/ directory.  Any changes needed to the .Rmd file need to   #
# be made in the .R file                                                       #
################################################################################
knitr::opts_chunk$set(collapse = TRUE, fig.align = "center")
library(qwraps2)

#'
#'
#+ label = "helper functions", include = FALSE
percentile_factor <- function(p) {
  factor(p, levels = sort(unique(p)), labels = paste0(sort(unique(p)) * 100, "th"))
}

#'
#'
library(pedbp)

#'
#' # Introduction
#'
#' Part of the work of @martin2022machine required transforming blood
#' pressure measurement into percentiles based on published norms.  This
#' work was complicated by the fact that data for pediatric blood pressure
#' percentiles is sparse and generally only applicable to children at least one
#' year of age and requires height, a commonly unavailable data point in
#' electronic health records for a variety of reasons.
#'
#' A solution to building pediatric blood pressure percentiles was developed and
#' is presented here for others to use.  Inputs for the developed method are:
#'
#' 1. Patient sex (male/female) _required_
#' 2. Systolic blood pressure (mmHg) _required_
#' 3. Diastolic blood pressure (mmHg) _required_
#' 4. Patient height (cm) _if known_.
#'
#' Given the inputs, the following logic is used to determine which data sets
#' will be used to inform the blood pressure percentiles.  Under one year of
#' age, the data from @gemelli1990longitudinal will be used; a height input is
#' not required for this patient subset. For those at least one year of age with
#' a known height, data from @nhlbi2011expert (hereafter referred to as
#' 'NHLBI/CDC' as the report incorporates recommendations and inputs from the
#' National Heart, Lung, and Blood Institute [NHLBI] and the Centers for Disease
#' Control and Prevention [CDC]). If height is unknown and age is at least three
#' years, then data from @lo2013prehypertension is used.  Lastly, for children
#' between one and three years of age with unknown height, blood pressure
#' percentiles are estimated by the NHLBI/CDC data using as a default the median
#' height for each patient's sex and age.
#'
#' <img src = "../man/figures/flowchart.png"/>
#'
#' # Estimating Pediatric Blood Pressure Distributions
#'
#' There are two functions provided for working with blood pressure
#' distributions.  These methods use Gaussian distributions for both systolic
#' and diastolic blood pressures with means and standard deviations either
#' explicitly provided in an aforementioned source or derived by optimizing the parameters
#' such that the sum of squared errors between the provided quantiles from an
#' aforementioned source and the distribution quantiles is minimized.  The
#' provided functions, a distribution function and a quantile function, follow a
#' similar naming convention to the distribution functions found in the stats
#' library in R.
#'
#+ Distribution Function
args(p_bp)

# Quantile Function
args(q_bp)

#'
#'
#' Both methods expect an age in months and an indicator for sex.
{{qwraps2::backtick(height) %s% ", "}}
#' in centimeters, is used preferentially over
{{qwraps2::backtick(height_percentile) %s% "."}}
#' The
{{qwraps2::backtick(default_height_percentile) %s% "."}}
#' is set to 50 by default to match the flowchart above, but can be adjusted
#' here to meet the end users needs.
#'
#' The reference look up tables for the @nhlbi2011expert and @flynn2017clinical
#' require height percentiles.
#' If
{{qwraps2::backtick(height)}}
#' is entered, then the height percentile is determined via an LMS
#' method for age and sex using corresponding LMS data from either the Centers
#' for Disease control (CDC) or the World Health Organization (WHO) based on
#' age.  Under 36 months use the WHO data to estimate the height percentile and
#' for 36 months and over use the CDC data.  The look up table will use the
#' percentile nearest the calculated value.  Look up height percentiles values
#' are: 5, 10, 25, 50, 75, 90, and 95.
#'
#' If you want to restrict to CDC or WHO values regardless of age, we recommend
#' using
{{qwraps2::backtick(p_height_for_age)}}
#' and
{{qwraps2::backtick(p_length_for_age)}}
#' to get height (stature) percentiles and pass the result to the
{{qwraps2::backtick(height_percentile)}}
#' argument.
#'
#' ## Percentiles
#'
#' What percentile for systolic and diastolic blood pressure is 100/60 for a 44
#' month old male with unknown height?
#'
p_bp(q_sbp = 100, q_dbp = 60, age = 44, male = 1)

#'
#' Those percentiles would be modified if height was 103 cm:
p_bp(q_sbp = 100, q_dbp = 60, age = 44, male = 1, height = 103)

#'
#' For the age and sex, the height of 103 is approximately the
{{ frmt(as.integer(100 * p_height_for_age(103, male = 1, age = 44))) %s% "th" }}
#' percentile.
p_height_for_age(103, male = 1, age = 44)
x <- p_bp(q_sbp = 100, q_dbp = 60, age = 44, male = 1, height_percentile = 0.80, source = "nhlbi")
x

#'
#' A plotting method to show where the observed blood pressures are on the
#' distribution function is also provided.
#+ fig.width = 5, fig.height = 5
bp_cdf(sbp = 90, dbp = 55, age = 44, male = 1, height = 103, source = "nhlbi")

#'
#' Vectors of blood pressures can be used as well.  NA values will return NA.
bps <-
  p_bp(
         q_sbp  = c(100, NA, 90)
       , q_dbp  = c(60, 82, 48)
       , age    = 44
       , male   = 1
       , height_percentile = 0.80
      )
bps

#'
#' If you want to know which data source was used in computing each of the
#' percentile estimates you can look at the
{{ qwraps2::backtick(bp_params) }}
#' attribute:
attr(bps, "bp_params")

#'
#' ## Quantiles
#'
#' If you have a percentile value and want to know the associated
#' systolic and diastolic blood pressures:
#'
q_bp(
       p_sbp = c(0.701, NA, 0.36)
     , p_dbp = c(0.85, 0.99, 0.50)
     , age = 44
     , male = 1
     , height_percentile = 0.80
    )

#'
#'
#' ## Working With More Than One Patient
#'
#' The
{{ qwraps2::backtick(p_bp) }}
#' and
{{ qwraps2::backtick(q_bp) }}
#' methods are designed accept vectors for each of the arguments.
#' These methods expected each argument to be length 1 or all the same length.
#'
#+ label = "bp_batch_example"
eg_data <- read.csv(system.file("example_data", "for_batch.csv", package = "pedbp"))
eg_data

bp_percentiles <-
  p_bp(
         q_sbp  = eg_data$sbp..mmHg.
       , q_dbp  = eg_data$dbp..mmHg.
       , age    = eg_data$age
       , male   = eg_data$male
       , height = eg_data$height
       )
bp_percentiles

str(bp_percentiles)

#'
#' Going from percentiles back to quantiles:
q_bp(
       p_sbp  = bp_percentiles$sbp_p
     , p_dbp  = bp_percentiles$dbp_p
     , age    = eg_data$age
     , male   = eg_data$male
     , height = eg_data$height
     )


#'
#' # Select Source Data
#'
#' The default method for estimating blood pressure percentiles is based on the
#' method of @martin2022machine and @martin2022development which uses three
#' different references depending on age and known/unknown stature.  If you want
#' to use a specific reference then you can do so by using the
{{ qwraps2::backtick(source) }}
#' argument.
#'
#' If you have a project with the want/need to use a specific source and you'd
#' to use you can set it as an option:
#'
#+ eval = FALSE
# /*
while (FALSE) {
# */
options(pedbp_bp_source = "martin2022")  # default
# /*
}
# */
#'
#' There are four sources:
#'
#' 1. @gemelli1990longitudinal for kids under one year of age.
#' 2. @lo2013prehypertension for kids over three years of age and when height is unknown.
#' 3. @nhlbi2011expert for kids 1 through 18 years of age with known stature.
#' 4. @flynn2017clinical for kids 1 through 18 years of age with known stature.
#'
#' The data from @flynn2017clinical and @nhlbi2011expert are similar but for one
#' major difference.  @flynn2017clinical excluded overweight and obese ( BMI
#' above the 85th percentile) children.
#'
#' For example, the estimated percentile for a blood pressure of 92/60 in a 39.2
#' month old female in the 95th height percentile are:
#'
d <- data.frame(source = c("martin2022", "gemelli1990", "lo2013", "nhlbi", "flynn2017"),
                p_sbp = NA_real_,
                p_dbp = NA_real_)

for(i in 1:nrow(d)) {
  bp <- p_bp(q_sbp = 92, q_dbp = 60, age = 39.2, male = 0, height_percentile = 95, source = d$source[i])
  d[i, "p_sbp"] <- bp$sbp
  d[i, "p_dbp"] <- bp$dbp
}
d

#'
#' The estimated 85th quantile SBP/DBP for a 39.2 month old female, who is in
#' the 95th height percentile are:
#'
d <- data.frame(source = c("martin2022", "gemelli1990", "lo2013", "nhlbi", "flynn2017"),
                q_sbp = NA_real_,
                q_dbp = NA_real_)

for(i in 1:nrow(d)) {
  bp <- q_bp(p_sbp = 0.85, p_dbp = 0.85, age = 39.2, male = 0, height_percentile = 95, source = d$source[i])
  d[i, "q_sbp"] <- bp$sbp
  d[i, "q_dbp"] <- bp$dbp
}
d

#'
#'
#' # Comparing to Published Percentiles
#'
#' The percentiles published in @nhlbi2011expert and @flynn2017clinical where
#' used to estimate a Gaussian mean and standard deviation.  This was in part to
#' be consistent with the values from @gemelli1990longitudinal and
#' @lo2013prehypertension.  As a result, the calculated percentiles and
#' quantiles from the pedbp package for @nhlbi2011expert and @flynn2017clinical
#' will be slightly different from the
#' published values.
#'
# /*
#  NOTE: This is also in the test suite
# */
#'
#' ## Flynn et al.
#'
fq <-
  q_bp(
     p_sbp = flynn2017$bp_percentile/100,
     p_dbp = flynn2017$bp_percentile/100,
     male  = flynn2017$male,
     age   = flynn2017$age,
     height_percentile = flynn2017$height_percentile,
     source = "flynn2017")

fp <-
  p_bp(
     q_sbp = flynn2017$sbp,
     q_dbp = flynn2017$dbp,
     male  = flynn2017$male,
     age   = flynn2017$age,
     height_percentile = flynn2017$height_percentile,
     source = "flynn2017")

f_bp <-
  cbind(flynn2017,
        pedbp_sbp = fq$sbp,
        pedbp_dbp = fq$dbp,
        pedbp_sbp_p = fp$sbp_p * 100,
        pedbp_dbp_p = fp$dbp_p * 100
  )

#'
#' All the quantile estimates are within 2 mmHg:
summary(f_bp$pedbp_sbp - f_bp$sbp)
summary(f_bp$pedbp_dbp - f_bp$dbp)

#+ include = FALSE
stopifnot(max(abs(f_bp$pedbp_sbp - f_bp$sbp)) < 2)
stopifnot(max(abs(f_bp$pedbp_dbp - f_bp$dbp)) < 2)

#'
#' All the percentiles estimates are within are within 2 percentile points:
#'
summary(f_bp$pedbp_sbp_p - f_bp$bp_percentile)
summary(f_bp$pedbp_dbp_p - f_bp$bp_percentile)

#'
#+ include = FALSE
stopifnot(max(abs(f_bp$pedbp_sbp_p - f_bp$bp_percentile)) < 2)
stopifnot(max(abs(f_bp$pedbp_dbp_p - f_bp$bp_percentile)) < 2)

#'
#' A helpful set of graphics are shown below.  Panels A and C show the estimated
#' blood pressure quantiles provide by the
{{ qwraps2::Rpkg(pedbp) }}
#' package (y-axis) against the published quantiles from @flynn2017clinical for
#' systolic and diastolic blood pressures respectively.
#' Panels B and D are Bland-Altman plots showing the difference vs average
#' between the two estimates.
#+ echo = FALSE, fig.width = 9, fig.height = 9
fsbp <- ggplot2::ggplot(f_bp) +
  ggplot2::theme_bw() +
  ggplot2::aes(x = sbp, y = pedbp_sbp) +
  ggplot2::geom_point() +
  ggplot2::geom_abline(intercept = 0, slope = 1) +
  ggplot2::ylab("pedbp Package\nSystolic Blood Pressure (mmHg)") +
  ggplot2::xlab("Published Flynn (2017)\nSystolic Blood Pressure (mmHg)")
fdbp <- ggplot2::ggplot(f_bp) +
  ggplot2::theme_bw() +
  ggplot2::aes(x = dbp, y = pedbp_dbp) +
  ggplot2::geom_point() +
  ggplot2::geom_abline(intercept = 0, slope = 1) +
  ggplot2::ylab("pedbp Package\nDiastolic Blood Pressure (mmHg)") +
  ggplot2::xlab("Published Flynn (2017)\nDiastolic Blood Pressure (mmHg)")

ggpubr::ggarrange(
  fsbp, qwraps2::qblandaltman(f_bp[, c("sbp", "pedbp_sbp")]) + ggplot2::theme_bw(),
  fdbp, qwraps2::qblandaltman(f_bp[, c("dbp", "pedbp_dbp")]) + ggplot2::theme_bw(),
  labels = LETTERS
)


#'
#' ## NHLBI/CDC
#'
nq <-
  q_bp(
     p_sbp = nhlbi_bp_norms$bp_percentile/100,
     p_dbp = nhlbi_bp_norms$bp_percentile/100,
     male  = nhlbi_bp_norms$male,
     age   = nhlbi_bp_norms$age,
     height_percentile = nhlbi_bp_norms$height_percentile,
     source = "nhlbi")

np <-
  p_bp(
     q_sbp = nhlbi_bp_norms$sbp,
     q_dbp = nhlbi_bp_norms$dbp,
     male  = nhlbi_bp_norms$male,
     age   = nhlbi_bp_norms$age,
     height_percentile = nhlbi_bp_norms$height_percentile,
     source = "nhlbi")

nhlbi_bp <-
  cbind(nhlbi_bp_norms,
        pedbp_sbp = nq$sbp,
        pedbp_dbp = nq$dbp,
        pedbp_sbp_p = np$sbp_p * 100,
        pedbp_dbp_p = np$dbp_p * 100
  )


#'
#' All the quantile estimates are within 2 mmHg:
#'
summary(nhlbi_bp$pedbp_sbp - nhlbi_bp$sbp)
summary(nhlbi_bp$pedbp_dbp - nhlbi_bp$dbp)

#'
#+ include = FALSE
stopifnot(max(abs(nhlbi_bp$pedbp_sbp - nhlbi_bp$sbp)) < 2)
stopifnot(max(abs(nhlbi_bp$pedbp_dbp - nhlbi_bp$dbp)) < 2)

#'
#' All the percentiles are within 2 percentile points:
#'
summary(nhlbi_bp$pedbp_sbp_p - nhlbi_bp$bp_percentile)
summary(nhlbi_bp$pedbp_dbp_p - nhlbi_bp$bp_percentile)

#'
#+ include = FALSE
stopifnot(max(abs(nhlbi_bp$pedbp_sbp_p - nhlbi_bp$bp_percentile)) < 2)
stopifnot(max(abs(nhlbi_bp$pedbp_dbp_p - nhlbi_bp$bp_percentile)) < 2)

#'
#' A helpful set of graphics are shown below.  Panels A and C show the estimated
#' blood pressure quantiles provide by the
{{ qwraps2::Rpkg(pedbp) }}
#' package (y-axis) against the published quantiles from @nhlbi2011expert for
#' systolic and diastolic blood pressures respectively.
#' Panels B and D are Bland-Altman plots showing the difference vs average
#' between the two estimates.
#+ echo = FALSE, fig.width = 9, fig.height = 9
nsbp <- ggplot2::ggplot(nhlbi_bp) +
  ggplot2::theme_bw() +
  ggplot2::aes(x = sbp, y = pedbp_sbp) +
  ggplot2::geom_point() +
  ggplot2::geom_abline(intercept = 0, slope = 1) +
  ggplot2::ylab("pedbp Package\nSystolic Blood Pressure (mmHg)") +
  ggplot2::xlab("Published NHLBI\nSystolic Blood Pressure (mmHg)")
ndbp <- ggplot2::ggplot(nhlbi_bp) +
  ggplot2::theme_bw() +
  ggplot2::aes(x = dbp, y = pedbp_dbp) +
  ggplot2::geom_point() +
  ggplot2::geom_abline(intercept = 0, slope = 1) +
  ggplot2::ylab("pedbp Package\nDiastolic Blood Pressure (mmHg)") +
  ggplot2::xlab("Published NHLBI (2017)\nDiastolic Blood Pressure (mmHg)")

ggpubr::ggarrange(
  nsbp, qwraps2::qblandaltman(nhlbi_bp[, c("sbp", "pedbp_sbp")]) + ggplot2::theme_bw(),
  ndbp, qwraps2::qblandaltman(nhlbi_bp[, c("dbp", "pedbp_dbp")]) + ggplot2::theme_bw(),
  labels = LETTERS
)

#'
#' ## NHLBI vs Flynn 2017
#'
#' The NHLBI data included overweight and obese children whereas Flynn excluded
#' them.  As a result, the estimates for blood pressures can differ
#' significantly between the two sources.
#'
#' The graphic below shows the estimated systolic and diastolic blood pressures
#' provided by the
{{ qwraps2::Rpkg(pedbp) }}
#' package.  As expected, the values estimated based on Flynn are lower, on
#' average, than those estimated by data from the NHLBI.
#'
#+ echo = FALSE, fig.width = 9, fig.height = 9
nhlbi_vs_flynn <- f_bp
names(nhlbi_vs_flynn)[names(nhlbi_vs_flynn) == "bp_percentile"] <- "flynn_bp_percentile"
names(nhlbi_vs_flynn)[names(nhlbi_vs_flynn) == "sbp"] <- "flynn_sbp"
names(nhlbi_vs_flynn)[names(nhlbi_vs_flynn) == "dbp"] <- "flynn_dbp"
names(nhlbi_vs_flynn)[names(nhlbi_vs_flynn) == "pedbp_sbp"] <- "pedbp_flynn_sbp"
names(nhlbi_vs_flynn)[names(nhlbi_vs_flynn) == "pedbp_dbp"] <- "pedbp_flynn_dbp"
names(nhlbi_vs_flynn)[names(nhlbi_vs_flynn) == "pedbp_sbp_p"] <- "pedbp_flynn_sbp_p"
names(nhlbi_vs_flynn)[names(nhlbi_vs_flynn) == "pedbp_dbp_p"] <- "pedbp_flynn_dbp_p"

nhlbi_vs_flynn <-
  merge(nhlbi_vs_flynn, nhlbi_bp, all = TRUE, by = c("male", "age", "height_percentile"))

names(nhlbi_vs_flynn)[names(nhlbi_vs_flynn) == "bp_percentile"] <- "nhlbi_bp_percentile"
names(nhlbi_vs_flynn)[names(nhlbi_vs_flynn) == "sbp"] <- "nhlbi_sbp"
names(nhlbi_vs_flynn)[names(nhlbi_vs_flynn) == "dbp"] <- "nhlbi_dbp"
names(nhlbi_vs_flynn)[names(nhlbi_vs_flynn) == "pedbp_sbp"] <- "pedbp_nhlbi_sbp"
names(nhlbi_vs_flynn)[names(nhlbi_vs_flynn) == "pedbp_dbp"] <- "pedbp_nhlbi_dbp"
names(nhlbi_vs_flynn)[names(nhlbi_vs_flynn) == "pedbp_sbp_p"] <- "pedbp_nhlbi_sbp_p"
names(nhlbi_vs_flynn)[names(nhlbi_vs_flynn) == "pedbp_dbp_p"] <- "pedbp_nhlbi_dbp_p"

pA <- ggplot2::ggplot(nhlbi_vs_flynn) +
  ggplot2::theme_bw() +
  ggplot2::aes(x = pedbp_nhlbi_sbp, y = pedbp_flynn_sbp) + #, color = factor(height_percentile), shape = factor(male, 0:1, c("Female", "Male"))) +
  ggplot2::geom_point() +
  ggplot2::geom_abline(intercept = 0, slope = 1) +
  ggplot2::ylab("pedbp NHLBI Systolic BP") +
  ggplot2::xlab("pedbp Flynn (2017) Systolic BP")

pC <- ggplot2::ggplot(nhlbi_vs_flynn) +
  ggplot2::theme_bw() +
  ggplot2::aes(x = pedbp_nhlbi_dbp, y = pedbp_flynn_dbp) + #, color = factor(height_percentile), shape = factor(male, 0:1, c("Female", "Male"))) +
  ggplot2::geom_point() +
  ggplot2::geom_abline(intercept = 0, slope = 1) +
  ggplot2::ylab("pedbp NHLBI Diastolic BP") +
  ggplot2::xlab("pedbp Flynn (2017) Diastolic BP")

ggpubr::ggarrange(
  pA, qwraps2::qblandaltman(nhlbi_vs_flynn[, c("pedbp_nhlbi_sbp", "pedbp_flynn_sbp")]) + ggplot2::theme_bw(),
  pC, qwraps2::qblandaltman(nhlbi_vs_flynn[, c("pedbp_nhlbi_dbp", "pedbp_flynn_dbp")]) + ggplot2::theme_bw(),
  labels = LETTERS
)



#'
#'
#' # Blood Pressure Charts
#'
#' To you can get blood pressure charts for any combination of inputs using
{{ qwraps2::backtick(bp_chart) %s% "."}}
#' For example, the blood pressure percentiles when using
{{ qwraps2::backtick("source = 'martin2022'", dequote = TRUE) }}
#' and height is unknown are:
#+ fig.height = 7, fig.width = 10
bp_chart(p = c(0.05, 0.1, 0.25, 0.5, 0.75, 0.90, 0.95), source = "martin2022") # default

#'
#' And if height is known (say it is the 25th percentile)
#+ fig.height = 7, fig.width = 10
bp_chart(p = c(0.05, 0.1, 0.25, 0.5, 0.75, 0.90, 0.95),
         height_percentile = 25,
         source = "martin2022")

#'
#' Additionally, charts for each of the specific data sources can be generated
#+ fig.height = 7, fig.width = 10
bp_chart(source = "gemelli1990")
bp_chart(source = "lo2013")
bp_chart(source = "nhlbi")
bp_chart(source = "flynn2017")


#'
#'
#' # Shiny Application
#'
#' An interactive [Shiny](https://shiny.posit.co/) application is also
#' available. After installing the pedbp package and the suggested packages, you
#' can run the app locally via
#'
#+ label = "shiny", eval = FALSE
# /*
while (FALSE) {
# */
shiny::runApp(system.file("shinyapps", "pedbp", package = "pedbp"))
# /*
}
# */
#'
#' The shiny application allows for interactive exploration of blood pressure
#' percentiles for an individual patient and allows for batch processing a set
#' of patients as well.
#'
#' # References
#'<div id="refs"></div>
#'
#/*
#' # Session Info
#+ label = "sessioninfo"
sessionInfo()
#*/
dewittpe/pedbp documentation built on Jan. 26, 2025, 8:02 p.m.