knitr::opts_chunk$set(echo = TRUE) library(tidyverse) library(predictBoundCP)
This package contains the scripts necessary to find a change point by finding the point where the observed data systematically deviate outside of, and above, the prediction bound of a model fit to a training period before the diagnostic opportunity window.
The following steps are applied to find the change-point:
Specify the training period [low
,high
] days prior to diagnosis, where low
is > the furthest point in the "feasible" opportunity window
Fit a model to the data over the period [low
,high
].
Generate forecasts over the period [0, low
-1] along with a corresponding
prediction interval
Find the change-point cp where the observed number of SSD visits exceeds the upper prediction bound for all points t<cp prior to diagnosis
The package contains a count dataset corresponding to SSD visits prior to HSV encephalitis.
The change-point functions require a count dataset with two variables: days_since_dx
is the number of days since diagnosis (negative values before diagnosis) and n
is the number of visits
visit_counts
The following figure depicts the trend in visit counts prior to HSV diagnosis, for the 180 days prior to diagnosis
plot_ssd_curve(count_data = visit_counts, max_window = 180)
The main function is fit_cp_model()
. This function takes an SSD visit count dataset
(visit_count), low and high values for the training window, a model to fit
over the training window, the confidence level for the prediction bound. The
function also has an option to return a plot of the fitted model along with prediction
bounds and the change-point.
The current options for models includes:
lm
- Linear Modelexp
- Exponential Modellm_period
- Linear Model with 7-day Periodicityexp_period
- Exponential Model with 7-day Periodicityauto_arima
- Auto ARIMA Model``` {r warning=FALSE, message=FALSE}
mod1 <- fit_cp_model(count_data = visit_counts, low = 40, high = 180, # training window model = "lm", # model level = .95, # prediction bound plot = TRUE)
mod2 <- fit_cp_model(count_data = visit_counts, low = 40, high = 180, # training window model = "exp", # model level = .95, # prediction bound plot = TRUE)
mod3 <- fit_cp_model(count_data = visit_counts, low = 40, high = 180, # training window model = "lm_period", # model level = .95, # prediction bound plot = TRUE)
mod4 <- fit_cp_model(count_data = visit_counts, low = 40, high = 180, # training window model = "exp_period", # model level = .95, # prediction bound plot = TRUE)
mod5 <- fit_cp_model(count_data = visit_counts, low = 40, high = 180, # training window model = "auto_arima", # model level = .95, # prediction bound plot = TRUE)
We can compare the results of the different models ``` {r} gridExtra::grid.arrange(mod1$plot, mod2$plot, mod3$plot, mod4$plot, nrow = 2) mod5$plot
The function fit_cp_profile()
creates a profile of change-points using a range
of different models, training windows and prediction bounds.
``` {r warning=FALSE, message=FALSE} profile_res <- fit_cp_profile(count_data = visit_counts, low = 40, high = 180)
We can view the change-points and the results across the various parameters: ``` {r} profile_res
Or we can plot the results across different lower bounds used in the training data: ``` {r} profile_res %>% ggplot2::ggplot(aes(low,cp,color=model)) + ggplot2::geom_line() + ggplot2::facet_wrap(~bound) + ggplot2::theme_minimal() + ggplot2::theme(legend.position="bottom") ````
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.