thin_ts_iter | R Documentation |
Thinning (i.e. retaining every nth observation in a dataset) is a commonly used method to reduce serial autocorrelation and simplify models. The effectiveness of thinning as a strategy to deal with autocorrelation depends on the volume of available data and the level of thinning required to remove autocorrelation. This function implements an iterative thinning approach to determine how the estimated autocorrelation parameter and the volume of data change as a dataset is thinned by progressively greater amounts. To do this, the user must supply a dataframe to be thinned along with parameters that are passed to thin_ts
, which implements thinning. The user supplies a starting value for the thinning index (i.e., specifying that every nth observation should be retained on the first iteration of the algorithm), and an increment which specifies the increase in the thinning index with each iteration of the algorithm. (For models with high serial autocorrelation, the user is advised to begin with a reasonably large increase in the thinning index with each iteration (increment
), which will result in faster convergence but less smooth profiles (i.e. estimated changes in the autocorrelation parameter with the thinning index.) On each iteration, user-defined functions are used to (a) evaluate a model using the thinned dataset and (b) compute the autocorrelation parameter. These functions provide the flexibility to implement the approach for a wide variety of models in which the fitting procedure may or may not estimate the value of the autocorrelation parameter (e.g. bam
versus gamm
). This process increases until the autocorrelation parameter falls below a user-defined threshold. Following algorithm convergence (i.e. the reduction in the autocorrelation parameter below a user-defined threshold), a list of outputs is returned which can be saved. A customisable plot can also be produced to demonstrate the change in (a) the estimated autocorrelation parameter and (b) the volume of data with (c) the level of thinning. This indicates the extent of thinning necessary to reduce autocorrelation consistently below the level expected from white noise (a 95 percent confidence interval is added to the plot), and the volume of data which remains for model fitting after this level of thinning. To facilitate plot customisation, a list of outputs from previous function iterations can be passed to the function, which bypasses the algorithm implementation and simply produces the plot, enabling quick adjustments to each plot component.
thin_ts_iter( dat, ind = NULL, flag1, first = 1, AR1_req = 0.01, nth = 0, increment = 1, eval_mod, resid_method = function(mod) { stats::resid(mod) }, est_AR1 = function(mod) { stats::acf(stats::resid(mod), plot = FALSE)$acf[2] }, thin_ts_iter_ls = NULL, plot = TRUE, p1_pretty_axis_args = list(side = 1:2, pretty = list(n = 5)), p2_pretty_axis_args = list(side = 4, pretty = list(n = 5)), p1_args = list(type = "b", pch = 21, bg = "black", col = "black", cex = 0.5), p2_args = list(type = "b", pch = 21, bg = "dimgrey", col = "dimgrey", cex = 0.5), add_error_envelope_args = list(), add_legend = TRUE, legend_args = list(), mtext_args = list(list(side = 1, text = "Thinning Index", line = 2.5), list(side = 2, text = expression(paste(hat(AR1)["lag =" ~ 1])), line = 2.5), list(side = 4, text = "log[n(obs)]", line = 2.5)), verbose = TRUE )
dat |
A dataframe which is thinned and used to fit the model on the first iteration. |
ind |
A character which defines the column name in |
flag1 |
A character which defines the column name in |
first |
A numeric value which defines the starting position in each independent time series from which every |
AR1_req |
A numeric input which defines the desired number to which the AR1 parameter should be reduced by thinning. |
nth |
A numeric input that defines the starting thinning index value. For a given |
increment |
A numeric value that defines the value by which the thinning index, |
eval_mod |
A function which acts as a wrapper for implementing a model. The only input to this function should be the data used to evaluate the model. |
resid_method |
A function which extracts the residuals from a model. |
est_AR1 |
A function which computes an autocorrelation parameter for a model, such as an AR1 parameter. The only input to this function should be the model. For models without a correlation structure, the default option is usually appropriate: this uses the autocorrelation function of residuals to estimate the approximate AR1 parameter. However, some adjustments may be required to this function (e.g. to extract model residuals appropriately or to extract a different parameter). In other cases, a model may estimate the autocorrelation parameter and the user can define a function here to extract the model estimate from the model object. |
thin_ts_iter_ls |
A named list of outputs from a previous implementation of |
plot |
A logical input which defines whether or not to produce a plot demonstrating the change in (a) the autocorrelation parameter and (b) the volume of data with (c) the thinning index. |
p1_pretty_axis_args |
A named list of arguments passed to |
p2_pretty_axis_args |
A named list of arguments passed to |
p1_args |
A named list of arguments to customise the first plot, which demonstrates the decline in the autocorrelation parameter with the thinning index. |
p2_args |
A named list of arguments to customise the second plot, which is added ontop of the first plot to demonstrate the simultaneous decline in the volume of data with the thinning index. |
add_error_envelope_args |
A list of arguments passed to |
add_legend |
A logical input which defines whether or not to add a legend. |
legend_args |
A named list of arguments passed to |
mtext_args |
A named list of arguments passed to |
verbose |
A logical input which defines whether or not to print messages; namely, the estimated autocorrelation parameter on each algorithm iteration which can be used to monitor algorithm process/speed of convergence. |
The function returns a list and/or a plot. On the first implementation of the algorithm, a list is contained with five elements: (1) start_time
, the start time of the algorithm; (2) end_time
, the end time of the algorithm; (3) iteration_duration
, the duration (in minutes) of the algorithm; (4) iteration_record
, a dataframe providing a record of algorithm outputs including (a) nth
(the value of the thinning index on each iteration), (b) AR1_est
(the value of the autocorrelation parameter on each iteration), (c) nrw_log
(the logarithm of the number of observations remaining in dat
on each iteration), (d) lowerCI
(the lower 95 percent confidence interval for the AR1 parameter in white noise) and (e) upperCI
(the upper 95 percent confidence interval for the AR1 parameter in white noise) and (4) CI
, a list with three elements (x
, the values of nth, as above; lowerCI
, as above; upperCI
, as above) used to add confidence intervals to the plot via add_error_envelope
.
Edward Lavender
#### Define model parameters and simulate observations # Imagine the depth of an animal changes in a concave down pattern throughout the year. # Define x values, the number of days since January 1st set.seed(1) x <- 1:365 # Define expected y values based on a concave-down effect of x: quadratic <- function(a, b, x, h, k){ step1 <- (b* x - h)^2 + k step2 <- a * step1 return(step2) } mu <- quadratic(a = -0.001, b = 1, x = x, h = median(x), k = 100) # Define observed y values with autocorrelation y <- mu + arima.sim(list(order = c(1, 0, 0), ar = 0.6), n = length(mu), sd = 5) # Define dataframe d <- data.frame(x = x, y = y) # Visualise simulated observations graphics::plot(d$x, d$y, type = "l") #### Flag independent sections of time series d <- cbind(d, flag_ts(x = d$x, duration_threshold = 2880, flag = 1:3)) head(d) #### Example (1): Implement thin_ts_iter with default options thin_ts_iter_ls1 <- thin_ts_iter(dat = d, ind = "flag3", flag1 = "flag1", first = 1, AR1_req = 0.01, nth = 0, increment = 1, eval_mod = function(data){ mgcv::bam(y ~ s(x), data = data) }, resid_method = function(mod) { stats::resid(mod) }, est_AR1 = function(mod){ stats::acf(stats::resid(mod), plot = FALSE)$acf[2] }, thin_ts_iter_ls = NULL, plot = TRUE, p1_pretty_axis_args = list(side = 1:2, pretty = list(n = 5)), p2_pretty_axis_args = list(side = 4, pretty = list(n = 5)), p1_args = list(type = "b", pch = 21, bg = "black", col = "black", cex = 0.5 ), p2_args = list(type = "p", pch = 24, bg = "dimgrey", col = "dimgrey", cex = 0.5 ), add_error_envelope_args = list(), add_legend = TRUE, legend_args = list(), mtext_args = list(list(side = 1, text = "Thinning Index", line = 2.5), list(side = 2, text = expression(paste(hat(AR1)["lag =" ~ 1])), line = 2.5), list(side = 4, text = "log[n(obs)]", line = 1) ), verbose = TRUE )
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.