tk_acf_diagnostics: Group-wise ACF, PACF, and CCF Data Preparation

View source: R/diagnostics-tk_acf_diagnostics.R

tk_acf_diagnosticsR Documentation

Group-wise ACF, PACF, and CCF Data Preparation

Description

The tk_acf_diagnostics() function provides a simple interface to detect Autocorrelation (ACF), Partial Autocorrelation (PACF), and Cross Correlation (CCF) of Lagged Predictors in one tibble. This function powers the plot_acf_diagnostics() visualization.

Usage

tk_acf_diagnostics(.data, .date_var, .value, .ccf_vars = NULL, .lags = 1000)

Arguments

.data

A data frame or tibble with numeric features (values) in descending chronological order

.date_var

A column containing either date or date-time values

.value

A numeric column with a value to have ACF and PACF calculations performed.

.ccf_vars

Additional features to perform Lag Cross Correlations (CCFs) versus the .value. Useful for evaluating external lagged regressors.

.lags

A seqence of one or more lags to evaluate.

Details

Simplified ACF, PACF, & CCF

We are often interested in all 3 of these functions. Why not get all 3 at once? Now you can!

  • ACF - Autocorrelation between a target variable and lagged versions of itself

  • PACF - Partial Autocorrelation removes the dependence of lags on other lags highlighting key seasonalities.

  • CCF - Shows how lagged predictors can be used for prediction of a target variable.

Lag Specification

Lags (.lags) can either be specified as:

  • A time-based phrase indicating a duraction (e.g. ⁠2 months⁠)

  • A maximum lag (e.g. .lags = 28)

  • A sequence of lags (e.g. .lags = 7:28)

Scales to Multiple Time Series with Groupes

The tk_acf_diagnostics() works with grouped_df's, meaning you can group your time series by one or more categorical columns with dplyr::group_by() and then apply tk_acf_diagnostics() to return group-wise lag diagnostics.

Special Note on Dots (...)

Unlike other plotting utilities, the ... arguments is NOT used for group-wise analysis. Rather, it's used for processing Cross Correlations (CCFs).

Use dplyr::group_by() for processing multiple time series groups.

Value

A tibble or data.frame containing the autocorrelation, partial autocorrelation and cross correlation data.

See Also

  • Visualizing ACF, PACF, & CCF: plot_acf_diagnostics()

  • Visualizing Seasonality: plot_seasonal_diagnostics()

  • Visualizing Time Series: plot_time_series()

Examples

library(dplyr)

# ACF, PACF, & CCF in 1 Data Frame
# - Get ACF & PACF for target (adjusted)
# - Get CCF between adjusted and volume and close
FANG %>%
    filter(symbol == "FB") %>%
    tk_acf_diagnostics(date, adjusted,                # ACF & PACF
                       .ccf_vars = c(volume, close),  # CCFs
                       .lags     = 500)

# Scale with groups using group_by()
FANG %>%
    group_by(symbol) %>%
    tk_acf_diagnostics(date, adjusted,
                       .ccf_vars = c(volume, close),
                       .lags     = "3 months")

# Apply Transformations
FANG %>%
    group_by(symbol) %>%
    tk_acf_diagnostics(
        date, diff_vec(adjusted),  # Apply differencing transformation
        .lags = 0:500
    )



business-science/timekit documentation built on Feb. 2, 2024, 2:51 a.m.