View source: R/prewhitened-ccf.R
prewhitened_ccf | R Documentation |
prewhitened_ccf()
prewhitens time series, calculates cross-correlation coefficients, and returns statistically significant values.
prewhitened_ccf(
input,
output,
input_col,
output_col,
keep_input = "both",
keep_ccf = "both",
max.order
)
input |
tsibble; The influential or "predictor-like" time series |
output |
tsibble; The affected or "response-like" time series |
input_col |
string; Name of the numeric column of the input tsibble |
output_col |
string; Name of the numeric column of the output tsibble |
keep_input |
string; values: "input_lags", "input_leads" or "both"; Default is "both". |
keep_ccf |
string; values: "positive", "negative", or "both; Default is "both" |
max.order |
integer; The maximum lag used in the CCF calculation. |
In a cross-correlation in which the direction of influence between two time-series is hypothesized or known,
The influential time-series is called the "input" time-series
The affected time-series is called the "output" time-series
The cross-correlation function calculates correlation values between lags and leads of the input series and the output series. Sometimes only correlations between the leads or lags of the input series and the output series make theoretical sense, or only positive or negative correlations make theoretical sense.
The "keep_input" argument specifies whether you want to keep only output CCF values involving leads or lags of the input series or both.
The "keep_ccf" argument specifies whether you want to keep only output positive, negative, or both CCF values.
prewhitened_ccf
differences the series if it's needed, prewhitens, and outputs either statistically significant values of the CCF or the top non-statistically significant value if no statistically significant values are found. The prewhitening method that is used is from Cryer and Chan (2008, Chapter 11).
A tibble with the following columns:
input_type: "lag" or "lead"
input_series: lag or lead number
signif_type: "Statistically Significant" or "Not Statistically Significant"
signif_threshold: Threashold CCF value for statistical significance at the 95% level
ccf: Calculated ccf value
Cryer, Jonathan, and Chan, Kung-Sik. 2008. Time Series Analysis With Applications in R. New York: Springer Science+Business Media (pp. 260-271)
oh_cases <- ohio_covid %>%
dplyr::select(date, cases) %>%
tsibble::as_tsibble(index = date)
oh_deaths <- ohio_covid %>%
dplyr::select(date, deaths_lead0) %>%
tsibble::as_tsibble(index = date)
oh_ccf_tbl <- prewhitened_ccf(input = oh_cases,
output = oh_deaths,
input_col = "cases",
output_col = "deaths_lead0",
max.order = 40,
keep_input = "input_lag",
keep_ccf = "positive")
oh_ccf_tbl
library(dplyr, warn.conflicts = FALSE)
reg_cases_tsb <- us_regional_cases %>%
tsibble::as_tsibble(index = date, key = region) %>%
tsibble::group_by_key() %>%
tidyr::nest() %>%
arrange(region) %>%
ungroup() %>%
mutate(id = as.character(row_number()))
reg_deaths_tsb <- us_regional_deaths %>%
tsibble::as_tsibble(index = date, key = region) %>%
tsibble::group_by_key() %>%
tidyr::nest() %>%
arrange(region) %>%
ungroup() %>%
mutate(id = as.character(row_number()))
reg_ccf_vals <- purrr::map2_dfr(reg_cases_tsb$data,
reg_deaths_tsb$data,
prewhitened_ccf,
input_col = "reg_sev_day_cases",
output_col = "reg_sev_day_deaths",
max.order = 40,
.id = "id") %>%
left_join(reg_cases_tsb %>%
select(id, region), by = "id")
head(reg_ccf_vals)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.