CausalArima | R Documentation |
The function estimates the causal effect of an intervention on a time series by using the best-fitting ARIMA model on pre-intervention data to forecast what would have happened afterwards in the absence of the intervention. The prediction step can be improved by adding covariates that are linked to the outcome variable. Then, the causal effect is estimated by difference between the observed outcome post-intervention ad the prediction.
CausalArima(
y,
auto = TRUE,
order = c(0, 0, 0),
seasonal = c(0, 0, 0),
ic = "aic",
xreg = NULL,
dates,
int.date,
arima.args = list(),
auto.args = list(),
nboot = NULL,
alpha = 0.05
)
y |
Vector of observations, univariate time series of class |
auto |
If set to TRUE, |
order |
Order of the non-seasonal part of the ARIMA model, (p, d, q). Required when |
seasonal |
Order of the seasonal part of the ARIMA model, (P, D, Q). Defaults to |
ic |
Information criterion to use for model selection. Possible values in c("aicc", "aic", "bic").
Defaults to |
xreg |
Optional vector, matrix or data.frame of regressors to be included in the model. It must have the same number of rows as y. |
dates |
Vector of dates of length t (with elements of class |
int.date |
Date of the intervention (must be of class |
arima.args |
Additional arguments to be passed to Arima. |
auto.args |
Additional arguments to be passed to auto.arima. |
nboot |
Number of bootstrap iteration. Optional, if |
alpha |
Confidence level for the prediction intervals. Defaults to |
The function accepts both ts
and numeric
objects. In the latter case, y
is internally transformed in a ts
object whose frequency is set by
findfrequency
from the forecast
package.
NA
values are allowed in y but not in xreg
. Users are then left to
choose their preferred technique to deal with missing data in the outcome (e.g., imputation,
deletion, analysis etc.)
A list with the following components:
norm |
The inferred point, cumulative and temporal average effects at each point in time after the intervention with their standard deviations and p-values (left-sided, right-sided, bidirectional) estimated under the assumption of Normally distributed residuals. |
boot |
The inferred point, cumulative and temporal average effects at each point in time after the intervention with their standard deviations and p-values (left-sided, right-sided, bidirectional) estimated by bootstrap. |
causal.effect |
Estimated causal effect at each point in time after the intervention date (point effect). |
model |
The model estimated in the pre-intervention period,
result of a call to |
dates |
The vector of dates considered in the analysis. |
int.date |
The intervention date. |
y |
The vector of observations considered in the analysis. |
xreg |
The covariates, if any, used in the analysis. |
forecast |
Forecasted time series in the absence of intervention. |
forecast_lower |
Lower confidence bound for the forecasted series. |
forecast_upper |
Upper confidence bound for the forecasted series. |
alpha |
Confidence level for the prediction intervals. |
Rubin, D. B. (1976). Inference and missing data. Biometrika, 63(3), 581-592.
Little, R. J. A., and Rubin, D.B. (1987). Statistical Analysis with Missing Data. New York: John Wiley & Sons.
Imbens, G. W., and and Rubin, D.B. (2015). Causal Inference in Statistics, Social, and Biomedical Sciences. Cambridge, U.K.: Cambridge University Press.
## Example 1 (daily data, no predictors)
# Generating a time series of length 1000 and a vector of dates
y <- 0.5*seq(0.5, 250, by = 0.5) + rnorm(500, sd = 6)
dates <- seq.Date(from = as.Date("2014-01-05"), by = "days", length.out = 500)
# Adding a fictional intervention
int.date <- as.Date("2015-04-01")
horizon <- as.Date(c("2015-04-10", "2015-04-20"))
y.new <- y ; y.new[dates >= int.date] <- y.new[dates >= int.date]*1.25
# Plot
plot(y = y.new, x = dates, type = "l", col = "cadetblue", xlim = c(as.Date("2014-10-01"), tail(dates, 1)))
lines(y = y, x = dates, col = "orange")
# Causal effect estimation
ce <- CausalArima(y = y.new, dates = dates, int.date = int.date)
## Example 2 (daily data, with predictors)
# Loading data and setting dates
data(sales)
y <- sales[, "Sales"]
dates <- as.Date(sales[, "Dates"])
int.date <- as.Date("2018-10-04")
xreg <- sales[, "Price"]
# Plot
oldpar <- par(no.readonly = TRUE)
par(mfrow = c(1,2))
plot(y = y, x = dates, type = "l", main = "Time series of daily sales")
abline(v = int.date, col = "red"); Acf(y, main = "Acf sales")
par(oldpar)
# Causal effect estimation
# The autocorrelation function indicates a weekly sesonal pattern
ce <- CausalArima(y = ts(y, frequency = 7), xreg = xreg, int.date = int.date,
dates = dates, nboot = 100)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.