knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 7, fig.height = 4.5, out.width = "100%" )
library(MacroFilters) library(ggplot2) data("fr_gdp", package = "MacroFilters") data("es_gdp", package = "MacroFilters")
A fundamental task in applied macroeconomics is separating the trend — the long-run trajectory of a variable — from the cycle — transitory deviations around it. This decomposition underpins business-cycle analysis, the output gap, and potential GDP estimation.
Every series can be written as:
$$ y_t = \tau_t + c_t $$
where $\tau_t$ is the trend and $c_t$ is the cyclical component. The challenge is that any filter must decide whether an unusual observation represents a genuine shock to the trend or a transitory deviation that belongs in the cycle.
Classical filters minimise squared loss. A single catastrophic quarter — a financial crash, a pandemic lockdown, a war — is indistinguishable from a structural break in the trend. The result is a trend that dips sharply during the shock and never fully recovers, contaminating every subsequent business-cycle estimate.
MacroFilters solves this with the mbh_filter() function, which uses Huber loss to automatically down-weight extreme observations while fitting a smooth trend via gradient boosting.
Many filter packages force you to convert data to a specific time-series class before calling them. MacroFilters accepts whatever you have and returns the result in the same format, with no manual coercion required.
Supported input classes:
| Class | Package | Example |
|---|---|---|
| numeric | base R | c(100, 102, 98, ...) |
| ts | base R | ts(y, start = c(2000, 1), frequency = 4) |
| xts | xts | xts(y, order.by = dates) |
| zoo | zoo | zoo(y, order.by = dates) |
set.seed(7) y_raw <- cumsum(rnorm(60)) + (1:60) * 0.3 # a simple integrated series # As plain numeric hp_num <- hp_filter(y_raw) class(hp_num$trend) # numeric # As a monthly ts object y_ts <- ts(y_raw, start = c(2019, 1), frequency = 12) hp_ts <- hp_filter(y_ts) class(hp_ts$trend) # ts — output matches input
The filters normalise the input internally, perform all computations on a plain numeric vector, then restore the original class and index before returning.
hp_filter() — Sparse Hodrick-PrescottThe Hodrick-Prescott (1997) filter is the workhorse of macroeconomic trend extraction. It solves the penalised least-squares problem:
$$ \min_{\tau} \sum_{t=1}^{n}(y_t - \tau_t)^2 + \lambda \sum_{t=2}^{n-1}[(\tau_{t+1} - \tau_t) - (\tau_t - \tau_{t-1})]^2 $$
The second term penalises curvature in the trend; $\lambda$ controls the smoothness.
Most implementations solve this by inverting a dense $n \times n$ matrix, which is $O(n^3)$. MacroFilters recognises that the penalty matrix $D'D$ is pentadiagonal (a sparse banded structure) and solves the system using Matrix::bandSparse() and sparse Cholesky factorisation — bringing the cost down to O(n) in time and memory.
set.seed(42) n <- 100 y <- ts(100 + 0.4 * (1:n) + 5 * sin(2 * pi * (1:n) / 20) + rnorm(n, sd = 2), start = c(2000, 1), frequency = 4) hp <- hp_filter(y) hp
The smoothing parameter $\lambda$ is auto-selected from the series frequency via the Ravn-Uhlig (2002) heuristic:
$$ \lambda = 6.25 \times \text{freq}^4 $$
which gives $\lambda = 1{,}600$ for quarterly and $\lambda = 129{,}600$ for monthly data — the conventional values. You can override it explicitly: hp_filter(y, lambda = 1600).
df_hp <- data.frame( t = as.numeric(time(y)), data = as.numeric(y), trend = as.numeric(hp$trend), cycle = as.numeric(hp$cycle) ) ggplot(df_hp, aes(x = t)) + geom_line(aes(y = data, colour = "Observed"), linewidth = 0.6, linetype = "dashed") + geom_line(aes(y = trend, colour = "Trend"), linewidth = 1.1) + scale_colour_manual(values = c("Observed" = "grey60", "Trend" = "#0072B2")) + labs(title = "HP Filter", subtitle = paste0("\u03bb = ", hp$meta$lambda), x = "Year", y = "Value", colour = NULL) + theme_minimal(base_size = 12) + theme(legend.position = "bottom")
ggplot(df_hp, aes(x = t, y = cycle)) + geom_hline(yintercept = 0, linetype = "dashed", colour = "grey40") + geom_line(colour = "#0072B2", linewidth = 0.8) + labs(title = "HP Cycle", x = "Year", y = "Cycle") + theme_minimal(base_size = 12)
hamilton_filter() — Regression-Based AlternativeHamilton (2018) proposes replacing the HP filter entirely with an OLS regression. The idea: project $y_{t+h}$ on a constant and $p$ lags of $y_t$:
$$ y_{t+h} = \alpha_0 + \alpha_1 y_t + \alpha_2 y_{t-1} + \cdots + \alpha_p y_{t-p+1} + c_t $$
The fitted values form the trend; the residuals form the cycle. The horizon $h$ is set to two years ahead by default (e.g., $h = 8$ quarters), long enough to capture business-cycle variation without filtering it out.
Advantages over HP: - No end-point distortion - No spurious cycles from integrated series - Produces stationary cycle estimates by construction
ham <- hamilton_filter(y) # auto-detects h = 8 for quarterly ham
Note that the first $h + p - 1$ observations of the trend and cycle are NA, because there is insufficient history for the regression.
bhp_filter() — Boosted HPPhillips & Shi (2021) propose iteratively applying the HP filter: at each step, the filter is re-run on the residuals from the previous pass, and the resulting increment is added to the trend estimate. This procedure converges to a trend that better tracks stochastic variation.
$$ \tau^{(m)} = \tau^{(m-1)} + S_\lambda \cdot c^{(m-1)}, \qquad c^{(m)} = y - \tau^{(m)} $$
where $S_\lambda$ is the HP smoothing operator. Three stopping rules are available:
| Rule | Description |
|---|---|
| "bic" (default) | Minimise the Schwarz information criterion |
| "adf" | Stop when the cycle passes an Augmented Dickey-Fuller stationarity test |
| "fixed" | Run exactly iter_max iterations |
bhp <- bhp_filter(y, stopping = "bic") bhp
Internally, MacroFilters precomputes the sparse penalty matrix $Q = (I + \lambda D'D)^{-1}$ once and reuses it across all iterations, so the cost per iteration is a single sparse matrix–vector multiply rather than a full solve.
mbh_filter()Every filter above minimises squared residuals. When a pandemic shock drops GDP by 15% in a single quarter, that one observation exerts enormous leverage — it is 100× more influential than a typical quarterly fluctuation. The filter accommodates it by bending the trend downward, producing a spurious trend break that infects every subsequent output gap estimate.
The MacroBoost Hybrid (MBH) filter replaces squared loss with Huber loss:
$$ L_\delta(r) = \begin{cases} \dfrac{r^2}{2} & |r| \leq \delta \[6pt] \delta!\left(|r| - \dfrac{\delta}{2}\right) & |r| > \delta \end{cases} $$
Observations with residuals smaller than $\delta$ are treated like ordinary squared-error observations. Observations with residuals larger than $\delta$ — the structural shocks — contribute only linearly, massively reducing their influence on the trend estimate.
The trend is decomposed into two additive base learners fitted via component-wise L2-boosting (mboost):
$$ \hat{\tau}t = \underbrace{b_0 + b_1 \cdot t}{\text{Global linear drift}} + \underbrace{f(t)}_{\text{Local curvature (P-spline)}} $$
bols(time, intercept = TRUE) — captures the global linear drift.bbs(time, knots, degree = 3, differences = 2, boundary.knots) — a cubic P-spline with knots interior knots that captures smooth nonlinear curvature.The default knots = min(max(20, floor(n/2)), 250) is deliberately generous — it gives the spline enough flexibility to follow genuine low-frequency movements without overfitting, while the Huber loss ensures that shock-contaminated quarters are downweighted. The cap of 250 keeps the basis bounded on long / high-frequency series, where extra knots add cost but not flexibility (the penalty, not the knot count, governs smoothness).
| Parameter | Default | Role |
|---|---|---|
| knots | min(max(20, n/2), 250) | P-spline flexibility — higher = more local adaptability (capped at 250) |
| mstop | 500 | Boosting iterations — more = finer approximation |
| d | "auto" | Huber delta — if "auto", calibrated via the MAD of the HP cyclical residual (output-gap scale) |
| nu | 0.1 | Shrinkage / learning rate — controls step size per iteration |
| boundary.knots | NULL | B-spline domain anchor — if NULL, uses range(time_idx); fix to c(1, T_max) for real-time vintage stability |
By default, d is auto-calibrated as the MAD of the HP cyclical residual, anchoring the Huber threshold to the output-gap (business-cycle) scale rather than the growth-rate scale. You can override it with an explicit numeric value: d = 0.01 is typical for growth rates, while larger values suit index-level series.
For real-time applications where the sample grows one period at a time,
set boundary.knots = c(1, T_max) to anchor the B-spline domain to the
full-sample range — this prevents the basis from shifting between vintages
and makes trends comparable across publication dates.
France and Spain had two of the sharpest COVID-19 contractions in the EU (approximately −14 % and −18 % quarter-on-quarter in 2020 Q2 respectively), both followed by a rapid V-shaped recovery — making them a demanding real-world stress test for any trend filter.
# FRED public endpoint — no API key needed. # See data-raw/intl_gdp.R for the full reproducible download script. read_fred <- function(id) { url <- sprintf("https://fred.stlouisfed.org/graph/fredgraph.csv?id=%s", id) dt <- read.csv(url, col.names = c("date", "gdp_real"), na.strings = ".") dt$date <- as.Date(dt$date) dt$gdp_log <- log(as.numeric(dt$gdp_real)) dt[!is.na(dt$gdp_real), ] } fr_raw <- read_fred("CLVMNACSCAB1GQFR") es_raw <- read_fred("CLVMNACSCAB1GQES")
# Apply HP + MBH per country. # For log-level series, auto d (MAD of diff) is too tight — calibrate d on the # cycle scale instead (see vignette "Hyperparameter Tuning for the MBH Filter"). make_trend_df <- function(raw, country) { dt <- raw[raw$date >= as.Date("2000-01-01"), ] g <- ts(dt$gdp_log, start = c(2000, 1), frequency = 4) hp <- hp_filter(g) mbh <- mbh_filter(g, d = mad(hp$cycle)) data.frame(country = country, t = as.numeric(time(g)), observed = as.numeric(g), hp = as.numeric(hp$trend), mbh = as.numeric(mbh$trend)) } df_plot <- rbind( make_trend_df(fr_gdp, "France"), make_trend_df(es_gdp, "Spain") ) # Keep Spain filter objects for the S3 class examples in Section 5 dt_es <- es_gdp[es_gdp$date >= as.Date("2000-01-01"), ] gdp <- ts(dt_es$gdp_log, start = c(2000, 1), frequency = 4) hp_res <- hp_filter(gdp) mbh_res <- mbh_filter(gdp, d = mad(hp_res$cycle)) mbh_res
ggplot(df_plot, aes(x = t)) + geom_line(aes(y = observed, colour = "Observed"), linewidth = 0.6, linetype = "dashed") + geom_line(aes(y = hp, colour = "HP trend"), linewidth = 1.0) + geom_line(aes(y = mbh, colour = "MBH trend"), linewidth = 1.1) + annotate("rect", xmin = 2020.00, xmax = 2020.75, ymin = -Inf, ymax = Inf, alpha = 0.12, fill = "firebrick") + annotate("text", x = 2020.375, y = Inf, vjust = 1.4, label = "COVID-19\n2020 Q2", size = 3.2, colour = "firebrick") + scale_colour_manual( values = c("Observed" = "grey60", "HP trend" = "#0072B2", "MBH trend" = "#E69F00") ) + facet_wrap(~country, scales = "free_y") + labs( title = "HP vs MBH under a Structural Shock", subtitle = "MBH trend (orange) stays smooth;\nHP trend (blue) is pulled down by the COVID crash", x = "Year", y = "Log Real GDP", colour = NULL ) + theme_minimal(base_size = 12) + theme(legend.position = "bottom", strip.text = element_text(face = "bold"))
macrofilter S3 ClassAll four functions return a list of class c("macrofilter", "list") with four
core named elements:
| Element | Type | Description |
|---|---|---|
| $trend | numeric | Trend component |
| $cycle | numeric | Cyclical component |
| $data | numeric | Original series (immutable) |
| $meta | named list | Filter method, parameters, temporal identity (ts_class, tsp, idx), compute time |
When a filter is called with boot_iter > 0, the object additionally carries
two bootstrap confidence bands (see the Uncertainty bands vignette):
| Element | Type | Description |
|---|---|---|
| $trend_lower | numeric | Lower 95% band (trend - 1.96 * bootstrap sd) |
| $trend_upper | numeric | Upper 95% band (trend + 1.96 * bootstrap sd) |
mbh_res
The print method shows the method, the number of observations, the key parameters, the cycle range, and how long the filter took to run.
# Trend and cycle as plain vectors head(mbh_res$trend, 8) head(mbh_res$cycle, 8) # Verify the fundamental identity: trend + cycle == data max(abs((mbh_res$trend + mbh_res$cycle) - mbh_res$data)) # should be < 1e-9
str(mbh_res$meta)
The meta list stores every parameter used by the filter, making results fully reproducible from the object alone — no need to track arguments separately.
df_cycle <- data.frame( t = as.numeric(time(gdp)), HP_cycle = as.numeric(hp_res$cycle), MBH_cycle = as.numeric(mbh_res$cycle) ) ggplot(df_cycle, aes(x = t)) + geom_hline(yintercept = 0, linetype = "dashed", colour = "grey40") + geom_line(aes(y = HP_cycle, colour = "HP cycle"), linewidth = 0.8) + geom_line(aes(y = MBH_cycle, colour = "MBH cycle"), linewidth = 0.8) + annotate("rect", xmin = 2020.00, xmax = 2020.75, ymin = -Inf, ymax = Inf, alpha = 0.12, fill = "firebrick") + scale_colour_manual(values = c("HP cycle" = "#0072B2", "MBH cycle" = "#E69F00")) + labs( title = "Cyclical Components", subtitle = "HP cycle absorbs the shock; MBH cycle faithfully records it", x = "Year", y = "Cycle", colour = NULL ) + theme_minimal(base_size = 12) + theme(legend.position = "bottom")
In the MBH cycle, the COVID quarters appear as large negative spikes — the filter correctly recognises them as transitory deviations rather than a change in the long-run level. The HP cycle, by contrast, spreads the shock over several surrounding quarters as it tries to reconcile the trend distortion.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.