WesterlundPlain: Compute Raw Westerlund ECM Panel Cointegration Statistics...

View source: R/Westerlund.R

WesterlundPlainR Documentation

Compute Raw Westerlund ECM Panel Cointegration Statistics (Plain Routine)

Description

Internal plain (non-bootstrap) routine for computing the four Westerlund (2007) ECM-based panel cointegration test statistics G_t, G_a, P_t, and P_a. The function estimates unit-specific ECM regressions to form the mean-group statistics and then constructs pooled (panel) statistics using cross-unit aggregation and partialling-out steps. Time indexing is handled strictly via gap-aware lag/difference helpers.

Usage

WesterlundPlain(
  data,
  touse,
  idvar,
  timevar,
  yvar,
  xvars,
  constant = FALSE,
  trend = FALSE,
  lags,
  leads = NULL,
  lrwindow = 2,
  westerlund = FALSE,
  aic = TRUE,
  bootno = FALSE,
  indiv.ecm = FALSE,
  verbose = FALSE
)

Arguments

data

A data.frame containing panel data.

touse

Logical vector of length nrow(data) indicating rows eligible for estimation. Rows are further filtered to remove missing yvar and xvars.

idvar

String. Column identifying cross-sectional units.

timevar

String. Column identifying time.

yvar

String. Name of the dependent variable (levels).

xvars

Character vector. Names of regressors in the long-run relationship (levels).

constant

Logical. If TRUE, includes a constant term in the ECM design matrix.

trend

Logical. If TRUE, includes a linear time trend in the ECM design matrix.

lags

Integer or length-2 integer vector. Fixed lag order or range c(min,max) for short-run dynamics. If a range is supplied, the routine performs an information-criterion search over candidate lag/lead combinations.

leads

Integer or length-2 integer vector, or NULL. Fixed lead order or range c(min,max). If NULL, defaults to 0.

lrwindow

Integer. Bartlett kernel window (maximum lag) used in long-run variance calculations via calc_lrvar_bartlett.

westerlund

Logical. If TRUE, uses a Westerlund-specific information criterion and trimming logic for variance estimation.

aic

Logical. If TRUE, uses AIC for lag/lead selection when ranges. If FALSE, uses BIC.

bootno

Logical. If TRUE, prints a short header and progress dots (intended for higher-level routines).

indiv.ecm

Logical. If TRUE, gets output of individual ECM regressions.

verbose

Logical. If TRUE, prints additional output.

Details

Purpose and status. WesterlundPlain() is typically called internally by westerlund_test. It returns the four raw test statistics and lag/lead diagnostics needed for printing and standardization.

Workflow overview. The routine proceeds in two main stages:

  1. Unit-specific ECM regressions (Loop 1): For each cross-sectional unit, it constructs an ECM with \Delta y_t as the dependent variable and includes deterministic terms (optional), y_{t-1}, x_{t-1}, lagged \Delta y_t, and leads/lags of \Delta x_t. Lags and leads are computed using strict time-indexed helpers (get_lag, get_diff), which respect gaps in the time index. If lags and/or leads are provided as ranges, an information-criterion search selects the lag/lead orders for each unit. The routine stores the unit-level error-correction estimate \hat{\alpha}_i and its standard error.

  2. Pooled (panel) aggregation (Loop 2): Using the mean of selected lag/lead orders across units, the routine constructs pooled quantities needed for P_t and P_a via partialling-out regressions and cross-unit aggregation of residual products.

Long-run variance calculations. Long-run variances are computed using calc_lrvar_bartlett with maxlag = lrwindow. In westerlund=TRUE mode, the routine applies Stata-like trimming at the start/end of the differenced series based on selected lags/leads prior to long-run variance estimation.

Returned statistics. Let \hat{\alpha}_i denote the unit-specific error-correction coefficient on y_{t-1} (as constructed in the ECM), with standard error \widehat{\mathrm{se}}(\hat{\alpha}_i). The routine computes:

  • G_t: the mean of the individual t-ratios \hat{\alpha}_i/\widehat{\mathrm{se}}(\hat{\alpha}_i),

  • G_a: a scaled mean-group statistic using a unit-specific normalization factor derived from long-run variances,

  • P_t: a pooled t-type statistic based on a pooled \hat{\alpha} and its pooled standard error,

  • P_a: a pooled scaled statistic using an average effective time dimension.

Value

A nested list containing:

  • stats: A list of the four raw Westerlund test statistics:

    • Gt: Mean-group tau statistic.

    • Ga: Mean-group alpha statistic.

    • Pt: Pooled tau statistic.

    • Pa: Pooled alpha statistic.

  • indiv_data: A named list where each element corresponds to a cross-sectional unit (ID), containing:

    • ai: The estimated speed of adjustment (alpha).

    • seai: The standard error of alpha (adjusted for degrees of freedom).

    • betai: Vector of long-run coefficients (\beta = -\lambda / \alpha).

    • blag, blead: The lags and leads selected for that specific unit.

    • ti: Raw observation count for the unit.

    • tnorm: Degrees of freedom used for normalization.

    • reg_coef: If indiv.ecm = TRUE, the full coefficient matrix from westerlund_test_reg.

  • results_df: A summary data.frame containing all unit-level results in vectorized format.

  • settings: A list of routine metadata:

    • meanlag, meanlead: Integer averages of the selected unit lags/leads.

    • realmeanlag, realmeanlead: Numeric averages of the selected unit lags/leads.

    • auto: Logical; TRUE if automatic selection (ranges) was used.

Internal Logic

Two-stage structure

Loop 1 (mean-group) estimates unit-specific ECMs. Each unit produces an estimated error-correction coefficient on y_{t-1} and an associated standard error. These are aggregated into G_t and G_a.

Loop 2 (pooled) fixes a common short-run structure based on the average selected lag/lead orders and constructs pooled residual products to obtain P_t and P_a.

Strict time indexing and gaps

All lags and differences are computed using strict time-based helpers (get_lag, get_diff). This ensures that gaps in the time index propagate as missing values rather than shifting across gaps.

References

Westerlund, J. (2007). Testing for error correction in panel data. Oxford Bulletin of Economics and Statistics, 69(6), 709–748.

See Also

westerlund_test, WesterlundBootstrap, get_lag, get_diff, calc_lrvar_bartlett

Examples


set.seed(123)
N <- 5
T <- 20
df <- data.frame(
  id = rep(1:N, each = T),
  t  = rep(1:T, N),
  y  = rnorm(N * T),
  x1 = rnorm(N * T),
  x2 = rnorm(N * T)
)

touse <- rep(TRUE, nrow(df))

plain_res <- WesterlundPlain(
  data       = df,
  touse      = touse,
  idvar      = "id",
  timevar    = "t",
  yvar       = "y",
  xvars      = c("x1","x2"),
  lags       = 1,
  leads      = 0
)

# Accessing results from the nested structure:
stats <- plain_res$stats
print(c(Gt = stats$Gt, Ga = stats$Ga, Pt = stats$Pt, Pa = stats$Pa))

# Checking unit-specific coefficients for ID '101'
unit_101 <- plain_res$indiv_data[["101"]]
print(unit_101$ai)


Westerlund documentation built on Feb. 7, 2026, 5:07 p.m.