crossnobis.md

Implementation Plan – Crossnobis Support for MS-ReVE / Contrast-RSA

Status (May 2025) estimation_method = "crossnobis" is currently disabled in contrast_rsa_model(). The code only whitened fold-means and did not compute the unbiased squared Euclidean distances required by Eq. (5) of Diedrichsen & Kriegeskorte (2017). This document specifies the steps to implement Crossnobis correctly, end-to-end.

1 High-Level Requirements

2 API Surface Changes

| Component | Change | |-----------|--------| | contrast_rsa_model() | Re-allow estimation_method = "crossnobis"; add argument whitening_matrix_W = NULL that is stored in the model spec. | | compute_crossvalidated_means_sl() | No change – new function will be used. | | New helper compute_crossnobis_distances_sl() | Returns a named vector d_crossnobis of length (K(K-1)/2). | | train_model.contrast_rsa_model() | Branch: if obj$estimation_method == "crossnobis", call the new helper to obtain dvec_sl directly (skip Ĝ, lower-tri step). Still compute U_hat_sl (via existing function) so that Δ and Σ_q β_q Δ_q,v remain available. | | Searchlight plumbing | Ensure whitening_matrix_W is forwarded (either via ... at run_searchlight() or by storing it in the model spec and passing down). |

3 Algorithm – compute_crossnobis_distances_sl()

  1. Inputs
  2. sl_data (N × P)
  3. mvpa_design (with Y conditions & block_var/cv grouping)
  4. cv_spec (provides get_nfolds() & train_indices())
  5. whitening_matrix_W (optional)
  6. Preparation
  7. Let cond_levels = levels(mvpa_design$Y).
  8. Determine M folds and allocate accumulator cross_sum (length = Kpairs) initialised to 0.
  9. Per-fold means
  10. For each fold m:
    • μ̂_{c,m} = mean pattern of condition c computed only on training samples of that fold (same logic as existing mean function).
    • If whitening requested: μ̂_{c,m} ← μ̂_{c,m} %*% W.
  11. Store the means in an array M_folds[m, c, p].
  12. Cross-products
  13. For every unordered condition pair (i,j) (vector index k):
    • For all ordered fold pairs (m,n) with m ≠ n:
    • δ̂_{k,m} ← μ̂_{i,m} − μ̂_{j,m}
    • δ̂_{k,n} ← μ̂_{i,n} − μ̂_{j,n}
    • cross_sum[k] += crossprod(δ̂_{k,m}, δ̂_{k,n})
  14. Complexity: O(M² Kpairs P). For typical M≤10, K≤12, P≤500, feasible.
  15. Memory: can compute δ̂ on the fly; no need to store full array if we loop cleverly.
  16. Normalisation
  17. d_crossnobis[k] ← cross_sum[k] / (M*(M-1)*P)
  18. Return
  19. Named vector with names like "condA_vs_condB" in lower-tri order to match how contrast RDMs are vectorised elsewhere.

4 Integration into RSA Regression

5 Edge-Case Handling

6 Tests & Validation

  1. Unit tests (tests/testthat)
  2. Simulated data with known true pattern differences; verify bias of naïve vs. crossnobis distance.
  3. Check invariance to permutation of folds.
  4. Confirm equal results when W = I vs. no whitening.
  5. Integration test
  6. Run a tiny searchlight with contrasts_rsa_model(estimation_method="crossnobis"); ensure no error and reasonable output.

7 Documentation Updates

8 Migration Steps

  1. Phase 1 – Backend logic
  2. Implement compute_crossnobis_distances_sl() + tests.
  3. Wire into train_model.contrast_rsa_model().
  4. Phase 2 – API & examples
  5. Re-enable constructor option, add parameter whitening_matrix_W.
  6. Update searchlight examples.
  7. Phase 3 – Performance optimisation (optional)
  8. C++/Rcpp implementation for heavy workloads.

9 Timeline (suggestion)

| Week | Deliverable | |------|-------------| | 1 | Helper function + unit tests pass | | 2 | Integration into train_model + searchlight plumbing | | 3 | Documentation, vignette, example rebuild | | 4 | Profiling & optional Rcpp optimisation |



bbuchsbaum/rMVPA documentation built on June 10, 2025, 8:23 p.m.