contrasts: Generate Contrast Matrices

View source: R/contrasts.R

contrastsR Documentation

Generate Contrast Matrices

Description

Creates a numeric contrast matrix for use in RSA or encoding models, based on condition labels and a specification.

Usage

contrasts(
  labels = NULL,
  spec,
  metadata = NULL,
  data = NULL,
  centre = TRUE,
  scale = c("none", "sd", "l2"),
  orth = FALSE,
  keep_attr = TRUE
)

Arguments

labels

Character vector. Required if 'metadata' is NULL. Specifies all unique condition labels in the desired order for the rows of the contrast matrix.

spec

Formula. Defines the contrasts. If 'metadata' is NULL, uses the mini-DSL (see Details). If 'metadata' is provided, uses standard R formula syntax referencing columns in 'metadata' (excluding the 'label' column).

metadata

Optional tibble/data.frame. If provided, it must contain a 'label' column matching the conditions, and other columns representing features or factors used in the 'spec' formula. 'labels' argument is ignored if 'metadata' is provided.

data

Ignored in this version. Reserved for future extensions allowing direct input of feature matrices or RDMs for PCA/MDS contrasts.

centre

Logical. If TRUE (default), columns of the resulting matrix are mean-centered.

scale

Character string specifying scaling method after centering (if 'orth=FALSE'). Options: '"none"' (default), '"sd"' (divide by sample standard deviation), '"l2"' (divide by L2 norm / vector length to get unit vectors). This argument is *ignored* if 'orth = TRUE'.

orth

Logical. If FALSE (default), the matrix columns represent the specified contrasts directly (after centering/scaling). If TRUE, an orthonormal basis for the column space is computed via QR decomposition. Resulting columns will be orthogonal and have unit length (L2 norm = 1).

keep_attr

Logical. If TRUE (default) and 'orth = TRUE', the original column names (before orthogonalization) are stored in 'attr(C, "source")'.

Details

This function provides two main ways to define contrasts:

  1. Via a 'labels' vector and a 'spec' formula using a mini-DSL like '~ factor1(levelA + levelB ~ levelC + .) + factor2(...)'.

  2. Via a 'metadata' tibble (containing condition labels and predictor columns) and a standard R formula 'spec' (e.g., '~ pred1 + pred2 + pred1:pred2').

The function automatically handles centering, scaling, and optional orthogonalization.

**Mini-DSL for 'spec' (when 'metadata' is NULL):** The formula should be of the form '~ name1(levelsA ~ levelsB) + name2(...)'.

  • 'name1', 'name2', etc., become the factor/contrast names. These are used to generate initial binary (+1/-1/0) columns.

  • 'levelsA' are condition labels (from 'labels' argument) separated by '+'. These get coded +1 for the named factor.

  • 'levelsB' are condition labels separated by '+', or '.' (period). These get coded -1 for the named factor. '.' means "all labels not listed in 'levelsA'".

  • Labels not mentioned in a factor definition get coded 0 for that factor.

  • Interaction terms (e.g., 'factorName1:factorName2') can be included in 'spec'. These are passed to 'model.matrix' which computes them based on the previously generated factor columns.

If 'centre = TRUE' (default), the resulting columns from 'model.matrix' are mean-centered. For binary factors created by the DSL (e.g. +1/-1/0 coding), if groups are balanced, they might already be near zero-mean. The explicit centering step ensures this property regardless of input or balance.

Value

A numeric matrix (K x Q), where K is the number of labels and Q is the number of contrasts/orthogonal components. If 'orth = TRUE' and 'keep_attr = TRUE', it includes attributes detailing the source ('"source"') and any dropped ('"dropped"') columns due to rank deficiency.

Orthogonalization

If 'orth = TRUE', uses 'qr.Q(qr(C))' to find an orthonormal basis. The number of columns in the output will be the rank of the input matrix. Columns are renamed 'Orth1', 'Orth2', etc. Scaling is ignored as the columns already have unit L2 norm. If 'keep_attr = TRUE': 'attr(C_orth, "source")' stores the names of the original columns that formed the basis for the orthogonalized matrix. 'attr(C_orth, "dropped")' stores the names of original columns that were linearly dependent and thus not part of the basis, if any.

Scaling

Applied *after* centering if 'orth=FALSE'.

  • '"none"': No scaling.

  • '"sd"': 'scale(..., center=FALSE, scale=TRUE)'. Uses sample standard deviation (N-1 denominator). Note that for columns with few unique values (e.g., a centered +/-1 contrast), the SD can be slightly different depending on whether the number of items is even or odd, due to the N-1 denominator. This might lead to minor differences in scaled norms.

  • '"l2"': Divides each column by its L2 norm ('sqrt(sum(x^2))').

Specific Behaviors

  • If 'orth = TRUE' and the input matrix has only one column after potential centering, that column is scaled to unit L2 norm. Centering still depends on the 'centre' argument.

  • If 'centre = FALSE' and 'orth = TRUE', the QR decomposition is performed on the *uncentered* columns.

  • If the mini-DSL '. ' notation is used for 'levelsB' and 'levelsA' already contains all 'labels', 'levelsB' becomes empty, potentially resulting in a constant (zero) column before centering. A warning is issued in this case.

Masking

This function masks the 'stats::contrasts' function. To use the base R function, explicitly call 'stats::contrasts()'.

See Also

[transform_contrasts()], [make_feature_contrasts()], [stats::contrasts()]

Examples

labs <- c("faces","animals","plants","tools",
          "vehicles","furniture","buildings","food")

# 1) Mini-DSL: 2x2 Factorial (Animacy x Size) + Interaction, Orthonormal
C1 <- contrasts(
        labels = labs,
        spec   = ~ anim( faces + animals + plants + food ~ . )
                 + size( faces + animals + tools + furniture ~ . )
                 + anim:size,
        orth   = TRUE)
print(colnames(C1))
print(attr(C1, "source"))
print(round(crossprod(C1), 5))

# 2) Mini-DSL: One-vs-rest, Centered, Unit Length (L2)
C2 <- contrasts(labels = labs,
                spec   = ~ faces( faces ~ . ) + tools( tools ~ . ),
                scale = "l2")
print(round(colSums(C2^2), 5)) # Should be 1

# 3) Metadata + Formula: Centered, Scaled (SD)
meta <- tibble::tribble(
  ~label,      ~anim, ~size,
  "faces",        1,    0,
  "animals",      1,    0,
  "plants",       1,    1,
  "tools",        0,    0,
  "vehicles",     0,    1,
  "furniture",    0,    0,
  "buildings",    0,    1,
  "food",         1,    1)
# Note: labels argument is ignored here, order comes from meta$label
# Also note: This function masks stats::contrasts
C3 <- contrasts(metadata = meta,
                spec     = ~ anim + size + anim:size,
                scale    = "sd")
print(round(colMeans(C3), 5)) # Should be 0
print(round(apply(C3, 2, sd), 5)) # Should be 1

bbuchsbaum/rMVPA documentation built on June 10, 2025, 8:23 p.m.