feature_rsa_model: Create a Feature-Based RSA Model
In bbuchsbaum/rMVPA: Multivoxel Pattern Analysis in R

feature_rsa_model

R Documentation

Create a Feature-Based RSA Model

Description

Creates a model for feature-based Representational Similarity Analysis (RSA) that relates neural patterns (X) to a predefined feature space (F).

Usage

feature_rsa_model(
  dataset,
  design,
  method = c("scca", "pls", "pca", "glmnet"),
  crossval = NULL,
  cache_pca = FALSE,
  alpha = 0.5,
  cv_glmnet = FALSE,
  lambda = NULL,
  nperm = 0,
  permute_by = c("features", "observations"),
  save_distributions = FALSE,
  ...
)

Arguments

`dataset`	An `mvpa_dataset` object containing the neural data (`X`).
`design`	A `feature_rsa_design` object specifying the feature space (`F`) and including the component limit ('max_comps').
`method`	Character string specifying the analysis method. One of: scca Sparse Canonical Correlation Analysis relating X and F. pls Partial Least Squares regression predicting X from F. pca Principal Component Analysis on F, followed by regression predicting X from the PCs. glmnet Elastic net regression predicting X from F using glmnet with multivariate Gaussian response.
`crossval`	Optional cross-validation specification.
`cache_pca`	Logical, if TRUE and method is "pca", cache the PCA decomposition of the feature matrix F across cross-validation folds involving the same training rows. Defaults to FALSE.
`alpha`	Numeric value between 0 and 1, only used when method="glmnet". Controls the elastic net mixing parameter: 1 for lasso (default), 0 for ridge, values in between for a mixture. Defaults to 0.5 (equal mix of ridge and lasso).
`cv_glmnet`	Logical, if TRUE and method="glmnet", use cv.glmnet to automatically select the optimal lambda value via cross-validation. Defaults to FALSE.
`lambda`	Optional numeric value or sequence of values, only used when method="glmnet" and cv_glmnet=FALSE. Specifies the regularization parameter. If NULL (default), a sequence will be automatically determined by glmnet.
`nperm`	Integer, number of permutations to run for statistical testing of model performance metrics after merging cross-validation folds. Default 0 (no permutation testing).
`permute_by`	DEPRECATED. Permutation is always done by shuffling rows of the predicted matrix.
`save_distributions`	Logical, if TRUE and nperm > 0, save the full null distributions from the permutation test. Defaults to FALSE.
`...`	Additional arguments (currently unused).

Details

Feature RSA models analyze how well a feature matrix F (defined in the 'design') relates to neural data X. The 'max_comps' parameter, inherited from the 'design' object, sets an upper limit on the number of components used: - pca: Performs PCA on F. 'max_comps' limits the number of principal components (selected by variance explained) used to predict X. Actual components used: 'min(max_comps, available_PCs)'. - pls: Performs PLS regression predicting X from F. 'max_comps' sets the maximum number of PLS components to compute. Actual components used may be fewer based on the PLS algorithm. - scca: Performs SCCA between X and F. 'max_comps' limits the number of canonical components retained (selected by correlation strength). Actual components used: 'min(max_comps, effective_components)'. - glmnet: Performs elastic net regression predicting X from F using the glmnet package with multivariate Gaussian response family. The regularization (lambda) can be automatically selected via cross-validation if cv_glmnet=TRUE. The alpha parameter controls the balance between L1 (lasso) and L2 (ridge) regularization.

**Performance Metrics** (computed by 'evaluate_model' after cross-validation): - 'mean_correlation': Average correlation between predicted and observed patterns for corresponding trials/conditions (diagonal of the prediction-observation correlation matrix). - 'cor_difference': The 'mean_correlation' minus the average off-diagonal correlation ('mean_correlation' - 'off_diag_correlation'). Measures how much better the model predicts the correct trial/condition compared to incorrect ones. - ‘mean_rank_percentile': Average percentile rank of the diagonal correlations. For each condition, ranks how well the model’s prediction correlates with the correct observed pattern compared to incorrect patterns. Values range from 0 to 1, with 0.5 expected by chance and 1 indicating perfect discrimination. - 'voxel_correlation': Correlation between the vectorized predicted and observed data matrices across all trials and voxels. - 'mse': Mean Squared Error between predicted and observed values. - 'r_squared': Proportion of variance in the observed data explained by the predicted data. - 'p_*', 'z_*': If 'nperm > 0', permutation-based p-values and z-scores for the above metrics, assessing significance against a null distribution generated by shuffling predicted trial labels.

The number of components actually used ('ncomp') for the region/searchlight is also included in the performance output.