In stnava/RKRNS: ANTsR Decoding

library(rmarkdown)
library(knitr)
opts_chunk$set(fig.width=9, fig.height=6, fig.path='Figs/',
               warning=FALSE, message=FALSE)
zscore<-TRUE

Introduction

Decoding from blood oxygenation-level dependent (BOLD) magnetic resonance imaging (MRI) data remains an esoteric art. Relatively few public datasets exist and these are rarely accompanied by accessible and validated, general-purpose decoding methodology. This package seeks to document some relatively standard approaches to processing BOLD data and follow with a few novel extensions. These are implemented via R, its packages and ANTsR.

RKRNS Algorithms

This package integrates several frameworks for BOLD processing:

core image processing and I/O: ITK (@Avants2014a);
registration and utilities for image processing: ANTs (@Tustison2014) and ANTsR (@ANTsR);
hemodynamic response function estimation: influenced by GLMdenoise (@Kay2013) and finite impulse response (@Kay2008) estimates;
dimensionality reduction: Eigenanatomy (@Dhillon2014) and SCCAN (@Avants2014);
core statistics and temporal filtering via R packages.

In combination, these tools enable one to go from near-raw medical imaging data to a BOLD decoding experiment.

Data organization expected by RKRNS

RKRNS makes several assumptions about data organization.

The BOLD time series will be masked and converted to a matrix. This will be known as "the BOLD matrix."
The design matrix is binary and has one event type per column.
The design matrix matches the bold matrix in time at an index level. Thus, it has the same number of rows (volumes) as the BOLD matrix. So, the i$^{th}$ row of the design matrix corresponds to the i$^{th}$ BOLD volume. We provide some utilities to assist in assembling the BOLD volumes in this manner.
Simplicity aids debugging but has caveats (rounded event onsets, etc).

Contributions of the package

RKRNS includes:

An organizational system
relatively small scripts implement full study
Implementation of foundational methods
HRF estimation
event-wise $\beta$ estimation
denoising via compcor and glmDenoiseR
flexible: easy to estimate voxel-wise optimal models, HRFs, etc
Reference simulation data and decoding distributed with the package
Interpretation of results
sparse low-dimensional predictors
anatomical labeling of predictors based on AAL
Openness and reproducibility
this document tests the package

Decoding in simulated data

Here, we outline test data that is similar to what one may acquire in a real event related design. However, in this data, there is no need for motion correction.

Example BOLD Data

We adapted methods from the neuRosim (@neuRosim) package in order to simulate event-related BOLD data. This data involves a 4-class decoding task. One seeks to identify brain responses to novel faces, famous faces and their first and second presentations to the subject. We distribute the underlying anatomical BOLD data along with a cortical mask within the package. The data may be loaded by:

library(RKRNS)
fn<-paste(path.package("RKRNS"),"/extdata/111157_mocoref_masked.nii.gz",sep="") 
eximg<-antsImageRead(fn,3)
fn<-paste(path.package("RKRNS"),"/extdata/subaal.nii.gz",sep="") 
mask<-antsImageRead(fn,3)

which gives you both the cortical mask and the example BOLD signal image eximg. ANTsR also provides AAL label (@Tzourio-Mazoyer2002) names via:

data(aal,package='ANTsR')
labs<-1:90

with cortical labs defined by labs.

Run the simulation

We generate simulated bold data as in neuRosim.
The simulateBOLD function results in a bold image in the antsImage class. We subsequently convert this image to a $n\times p$ matrix where there are $n$ time volumes and $p$ voxels in the mask.

bb<-simulateBOLD(option="henson",eximg=eximg,mask=mask)
boldImage<-bb$simbold
mat<-timeseries2matrix( boldImage, bb$mask )

The data frame containing the bold image also contains the design matrix. One should inspect that matrix to get an idea of the structure expected by RKRNS. Row names and column names are expected to exist in the design matrix dataframe.

Simulated BOLD time series

Perform some diagnostic visualizations on the data.

randvox<-sample(1:ncol(mat),4)
plot(ts(mat[,randvox]))

The time series is noisy, as expected.

Frequency filtering

Let's get rid of high frequency noise.

# only filter high frequencies 1/(2*tr)
filtmat<-filterfMRI4KRNS( data.matrix(mat), tr=1, NA, NA, trendfrequency=2, trendfrequency2=NA )
plot(ts(filtmat[,randvox]))

We can eliminate low frequency contamination with polynomial regressors and other denoising techniques at a later step.

Autoregression Correction

armat<-arCorrection( mat )
print(armat$arcoefs)
plot(ts(armat$outmat[,randvox]))

The armat$arcoefs indicate auto-correlation levels of the input data. This function tries to shrink these levels toward zero.

Estimate the hemodynamic response function

Use glmDenoiseR to estimate a FIR HRF.

mypolydegree<-4
glmda<-glmDenoiseR( data.matrix(mat), bb$desmat[,1:4], hrfBasis=NA, 
  hrfShifts = 12, maxnoisepreds=0 , selectionthresh=0.1 , 
  collapsedesign=T, polydegree=mypolydegree, baseshift=0 )

Visualize the HRF

Take a quick look at the estimated hemodynamic response function.

plot(ts(glmda$hrf))

If this doesn't look "normal," then there may be a problem. One might also do an SVD on the run-wise HRFs and derive a more flexible basis set for your study.

Estimate $\beta$ values for each event

We estimate a $\beta^i$ vector for each of $i \in 1 \cdots q$ events. This vector contains a scalar $\beta$ value for each voxel, i.e. $\beta_j^i$ where $j \in 1 \cdots p$. This example can be found via ?bold2betas.

runs<-bb$desmat$Run; 
btsc<-bold2betas( boldmatrix=data.matrix(mat), 
      designmatrix=bb$desmat[,1:4], baseshift=0,
      blockNumb=runs, maxnoisepreds=0, hrfBasis=glmda$hrf,
      hrfShifts=0, polydegree=mypolydegree, selectionthresh=0.2 )
if ( zscore ) {  # z-score the betas
  betameans<-colMeans(btsc$eventbetas)
  betasds<-apply(btsc$eventbetas,FUN=sd,MARGIN=2)
  btsc$eventbetas<-(btsc$eventbetas-betameans)/betasds
}
mylabs<-rep("",nrow(btsc$eventbetas))
for ( i in 1:nrow(btsc$eventbetas) ) 
  mylabs[i]<-substr( rownames(btsc$eventbetas)[i],1,2)
mylabs<-as.factor(mylabs)

Estimate $\beta$ values for each event: 2

We recommend that one looks at the estimated hemodynamic response function in order to help set parameters. This is available as output from the bold2betas function (and glmDenoiseR). Key parameters are the basis length for HRF estimation, the maximum number of noise predictors (e.g. maxnoisepreds=4 or maxnoisepreds=2:10 to search a range), the polynomial degree and the selectionthresh which should be kept below 0.5. Real data processing should proceed similarly. The latter part of the sample code extracts labels from the rownames of the output event beta dataframe. This will set us up for decoding.

Decode from event-wise $\beta$

Use effect size to select a subset of the full voxel matrix.

# sample for training
inds<-sample(1:nrow(btsc$eventbetas),size=round(nrow(btsc$eventbetas)*3./4.))
# basic voxel selection using effect size & classification
zz<-apply(btsc$eventbetas,FUN=mean,MARGIN=2)
zze<-zz/apply(btsc$eventbetas,FUN=sd,MARGIN=2)
th<-0.4
ff<-which( abs(zze) > th )
mydf<-data.frame( lab=mylabs,  vox=data.matrix(btsc$eventbetas)[,ff])
mdl<-svm( lab ~., data=mydf[inds,])
err<-sum(mydf[-inds,]$lab==predict( mdl, newdata=mydf[-inds,]))/nrow(mydf[-inds,])
print(paste("NPredVox",length(ff),"Correct",err*100))

The "high effect size" voxels are sent to a naive support vector machine to do training (on a fraction of the data) and testing on the left out data.

Cross-validated effect size

Effect size increases when a measurement has low variance and a high value. The effect size of cross-validated beta estimates may aid voxel selection.

# ccnuis<-compcor( mat, 2 ) # a denoising option not used here ....
stb<-stableEventResponse( mat,  bb$desmat[,1:4], bb$desmat$Run,  
            polydegree=mypolydegree, hrf=btsc$hrf )

Voxels of clustered effect size

Use a threshold to find clusters of voxels with large effect.

stb2<-abs(stb)
stb2[ stb2 < 5 ]<-0
ee<-eigSeg(mask, matrixToImages( stb2, mask) )
ImageMath(3,ee,'ClusterThresholdVariate',ee,mask,5)
ff<-which( ee[mask==1] > 0 )
ofn<-"Figs/eigseg.jpg"
plotANTsImage( eximg, list(ee), slices='12x56x1',
  thresh='1x4',color='red',outname=ofn)

The eigSeg function labels each region of the mask with the class that has the largest effect size.

Should return the magnitude of effects ... FIXME colors above.

eigSeg Viz

Decode with effect size clusters

Use only those voxels that exist within clusters and have maximum response for some label class.

mydf<-data.frame( lab=mylabs,  vox=data.matrix(btsc$eventbetas)[,ff])
mdl<-svm( lab ~., data=mydf[inds,])
err<-sum(mydf[-inds,]$lab==predict( mdl, newdata=mydf[-inds,]))/nrow(mydf[-inds,])
print(paste("EigSegPredVox",length(ff),"Correct",err*100))

Time series per stimulus

Investigate the time series for each stimulus.

ff<-which( ee[mask==1] == 1 )
stimbold1<-ts( scale(rowMeans(mat[,ff]) ) )
ff<-which( ee[mask==1] == 2 )
stimbold2<-ts( scale(rowMeans(mat[,ff]) ) )
ff<-which( ee[mask==1] == 3 )
stimbold3<-ts( scale(rowMeans(mat[,ff]) ) )
ff<-which( ee[mask==1] == 4 )
stimbold4<-ts( scale(rowMeans(mat[,ff]) ) )
plot(cbind(stimbold1,stimbold2,stimbold3,stimbold4))

Sparse canonical correlation between BOLD and stimuli

CCA maximizes $PearsonCorrelation( XW^T, ZY^T )$ where $X, W$ are as above and $Z$ and $Y$ are similarly defined. CCA optimizes the matrices $W, Y$ operating on $X, Z$ to find a low-dimensional representation of the data pair $( X , Z )$ in which correlation is maximal. Following ideas outlined in @Dhillon2014 and @Avants2014, this method can be extended with sparsity constraints that yield rows of $W, Y$ with a controllable number of non-zero entries.

SCCAN voxel selection & classification

Set up the CCA by pairing the beta matrix with the design matrix.

ccamats<-list( data.matrix(btsc$eventbetas)[inds,] , 
               data.matrix(bb$desmat[btsc$eventrows,1:4])[inds,] )

Initialize SCCAN

Use the SVD of the beta matrix to initialize sparse CCA.

initcca<-t( svd( btsc$eventbetas, nu=0, nv=10 )$v )
initcca<-initializeEigenanatomy( initcca, mask=mask, nreps=1 )$initlist
nv<-length(initcca)

These r nv vectors initialize the sparse optimizer in a good place.

Supervised clustering with SCCAN

The design matrix guides the dimensionality reduction performed on the $\beta$ matrix.

mycca<-sparseDecom2( inmatrix=ccamats, initializationList=initcca,
  sparseness=c( -0.001, -0.95 ), nvecs=nv, its=10, cthresh=c(250,0),
  uselong=0, smooth=0.0, mycoption=1, inmask=c(mask,NA) )
ccaout<-(data.matrix(imageListToMatrix( mycca$eig1, mask )))
ff<-which(colSums(abs(ccaout))>1.e-4)

We also count the non-zero voxels which cover r length(ff)/ncol(mat)*100% of the brain.

Predictive model

Given CCA solution matrix $W$, one may employ the low-dimensional representation, $XW^T$, in multi-label classification. Currently, we employ SVM or random forests as multi-label learners for the problem:

$$L_i = f( XW^T ),$$

that is, learning a (sentence) label function from the BOLD data.

Predictive model: 2

mydf<-data.frame( lab=mylabs,  
  vox=data.matrix(btsc$eventbetas) %*% t(ccaout) )
mdl<-svm( lab ~., data=mydf[inds,])
err<-sum(mydf[-inds,]$lab==predict( mdl, newdata=mydf[-inds,]))/nrow(mydf[-inds,])
print(paste("CCA: Correct",err*100))
# here is another approach ... use cca to transform BOLD signal to stimuli 
mydf3<-data.frame( lab=mylabs, 
  vox=data.matrix(btsc$eventbetas) %*% t(ccaout) %*% t(mycca$eig2) )
mdl2<-svm( lab ~., data=mydf3[inds,])
err<-sum(mydf3[-inds,]$lab==predict( mdl2, newdata=mydf3[-inds,]))/nrow(mydf3[-inds,])
print(paste("CCA2: Correct",err*100))

The input predictors are both clustered and sparse.

Quick look at CCA results

Rescale the cca results and make a picture.

for ( img in mycca$eig1 ) {
  img[mask==1]<-abs(img[mask==1])
  img[mask==1]<-img[mask==1]/max(img[mask==1])
}
ofn<-"Figs/temp.jpg"
plotANTsImage( eximg, mycca$eig1, slices='12x56x1' , 
 thresh='0.01x1.1', color=rainbow(nv), outname=ofn)

CCA results in image space

Interpreting SCCAN results

Confusion matrix

heatmap of co-occurrence of predictions

library(caret) 
truth<-mydf[-inds,]$lab
pred<-predict( mdl, newdata=mydf[-inds,])
xtab <- table(pred, truth)
cm<-confusionMatrix(xtab)
print(cm$table)

Confusion matrix 2

pheatmap(cm$table, cluster_rows = F, cluster_cols = F)

Anatomical coordinates of SCCAN predictors

ANTs propagates AAL labels to the cortex (@Avants2014a,@Tustison2014)

fn<-paste(path.package("RKRNS"),"/extdata/111157_aal.nii.gz",sep="") 
aalimg<-antsImageRead(fn,3)
ccaanat<-list()
for ( img in mycca$eig1 ) {
  nzind<-img[ mask == 1 ] > 0 
  aalvals<-aalimg[ mask == 1 ][ nzind ]
  aalmax<-which.max( hist( aalvals, breaks=1:90, plot=FALSE )$counts )+1 
  ccaanat<-lappend( ccaanat, aal$label_name[aalmax] )
}

The SCCAN predictors include: r unlist(ccaanat).

How much of the known network do we actually find?

Associating classes to SCCAN predictors

Recalling: CCA maximizes $PearsonCorrelation( XW^T, ZY^T )$, we can study matrix $Y$ which contrasts or combines columns of the design matrix.

rownames(mycca$eig2)<-levels(mylabs)
temp<-(mycca$eig2)
temp[ abs(mycca$eig2) < 0.03 ]<-0

Associating classes to SCCAN predictors: 2

pheatmap(temp)

Conclusions

With the current RKRNS, one may:

Exploit R functionality with BOLD data
Estimate HRFs and event-wise betas
Use feature selection based on effect sizes
Employ dimensionality reduction through eigenanatomy or SCCAN
Use relatively few low-dimensional predictors for decoding
Interpret multivariate results intuitively
Allows us to exploit prior anatomical labels ....
or conceptions about what anatomy the decoding should be driven by?

The package needs evaluation at this level of detail on real data.

See RKRNS for all source code and documentation and RKRNS-talk for html slides.

Real data processing with RKRNS

We recommend that you motion correct the data before you use RKRNS tools. This should be done once and done well. Good approaches exist in ANTsR which yield both motion matrices and relevant summary measurements such as FD and DVARS. See ?antsPreprocessfMRI for a simplified utility function. This function could be used on each run of an experiment and the results stored in organized fashion for later use.

Motion correction

To motion correct your data, one might run:

# get an average image
averageImage <- getAverageOfTimeSeries( boldImage )
motionCorrectionResults <- motion_correction(boldImage, 
   fixed = averageImage, moreaccurate = 0 )

Set the moreaccurate flag to 1 or 2 for usable (not test) results. You might also estimate FD and DVARS from these results. One might use antsPreprocessfMRI to get these directly. Note, however, to turn this function's frequency filtering off if you want to do decoding.

FD and DVARS

hey hey ...

References

stnava/RKRNS documentation built on May 30, 2019, 7:21 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

stnava/RKRNS ANTsR Decoding