library( rcissus )
runk = FALSE # controls whether we actually train the models and report results
diceOverlap <- function( x,  y ) {
  ulabs = sort( unique( c( unique(x), unique(y) ) ) )
  dicedf = data.frame( labels = ulabs, dice = rep( NA, length( ulabs ) ) )
  for ( ct in 1:length(ulabs) ) {
    denom = sum( x == ulabs[ct] ) + sum( y == ulabs[ct] )
    dd = sum( x == ulabs[ct] & y == ulabs[ct] ) * 2
    dicedf[  ct, 'dice' ] = dd / denom
  }
  return( dicedf )
}

RCISSUS

After Narcissus, who sought his reflection.

Background and motivation

Rcissus provides fast eigenpatch-based deep learning. Rcissus, as a package, is valuable for either segmentation or image translation (mapping intensities between imaging modalities). However, the framework is extensible to many possible applications beyond what is covered here.

Rcissus uses an eigenvector representation of image patches in order to allow rapid and generalizable training from small $n$ datasets. The representation is based on RIPMMARC and RIPMMARC-POP. The core idea is to use patch-based dimensionality reduction ( e.g. [@Dhillon2014; @Avants2014; @Kandel2015] ) to simplify the image representation and avoid the need for convolutional networks. Each image patch is compressed to a k-dimensional basis representation.

With this strategy, the model training instances -- and their size -- are determined by the number of patches and the size of the basis, not the number of image examples. This makes the approach computationally fast, memory efficient and relevant to much smaller datasets, potentially reducing the need for augmentation and high-performance GPUs. These models run fairly efficiently on CPU architecture. Information scale is controlled by both the patch size and the number of eigenvectors used for the patch-based representation.

In examples below, the epochs are limited to allow quick building. One should increase the epochs to get better results. The keras-based models are much more flexible than those provided by h2o and will likely serve as the foundation for further development.

Regression

One source modality

An example predicting raw intensity from gradient images. The examples, here, are 2D but should work in 3D with identical code.

First, we organize the input training dataset.

if ( ! exists( "mth" ) ) mth = 'h2o'
# to use a custom python install and keras models
# mth='kerasdnn'
# Sys.setenv(TENSORFLOW_PYTHON='/usr/local/bin/python3')
popfns = getANTsRData('show')[1:5]
testfn = getANTsRData('show')[6]
# below, we use image lists but one can alternatively replaces lists
# with vectors of filenames - just do so consistently
popGT = list( )  # ground truth
popGR = list( )  # gradient
masks = list( )  # sample masks
nsam = 5000 # samples
################################### train features below
for ( i in 1:length( popfns ) ) {
  popGT[[ i ]] = antsImageRead( getANTsRData( popfns[ i ] ) )
  popGR[[ i ]] = iMath( popGT[[ i ]], "Grad", 1 )
  masks[[ i ]] = randomMask( thresholdImage( popGT[[ i ]], 1 , 255  ) , nsam )
  }

Use high-level code to run the study's training component.

# out of sample prediction ....
nBases = 4
nBasesSeg = 6
nEpochs = 125
hdn = c( 200, 200  ) # depth of network
dmdl = rcTrainTranslation( popGR, popGT, masks, mdlMethod = mth,
  nBases = nBases, patchRadius = 4, nsamples = nsam,
  hidden = hdn,  epochs = nEpochs )

Out of sample prediction.

oosMask =  getANTsRData( testfn ) %>% antsImageRead(  ) %>% getMask()
oos = getANTsRData( testfn ) %>% antsImageRead(  ) %>% iMath("Grad", 1 )
translatedImages = rcTranslate( x = oos, rcmdl = dmdl, mask = oosMask, classification = FALSE  )
invisible( plot( translatedImages[[ 1 ]] ) )

Two source modalities

An example predicting raw intensity from two "modalities": gradient and laplacian images.
We have to go deeper into the guts of the method to implement this special case.

# step 1 - collect data
library( rcissus )
popfns = getANTsRData('show')[1:5]
testfn = getANTsRData('show')[6]
# below, we use image lists but one can alternatively replaces lists
# with vectors of filenames - just do so consistently
popGT = list( )  # ground truth
popGR = list( )  # gradient
popLA = list( )  # laplacian
masks = list( )  # sample masks
mc = FALSE
################################### train features below
for ( i in 1:length( popfns ) ) {
  popGT[[ i ]] = antsImageRead( getANTsRData( popfns[ i ] ) )
  popGR[[ i ]] = iMath( popGT[[ i ]], "Grad", 1 )
  popLA[[ i ]] = iMath( popGT[[ i ]], "Laplacian", 1 )
  masks[[ i ]] = randomMask( thresholdImage( popGT[[ i ]], 1 , 255  ) , nsam )
  }
################################### critical step - build the basis set from both features
trnBas = rcBasis( lappend( popGR, popLA  ),
  patchRadius = dmdl$patchRadius, meanCenter = dmdl$meanCenter )
trnBas$basisMat = trnBas$basisMat[ 1:nBases,  ] # select basis vectors
myseeds = c( 1:length( popGT ) )
################################### project features to the basis
trnMat1 = rcTrainingMatrix( popGR, popGT, masks, trnBas, seeds = myseeds,
  patchRadius = dmdl$patchRadius, meanCenter = dmdl$meanCenter )
trnMat2 = rcTrainingMatrix( popLA, popGT, masks, trnBas, seeds = myseeds,
  patchRadius = dmdl$patchRadius, meanCenter = dmdl$meanCenter )
################################### test features below
popGTtest = list( )  # ground truth
popGRtest = list( )  # gradient
popLAtest = list( )  # laplacian
maskstest = list( )  # sample masks
for ( i in 1:length( testfn ) ) {
  popGTtest[[ i ]] = antsImageRead( getANTsRData( testfn[ i ] ) )
  popGRtest[[ i ]] = iMath( popGTtest[[ i ]], "Grad", 1 )
  popLAtest[[ i ]] = iMath( popGTtest[[ i ]], "Laplacian", 1 )
  maskstest[[ i ]] = getMask( popGTtest[[ i ]] ) # NOTE: dense prediction!
  }
testMat1 = rcTestingMatrix( popGRtest, maskstest, trnBas, seeds = 1,
  patchRadius = dmdl$patchRadius, meanCenter = dmdl$meanCenter )
testMat2 = rcTestingMatrix( popLAtest, maskstest, trnBas, seeds = 1,     
  patchRadius = dmdl$patchRadius, meanCenter = dmdl$meanCenter )

################################### train/test below
traindf = data.frame( trnMat1$x, trnMat2$x, trnMat1$position ) / 100
testdf = data.frame( testMat1$x, testMat2$x, testMat1$position ) / 100
trn = rcTrain( trnMat1$y, traindf, mdlMethod = mth,
  epochs = nEpochs, hidden = hdn )
prd = rcPredict( trn, testdf, mdlMethod = mth )
mm = makeImage( maskstest[[1]], as.numeric( prd[,1] ) )
invisible( plot( mm ) )

Ground truth: regression

invisible( plot( popGTtest[[ 1 ]]  ) )

Segmentation

The same approach can be used for segmentation. We simply switch the featues and, internally, the cost functions and optimizers.

Note that the results - particularly for keras - are impacted by how one scales the feature matrix.

# step 1 - collect data
# below, we use image lists but one can alternatively replaces lists
# with vectors of filenames - just do so consistently
popGT = list( )  # ground truth
popGR = list( )  # gradient
masks = list( )  # sample masks
for ( i in 1:length( popfns ) ) {
  popGT[[ i ]] = antsImageRead( getANTsRData( popfns[ i ] ) )
  popGR[[ i ]] = thresholdImage( popGT[[ i ]], "Otsu", 3 )
  masks[[ i ]] = randomMask( thresholdImage( popGT[[ i ]], 1, 255  ) , nsam )
  }

# training
dmdl = rcTrainTranslation( popGT, popGR, masks, mdlMethod = mth,
  nBases = nBasesSeg, patchRadius = dmdl$patchRadius, meanCenter = dmdl$meanCenter,
  nsamples = nsam, dataScaling = 1000, epochs = nEpochs, classification = TRUE )

# out of sample prediction ....
oosMask =  getANTsRData( testfn ) %>% antsImageRead(  ) %>% getMask()
oos = getANTsRData( testfn ) %>% antsImageRead(  )
mm2 = rcTranslate( x = oos, rcmdl = dmdl, mask = oosMask, classification = TRUE  )
plot( oos, mm2[[ 1 ]], window.overay = c(0,4), alpha = 0.6 )
plot( mm2[[ 4 ]] ) # GM probability

Ground truth: segmentation

gtimg = thresholdImage( oos, "Otsu", 3 )
invisible(
  plot( oos, gtimg,
        window.overay = c(0,3), alpha = 0.6 ) )

Dice overlap

pander::pander( diceOverlap( mm2[[ 1 ]],  gtimg ) )

Write the model for later

# implement this yourself - change variables, etc.
writeRcissus(basis, patchRadius, meanCenter, dataScaling = 1, deepNet = NA,
    directoryname )

Further optimizations

One may want to customize the deep learning components used here, which are simply the defaults provided in h2o and minimal customization of keras.
One can inspect the code in order to get started on this, as well as the h2o documentation. In order to optimize performance for a specific problem, one will need to define a validation objective and take advantage of the many parameter search strategies available.

Please open issues at the rcissus site to discuss concerns, propose new methods, etc.

Future work:

References



stnava/rcissus documentation built on May 12, 2019, 6:25 p.m.