The purpose of this vignette is to give an easy to understand tutorial on how to use my new package, medals! As you may know, my package is available here. medals is a R package to implement a new method I've been working on for lesion segmentation, which I'm calling MEDALS: Memory Efficient Decomposition for Analysis of Local neighborhood moments for Segmentation.

For this vignette, we will use the medals package to train on two subjects with multimodal MRI (FLAIR, DWI). Every subject's image is skull-stripped coregistered to the FLAIR, and smoothed. Then, every image is normalized using a trimmed normalization. We will analyze only the 1st order moment image. Then we will test the fitting on a separate subject.

Before doing anything, we need to get our data in a particular form. We need a list of paths to MRI images, a list of paths to brain masks, and a list of paths to manual segmentations. Below I show the structure of these lists. Images are denoted by image.file, brain masks by mask.file and manual segmentations by y.file. Notice that image.file is a list of sublists, each sublist is a list of imaging sequences for a particular subject. The orders of image.file, mask.file, and y.file should all match. We also need a vector of subject IDs. The data I'm using is from the 2015 ISLES challenge, which is here.

library(medals)
datadir <- "../inst/extdata/"
subj <- c("05","06")

image.file<-vector(length = length(subj), mode = "list")
mask.file<-vector(length = length(subj), mode = "list")
y.file<-vector(length = length(subj), mode = "list")

for(i in 1:length(subj)){
  imgs = file.path(datadir,
                   paste0(c("flair", "dwi"),
                          "TrimmedNormImg_Subject",
                          subj[i], ".nii.gz"))
  image.file[[i]][[1]] <- imgs[1]
  image.file[[i]][[2]] <- imgs[2]
  mask.file[[i]] <- file.path(datadir, paste0("brainmask_Subject", subj[i], ".nii.gz"))
  y.file[[i]] <- file.path(datadir, 
                           paste0("ymask_Subject", subj[i], ".nii.gz")
  )
}
print(image.file)
print(mask.file)
print(y.file)
print(subj)

We can visualize the images thanks to John Muschelli's ortho2() function in his neurobase package.

#install.packages("fslr")
library(neurobase)
ortho2(readnii(image.file[[1]][[1]]))

The first step is the computational workhorse of the package. The goal of this package is to create a matrix of local neighborhoods for each voxel, of each imaging modality, of each moment, for every subject in the training set with centered and scaled columns. Since we're interested in doing a principal component analysis on this matrix, we actually contruct the cross-product matrix since we can decompose this matrix equivalently and it's much smaller in dimension. This matrix is calculated using a "load and forget" approach so that only one subject's set of images is loaded in memory at any given time. This function takes in the list of images, list of brain masks, and highest order moment wanted, and it outputs the centered and scaled crossproduct matrix and the total number of voxels across all subjects.

#### library(devtools) ### for install_github()
#### install_github("jmmaronge/medals") ### installing medals
library(medals)
cp<-moment.cor.mat(path.img.list = image.file,
                   path.mask.list = mask.file,
                   mpower = 1,
                   verbose = TRUE)

\

The next function we will use is pc.var(). This will gives us information about the variance and cumulative variance explained by each principal component so that we can make a decision on how many PCs to keep for prediction. The arguments this function takes are the population level cross-product matrix and the total number of voxels in the brain mask across all subjects, which is contained within the suff object that we created.

pc.var(mat=cp$c.mat,n=cp$n)$var[1:8]
pc.var(mat=cp$c.mat,n=cp$n)$cuml_pct[1:8]

For demonstration purposes, I will only use the first three PCs. In general one would want to use more PCs for segmentation I'll use the next function, get.loadings() to create the loadings matrix which will be used to rotate the original images into score images. I'll then use the function make.score.img() to create score images for each subject for each principal component. These will be used as predictors in the logistic regression model.

loadings <- get.loadings(cp$c.mat)
scores <- make.score.img(
  path.img.list = image.file,
  path.mask.list = mask.file,
  loads = loadings,
  which.scores = 1:3
)

Now we can visualize the score images

ortho2(scores[[1]][[2]]) # gives you the score image of the 1st subject, 2nd PC

Now, we need to get the model fit for the regression, this fit is then used to create the list of prediction images for each subject. Then I used ortho2() to visualize the prediction image for the first subject.

fit1 <- get.model.fit(
  score.img.list = scores,
  path.mask.list = mask.file,
  path.y.list = y.file,
  # subj.id = subj,
  verbose = TRUE
)
preds <- make.pred.img(
  score.img.list = scores,
  path.mask.list = mask.file,
  fit = fit1,
  # subj.id = subj
  verbose = TRUE
)
ortho2(preds[[1]]) #prediction image for the first subject
ortho2(image.file[[1]][[1]], preds[[1]] > 0.5) #prediction image for the first subject

That is the end of the pipeline for training a model! Now, if we wanted to use this trained model to predict on a out of sample subject, it's fairly simple. We will use the loadings matrix (load) and model fit (fit1) to predict on a new sample.

First we need a list of paths to images, a list of paths to brain masks, and a vector of subject IDs for this new subject (or subjects). It should be arranged like so.

test.subj<-c("08")
test.image.file<-vector(length = length(test.subj), mode = "list")
test.mask.file<-vector(length = length(test.subj), mode = "list")
test.y.file<-vector(length = length(test.subj), mode = "list")

for(i in 1:length(test.subj)){
  imgs = file.path(datadir,
                   paste0(c("flair", "dwi"),
                          "TrimmedNormImg_Subject",
                          test.subj[i], ".nii.gz"))
  test.image.file[[i]][[1]] <- imgs[1]
  test.image.file[[i]][[2]] <- imgs[2]
  test.mask.file[[i]] <- file.path(datadir, 
                                   paste0("brainmask_Subject", test.subj[i], ".nii.gz"))
  test.y.file[[i]] <- file.path(datadir, 
                           paste0("ymask_Subject", test.subj[i], ".nii.gz")
  )
}
print(test.image.file)
print(test.mask.file)
print(test.subj)

Now, we will use make.score.img() to make predictors for this new subject, but use the loading that we trained.

  test.scores <- make.score.img(
  path.img.list = test.image.file,
  path.mask.list = test.mask.file,
  loads = loadings,  #training loadings matrix
  which.scores = 1:3  #make sure this is the same as what your training set used
)

And now make the prediction image and visualize it...

test.preds <- preds <- make.pred.img(
  score.img.list = test.scores,
  path.mask.list = test.mask.file,
  fit = fit1, # training fit
  # subj.id = subj
  verbose = TRUE
)
mask = readnii(test.mask.file[[1]]) # 230x230x154
img = readnii(test.image.file[[1]][[1]]) # 230x230x153
# ortho2(test.preds[[1]]) #prediction image for the first subject
ortho2(img, test.preds[[1]] > 0.5) #binary prediction

# comparison to ground truth
y_img = readnii(test.y.file[[1]]) # 230x230x154
double_ortho(x = y_img, y = test.preds[[1]] > 0.5, 
             NA.x = TRUE, 
             NA.y = TRUE, 
             xyz = xyz(y_img),
             col = "white",
             col.y = "white") #binary prediction

ortho_diff(img, 
           pred = test.preds[[1]] > 0.5, 
          roi = y_img) 

I've now taken you through how to use all the features of medals to both train a model and apply that training to a new subject! This results in prediction images. In reality you would want to use the prediction images from the training set to establish a probability threshold for prediction on the testing data. You might also calculate DICE, partial AUC, etc. It's also advisable to train on a data set bigger than two subjects. This whole analysis was run on a Macbook Pro with 8GB of RAM. If you have more RAM you may want to set mpower=4, which will analyze up to the 4th order moment.



JMMaronge/medals documentation built on May 7, 2019, 10:12 a.m.