Privacy preserving Evaporative Cooling feature selection and classification with Relief-F and Random Forests

Methods are described in the following publication.

Trang Le, W. K. Simmons, M. Misaki, B.C. White, J. Savitz, J. Bodurka, and B. A. McKinney. “Differential privacy-based Evaporative Cooling feature selection and classification with Relief-F and Random Forests,” Bioinformatics. Accepted. Bioinformatics Abstract. 2017.

Load a subset of the resting state fMRI region correlation

library(privateEC)
data(rsfMRIcorrMDD)
# ~100 variables for a test
data.width <- ncol(rsfMRIcorrMDD)
real.data.sets <- splitDataset(all.data = rsfMRIcorrMDD[, (data.width - 101):data.width],
                               pct.train = 0.5,
                               pct.holdout = 0.5,
                               pct.validation = 0,
                               label = "class")
pec.result <- privateEC(train.ds = real.data.sets$train,
                         holdout.ds = real.data.sets$holdout,
                         validation.ds = NULL,
                         label = "class",
                         is.simulated = FALSE,
                         importance.algorithm = "ReliefFequalK",  # ReliefF
                         relief.k.method = "k_half_sigma",        # ReliefF knn
                         update.freq = 5,
                         verbose = FALSE)
knitr::kable(pec.result$algo.acc, 
             caption = "Algorithm Iterations",
             row.names = FALSE, 
             digits = 3)
plotRunResults(pec.result)

pEC selected features

multiple.max.indices <- which(pec.result$algo.acc$holdout.acc==max(pec.result$algo.acc$holdout.acc))
last.max <- multiple.max.indices[length(multiple.max.indices)]
cat("\n Max Holdout Accuracy Step [",last.max,"]\n")
cat("\n Accuracies: ")
print(pec.result$algo.acc[last.max,])
cat("\n Selected Features \n [",pec.result$atts.remain[[last.max]],"]\n")


insilico/privateEC documentation built on May 22, 2020, 5:12 p.m.