SICS
is a package designed for microbiome-based prediction based on OTU profiles of 16S-rRNA experiment.
We develop a phylogeny-regularized sparse regression model for “sparse and clustered” microbiome signal. The proposed method uses a novel phylogeny-based smoothness penalty, which is defined based on the inverse matrix of the phylogeny-induced correlation matrix. The new phylogeny-based penalty addresses the two major drawback of the Laplacian-type penalty. It encourages local smoothing, i.e., smoothing effects from close neighbors, as well as, enjoy the data-driven grouping if the tree is mis-specified.
Run following commands in R:
library(devtools)
install_github("lichen-lab/SICS")
3. Use SICS
SICS starts from a OTU abundance matrix (normalized counts), where each row corresponds to a individual and each column corresponds to an OTU, and and a distance matrix among all OTUs.
Below we will use a simulated data (traing set includes 100 individuals, 200 OTUs; testing set includes 200 individuals, 200 OTUs) to illustrate the workflow of SICS.
The data is distributed in SICS as data_SICS
.
1. Load library and example data
library(SICS)
data(data_SICS)
library(ROCR)
help(SICS)
2. Continous outcome: train a model based on the training set, and test the prediction using the testing set
set.seed(1234)
beta.sics=SICS(data_gaussian$z,data_gaussian$y,data_gaussian$D,family='gaussian',pho=c(1/4,4),lambda2=c(1/4,4))
yhat=predict(beta.sics,data_gaussian$z.te,family='gaussian')
plot(data_gaussian$y.te, yhat, main='Continous Outcome',xlab='Observed',ylab='Predicted',col=1,lwd=3)
legend("bottomright",legend=paste('R:',cor(data_gaussian$y.te, yhat)),pch=16)
3. Binary outcome: train a model based on the training set, and test the prediction using the testing set
set.seed(1234)
beta.sics=SICS(data_binary$z,data_binary$y,data_binary$D,family='binomial',pho=c(1/4,4),lambda2=c(1/4,4))
yhat=predict(beta.sics,data_binary$z.te,family='binomial')
pred=prediction(yhat,data_binary$y.te)
perf=performance(pred,"tpr","fpr")
auc=performance(pred,"auc")@y.values[[1]]
plot(perf,main="Binary Outcome",col=1,lwd=3)
abline(0,1)
legend("bottomright",legend=paste('AUC:',auc),pch=16)
We first demonstrate two real data examples to compare SICS with other prediction methods. The following packages are required to be installed before running real data examples.
install.packages(c('ape','ade4','cluster','randomForest','glmnet','glmgraph','ncvreg'))
install.packages('devtools')
library(devtools)
install_github("lichen-lab/SICS")
install_github("lichen-lab/GMPR")
The tutorial for caffeine data analysis (continuous outcome) is https://github.com/lichen-lab/SICS/blob/master/caff.md The tutorial for smoking data analysis (binary outcome) is https://github.com/lichen-lab/SICS/blob/master/smoking.md
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.