knitr::opts_chunk$set(comment="", cache=FALSE, fig.align="center") library(reactable) library(magrittr) devtools::load_all(".")
There is always a limit to the number of variables that can be used in high-dimensional graphical modeling. This is typically determined by the number of samples available as well as the methodology. In some cases, it may be suitable to build networks on curated genes of interest. In other cases we may need to to filter out genes due some failure to meet a requirement or select genes based on favorable property. We provide some useful functions for variable filtering and selection that consider multiple networks.
library(shine)
We use a toy expression set object with three subtypes to illustrate...
data(toy)
dim(toy) head(pData(toy)) table(toy$subtype) exprs(toy)[1:5,1:5]
Graphical modeling considers the co-variance of variables across samples. Therefore, variables must have non-zero variance to be useful. Additionally, when learning multiple networks, the same variable must have non-zero variance within each subtype. Here is a function used to find variables with non-zero variance within every subtype.
head(rownames(toy)) # Edit in zero variance in at least one subtype exprs(toy)["G2", toy$subtype == "A"] <- 0 genes.filtered <- keep.var(toy, column="subtype", subtypes=c("A", "B", "C"), fn=var) head(genes.filtered)
Even after filtering, there still may be too many variables. One method to reduce the number of variables is to rank them by some favorable property and select the top x amount. When finding structural constraints, we rely on biweight midcorrelation which is more reliable when variables have a high median absolute deviation. Since there are multiple networks, we give each network an opportunity to contribute a high-ranking variable in a turn-based system until the limit has been reached.
genes.selected <- rank.var(toy, column="subtype", subtypes=c("A", "B", "C"), limit=30, genes=genes.filtered, fn=mad) head(genes.selected)
rankings <- mapply(function(x) { rank <- sort(apply(exprs(toy)[,toy$subtype == x], 1, mad), decreasing=TRUE) match(genes.selected, names(rank)) }, unique(toy$subtype), SIMPLIFY = FALSE) %>% as.data.frame() %>% set_rownames(genes.selected)
reactable(rankings, compact=TRUE, fullWidth=TRUE, defaultPageSize=30, striped=TRUE, showPageSizeOptions=FALSE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.