knitr::opts_chunk$set(message = FALSE, warning = FALSE)
Association between omic data and continuous variables such as exposures are in many cases not linear. Several methods are proposed to accomodate such relationship. In this package we propose an interface to the following methods:
The package provides tools to impute missing variables using r CRANpkg("mice")
and has functions to filter and obtain variable scores for each method as well and a comparison tool for the obtained results. Plots are also available for results.
Installation can be done directly from GitHub with the following instructions
if (!require("devtools")) install.packages("devtools")
devtools::install_github("isglobal-brge/nlOmicAssoc", force = TRUE)
The input data for nlAssoc are the following:
covars rownames should match to object columns
we use data in bioconductor r Biocpkg("brgedata")
.
This data set consists of the exposome, transcritopme, methylome and proteome obtained from the same spanish population. For the scope of this example, two data sets are selected, the exposome and the transcriptome.
The brgedata transcriptome was obtained from 75 individuals using the HTA 2.0 array (Affymetrix, USA), which provides gene expression for 67,528 transcript clusters. The object is the brge_gexp and is an ExpressionSet.
The brgedata exposome data set, consisting of 15 exposure variables. The object is brge_expo is an ExposomeSet.
In the following lines, we will load and study both data sets, find common patients and remove those variables with a more than 70% of missing values.
library(nlOmicAssoc) library(BiocStyle) library(brgedata) library(rexposome) library(mice) # ExposomeSet data("brge_expo") head(expos(brge_expo)) dim(expos(brge_expo)) head(pData(brge_expo)) pat_expo<-rownames(pData(brge_expo)) # ExpressionSet data("brge_gexp") brge_gexp pat_gexp<-rownames(pData(brge_gexp)) # Common patients common_pat<-intersect(pat_expo,pat_gexp) # Needed objects data_m<-exprs(brge_gexp)[,common_pat] vars_df<-as.data.frame(expos(brge_expo)[common_pat,]) # Missing values in variables pNA <- function(x){sum(is.na(x))/length(x)} which(apply(vars_df,1,pNA)>(70/100)) # Final vars_df object vars_df<-vars_df[!(rownames(vars_df) %in% c("x0031")),]
Once data are prepared, is time to call the method. Methods should be called using the following names:
as an example we will test Multivariable Fractional Polynomial (MFP) model using stepwise.
n<-200 brge.mfp <- nlOmicAssoc(object = data_m[1:n,], covars = vars_df, model = "mfp", imp_method = 'pmm', perc_imp = 70, cores = 1L, verbose = FALSE)
brge.mfp is an object of class nlAssoc that contains seven slots.
names(brge.mfp)
brge.mfp
Summary function gives information about the call and the results obtained and will help in following steps to filter results.
summary(brge.mfp)
Results can be filtered using different parameters as the FDR adjusted p-value (default)
brge.mfp.f <- getSignif(brge.mfp, threshold = 0.05) summary(brge.mfp.f)
Plots are obtained for each probe in the gene expression data set. A pdf is created with the comparison between the real data and the predicted by mfp in this case.
#sembla que no xuta... plot(brge.mfp.f, top.probes=2)
For the special case were the model applied is mfp, partial effects of a single model can be visualized with function plot.single.mfp. This funcion allows the adjustment of x and y limits in the plot.
#sembla que no xuta... plot.single.mfp(brge.mfp.f, probe=brge.mfp.f$table$probe[1],xlim=c(0,4))
To discern the most important variables in a data set, a scoring measure was defined as the total number of times the variable has been selected divided by the number of total tests performed. This can be obtained by calling funcion scoreVars
scoreVars(brge.mfp)
To compare among models the GAM model and ExWASsp will also be applied.
GAM model.
brge.gam <- nlOmicAssoc(object = data_m[1:n,], covars = vars_df, model = "gam", imp_method = 'pmm', perc_imp = 70, cores = 1L, verbose = FALSE) summary(brge.gam)
ExWAS model.
brge.exwas <- nlOmicAssoc(object = data_m[1:n,], covars = vars_df, model = "exwassp", imp_method = 'pmm', perc_imp = 70, cores = 1L, verbose = FALSE) summary(brge.exwas)
Comparison among the three models. Parameters param, comp and threshold allow to filter each set results by different parameters and compare among them.
list.meth<-list(brge.exwas,brge.gam,brge.mfp) compareMethods(list.meth,filter=T,threshold=0.05)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.