In isglobal-brge/nlOmicAssoc: non-linar omic association

knitr::opts_chunk$set(message = FALSE, warning = FALSE)

Introduction

Association between omic data and continuous variables such as exposures are in many cases not linear. Several methods are proposed to accomodate such relationship. In this package we propose an interface to the following methods:

Multivariable Fractional Polynomial (MFP) model using stepwise,
Exposome-wide association study (ExWASsp) with Natural Cubic Splines regressions,
Generalized Additive Splines model using backfitting (GAM),
Generalized Additive model using boosting (GAMboost),
Regression Trees model using the algorithm Deletion Substitution Addition (partDSA),
Random Forest using an implemented variable selection step (RF) and
Neural Network with an implemented variable selection step (NNet)

The package provides tools to impute missing variables using r CRANpkg("mice") and has functions to filter and obtain variable scores for each method as well and a comparison tool for the obtained results. Plots are also available for results.

Installation

Installation can be done directly from GitHub with the following instructions

if (!require("devtools")) install.packages("devtools")

devtools::install_github("isglobal-brge/nlOmicAssoc", force = TRUE)

Data

The input data for nlAssoc are the following:

object: is the omic data set containing probes in rows and samples in columns. Can be a numeric 'matrix', an 'ExpressionSet' or a 'SummarizedExperiment'
covars: 'data.frame' with exposome variables (continuous).
cores: number of cores to get to work in the case of parallelization (only for linux).

covars rownames should match to object columns

How to start: an example

Preparing data

we use data in bioconductor r Biocpkg("brgedata"). This data set consists of the exposome, transcritopme, methylome and proteome obtained from the same spanish population. For the scope of this example, two data sets are selected, the exposome and the transcriptome. The brgedata transcriptome was obtained from 75 individuals using the HTA 2.0 array (Affymetrix, USA), which provides gene expression for 67,528 transcript clusters. The object is the brge_gexp and is an ExpressionSet. The brgedata exposome data set, consisting of 15 exposure variables. The object is brge_expo is an ExposomeSet. In the following lines, we will load and study both data sets, find common patients and remove those variables with a more than 70% of missing values.

library(nlOmicAssoc)
library(BiocStyle)

library(brgedata)
library(rexposome)
library(mice)

# ExposomeSet
data("brge_expo")
head(expos(brge_expo))

dim(expos(brge_expo)) 
head(pData(brge_expo)) 

pat_expo<-rownames(pData(brge_expo))

# ExpressionSet 
data("brge_gexp")
brge_gexp 
pat_gexp<-rownames(pData(brge_gexp))

# Common patients
common_pat<-intersect(pat_expo,pat_gexp)

# Needed objects

data_m<-exprs(brge_gexp)[,common_pat]
vars_df<-as.data.frame(expos(brge_expo)[common_pat,])

# Missing values in variables
pNA <- function(x){sum(is.na(x))/length(x)}
which(apply(vars_df,1,pNA)>(70/100))

# Final vars_df object
vars_df<-vars_df[!(rownames(vars_df) %in% c("x0031")),]

Applying the model

Once data are prepared, is time to call the method. Methods should be called using the following names:

mfp: Multivariable Fractional Polynomial model using stepwise,
exwassp: Exposome-wide association study with Natural Cubic Splines regressions,
gam: Generalized Additive Splines model using backfitting,
gamboost: Generalized Additive model using boosting,
partdsa: Regression Trees model using the algorithm Deletion Substitution Addition,
rforest: Random Forest using an implemented variable selection step and
nnetwork: Neural Network with an implemented variable selection step

as an example we will test Multivariable Fractional Polynomial (MFP) model using stepwise.

n<-200

brge.mfp <- nlOmicAssoc(object = data_m[1:n,],
                      covars = vars_df,
                      model = "mfp",
                      imp_method = 'pmm',
                      perc_imp = 70,
                      cores = 1L,
                      verbose = FALSE)

nlAssoc class

brge.mfp is an object of class nlAssoc that contains seven slots.

names(brge.mfp)

brge.mfp

Summary

Summary function gives information about the call and the results obtained and will help in following steps to filter results.

summary(brge.mfp)

Filtering results

Results can be filtered using different parameters as the FDR adjusted p-value (default)

brge.mfp.f <- getSignif(brge.mfp, threshold = 0.05)
summary(brge.mfp.f)

Plotting

Plots are obtained for each probe in the gene expression data set. A pdf is created with the comparison between the real data and the predicted by mfp in this case.

#sembla que no xuta...
plot(brge.mfp.f, top.probes=2)

Plotting mfp single results

For the special case were the model applied is mfp, partial effects of a single model can be visualized with function plot.single.mfp. This funcion allows the adjustment of x and y limits in the plot.

#sembla que no xuta...
plot.single.mfp(brge.mfp.f, probe=brge.mfp.f$table$probe[1],xlim=c(0,4))

Scoring variables

To discern the most important variables in a data set, a scoring measure was defined as the total number of times the variable has been selected divided by the number of total tests performed. This can be obtained by calling funcion scoreVars

scoreVars(brge.mfp)

Compare results among methods

To compare among models the GAM model and ExWASsp will also be applied.

GAM model.

brge.gam <- nlOmicAssoc(object = data_m[1:n,],
                      covars = vars_df,
                      model = "gam",
                      imp_method = 'pmm',
                      perc_imp = 70,
                      cores = 1L,
                      verbose = FALSE)
summary(brge.gam)

ExWAS model.

brge.exwas <- nlOmicAssoc(object = data_m[1:n,],
                      covars = vars_df,
                      model = "exwassp",
                      imp_method = 'pmm',
                      perc_imp = 70,
                      cores = 1L,
                      verbose = FALSE)
summary(brge.exwas)

Comparison among the three models. Parameters param, comp and threshold allow to filter each set results by different parameters and compare among them.

list.meth<-list(brge.exwas,brge.gam,brge.mfp)
compareMethods(list.meth,filter=T,threshold=0.05)

isglobal-brge/nlOmicAssoc documentation built on May 9, 2019, 8:16 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

isglobal-brge/nlOmicAssoc
non-linar omic association

In isglobal-brge/nlOmicAssoc: non-linar omic association

Introduction

Installation

Data

How to start: an example

Preparing data

Applying the model

nlAssoc class

Summary

Filtering results

Plotting

Plotting mfp single results

Scoring variables

Compare results among methods

R Package Documentation

Browse R Packages

We want your feedback!

isglobal-brge/nlOmicAssoc non-linar omic association

In isglobal-brge/nlOmicAssoc: non-linar omic association

Introduction

Installation

Data

How to start: an example

Preparing data

Applying the model

nlAssoc class

Summary

Filtering results

Plotting

Plotting mfp single results

Scoring variables

Compare results among methods

R Package Documentation

Browse R Packages

We want your feedback!

isglobal-brge/nlOmicAssoc
non-linar omic association