relative.contributions: Determine the relative contribution per data type

Description Usage Arguments Value Examples

View source: R/functions.R

Description

For each data type, determine its relative contribution to the overall prediction.

Usage

1
relative.contributions(fit, x, data_types, lambda_glmnet = "lambda.1se")

Arguments

fit

Either a tandem-object or a cv.glmnet-object

x

The feature matrix used to train the fit, where the rows correspond to samples and the columns to features.

data_types

A vector of the same length as the number of features, that indicates for each feature to which data type it belongs. This vector doesn't need to correspond to the 'upstream' vector used in tandem(). For example, the upstream features be spread across various data types (such as mutation, CNA, methylation and cancer type) and the downstream features could be gene expression.

lambda_glmnet

Only used when fit is a cv.glmnet object. Should glmnet use lambda.min or lambda.1se? Default is lambda.1se. Note that for TANDEM objects, the lambda_upstream and lambda_downstream parameters should be specified during the tandem() call, as they are used while fitting the model.

Value

A vector that indicates the relative contribution per data type. These numbers sum up to one.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
## simple example
# unpack example data
x = example_data$x
y = example_data$y
upstream = example_data$upstream
data_types = example_data$data_types

# fit TANDEM model
fit = tandem(x, y, upstream, alpha=0.5)

# assess the relative contribution of upstream and downstream features
contr = relative.contributions(fit, x, data_types)
barplot(contr, ylab="Relative contribution", ylim=0:1)

## comparing TANDEM and classic model (glmnet)
# unpack example data
x = example_data$x
y = example_data$y
upstream = example_data$upstream
data_types = example_data$data_types

# fix the cv folds, to facilitate a comparison between models
set.seed(1)
n = nrow(x)
nfolds = 10
foldid = ceiling(sample(1:n)/n * nfolds)

# fit both a TANDEM and a classic model (glmnet)
fit = tandem(x, y, upstream, alpha=0.5)
library(glmnet)
fit2 = cv.glmnet(x, y, alpha=0.5, foldid=foldid)

# assess the relative contribution of upstream and downstream features
# using both methods
contr_tandem = relative.contributions(fit, x, data_types)
contr_glmnet = relative.contributions(fit2, x, data_types)
par(mfrow=c(1,2))
barplot(contr_tandem, ylab="Relative contribution", main="TANDEM", ylim=0:1)
barplot(contr_glmnet, ylab="Relative contribution", main="Classic approach", ylim=0:1)
par(mfrow=c(1,1))

TANDEM documentation built on Dec. 1, 2019, 1:12 a.m.