tandem: Fits a TANDEM model by performing a two-stage regression

Description Usage Arguments Value Examples

View source: R/functions.R

Description

Fits a TANDEM model by performing a two-stage regression. In the first stage, all upstream features (x[,upstream]) are regressed on the output y. In the second stage, the downstream features (x[,!upstream]) are regressed on the residuals of the first stage. In both stages Elastic Net regression (as implemented in cv.glmnet() from the glmnet package) is used to perform the regression.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
tandem(
  x,
  y,
  upstream,
  family = "gaussian",
  nfolds = 10,
  foldid = NULL,
  lambda_upstream = "lambda.1se",
  lambda_downstream = "lambda.1se",
  ...
)

Arguments

x

A feature matrix, where the rows correspond to samples and the columns to features.

y

A vector containing the response.

upstream

A boolean vector that indicates for each feature whether it's upstream (TRUE) or downstream (FALSE).

family

The family parameter that's passed to cv.glmnet(). Currently, only family='gaussian' is supported.

nfolds

Number of cross-validation folds (default is 10) used to determine the optimal lambda in cv.glmnet().

foldid

An optional vector indicating in which cross-validation fold each sample should be. Overrides nfolds when used.

lambda_upstream

For the first stage (using the upstream features), should glmnet use lambda.min or lambda.1se? Default is lambda.1se.

lambda_downstream

For the second stage (using the downstream features), should glmnet use lambda.min or lambda.1se? Default is lambda.1se.

...

Other parameters that are passed to cv.glmnet().

Value

A tandem-object.

Examples

1
2
3
4
5
6
7
8
9
# unpack example data
x = example_data$x
y = example_data$y
upstream = example_data$upstream

# fit a tandem model, determine the coefficients and create a prediction
fit = tandem(x, y, upstream, alpha=0.5)
beta = coef(fit)
y_hat = predict(fit, newx=x)

NKI-CCB/TANDEM documentation built on Nov. 25, 2019, 11:18 p.m.