pathway2sample: Penalized estimation of a pathyway's regulatory network from...

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/pathwayFunctions.R

Description

The regulatory relationships between DNA copy number and gene expression within a pathway are modeled by a simulteneous-equations model. Parameters of this model are fitted by minimizing of a penalized least squares criterion. The employed penalty is a combination of the lasso and the fused lasso. This combination encourages within-sample sparsity (lasso), and limits the between-sample differences (fused lasso).

Usage

1
2
3
pathway2sample(Y, X, id, lambda1 = 1, lambdaF = 1, method = "FL", 
	constr = TRUE, startCis = numeric(), startTrans1 =
        matrix(), startTrans2 = matrix(), epsilon = 0, verbose = FALSE)

Arguments

Y

Object of class matrix. Rows are assumed to represent the samples, and columns represent the samples' gene expression levels.

X

Object of class matrix. Rows are assumed to represent the samples, and columns represent the samples' genes or traits. The number of rows and columns of X must be identical to that of Y.

id

An indicator variable of class numeric for the two groups to be compared. The groups should be coded as 0 and 1.

lambda1

Either a numeric- or matrix-object. The lasso parameter. In case lambda1 is of class numeric its length is one, and the same penalty parameter is applied to all trans-effects. In case lambda1 is of class matrix its column and row dimension equal the number of columns of Y. A possibly different penalty parameter is applied to each trans-effect.

lambdaF

Either a numeric- or matrix-object. The fused lasso parameter. In case lambdaF is of class numeric and of length one, the same penalty parameter is applied to all differential trans-effects. In case lambdaF is of class matrix its column and row dimension equal the number of columns of Y. A possibly different penalty parameter is applied to each differential trans-effect.

method

A character-object. Indicates which penalty to employ (see details).

constr

logical. Should the cis-effect (the direct effect of a column of X on column of Y) be positive?

startCis

numeric. Starting values for the cis-effect.

startTrans1

matrix. Starting values for the trans-effect of group 1 (coded as 0).

startTrans2

matrix. Starting values for the trans-effect of group 2 (coded as 1).

epsilon

A numeric. Non-negative positive in the low-dimensional case. epsilon is to assume a positive value in the high-dimensional case.

verbose

logical. Should intermediate output be printed on the screen?

Details

The model is fitted equation-by-equation. This is warranted by the assumption of independent errors. The expression levels of one gene is regressed on its own DNA copy number data and the expression levels of all other genes in the pathway.

The method-option indicates which penalty is combined with the least squares loss function. In case methode = FL, this the fused lasso penalty (as described in Van Wieringen, W.N., Van de Wiel, M.A., 2012):

λ_1 \| Θ^{(a)} \|_1 + λ_1 \| Θ^{(b)} \|_1 + λ_F \| Θ^{(a)} - Θ^{(b)} \|_1.

When methode = FLs, this penalty is simplified to:

λ_1 \| Θ^{(a)} + Θ^{(b)} \|_1 + λ_F \| Θ^{(a)} - Θ^{(b)} \|_1.

The use of this penalty may be motivated as follows. The two samples used to share a common network architecture. One expects only a relatively limited number of edges to have changed. Hence, the majority of edges will have the same sign, resulting in equality of the two penalties. An other motivation for this second penalty arises from the the observation that it is computationally faster. And, as

λ_1 \| Θ^{(a)} \|_1 + λ_1 \| Θ^{(b)} \|_1 ≥q λ_1 \| Θ^{(a)} + Θ^{(b)} \|_1,

it penalizes less. As such, the resulting FLs penalized estimates may be used as starting values for fitting the model with the FL penalty.

Value

Object of class pathwayFit.

Author(s)

Wessel N. van Wieringen: w.vanwieringen@vumc.nl

References

Van Wieringen, W.N., Van de Wiel, M.A. (2012), "Modeling the cis- and trans-effect of DNA copy number aberrations on gene expression levels in a pathway", submitted for publication.

See Also

See also pathwayFit and pathway1sample.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
# set number of genes (p) and samples (n)
p <- 10
n <- 1000

# sample cis-effects
beta <- abs(rnorm(p))

# sample trans-effects for first sample
Theta1 <- matrix(sample(c(-1,1), p^2, replace=TRUE, prob=c(0.2, 0.8)), ncol=p) * 
	matrix(runif(p^2), ncol=p) / 4
diag(Theta1) <- 1

# sample trans-effects for second sample
idDiff <- sample(which(Theta1 != 1), 10)
Theta2 <- Theta1
Theta2[idDiff] <- -Theta1[idDiff]

# sample error variances
Sigma <- diag(rchisq(p, df=1)/5 + 0.5)

# sample DNA copy number data of sample 1
X1 <- matrix(runif(n*p, min=-2, max=2), ncol=p)

# sample gene expression data
Y1 <- t(apply(X1, 1, function(Y, beta){ Y * beta }, beta=beta)) %*% t(solve(Theta1)) + 
	rmvnorm(n, sigma=solve(Theta1) %*% Sigma %*% t(solve(Theta1)))

# sample DNA copy number data of sample 1
X2 <- matrix(runif(n*p, min=-2, max=2), ncol=p)

# sample gene expression data
Y2 <- t(apply(X2, 1, function(Y, beta){ Y * beta }, beta=beta)) %*% t(solve(Theta2)) + 
	rmvnorm(n, sigma=solve(Theta2) %*% Sigma %*% t(solve(Theta2)))

# construct id-vector
id <- c(rep(0, n), rep(1, n))

# fit model
pFit <- pathway2sample(Y=rbind(Y1, Y2), X=rbind(X1, X2), id=id, lambda1=0, lambdaF=0.01)

# compare fit to "truth" for cis-effects
plot(pFit@Cis ~ beta, pch=20)

# compare fit to "truth" for differential trans-effects
penFits1 <- c(pFit@Trans1[upper.tri(Theta1)], pFit@Trans1[lower.tri(Theta1)])
penFits2 <- c(pFit@Trans2[upper.tri(Theta2)], pFit@Trans2[lower.tri(Theta2)])
truth1 <- c(Theta1[upper.tri(Theta1)], Theta1[lower.tri(Theta1)])
truth2 <- c(Theta2[upper.tri(Theta2)], Theta2[lower.tri(Theta2)])
plot(penFits1 - penFits2, truth1 - truth2, pch=20)
cor(penFits1 - penFits2, truth1 - truth2, m="s")

sigaR documentation built on April 28, 2020, 6:05 p.m.