BayRepulsive_known: BayRepulsive_known is a deconvolution function designed for...

Description Usage Arguments Details Value Source References Examples

View source: R/BayRepulsive.R

Description

This function gives deconvolution results of the observed matrix, along with the square sum of the residuals.

Usage

1
2
3
BayRepulsive_known(DATA, K, Nobs, Nfeature,
                  Niter = 100, epsilon = 0.0001, tau = 100,
                  a.theta = 0.01, b.theta = 0.01, seed = 1 )

Arguments

DATA

The observed data matrix. Each row represents a feature (gene); each column represents a sample.

K

The number of subclones.

Nobs

The number of samples, i.e., the number of columns of the DATA.

Nfeature

The number of features, i.e., the number of rows of the DATA.

Niter

The number of maximum iterations.

epsilon

Tolerance for convergence. We determine whether to break based on the estimated weight matrix. We decide to break if the distance induced by L2 norm between two successive estimated weight matrices is less than epsilon.

tau

The hyperparameter for DPP. A large number is preferred. See more in Details.

a.theta

The hyperparameter for DPP. See more in Details.

b.theta

The hyperparameter for DPP. See more in Details.

seed

The random seed.

Details

Given an observed matrix, whose columns are mixed samples of known number of subclones, the function returns the deconvolution results.

The deconvolution model is based on the assumption that

Y = ZW+E.

Here Y is the observed matrix DATA; Z is the subclone-specific expression matrix; W is the weight matrix; E is the matrix whose entries are independent white noises, with unknown variance σ^2. We assume each column of W, Wj has a prior Wj~Dir(α), where α is a vector with elements 1. We also assume an improper uniform prior for σ^2: σ^2~U(0,10^6). We use a fixed-size determinant point process \insertCitekulesza2011kBayRepulsive as a prior for the subclone-specific expression matrix Z. Suppose there are K subclones and let Zk be the expression profile of subclone k. Mean zero multivariate normal density functions are commonly used as quality functions in DPP. Since the subclone-specific expression matrix is nonnegative, we cosider a transformation, \tilde{Z}k = Zk - \bar{Y}, where the vector \bar{Y} is the mean of average expression level in the DATA of each gene. The prior of (\tilde{Z}1,...,\tilde{Z}K) is proportional to the determinant of a K*K matrix L, defined by Lij = q(\tilde{Z}i)φ(\tilde{Z}i,\tilde{Z}j)q(\tilde{Z}j), where φ(\tilde{Z}i,\tilde{Z}j)=exp{||\tilde{Z}i-\tilde{Z}j||^2 / θ^2} and q(\tilde{Z}j) is the density function of a multivariate normal distribution with mean being the zero vector and variance being τ^2 I. Here τ is the parameter tau in the function, and θ is an unknown parameter with a prior theta~Gamma(a.theta,b.theta). Here a.theta and b.theta are the parameters a.theta and b.theta in the function.

Value

A list of following components:

Z The estimated subclone-specific expression matrix.
W The estimated weight matrix.
C Square sum of the residuals used as a measure of performance.

Source

BayRepulsive: A Bayesian Repulsive Deconvolution Model for Inferring Tumor Heterogeneity

References

\insertRef

kulesza2011kBayRepulsive

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
rm(list=ls())
library(BayRepulsive)
data(CCLE)
set.seed(1)
Nobs     <- 24
Nfeature <- 100
K0       <- 3
### randomly generate weight matrix W for 24 mixing samples
W        <- matrix(0,nrow = K0, ncol = Nobs)
for(i in 1:Nobs){
  Theta <- rgamma(K0,1/K0,1)
  W[,i] <- Theta/sum(Theta)
}
### add some noise
error    <- t(matrix(rnorm(Nfeature * Nobs, mean = 0, sd = 0.5), nrow = Nobs))
DATA     <- CCLE$Z%*%W + error
### Note: please make sure that there are no negative values after adding the noise
result1  <- BayRepulsive_known(DATA = DATA, K = K0, Nobs = Nobs,
                               Nfeature = Nfeature)
cor(as.vector(result1$W), as.vector(W))

bruce1995/BayRepulsive documentation built on May 4, 2019, 9:50 a.m.