Description Usage Arguments Details Value Source References Examples
This function gives deconvolution results of the observed matrix, along with the square sum of the residuals.
1 2 3 | BayRepulsive_known(DATA, K, Nobs, Nfeature,
Niter = 100, epsilon = 0.0001, tau = 100,
a.theta = 0.01, b.theta = 0.01, seed = 1 )
|
DATA |
The observed data matrix. Each row represents a feature (gene); each column represents a sample. |
K |
The number of subclones. |
Nobs |
The number of samples, i.e., the number of columns of the |
Nfeature |
The number of features, i.e., the number of rows of the |
Niter |
The number of maximum iterations. |
epsilon |
Tolerance for convergence. We determine whether to break based on the estimated weight matrix. We decide to break if the distance induced by L2 norm between two successive estimated weight matrices is less than epsilon. |
tau |
The hyperparameter for DPP. A large number is preferred. See more in Details. |
a.theta |
The hyperparameter for DPP. See more in Details. |
b.theta |
The hyperparameter for DPP. See more in Details. |
seed |
The random seed. |
Given an observed matrix, whose columns are mixed samples of known number of subclones, the function returns the deconvolution results.
The deconvolution model is based on the assumption that
Y = ZW+E.
Here Y is the observed matrix DATA
;
Z is the subclone-specific expression matrix;
W is the weight matrix; E is the matrix whose entries are independent white noises, with unknown variance σ^2.
We assume each column of W, Wj has a prior Wj~Dir(α), where α is a vector with elements 1.
We also assume an improper uniform prior for σ^2: σ^2~U(0,10^6).
We use a fixed-size determinant point process \insertCitekulesza2011kBayRepulsive
as a prior for the subclone-specific expression matrix Z.
Suppose there are K subclones and let Zk be the expression profile of subclone k.
Mean zero multivariate normal density functions are commonly used as quality functions in DPP.
Since the subclone-specific expression matrix is nonnegative, we cosider a transformation,
\tilde{Z}k = Zk - \bar{Y}, where the vector \bar{Y} is the mean of average expression level in the DATA
of each gene.
The prior of (\tilde{Z}1,...,\tilde{Z}K) is proportional to the determinant of a K*K matrix L,
defined by Lij = q(\tilde{Z}i)φ(\tilde{Z}i,\tilde{Z}j)q(\tilde{Z}j),
where φ(\tilde{Z}i,\tilde{Z}j)=exp{||\tilde{Z}i-\tilde{Z}j||^2 / θ^2}
and q(\tilde{Z}j) is the density function of a multivariate normal distribution with mean being the zero vector and variance being τ^2 I.
Here τ is the parameter tau
in the function,
and θ is an unknown parameter with a prior theta~Gamma(a.theta,b.theta).
Here a.theta and b.theta are the parameters a.theta
and b.theta
in the function.
A list of following components:
Z | The estimated subclone-specific expression matrix. | |||||||||
W | The estimated weight matrix. | |||||||||
C | Square sum of the residuals used as a measure of performance. |
BayRepulsive: A Bayesian Repulsive Deconvolution Model for Inferring Tumor Heterogeneity
kulesza2011kBayRepulsive
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | rm(list=ls())
library(BayRepulsive)
data(CCLE)
set.seed(1)
Nobs <- 24
Nfeature <- 100
K0 <- 3
### randomly generate weight matrix W for 24 mixing samples
W <- matrix(0,nrow = K0, ncol = Nobs)
for(i in 1:Nobs){
Theta <- rgamma(K0,1/K0,1)
W[,i] <- Theta/sum(Theta)
}
### add some noise
error <- t(matrix(rnorm(Nfeature * Nobs, mean = 0, sd = 0.5), nrow = Nobs))
DATA <- CCLE$Z%*%W + error
### Note: please make sure that there are no negative values after adding the noise
result1 <- BayRepulsive_known(DATA = DATA, K = K0, Nobs = Nobs,
Nfeature = Nfeature)
cor(as.vector(result1$W), as.vector(W))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.