View source: R/function_eval_cor.R
| evaluate_cor | R Documentation |
The loss-function learning digital tissue deconvolution finds a vector g
which minimizes the Loss function L
L(g) = - ∑ cor(true_C, estimatd_C(g))
The evaluate_cor function returns the value of the Loss function.
evaluate_cor( X.matrix = NA, new.data, true.compositions, DTD.model, estimate.c.type )
X.matrix |
numeric matrix, with features/genes as rows, and cell types as column. Each column of X.matrix is a reference expression profile. A trained DTD model includes X.matrix, it has been trained on. Therefore, X.matrix should only be set, if the 'DTD.model' is not a DTD model. |
new.data |
numeric matrix with samples as columns, and features/genes as rows. |
true.compositions |
numeric matrix with cells as rows, and mixtures as columns. In the formula above named true_C. Each row of C holds the distribution of the cell over all mixtures. |
DTD.model |
either a numeric vector with length of nrow(X), or a list
returned by |
estimate.c.type |
string, either "non_negative", or "direct". Indicates how the algorithm finds the solution of arg min_C ||diag(g)(Y - XC)||_2.
|
float, value of the Loss function
library(DTD)
random.data <- generate_random_data(
n.types = 10,
n.samples.per.type = 150,
n.features = 250,
sample.type = "Cell",
feature.type = "gene"
)
# normalize all samples to the same amount of counts:
normalized.data <- normalize_to_count(random.data)
# extract indicator list.
# This list contains the Type of the sample as value, and the sample name as name
indicator.list <- gsub("^Cell[0-9]*\\.", "", colnames(random.data))
names(indicator.list) <- colnames(random.data)
# extract reference matrix X
# First, decide which cells should be deconvoluted.
# Notice, in the mixtures there can be more cells than in the reference matrix.
include.in.X <- paste0("Type", 2:7)
percentage.of.all.cells <- 0.2
sample.X <- sample_random_X(
included.in.X = include.in.X,
pheno = indicator.list,
expr.data = normalized.data,
percentage.of.all.cells = percentage.of.all.cells
)
X.matrix <- sample.X$X.matrix
samples.to.remove <- sample.X$samples.to.remove
remaining.mat <- normalized.data[, -which(colnames(normalized.data) %in% samples.to.remove)]
indicator.remain <- indicator.list[names(indicator.list) %in% colnames(remaining.mat)]
training.data <- mix_samples(
expr.data = remaining.mat,
pheno = indicator.remain,
included.in.X = include.in.X,
n.samples = 500,
n.per.mixture = 100,
verbose = FALSE
)
start.tweak <- rep(1, nrow(X.matrix))
sum.cor <- evaluate_cor(
X.matrix = X.matrix
, new.data = training.data$mixtures
, true.compositions = training.data$quantities
, DTD.model = start.tweak
, estimate.c.type = "direct"
)
rel.cor <- sum.cor/ncol(X.matrix)
cat("Relative correlation: ", -rel.cor, "\n")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.