estimate_c: Estimating C

View source: R/function_estimateC.R

estimate_cR Documentation

Estimating C

Description

Given a reference matrix X, a matrix of bulks Y and a g-vector, "estimate_c" finds the solution of

arg min || diag(g) (Y - XC) ||_2

. It either uses

  • 'direct' solution:

    C(g) = (X^T Γ X )^(-1) X^T Γ Y

  • 'non_negative' solution, where C_i ≥ 0

Usage

estimate_c(
  X.matrix = NA,
  new.data,
  DTD.model,
  estimate.c.type = "decide.on.model"
)

Arguments

X.matrix

numeric matrix, with features/genes as rows, and cell types as column. Each column of X.matrix is a reference expression profile. A trained DTD model includes X.matrix, it has been trained on. Therefore, X.matrix should only be set, if the 'DTD.model' is not a DTD model.

new.data

numeric matrix with samples as columns, and features/genes as rows. In the formula above denoated as Y.

DTD.model

either a numeric vector with length of nrow(X), or a list returned by train_deconvolution_model, DTD_cv_lambda_cxx, or descent_generalized_fista. In the equation above the DTD.model provides the vector g.

estimate.c.type

string, either "non_negative", or "direct". Indicates how the algorithm finds the solution of arg min_C ||diag(g)(Y - XC)||_2.

  • If 'estimate.c.type' is set to "direct", there is no regularization (see estimate_c),

  • if 'estimate.c.type' is set to "non_negative", the estimates "C" must not be negative (non-negative least squares) (see (see estimate_nn_c))

Value

numeric matrix with ncol(X.matrix) rows, and ncol(new.data) columns

Examples

library(DTD)
set.seed(1)
# simulate random data:
random.data <- generate_random_data(
  n.types = 5,
  n.samples.per.type = 1,
  n.features = 100
)

# simulate a true c
# (this is not used by the estimate_c function, it is only used to show the result!)
true.c <- rnorm(n = ncol(random.data), mean = 0.1, sd = 0.5)

# calculate bulk y = Xc * some_error
bulk <- random.data %*% true.c * rnorm(n = nrow(random.data), mean = 1, sd = 0.01)

# estimate c
estimated.c <- estimate_c(
  X.matrix = random.data,
  new.data = bulk,
  DTD.model = rep(1, nrow(random.data)),
  estimate.c.type = "direct"
)
# visualize that the estimated c are close to the true c
plot(true.c, estimated.c)

estimated.nn.c <- estimate_c(
  X.matrix = random.data,
  new.data = bulk,
  DTD.model = rep(1, nrow(random.data)),
  estimate.c.type = "non_negative"
)
# visualize that the non negative estimated c (notice, the y axis)
plot(true.c, estimated.nn.c)

MarianSchoen/DTD documentation built on April 29, 2022, 1:59 p.m.