train_deconvolution_model: Train a DTD model based on correlation loss function
In MarianSchoen/DTD: Digital Tissue Deconvolution

View source: R/function_trainDeconvMod.R

train_deconvolution_model

R Documentation

Train a DTD model based on correlation loss function

Description

Loss-function learning Digital Tissue Deconvolution (DTD) adapts a deconvolution model to its biological context. 'train_deconvolution_model' is the main function of the DTD package.
As input it takes the reference matrix X, a list of training data and a start vector 'tweak'. Then, it iteratively finds that vector 'g' that deconvolutes best based on the loss fucntion:

L(g) = - ∑ cor(C_{j,.} \widehat C_{j,.}(g) ) + λ ||g||_1

The 'train_deconvolution_model' function calls the cross validation function DTD_cv_lambda_cxx (or DTD_cv_lambda_R, depending on 'use.implementation') to find the optimal lambda. After the cross validation, it optimizes a model on the complete dataset with the optimal λ.

Usage

train_deconvolution_model(
  tweak,
  X.matrix,
  train.data.list,
  test.data.list = NULL,
  estimate.c.type,
  use.implementation = "cxx",
  ...
)

Arguments

`tweak`	numeric vector with length of nrow(X). In the Loss function above tweak is named "g" Notice, the names of the vector will be kept, and are of use later on.
`X.matrix`	numeric matrix, with features/genes as rows, and cell types as column. Each column of X.matrix is a reference expression profile
`train.data.list`	list, with two entries, a numeric matrix each, named 'mixtures' and 'quantities' Within this list the train/test cross validation will be done. (see Vignette 'browseVignettes("DTD")' for details). Generate 'train.data.list' using `mix_samples` or `mix_samples_with_jitter`.
`test.data.list`	list, with two entries, a numeric matrix each, named 'mixtures' and 'quantities' On this data, the trained model will be tested. Notice, this data is not shown to the optimization. (see Vignette 'browseVignettes("DTD")' for details). Generate 'test.data.list' using `mix_samples` or `mix_samples_with_jitter`.
`estimate.c.type`	string, either "non_negative", or "direct". Indicates how the algorithm finds the solution of arg min_C \|\|diag(g)(Y - XC)\|\|_2. If 'estimate.c.type' is set to "direct", there is no regularization (see `estimate_c`), if 'estimate.c.type' is set to "non_negative", the estimates "C" must not be negative (non-negative least squares) (see (see `estimate_nn_c`))
`use.implementation`	string, either "R" or "cxx". Chooses between the R reference implementation and the faster c++ implementation. Notice, if 'use.implementation' is set to "R" the cross validation function `DTD_cv_lambda_R` is used.
`...`	parameters passed to `DTD_cv_lambda_cxx`, or `DTD_cv_lambda_R`