train_deconvolution_model: Train a DTD model based on correlation loss function

View source: R/function_trainDeconvMod.R

train_deconvolution_modelR Documentation

Train a DTD model based on correlation loss function

Description

Loss-function learning Digital Tissue Deconvolution (DTD) adapts a deconvolution model to its biological context. 'train_deconvolution_model' is the main function of the DTD package.
As input it takes the reference matrix X, a list of training data and a start vector 'tweak'. Then, it iteratively finds that vector 'g' that deconvolutes best based on the loss fucntion:

L(g) = - ∑ cor(C_{j,.} \widehat C_{j,.}(g) ) + λ ||g||_1

The 'train_deconvolution_model' function calls the cross validation function DTD_cv_lambda_cxx (or DTD_cv_lambda_R, depending on 'use.implementation') to find the optimal lambda. After the cross validation, it optimizes a model on the complete dataset with the optimal λ.

Usage

train_deconvolution_model(
  tweak,
  X.matrix,
  train.data.list,
  test.data.list = NULL,
  estimate.c.type,
  use.implementation = "cxx",
  ...
)

Arguments

tweak

numeric vector with length of nrow(X). In the Loss function above tweak is named "g" Notice, the names of the vector will be kept, and are of use later on.

X.matrix

numeric matrix, with features/genes as rows, and cell types as column. Each column of X.matrix is a reference expression profile

train.data.list

list, with two entries, a numeric matrix each, named 'mixtures' and 'quantities' Within this list the train/test cross validation will be done. (see Vignette 'browseVignettes("DTD")' for details). Generate 'train.data.list' using mix_samples or mix_samples_with_jitter.

test.data.list

list, with two entries, a numeric matrix each, named 'mixtures' and 'quantities' On this data, the trained model will be tested. Notice, this data is not shown to the optimization. (see Vignette 'browseVignettes("DTD")' for details). Generate 'test.data.list' using mix_samples or mix_samples_with_jitter.

estimate.c.type

string, either "non_negative", or "direct". Indicates how the algorithm finds the solution of arg min_C ||diag(g)(Y - XC)||_2.

  • If 'estimate.c.type' is set to "direct", there is no regularization (see estimate_c),

  • if 'estimate.c.type' is set to "non_negative", the estimates "C" must not be negative (non-negative least squares) (see (see estimate_nn_c))

use.implementation

string, either "R" or "cxx". Chooses between the R reference implementation and the faster c++ implementation. Notice, if 'use.implementation' is set to "R" the cross validation function DTD_cv_lambda_R is used.

...

parameters passed to DTD_cv_lambda_cxx, or DTD_cv_lambda_R

Details

For an example see 'browseVignettes("DTD")'

Value

list, including 5 entries:

  • cv.obj' (see DTD_cv_lambda_cxx)

  • 'best.model' (see DTD_cv_lambda_cxx)

  • 'reference.X'

  • 'estimate.c.type'

  • 'pics' (see 'browseVignettes("DTD")')


MarianSchoen/DTD documentation built on April 29, 2022, 1:59 p.m.