evaluate_multi_ligand_target_prediction_regression: Evaluation of target gene value prediction for multiple...

View source: R/evaluate_model_target_prediction.R

evaluate_multi_ligand_target_prediction_regressionR Documentation

Evaluation of target gene value prediction for multiple ligands (regression).

Description

evaluate_multi_ligand_target_prediction_regression Evaluate how well a trained model is able to predict the observed response to a combination of ligands (e.g. the absolute log fold change value of genes after treatment of cells by a ligand). A regression algorithm chosen by the user is trained to construct one model based on the target gene predictions of all ligands of interest (ligands are considered as features). It shows several regression model fit metrics for the prediction. In addition, variable importance scores can be extracted to rank the possible active ligands in order of importance for response prediction.

Usage

evaluate_multi_ligand_target_prediction_regression(setting,ligand_target_matrix, ligands_position = "cols", algorithm, var_imps = TRUE, cv = TRUE, cv_number = 4, cv_repeats = 2, parallel = FALSE, n_cores = 4,ignore_errors = FALSE)

Arguments

setting

A list containing the following elements: .$name: name of the setting; .$from: name(s) of the ligand(s) active in the setting of interest; .$response: named logical vector indicating whether a target is a TRUE target of the possibly active ligand(s) or a FALSE.

ligand_target_matrix

A matrix of ligand-target probabilty scores (recommended) or discrete target assignments (not-recommended).

ligands_position

Indicate whether the ligands in the ligand-target matrix are in the rows ("rows") or columns ("cols"). Default: "cols"

algorithm

The name of the classification algorithm to be applied. Should be supported by the caret package. Examples of algorithms we recommend: "lm","glmnet", "rf".

var_imps

Indicate whether in addition to classification evaluation performances, variable importances should be calculated. Default: TRUE.

cv

Indicate whether model training and hyperparameter optimization should be done via cross-validation. Default: TRUE. FALSE might be useful for applications only requiring variable importance, or when final model is not expected to be extremely overfit.

cv_number

The number of folds for the cross-validation scheme: Default: 4; only relevant when cv == TRUE.

cv_repeats

The number of repeats during cross-validation. Default: 2; only relevant when cv == TRUE.

parallel

Indiciate whether the model training will occur parallelized. Default: FALSE. TRUE only possible for non-windows OS.

n_cores

The number of cores used for parallelized model training via cross-validation. Default: 4. Only relevant on non-windows OS.

ignore_errors

Indiciate whether errors during model training by caret should be ignored such that another model training try will be initiated until model is trained without raising errors. Default: FALSE.

Value

A list with the following elements. $performances: data frame containing regression model fit metrics for regression on the test folds during training via cross-validation; $performances_training: data frame containing model fit metrics for regression of the final model on the complete data set (performance can be severly optimistic due to overfitting!); $var_imps: data frame containing the variable importances of the different ligands (embbed importance score for some classification algorithms, otherwise just the auroc); $prediction_response_df: data frame containing for each gene the ligand-target predictions of the individual ligands, the complete model and the response as well; $setting: name of the specific setting that needed to be evaluated; $ligands: ligands of interest.

Examples

## Not run: 
library(dplyr)
weighted_networks = construct_weighted_networks(lr_network, sig_network, gr_network, source_weights_df)
setting = convert_expression_settings_evaluation_regression(expression_settings_validation$TGFB_IL6_timeseries) %>% list()
ligands = extract_ligands_from_settings(setting)
ligand_target_matrix = construct_ligand_target_matrix(weighted_networks, ligands)
output = lapply(setting,evaluate_multi_ligand_target_prediction_regression,ligand_target_matrix,ligands_position = "cols",algorithm = "lm" )

## End(Not run)

saeyslab/nichenetr documentation built on March 26, 2024, 9:22 a.m.