estimate_source_weights_characterization: Estimate data source weights of data sources of interest...

estimate_source_weights_characterizationR Documentation

Estimate data source weights of data sources of interest based on leave-one-in and leave-one-out characterization performances.

Description

estimate_source_weights_characterization will estimate data source weights of data sources of interest based on a model that was trained to predict weights of data sources based on leave-one-in and leave-one-out characterization performances.

Usage

estimate_source_weights_characterization(loi_performances,loo_performances,source_weights_df, sources_oi, random_forest =FALSE)

Arguments

loi_performances

Performances of models in which a particular data source of interest was the only data source in or the ligand-signaling or the gene regulatory network.

loo_performances

Performances of models in which a particular data source of interest was removed from the ligand-signaling or the gene regulatory network before model construction.

source_weights_df

A data frame / tibble containing the weights associated to each individual data source. Sources with higher weights will contribute more to the final model performance (required columns: source, weight). Note that only interactions described by sources included here, will be retained during model construction.

sources_oi

The names of the data sources of which data source weights should be estimated based on leave-one-in and leave-one-out performances.

random_forest

Indicate whether for the regression between leave-one-in + leave-one-out performances and data source weights a random forest model should be trained (TRUE) or a linear model (FALSE). Default: FALSE

Value

A list containing two elements. $source_weights_df (the input source_weights_df extended by the estimated source_weighs for data sources of interest) and $model (model object of the regression between leave-one-in, leave-one-out performances and data source weights).

Examples

## Not run: 
library(dplyr)
settings = lapply(expression_settings_validation[1:4], convert_expression_settings_evaluation)
weights_settings_loi = prepare_settings_leave_one_in_characterization(lr_network = lr_network, sig_network = sig_network, gr_network = gr_network, source_weights_df)
weights_settings_loi = lapply(weights_settings_loi,add_hyperparameters_parameter_settings, lr_sig_hub = 0.25,gr_hub = 0.5,ltf_cutoff = 0,algorithm = "PPR", damping_factor = 0.2, correct_topology = TRUE)
doMC::registerDoMC(cores = 4)
job_characterization_loi = parallel::mclapply(weights_settings_loi[1:4], evaluate_model,lr_network = lr_network, sig_network = sig_network, gr_network = gr_network, settings,calculate_popularity_bias_target_prediction = FALSE, calculate_popularity_bias_ligand_prediction = FALSE, ncitations, mc.cores = 4)
loi_performances = process_characterization_target_prediction_average(job_characterization_loi)
weights_settings_loo = prepare_settings_leave_one_out_characterization(lr_network = lr_network, sig_network = sig_network, gr_network = gr_network, source_weights_df)
weights_settings_loo = lapply(weights_settings_loo,add_hyperparameters_parameter_settings, lr_sig_hub = 0.25,gr_hub = 0.5,ltf_cutoff = 0,algorithm = "PPR", damping_factor = 0.2, correct_topology = TRUE)
doMC::registerDoMC(cores = 4)
job_characterization_loo = parallel::mclapply(weights_settings_loo[1:4], evaluate_model,lr_network = lr_network, sig_network = sig_network, gr_network = gr_network, settings,calculate_popularity_bias_target_prediction = FALSE, calculate_popularity_bias_ligand_prediction = FALSE,ncitations,mc.cores = 4)
loo_performances = process_characterization_target_prediction_average(job_characterization_loo)
sources_oi = c("kegg_cytokines")
output = estimate_source_weights_characterization(loi_performances,loo_performances,source_weights_df %>% filter(source != "kegg_cytokines"), sources_oi, random_forest =FALSE)

## End(Not run)


saeyslab/nichenetr documentation built on March 26, 2024, 9:22 a.m.