model.sensitivity: Run sensitivity analysis

View source: R/model.sensitivity.R

model.sensitivityR Documentation

Run sensitivity analysis

Description

The deterministic process is solved several times varying the values of the unknown parameters to identify which are the sensitive ones (i.e., those that have a greater effect on the model behavior), by exploiting the Pearson Ranking Correlation Coefficients (PRCCs). Furthermore, a ranking of simulations is returned in according to the distance of each solution with respect to the reference one.

Usage

model.sensitivity(
  folder_trace = NULL,
  solver_fname = NULL,
  ini_v = NULL,
  i_time = 0,
  f_time,
  s_time,
  atol = 1e-06,
  rtol = 1e-06,
  n_config = 1,
  parameters_fname = NULL,
  functions_fname = NULL,
  volume = getwd(),
  timeout = "1d",
  parallel_processors = 1,
  reference_data = NULL,
  distance_measure = NULL,
  target_value = NULL,
  event_times = NULL,
  event_function = NULL,
  extend = FALSE,
  seed = NULL,
  out_fname = NULL,
  user_files = NULL,
  debug = FALSE,
  fba_fname = NULL,
  FVA = FALSE,
  flux_fname = NULL,
  fva_gamma = 0.9
)

Arguments

folder_trace

Folder in which are stored the traces file that are considered to calculate the PRCC analysis.

solver_fname

.solver file (generated with the function *model_generation*).

i_time

Initial solution time.

f_time

Final solution time.

s_time

Time step defining the frequency at which explicit estimates for the system values are desired.

atol

Absolute error tolerance that determine the error control performed by the LSODA solver.

rtol

Relative error tolerance that determine the error control performed by the LSODA solver.

n_config

Number of configurations to generate, to use only if some parameters are generated from a stochastic distribution, which has to be encoded in the functions defined in *functions_fname* or in *parameters_fname*.

parameters_fname

a textual file in which the parameters to be studied are listed associated with their range of variability. This file is defined by three mandatory columns (*which must separeted using ;*): (1) a tag representing the parameter type: *i* for the complete initial marking (or condition), *m* for the initial marking of a specific place, *c* for a single constant rate, and *g* for a rate associated with general transitions (Pernice et al. 2019) (the user must define a file name coherently with the one used in the general transitions file); (2) the name of the transition which is varying (this must correspond to name used in the PN draw in GreatSPN editor), if the complete initial marking is considered (i.e., with tag *i*) then by default the name *init* is used; (3) the function used for sampling the value of the variable considered, it could be either a R function or an user-defined function (in this case it has to be implemented into the R script passed through the *functions_fname* input parameter). Let us note that the output of this function must have size equal to the length of the varying parameter, that is 1 when tags *m*, *c* or *g* are used, and the size of the marking (number of places) when *i* is used. The remaining columns represent the input parameters needed by the functions defined in the third column

functions_fname

an R file storing: 1) the user defined functions to generate instances of the parameters summarized in the *parameters_fname* file, and 2) the functions to compute: the distance (or error) between the model output and the reference dataset itself (see *reference_data* and *distance_measure*), the discrete events which may modify the marking of the net at specific time points (see *event_function*), and the place or a combination of places from which the PRCCs over the time have to be calculated (see *target_value*).

volume

The folder to mount within the Docker image providing all the necessary files.

timeout

Maximum execution time allowed to each configuration.

parallel_processors

Integer for the number of available processors to use.

reference_data

csv file storing the data to be compared with the simulations’ result.

distance_measure

String reporting the distance function, implemented in *functions_fname*, to exploit for ranking the simulations. Such function takes 2 arguments: the reference data and a list of data_frames containing simulations' output. It has to return a data.frame with the id of the simulation and its corresponding distance from the reference data.

event_times

Vector representing the time points at which the simulation has to stop in order to simulate a discrete event that modifies the marking of the net given a specific rule defined in *functions_fname*.

event_function

String reporting the function, implemented in *functions_fname*, to exploit for modifying the total marking at a specific time point. Such function takes in input: 1) a vector representing the marking of the net (called *marking*), and 2) the time point at which the simulation has stopped (called *time*). In particular, *time* takes values from *event_times*.

extend

If TRUE the actual configuration is extended including n_config new configurations.

seed

.RData file that can be used to initialize the internal random generator.

out_fname

Prefix to the output file name.

user_files

Vector of user files to copy inside the docker directory

debug

If TRUE enables logging activity.

fba_fname

vector of .txt files encoding different flux balance analysis problems, which as to be included in the general transitions (*transitions_fname*).

FVA

Flag to enable the flux variability analysis

flux_fname

vector of fluxes id to compute the FVA

fva_gamma

parameter, which controls whether the analysis is done w.r.t. suboptimal network states (0 <= fva_gamma < 1) or to the optimal state (fva_gamma = 1) It must be the same files vector passed to the function *model_generation* for generating the *solver_fname*. (default is NULL)

target_value_fname

String reporting the distance function, implemented in *functions_fname*, to obtain the place or a combination of places from which the PRCCs over the time have to be calculated. In details, the function takes in input a data.frame, namely output, defined by a number of columns equal to the number of places plus one corresponding to the time, and number of rows equals to number of time steps defined previously. Finally, it must return the column (or a combination of columns) corresponding to the place (or combination of places) for which the PRCCs have to be calculated for each time step.

Details

Sensitivity_analysis takes as input a solver and all the required parameters to set up a dockerized running environment to perform the sensitivity analysis of the model. In order to run the simulations, the user must provide a reference dataset and the definition of a function to compute the distance (or error) between the models' output and the reference dataset itself. The function defining the distance has to be in the following form:

FUNCTION_NAME(reference_dataset, simulation_output)

Moreover, the function must return a column vector with one entry for each evaluation point (i.e. f_time/s_time entries). In addition to that, the user is asked to provide a function that, given the output of the solver, returns the relevant measure (one column) used to evaluate the quality of the solution.

The sensitivity analysis will be performed through a Monte Carlo sampling through user defined functions. The parameters involved in the sensitivity analysis have to be listed in a cvs file using the following structure:

OUTPUT_FILE_NAME, FUNCTION_NAME, LIST OF PARAMETERS (comma separated)

The functions allowed to compute the parameters are either R functions or user defined functions. In the latter case, all the user defined functions must be provided in a single .R file (which will be passed to sensitivity_analysis through the parameter parameters_fname).

Exploiting the same mechanism, user can provide an initial marking to the solver. However, if it is the case the corresponding file name in the parameter list must be set to "init". Let us observe that: (i) the distance and target functions must have the same name of the corresponding R file, (ii) sensitivity_analysis exploits also the parallel processing capabilities, and (iii) if the user is not interested on the ranking calculation then the distance_measure and reference_data are not necessary and can be omitted.

Author(s)

Beccuti Marco, Castagno Paolo, Pernice Simone, Baccega Daniele

See Also

model_generation


qBioTurin/epimod documentation built on June 29, 2024, 8:53 a.m.