README.md

Launch Rstudio Binder

SBtabVFGEN

Convert a Model written in SBtab, saved as a series of tsv files or alternatively an Open Document Spreadsheet ods to a VFGEN vector field file vf.

This project is supported by EBRAINS infrastructure and the Human Brain Project, with more detail in acknowledgements.

Install

Using remotes:

remotes::install_github("a-kramer/SBtabVFGEN")

You may have to check .libPaths() to verify that it includes a path that you have permission to write to (this is just generally the case, not just for this package).

Currently, this will work on platforms that have R. But, any user who needs SBML output must install the libSBML package for R.

This is entirely optional, but SBML files may be useful to share models with others. This is not easy, as libSBML is not a cran package, nor is it on github/lab. But browsing the SBML repository on sourceforge makes it possible to find the right version of the R interface.

Download the tar.gz file libSBML_*.tar.gz in the appropriate version, and the install it as a package using

$ R CMD INSTALL liblibSBML_*.tar.gz

Purpose

This model conversion tool can be used by scientists working in the field of systems biology and all adjacent fields that work with ordinary differential equation (ODE) models.

It can be helpful when collaborating with other researchers as it keeps the model separate from any programming language choice. The user writes the model in SBtab form, a simple, human readable format; afterwards this SBtab model can be converted to an ODE and further processed via vfgen.

The final result is code for the ODE right hand side function and analytical jacobian function (among other things) in the chosen programming language.

This tool prepares a model M for use in numerical analysis application such as parameter estimation:

  User written:                           generated
  +-----------+      +-----------+      +------------+      +----------+
  |           |      |           |      |  (CVODE)   |      |          |
  | SBtab (M) +--+-->+   VFGEN   +----->+  ODE code  +----->+   MCMC   |
  |           |  |   |           |      | +jacobian  |      |  (e.g.)  |
  +-----------+  |   +-----------+      +------------+      +----------+
                 |
            +----+--------------+
            |                   |
            | sbtab_to_vfgen()  |
            |                   |
            +-------------------+

The above sketch is an illustration of this tools location within a larger workflow (context).

As an alternative to VFGEN, we have written a less powerful tool that just creates C and R code (while VFGEN covers many langauges): RPN-derivative (forked), which includes the shell script sh/ode.sh. The shell script uses the derivative program (same repository, written in C) to calculate model-jacobians analytically.

Usage Example

Within an interactive R session called from the folder that contains the tsv files:

library(SBtabVFGEN)
model.tsv <- dir(pattern="[.]tsv$");
model.sbtab <- sbtab_from_tsv(model.tsv)
sbtab_to_vfgen(model.sbtab)

Shell Interface via Rscript

An additional R script can be called from the commandline; it uses sbtab_to_vfgen.R internally:

$ alias sbtab_to_vfgen='.../path/to/R/sbtab_to_vfgen'
$ sbtab_to_vfgen *.tsv

This will work if the .tsv files have acceptable SBtab content.

Other Output Formats

As a by-product, sbtab_to_vfgen() also produces a mod file intended as a starting point for use in neuron. This is not the primary purpose of this function.

If libSBML is installed and R bindings available, an attempt will be made to produce an SBML file (see further below).

The type of systems biology models that we have in mind are (plain) ordinary differential equations (ODEs).

NEURON comprehensively models biochemistry and electrophysiology (e.g. membrane action potentials). These two different simulations have to be coupled (neuron does that). This aspect is missing from the SBtab files we typically use, so the produced mod file is really only an initial point; the user has to change and adapt the file to the intended purpose and make it work inside of neuron. The user must also be aware of NEURONs units and make the necessary unit conversions at some point.

There is no automatic conversion of units (yet).

VFGEN Output

VFGEN is a very useful tool that reformats an ODE (given in vfgen's xml format) and convert it into various programming languages, including right hand side functions for two ODE solvers in C: gsl and cvode. While doing so, it uses GiNaC to calculate the model's Jacobian analytically (among other things).

The R script sbtab_to_vfgen.R converts an SBtab model to vfgen's xml format (with normal math, as string attributes), among others (file will end in vf).

Biological models don't necessarily map uniquely onto ODE models, a compound can be a state variable or an algebraic assignment or a constant, this has to be inferred a bit from the SBtab files.

SBtab

SBtab is a tabular format for biochemical models (as in Systems Biology). It is ~perhaps~ easier to understand than sbml and can be parsed/worked via shell scripts (e.g. with line oriented tools such as sed and awk) due to its tabular nature (stored as e.g. tab separated value text files).

In addition to the official upstream documentation, we have summarised the SBtab entries that this script can use in sbtab.md. The SBtab specification does not go into detail about many use-cases, so some interpretation on our part was needed. The SBtab files we create may not adhere perfectly to the official specs and similarly SBtab files created by official SBtab software may not work here. This will probably improve over time, but currently we use no code/software from the SBtab authors (upstream).

Open Document Format, Gnumeric and Spreadsheets in General

Even though .tsv files are more fundamental types (have least amount of prerequisites), .ods files keep all the sheets in one file and are slightly more convenient. Gnumeric is a spreadsheet software that handles both ods and tsv files fairly well (it also has its own format .gnumeric)

Conversion between spreadsheet formats like .ods, .gnumeric and .tsv files is very convenient using ssconvert, a part of gnumeric. The shell scripts ods_to_tsv.sh and tsv_to_ods.sh in this repository are an example of ssconvert usage.

An SBtab document can be imported from an open document spreadsheet (.ods) directly using the readODS package:

library("SBtabVFGEN")
model.ods <- "examplemodel.ods" 
if (file.exists(model.ods){
 model.sbtab <- sbtab_from_ods(model.ods)
 sbtab_to_vfgen(model.sbtab)
}

The result is written to several files (.vf,.mod, and .xml). Some other results with additional information are also created.

In either case, whether TSV or ODS was used, model.sbtab will be a list of data.frame objects.

Other spreadsheet programs such as google spreadsheets and libre office export to tsv one sheet at a time (with no easy workarounds) and lack an option to export N sheets into N files. Gnumeric's ssconvert command does.

Systems Biology Markup Language (SBML level 2 version 4)

The program sbtab_to_vfgen.R also produces an .xml file in the Systems Biology Markup Language (SBML). This is only done, if libsbml is installed with R bindings, like this:

$ R CMD INSTALL libSBML_5.18.0.tar.gz

If this check: if (requireNamespace(libSBML)) succeeds, then the scripts attempts to make an sbml file. SBML is a format that has units, and the units defined in SBtab are forwarded to SBML. The formats are very different with regard to unit handling and math generally. The method we use to parse human readble text units is described in units.md.

Here is a small (incomplete) list of libsbml functions in R (that we used). An auto-generated full list, without comments, is also present.

There is a more modern level of SBML, level 3, but this package lacks the ability to generate this format.



a-kramer/SBtabVFGEN documentation built on Nov. 14, 2024, 8:41 p.m.